Netezza technicaloverviewportugues

40
© 2011 IBM Corporation June 6, 2022 IBM Netezza TwinFin ® Líder em Appliances para Data Warehouse Silvio Ferrari IBM Netezza Systems Engineer [email protected]

description

 

Transcript of Netezza technicaloverviewportugues

Page 1: Netezza technicaloverviewportugues

© 2011 IBM CorporationApril 8, 2023

IBM Netezza TwinFin®

Líder em Appliances para Data Warehouse

Silvio Ferrari

IBM Netezza Systems Engineer

[email protected]

Page 2: Netezza technicaloverviewportugues

© 2011 IBM Corporation22

Conteúdo

DadosEstruturados

AnalisarIntegrar

Governança

Dados

Aplicações Transacionais &

Colaborativas

Gerenciar

Informação Streaming

Aplicações Analíticas de

Negócio

Streams

Big DataData

Warehouses

Fontes de informação

Externas

www

Qualidade

Gerenciamento de Lifecycle

Segurança &Privacidade

Netezza, IM e BAO

Data WarehouseAppliances

Master Data

Page 3: Netezza technicaloverviewportugues

© 2011 IBM Corporation3

Verdadeiros Appliances

Dispositivos especializados Otimizados para um propósito Solução completa Instalação rápida Operação muito simples Interfaces padrão de mercado Baixo custo

Netezza anuncia servidor em 2002 Está no melhor quadrante do

Gartner desde 2008 2008 Data Warehouse Database Management Systems Magic Quadrant report released on

December 23, 2008

Page 4: Netezza technicaloverviewportugues

© 2011 IBM CorporationApril 8, 2023

A Simplicidade de um Appliance

Netezza

Page 5: Netezza technicaloverviewportugues

© 2011 IBM Corporation5

Integração de dados

Inserindo

Carregando dados no Appliance IBM Netezza

Ab Initio

Business Objects/SAP

Composite Software

Expressor Software

GoldenGate Software (Oracle)

Informatica

IBM Information Server

Sunopsis (Oracle)

WisdomForce

... e outras mais....

SQ

L

OD

BC

JD

BC

O

LE

-DB

Page 6: Netezza technicaloverviewportugues

© 2011 IBM Corporation6

Reporting e Análise

Consultando o Appliance IBM Netezza

Actuate

Business Objects/SAP

Cognos (IBM)

Information Builders

Kalido

KXEN

MicroStrategy

Oracle OBIEE

QlikTech

Quest Software

SAS

SPSS (IBM)

Unica (IBM)

... e outras mais....

extraindo

SQ

L

OD

BC

JD

BC

O

LE

-DB

Page 7: Netezza technicaloverviewportugues

© 2011 IBM Corporation7

A arquitetura IBM Netezza AMPP™ ( parte de Hardware )

Analíticos

Avançados

Analíticos

Avançados

LoaderLoader

ETLETL

BIBI

Applicações

FPGA

Memory

CPU

FPGA

Memory

CPU

FPGA

Memory

CPU

HostsHost

DiscosS-Blades™

Rede Interna

Netezza Appliance

Page 8: Netezza technicaloverviewportugues

© 2011 IBM Corporation8

Servidores Blade

CPUs

Memória

Page 9: Netezza technicaloverviewportugues

© 2011 IBM Corporation9

CPUs

Acelerador IBM Netezza DatabaseMemória

FPGA

Page 10: Netezza technicaloverviewportugues

© 2011 IBM Corporation10

Nosso segredo:

FPGA CPU

DescomprimeElimina colunas

não usadas

RestringeVisibilidade

Operações complexas: ∑

Joins, Aggs, etc.

select DISTRICT,

PRODUCTGRP,

sum(NRX)

from MTHLY_RX_TERR_DATA

where MONTH = '20091201'

and MARKET = 509123

and SPECIALTY = 'GASTRO'

Parte da tabela

MTHLY_RX_TERR_DATA

(comprimida)

Parte da tabela

MTHLY_RX_TERR_DATA

(comprimida)

where MONTH = '20091201'

and MARKET = 509123

and SPECIALTY = 'GASTRO'

where MONTH = '20091201'

and MARKET = 509123

and SPECIALTY = 'GASTRO'

sum(NRX)sum(NRX)select DISTRICT,

PRODUCTGRP,

sum(NRX)

select DISTRICT,

PRODUCTGRP,

sum(NRX)

Page 11: Netezza technicaloverviewportugues

© 2011 IBM Corporation13

Simplicidade do Appliance IBM Netezza ( Software )

dbspace/tablespace: não há sizing ou configuração

redo/physical/Logical log: não há sizing ou configuração

page/block de tabelas: não há sizing ou configuração

extent para tabelas não há sizing ou configuração

Temp Space: não há alocação ou monitoração

dbspaces: não há decisões para nível RAID

Logical Volume: não há criação de files

OS kernel: não há alterações

OS kernel: não há níveis de patch requeridos

Sessões JAD para configurar host/network/storage não requeridas

Administração de storage desnecessária

Sem índices ou ajustes

Sem instalação de software

Passos da instalação:- conectar energia elétrica- rodar testes (8h)- entregar servidor ao cliente

DBAs se tornam Gerenciadores de Dados, em vez de administradores de banco de dados

Page 12: Netezza technicaloverviewportugues

© 2011 IBM Corporation14

Complexidade versus Simplicidade IBM Netezza Criando um database:

0. CREATE DATABASE TEST LOGFILE 'E:\OraData\TEST\LOG1TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG2TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG3TEST.ORA' SIZE 2M, 'E:\OraData\

TEST\LOG4TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG5TEST.ORA' SIZE 2M EXTENT MANAGEMENT LOCAL MAXDATAFILES 100 DATAFILE 'E:\OraData\TEST\SYS1TEST.ORA'

SIZE 50 M DEFAULT TEMPORARY TABLESPACE temp TEMPFILE 'E:\OraData\TEST\TEMP.ORA' SIZE 50 M

UNDO TABLESPACE undo DATAFILE 'E:\OraData\TEST\UNDO.ORA' SIZE 50 M NOARCHIVELOG CHARACTER SET WE8ISO8859P1;

1. Oracle* table and indexes  2. Oracle tablespace    3. Oracle datafile      4. Veritas file        5. Veritas file system           6. Veritas striped logical volume              7. Veritas mirror/plex                8. Veritas sub-disk                   9. SunOS raw device                     10. Brocade SAN switch                       11. EMC Symmetrix volume                          12. EMC Symmetrix striped meta-volume                            13. EMC Symmetrix hyper-volume                                14. EMC Symmetrix remote volume (replication)                                 15. Days/weeks of planning meetings

Mudar pata 6data!!!!!!!

IBM Netezza: ZERO parâmetros:

CREATE DATABASE my_db;

Page 13: Netezza technicaloverviewportugues

© 2011 IBM Corporation15

ORACLECREATE TABLE "MRDWDDM"."RDWF_DDM_ROOMS_SOLD" ("ID_PROPERTY" NUMBER(5,

0) NOT NULL ENABLE, "ID_DATE_STAY" NUMBER(5, 0) NOT NULL ENABLE,

"CD_ROOM_POOL" CHAR(4) NOT NULL ENABLE, "CD_RATE_PGM" CHAR(4) NOT

NULL ENABLE, "CD_RATE_TYPE" CHAR(1) NOT NULL ENABLE,

"CD_MARKET_SEGMENT" CHAR(2) NOT NULL ENABLE, "ID_CONFO_NUM_ORIG"

NUMBER(9, 0) NOT NULL ENABLE, "ID_CONFO_NUM_CUR" NUMBER(9, 0) NOT

NULL ENABLE, "ID_DATE_CREATE" NUMBER(5, 0) NOT NULL ENABLE,

"ID_DATE_ARRIVAL" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_DEPART"

NUMBER(5, 0) NOT NULL ENABLE, "QY_ROOMS" NUMBER(5, 0) NOT NULL

ENABLE, "CU_REV_PROJ_NET_LOCAL" NUMBER(21, 3) NOT NULL ENABLE,

"CU_REV_PROJ_NET_USD" NUMBER(21, 3) NOT NULL ENABLE,

"QY_DAYS_STAY_CUR" NUMBER(3, 0) NOT NULL ENABLE, "CD_BOOK_SOURCE"

CHAR(1) NOT NULL ENABLE) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255

STORAGE( FREELISTS 6) TABLESPACE "DDM_ROOMS_SOLD_DATA" NOLOGGING

PARTITION BY RANGE ("ID_PROPERTY" ) (PARTITION "PART1" VALUES LESS

THAN (600) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255

STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE

"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART2" VALUES

LESS THAN (1200) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255

STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE

"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART3" VALUES

LESS THAN (1800) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255

STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE

"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART4" VALUES

LESS THAN (2400) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255

STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE

"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART5" VALUES

LESS THAN (3000) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255

STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE

"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART6" VALUES

LESS THAN (MAXVALUE) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255

STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE

"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS ) ;

ORACLE Indexes

CREATE INDEX "MRDWDDM"."RDWF_DDM_ROOMS_SOLD_IDX1" ON "RDWF_DDM_ROOMS_SOLD"

("ID_PROPERTY" , "ID_DATE_STAY" , "CD_ROOM_POOL" , "CD_RATE_PGM" ,

"CD_RATE_TYPE" , "CD_MARKET_SEGMENT" ) PCTFREE 10 INITRANS 6 MAXTRANS 255

STORAGE( FREELISTS 10) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING

PARALLEL ( DEGREE 4 INSTANCES 1) LOCAL(PARTITION "PART1" PCTFREE 10

INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1

MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL

DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART2"

PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840

MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS

1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING,

PARTITION "PART3" PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL

4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0

FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE

"DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART4" PCTFREE 10 INITRANS 6

MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS

100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)

TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART5" PCTFREE 10

INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1

MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL

DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART6"

PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840

MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS

1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING ) ;

ORACLE Bitmap index

CREATE BITMAP INDEX "CRDBO"."SNAPSHOT_MONTH_IDX13" ON

"SNAPSHOT_OPPTY_MONTH_HIST" ("SNAPSHOT_YEAR" ) PCTFREE 10 INITRANS 2

MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4194304 MINEXTENTS 2 MAXEXTENTS

2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL

DEFAULT) TABLESPACE "SFA_DATAMART_INDEX" NOLOGGING ;

ORACLE Table Clusters

CREATE CLUSTER "MRDW"."CT_INTRMDRY_CAL" ("ID_YEAR_CAL" NUMBER(4, 0),

"ID_MONTH_CAL" NUMBER(2, 0), "ID_PROPERTY" NUMBER(5, 0)) SIZE 16384

PCTFREE 10 PCTUSED 90 INITRANS 3 MAXTRANS 255 STORAGE(INITIAL

83886080 NEXT 41943040 MINEXTENTS 1 MAXEXTENTS 1017 PCTINCREASE 0

FREELISTS 4 FREELIST GROUPS 1 BUFFER_POOL RECYCLE) TABLESPACE

"TSS_FACT" ;

Netezza

CREATE TABLE MRDWDDM.RDWF_DDM_ROOMS_SOLD (

ID_PROPERTY numeric(5, 0) NOT NULL ,

ID_DATE_STAY integer NOT NULL ,

CD_ROOM_POOL CHAR(4) NOT NULL ,

CD_RATE_PGM CHAR(4) NOT NULL ,

CD_RATE_TYPE CHAR(1) NOT NULL ,

CD_MARKET_SEGMENT CHAR(2) NOT NULL ,

ID_CONFO_NUM_ORIG integer NOT NULL ,

ID_CONFO_NUM_CUR integer NOT NULL ,

ID_DATE_CREATE integer NOT NULL ,

ID_DATE_ARRIVAL integer NOT NULL ,

ID_DATE_DEPART integer NOT NULL ,

QY_ROOMS integer NOT NULL ,

CU_REV_PROJ_NET_LOCAL numeric(21, 3) NOT NULL ,

CU_REV_PROJ_NET_USD numeric(21, 3) NOT NULL ,

QY_DAYS_STAY_CUR smallint NOT NULL ,

CD_BOOK_SOURCE CHAR(1) NOT NULL)

distribute on random;

•Sem indexes

•Sem Admininstração ou ajustes

•Distribua os dados aleatoriamente,

ou por Colunas

Simplicidade Netezza: criando uma tabela

Page 14: Netezza technicaloverviewportugues

© 2011 IBM Corporation16

Complexidade Tradicional versus a Simplicidade Netezza (RDBMS 101)CREATE TABLE EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(

RPT_PERIOD_DIM_ID NUMBER NOT NULL,

SRVY_WEEK_DIM_ID NUMBER NOT NULL,

DATE_DIM_ID NUMBER NOT NULL,

SRVC_MKT_SEG_DIM_ID NUMBER NOT NULL,

RESPD_HHLD_DIM_ID NUMBER NOT NULL,

MDOTLT_DIM_ID NUMBER NOT NULL,

LSTN_LOC_DIM_ID NUMBER NOT NULL,

EXPSR_MIN_CNT NUMBER NOT NULL,

RESPD_WGHT_NMBR NUMBER,

PRELIM_DAILY_WGHT_NMBR NUMBER,

FINAL_DAILY_WGHT_NMBR NUMBER,

TIMESHIFT_SECOND_CNT NUMBER,

BGN_EXPSR_UTC_TS DATE,

END_EXPSR_UTC_TS DATE,

BGN_EXPSR_LOCAL_TS DATE,

END_EXPSR_LOCAL_TS DATE,

BGN_BCST_UTC_TS DATE,

END_BCST_UTC_TS DATE,

BGN_BCST_LOCAL_TS DATE,

END_BCST_LOCAL_TS DATE,

SOURCE_ID VARCHAR2(50 BYTE),

ACTIVE_IND CHAR(1 BYTE) DEFAULT 'Y‘ NOT NULL,

INSERT_TS DATE NOT NULL,

UPDATE_TS DATE NOT NULL,

METADATA_ID NUMBER,

MEDIA_CODE VARCHAR2(10 BYTE),

MDOTLT_HIER_DIM_ID NUMBER,

OUT_OF_MKT_IND CHAR(1 BYTE)

)

CREATE TABLE EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(

RPT_PERIOD_DIM_ID INTEGER NOT NULL,

SRVY_WEEK_DIM_ID INTEGER NOT NULL,

DATE_DIM_ID INTEGER NOT NULL,

SRVC_MKT_SEG_DIM_ID INTEGER NOT NULL,

RESPD_HHLD_DIM_ID INTEGER NOT NULL,

MDOTLT_DIM_ID INTEGER NOT NULL,

LSTN_LOC_DIM_ID INTEGER NOT NULL,

EXPSR_MIN_CNT NUMERIC(9,2) NOT NULL,

RESPD_WGHT_NMBR NUMERIC(9,2),

PRELIM_DAILY_WGHT_NMBR NUMERIC(9,2),

FINAL_DAILY_WGHT_NMBR NUMERIC(9,2),

TIMESHIFT_SECOND_CNT INTEGER,

BGN_EXPSR_UTC_TS TIMESTAMP,

END_EXPSR_UTC_TS TIMESTAMP,

BGN_EXPSR_LOCAL_TS TIMESTAMP,

END_EXPSR_LOCAL_TS TIMESTAMP,

BGN_BCST_UTC_TS TIMESTAMP,

END_BCST_UTC_TS TIMESTAMP,

BGN_BCST_LOCAL_TS TIMESTAMP,

END_BCST_LOCAL_TS TIMESTAMP,

SOURCE_ID VARCHAR(50),

ACTIVE_IND CHAR(1) DEFAULT 'Y‘ NOT NULL,

INSERT_TS TIMESTAMP NOT NULL,

UPDATE_TS TIMESTAMP NOT NULL,

METADATA_ID INTEGER,

MEDIA_CODE VARCHAR(10),

MDOTLT_HIER_DIM_ID INTEGER,

OUT_OF_MKT_IND CHAR(1)

) distribute on random;

516 BASE TABLE PARTITIONS…TABLESPACE AT_EDW_REXMIN

PCTUSED 0

PCTFREE 10

INITRANS 1

MAXTRANS 255

LOGGING

PARTITION BY RANGE (RPT_PERIOD_DIM_ID)

(

PARTITION RP0000 VALUES LESS THAN (0)

NOLOGGING

NOCOMPRESS

TABLESPACE AT_EDW_REXMIN

PCTFREE 10

INITRANS 1

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

),

PARTITION RP0001 VALUES LESS THAN (2)

NOLOGGING

NOCOMPRESS

TABLESPACE AT_EDW_REXMIN

PCTFREE 10

INITRANS 1

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

),

PARTITION RP0002 VALUES LESS THAN (3)

NOLOGGING

NOCOMPRESS

TABLESPACE AT_EDW_REXMIN

PCTFREE 10

INITRANS 1

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT ), …

… PLUS DDL FOR 513 MORE PARTITIONS

Index REXMIN_SOURCE_ID_I on 515 PARTITIONS…CREATE INDEX EDW_PROD.REXMIN_SOURCE_ID_I ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(SOURCE_ID)

TABLESPACE AI_EDW_REXMIN

INITRANS 2

MAXTRANS 255

LOGGING

LOCAL (

PARTITION RP0000

NOLOGGING

NOCOMPRESS

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

),

PARTITION RP0001

NOLOGGING

NOCOMPRESS

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

),

PARTITION RP0002

NOLOGGING

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT ), …

… PLUS DDL FOR 512 MORE PARTITIONS

Index REXMIN_LLOC_FK_BI on 515 PARTITIONS…CREATE BITMAP INDEX EDW_PROD.REXMIN_LLOC_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(LSTN_LOC_DIM_ID)

TABLESPACE AI_EDW_REXMIN

INITRANS 2

MAXTRANS 255

LOGGING

LOCAL (

PARTITION RP0000

NOLOGGING

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

),

PARTITION RP0001

NOLOGGING

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

), …

… PLUS DDL FOR 513 MORE PARTITIONS

Index REXMIN_REHH_FK_BI on 515 PARTITIONS…CREATE BITMAP INDEX EDW_PROD.REXMIN_REHH_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(RESPD_HHLD_DIM_ID)

TABLESPACE AI_EDW_REXMIN

INITRANS 2

MAXTRANS 255

LOGGING

LOCAL (

PARTITION RP0000

NOLOGGING

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

),

PARTITION RP0001

NOLOGGING

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

), …

… PLUS DDL FOR 513 MORE PARTITIONS

Index REXMIN_SMS_FK_BI on 515 PARTITIONS…CREATE BITMAP INDEX EDW_PROD.REXMIN_SMS_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(SRVC_MKT_SEG_DIM_ID)

TABLESPACE AI_EDW_REXMIN

INITRANS 2

MAXTRANS 255

LOGGING

LOCAL (

PARTITION RP0000

NOLOGGING

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

), …

… PLUS DDL FOR 514 MORE PARTITIONS

Index REXMIN_SRWK_FK_BI on 515 PARTITIONS…CREATE BITMAP INDEX EDW_PROD.REXMIN_SRWK_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(SRVY_WEEK_DIM_ID)

TABLESPACE AI_EDW_REXMIN

INITRANS 2

MAXTRANS 255

LOGGING

LOCAL (

PARTITION RP0000

NOLOGGING

TABLESPACE AI_EDW_REXMIN

PCTFREE 10

INITRANS 2

MAXTRANS 255

STORAGE (

INITIAL 96K

NEXT 96K

MINEXTENTS 1

MAXEXTENTS UNLIMITED

PCTINCREASE 0

BUFFER_POOL DEFAULT

), …

… PLUS DDL FOR 514 MORE PARTITIONS

Index REXMIN_RP_FK_BI on 515 PARTITIONS…CREATE BITMAP INDEX EDW_PROD.REXMIN_SRWK_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(SRVY_WEEK_DIM_ID)

TABLESPACE AI_EDW_REXMIN

INITRANS 2

MAXTRANS 255

LOGGING

LOCAL ( …

… PLUS DDL FOR 515 PARTITIONS

Index REXMIN_DATE_FK_BI on 515 PARTITIONS…CREATE BITMAP INDEX EDW_PROD.REXMIN_DATE_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(DATE_DIM_ID)

TABLESPACE AI_EDW_REXMIN

INITRANS 2

MAXTRANS 255

LOGGING

LOCAL ( …

… PLUS DDL FOR 515 PARTITIONS

Index REXMIN_MEDO_FK_BI on 515 PARTITIONS…CREATE BITMAP INDEX EDW_PROD.REXMIN_MEDO_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT

(MDOTLT_DIM_ID)…

… PLUS DDL FOR TABLESPACE + 515 PARTITIONS

Oracle: 34,500 KB de DDLs

Netezza: 250 KB de DDLs

Page 15: Netezza technicaloverviewportugues

© 2011 IBM Corporation17

Comparação de requerimentos de redes (internas e externas)

Exadata (full rack) TwinFin12 (full rack)

22 IP addresses for the InfiniBand network

-

68 IP addresses for Ethernet (for a single cluster)

5 IP addresses

10 network drops minimum (with

50+ reported as being typical

4 network drops

Total: 90 endereços IP Total: 9 endereços IP

Page 16: Netezza technicaloverviewportugues

© 2011 IBM Corporation18

Monitorando a distribuição dos dados com NzAdmin

• Uma má distribuição.

• O usuário escolheu a(s) coluna(s) errada(s) para a distribuição dos dados.

• Nota: Neste caso, o usuário escolheu a primeira coluna da tabela como a coluna de distrubuição. Uma decisão incorreta.

Page 17: Netezza technicaloverviewportugues

© 2011 IBM Corporation19

Uma boa Distribuição: 2.2 Trilhões de Registros

Page 18: Netezza technicaloverviewportugues

© 2011 IBM Corporation20

Monitoração: Distribuição homogênea dos dados no sistema

Análise de SKEW com relação ao sistema

Deve haver uma carga de utilização

equivalente entre as SPUs

Page 19: Netezza technicaloverviewportugues

© 2011 IBM Corporation21

Backup e Restore

Integração e certificação com ferramentas líderes de mercado: – Simplifica integração com as principais ferramentas de backup e

restore– Suporte a X/Open Backup Services API (XBSA)– Certificação IBM Tivoli Storage Manager (TSM)– Certificação Veritas NetBackup™ da Symantec

Backup and Restore Incremental– Diminui significativamente os tempos de backup comparados ao

backup Full– Disponível no utilitário NZBACKUP– Restores tipo Full ou parcial

DomDom SegSeg TerTer QuaQua QuiQui SexSex SabSab

Fu

ll Dif Dif

Cumulativo

Dif Dif Dif

Page 20: Netezza technicaloverviewportugues

© 2011 IBM Corporation22

The IBM Netezza TwinFin™ - Expansão

22

Em caso de expansão:

- um novo sistema completo é enviado

- dados migrados ONLINE

- IPs são redirecionados

- servidor original é desligado e devolvido

Page 21: Netezza technicaloverviewportugues

© 2011 IBM Corporation23

i-Class: Analytics Without Constraints

• Analyze wider and deeper data

> Additional dimensions

> Richer history

Big Data Big Math

• Increase computational intensity

> More complex models

> Faster execution for results

Page 22: Netezza technicaloverviewportugues

© 2011 IBM Corporation24

Advanced Analytics with TwinFin i-Class

SAS, SPSS

R, S+

SQL

SQL

Fraud Detection

Fraud Detection

Demand Forecasting

Demand Forecasting

Page 23: Netezza technicaloverviewportugues

© 2011 IBM Corporation25

Simples de Instalar e Operar

25

Operações• Simplesmente carregue e use… é um appliance!• Instalação em ~2 dias!• Fácil de avaliar e funciona como anunciado!

Desenvolvedores BI & DBAs – mais ágeis• Sem configuração ou modelagem física• Sem índices ou ajustes – performance imediata• Agnóstico a modelos de dados• Data Architects / DBA focam nos negócios, não na

modelagem física

Desenvolvedores ETL• Tabelas de agregação não necessárias – lógica de ETL

simplificada• Cargas e transformações mais rápidas

Analistas de Negócio• Análise “Linha de Pensamento”– 10 a 100x mais rápida• Consultas ad hoc – sem ajustes, sem índices• Consultas complexas a grandes datasets• Menor latencia – cargas e consultas simultâneas• processamento OnStream a centenas de nodes

Page 24: Netezza technicaloverviewportugues

© 2011 IBM Corporation26

Família de Appliances para todo o ciclo de gerenciamento:

Skimmer

Sistemas de Desenvolvimento

e Testes

1 TB to 10 TB

TwinFin

Data Warehouse Analítico de alta

Performance

1 TB to 1.5 PB

Cruiser

Archiving acessível por SQL,

Back-up / DR

100 TB to 10 PB

Page 25: Netezza technicaloverviewportugues

© 2011 IBM Corporation27

“…when something took 24 hours I could only do so much with it, but when something takes 10 seconds, I may be able to completely rethink the business process…”

- SVP Application Development, Nielsen

15,000 users running 800,000+ queries per day 50X faster than before

Speed

Source: http://www.youtube.com/watch?v=yOwnX14nLrE&feature=player_embedded

Page 26: Netezza technicaloverviewportugues

© 2011 IBM Corporation28

DAYS

WEEKS

MONTHS

Simplicity

“Allowing the business users access to the Netezza box was what sold it.”

Steve Taff, Executive Dir. of IT

Services

200X faster than Oracle system

ROI in less than 3 months

Up and running 6 months before having any training

Page 27: Netezza technicaloverviewportugues

© 2011 IBM Corporation29

“NYSE … has replaced an Oracle IO relational database with a data warehousing appliance from Netezza, allowing it to conduct rapid searches of 650 terabytes of data.”

ComputerWeekly.com

Source: http://www.computerweekly.com/Articles/2008/04/14/230265/NYSE-improves-data-management-with-datawarehousing.htm

Scalability

1 PB on Netezza

7 years of historical data

100-200% annual data growth

Page 28: Netezza technicaloverviewportugues

© 2011 IBM Corporation30

Smart

“Because of (Netezza’s) in-database technology, we believe we'll be able to do 600 predictive models per year (10X as many as before) with the same staff."

Eric Williams,

CIO and executive VP

Coupon redemption rates as high as 25%

Predicts what shoppers are likely to buy in future visits

Page 29: Netezza technicaloverviewportugues

© 2011 IBM Corporation31

Todos prometem, mas... nós provamos!

Nós provamos que somos simples Nós provamos que entregamos performance Nós provamos dentro do seu ambiente Nós provamos que nos integramos com suas

ferramentas Nós provamos que somos “fáceis de fazer

negócio” Nós provamos que temos o menor TCO Nós provamos Business Value

Page 30: Netezza technicaloverviewportugues

© 2011 IBM Corporation33

Indice de sucesso nas PoCs:

86%One of “The five most important M&A Deals of 2010”

- Wall Street Journal

Page 31: Netezza technicaloverviewportugues

© 2011 IBM Corporation34

Page 34

Digital Media

Financial Services

Governo

Health & Life Sciences

Retail / Consumer

Products

Telecom

Other

Page 32: Netezza technicaloverviewportugues

© 2011 IBM CorporationApril 8, 2023

Obrigado!

(slides backup)

Page 33: Netezza technicaloverviewportugues

© 2011 IBM Corporation36

Oracle ExadataOracle Exadata Results In Netezza TwinFin

Netezza’s Competitive Advantage

Architecture

– Two layer:• Clustered SMP DB Layer

(RAC)• Shared disk MPP Storage

Layer

Compromised Performance

– True MPP with FPGA acceleration of processing in each MPP node

– Best architecture for DW and advanced analytics due to minimization of contention/bottlenecks

Speed

– Tuned for OLTP (e.g. FlashCache)

– RAC unfit for DW workloads

Poor DW Performance

– Appliance tuned for DW and advanced analytics

– Highest DW performance– Operational Simplicity

Simplicity

– Complexity of Oracle Real Application Clusters (RAC)

– Constant tuning for performance

Complex Administration

– True Appliance with HW/SW created to provide high performance for DW

– No tuning

– More time spent delivering business value rather than tuning for acceptable performance

Smart

– Very limited push-down of analytics

– RAC bottleneck for analytic performance

Poor Analytic Performance

– Push down of many diverse analytics (SAS, R, Gnu, etc.) through iClass

– Ability to accelerate the analytics used by many prospects

Costs

– Acquisition cost can exceed $7M per rack• Hardware $1M• Software is more than

$6M!– High maintenance and

software subscription– Continuing high admin costs

High Total Cost of Ownership

– Low, transparent initial cost

– Simple install requires no additional professional services

– Standard maintenance includes hw /sw support and sw upgrades

– Easily understood, predictable costs

– Minimal “extra” services so easier to budget for Netezza

36

Page 34: Netezza technicaloverviewportugues

© 2011 IBM Corporation37

Analysis Summary: Oracle Exadata Database Machine

Exadata is Limited in the Processing It Does. Won’t Handle: – Complex joins – Distinct aggregation – Analytical functions

Most Work Still Done on Oracle Database Server– Lots of movement of data– Loss of Performance

Oracle Says Exadata Can Do OLTP or DW or Both At the Same Time– Vastly different workloads requiring vastly different tuning– Netezza customers report that Exadata poor at DW and

analytic

Page 35: Netezza technicaloverviewportugues

© 2011 IBM Corporation38

Query Throughput ≠ Scan Rate

Oracle Exadata throws together the very fast hardware and hopes it produces fast results.

Exadata offers very fast scan rates but that just means it can get data off the disks quickly.

Overall query throughput also relies on the speed of all the other components, including the software

Oracle Exadata can be very fast for simple queries but gets slower with increasing complexity

Netezza is designed for balance – it works fast for all query types

38

Page 36: Netezza technicaloverviewportugues

© 2011 IBM Corporation39

Netezza’s Advantages over Oracle

• Oracle RAC is still Oracle RAC. It is still: – Complex – needs to be tuned– Temperamental – needs retuning for different configurations– Difficult – needs specialized skills and constant maintenance

• Netezza is much easier. With hardware and software optimized for data warehouse applications, there is:– No need for labor-intensive tuning– No requirements for partitioning, indexing or building cubes

Database Machine is a Resource Hog– For a full rack Oracle Exadata Database Machine, you will need to supply at

least 90 IP addresses (22 IP addresses for the InfiniBand network, 68 IP addresses for Ethernet, assuming a single cluster), and a minimum of 10 network drops (with 50+ reported as being typical).

In contrast, a Netezza TwinFin-12 requires 5 IP addresses and 4 network drops. The core Netezza theme of simplicity is reflected in installation as in operation.

Page 37: Netezza technicaloverviewportugues

© 2011 IBM Corporation40

TwinFin™ 24 Specification

• 16 (8*2) Disk Enclosures• 192 (96*2) 1TB SAS Drives • (8 hot spares)• RAID 1 Mirroring

• 24 Netezza S-Blades:• 192 Core’s ( Intel Quad-Core 2.5 GHz)• 192 FPGA’s ( 125 MHz )• 384 GB DDR2 RAM (1+TB compressed)• Linux 64-bit Kernel

• 2 Hosts (Active-Passive):• 24 Cores (Quad-Core Intel 2.6 GHz)• 96 GB Memory• 4x146 GB SAS Drives• Red Hat Linux 5 64-bit• 10G Internal Network

• User Data Capacity: 250 TB• Data Scan Speed: 290 TB/hr• Load Speed (per system): 2.0 TB/hr

• Power/Rack: 7,400 Watts• Cooling/Rack: 25,500 BTU/Hour

Page 38: Netezza technicaloverviewportugues

© 2011 IBM Corporation41

Compress Engine in Action

On Data Load Rows separated into

columnar streams Each stream independently

compiled Field instructions applied to

block headers Compressed data maintains

row-based structure

On Data Scan/Query FPGA executes field

instructions to decompile at wire speed

Data re-assembled into rows for other FAST Engines processing

Burst rows into column streams

Compile independent streams

Apply field instructions

Compressed storage retains all structural properties of row-wise uncompressed

storage

Execute field instructions to recover full-sized values

Reassemble values to recover full-sized,

uncompressed rows & pass on to remaining FAST

engines

Burst rows into column streams

Compile independent streams

Apply field instructions

Compressed storage retains all structural properties of row-wise uncompressed

storage

Execute field instructions to recover full-sized values

Reassemble values to recover full-sized,

uncompressed rows & pass on to remaining FAST

engines

Page 39: Netezza technicaloverviewportugues

© 2011 IBM Corporation43

Default Workload Management: Short Query Bias

Short Query Bias (SQB)– Short queries prioritized ahead of longer running queries – Real-time responses to users performing short queries– Invaluable feature for large mixed-workload environments

8 Items or Less

Full Carts Here

Full Carts Here

Page 40: Netezza technicaloverviewportugues

© 2011 IBM Corporation44

GRA Test: Fidelity to User Settings

0

10

20

30

40

50

60

rsg1_actual_%rsg2_actual_%rsg3_actual_%