20
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Embed Size (px)

DESCRIPTION

Indexes Types of Indexes supported by Teradata  Unique Primary Index  Non-Unique Primary Index  Unique Secondary Index  Non-Unique Secondary Index

Citation preview

Page 1: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Last Updated : 27th April 2004

Center of ExcellenceData Warehousing Group

Teradata Performance Optimization

Page 2: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Performance Optimization Tools and Techniques

Performance Optimization Techniques– Indexes Choices– Macros & Views– Reducing Row Size – Compression– Optimization Through SQL Tuning– Join Indexes– Hash Indexes– Partitioned Primary Index– Priority Scheduler– Expanding your Teradata configuration.

Query Analysis Tools– Explain/Visual Explain Facility– Query Capture Facility– Target Level Emulation– Teradata System Emulation Tool– Teradata Index Wizard– Teradata Statistics Wizard

System Monitoring – Resource Usage– Teradata Manager

Page 3: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

IndexesTypes of Indexes supported by Teradata

Unique Primary IndexNon-Unique Primary IndexUnique Secondary IndexNon-Unique Secondary Index

Page 4: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Unique/Non-Unique IndexesA Unique Index identifies one & only one

data rowA Non-Unique Index identifies one or many

data rows

Page 5: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Criteria For Index SelectionPrimary Index * The column (or column set) chosen should be the

set selection most frequently used to select rows from the table and should be unique (UPI) or close to unique (NUPI).

Cont...

Page 6: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Criteria For Index SelectionSecondary Index

* Secondary indexes always have an associated subtable.

* The column (or column set) chosen should be a frequently used set selection.

* Large Table/Small table Joins : Joins that involve three or more small tables and one large

table are called Large Table/Small Table (LT/ST) joins. The optimizer first joins the small tables together and

then joins that result with the large table (also called as Product/Merge Join).

To work well, the join fields of the small table must comprise an index of the large table. The join fields do not have to be indexes in the small table.

Collect statistics for all tables on their indexes used in a query.

Page 7: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Value Ordered NUSIsNUSI rows are stores row hash order

which is very efficient for the queries with equality condition on the SI column(s).

Queries with inequality conditions on the SI column(s) typically does full table scan of SI subtable.

Value ordered NUSI rows are sorted by data value. It is possible to search only a portion of the index subtable for a given range of key values.

Value-ordered NUSIs are very efficient for range queries.

Page 8: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Partitioned Primary IndexPerformance benefit from data locality is

achieved by creating local partitions of data rows within each virtual AMP.

Following performance benefits can be achieved by PPILarge performance gain for range queries. Improves load time.Deleting entire partition is very fast. PI access are not affected.

Page 9: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Explain FacilityAllows you to experiment with different

approaches to an answer, then select the one that performs best.

Provides information about the relative time the query would take to execute.

Page 10: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Identifying Columns to IndexRun EXPLAINs on typical queries with and

without indexes defined on various columns to determine which performs best.

Run HELP INDEX tablename statements to produce information helpful for interpreting the EXPLAIN statements you run.

Cont….

Page 11: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Identifying Columns to IndexRun COLLECT STATISTICS on the tables to

be indexed to provide data for assessing the cost/benefit balance afforded by indexes.

Page 12: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

MacrosTeradata macros are SQL statements that

are stored on the server and executed there.

Macros are particularly useful for improving performance.

Page 13: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

ViewsViews provide a means for application

programmers to develop and test SQL statements that are highly optimized.

These views can then be provided to users who can use them without worrying about tying up system resources needlessly.

Well written macros provide the same facility.

Page 14: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Compressing ColumnsMost frequent value of a column can be

compressed to reduce the row size. Tables with large numbers of rows and

fields with limited numbers of unique values are very good candidates for compression.

CPU cost overhead for compression processing is minimal.

Page 15: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Optimization Through SQL Tuning

Correlated Sub queries Correlated sub queries are faster than use of a

temporary table as Correlated sub queries fully integrated with the global join plan to minimize the cost.

Case Expression CASE expressions help increase performance because

they return multiple results in a single pass over the data.

SELECT item_number, item_description,item_price as “Current//Price”,CASEWHEN item_season = ‘summer' and item_count < 3THEN item_price *(1-.50)WHEN item_season = ‘summer’ and item_count >= 3THEN item_price *(1-.25)WHEN item_season = ‘spring’ THEN item_price * (1-.33)ELSE NULL END AS “Sale//Price”FROM inventory_tableWHERE item_season in (‘spring’ or ‘summer’);

Cont….

Page 16: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Optimization Through SQL Tuning

Ordered Analytical Functions Using Ordered Analytical Functions, you can perform

data analysis within the database engine itself taking full advantage of Teradata parallel architecture.

CALENDAR view for date related operations.Optimized Empty table INSERT/SELECT.

INSERT INTO Summary_Table SELECT * FROM Store1;

INSERT INTO Summary_TableSELECT * FROM Store2;

INSERT INTO Summary_TableSELECT * FROM Store3;

INSERT INTO Summary_Table SELECT * FROM Store1

; INSERT INTO Summary_Table SELECT * FROM Store2

;INSERT INTO Summary_TableSELECT * FROM Store3;

Page 17: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Join IndexesA Join Index is a data structure that stores

and maintains join results. Frequently executed joins can be stored in

Join Indexes to improve performance.Join Indexes are maintained automatically.Join Index -

Stores pre-joins without de-normalizing the database.

Stores summary data without de-normalizing the database.

Replicates the tables on the same AMP for join.Sparse Index - Can be used to index a

portion of a table.

Page 18: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Hash IndexesHash indexes create a full or partial

replication of a base table with a primary index on a foreign key column table to facilitate joins of very large tables by hashing them to the same AMP.

Eliminate data distribution for join processing.

Page 19: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Priority SchedulingCan be used to control resources allocated

to users. Administrator can specify performance

group while creating the user.It manages resource distribution to

improve performance of one application at the expense of other.

Resource Partition }

RHM

L

Weight 40 Weight

20Weight 10Weight

5

Performance GroupAllocation Group

Page 20: Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization

Additional SMP/MPP NodesYou can increase the performance of a

Teradata RDBMS by adding SMP nodes to your system. Performance increases at a nearly linear rate with the addition of SMP nodes to the configuration.

Each MPP system is certified to support as many as 512 nodes. More nodes can be added on a custom basis to improve Performance.