If you can't read please download the document
Upload
jeff-moss
View
670
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Partitioning and Compression for Datawarehouses.
Citation preview
2. Who Dunnit ? 3. Agenda
4. My Background
5. What Is Data Segment Compression ?
6. Where Can Data Segment Compression Be Used ?
7. How Does Segment Compression Work ? Database Block Symbol Table Row Data Area Block Common Header (20 bytes) Transaction Header (24 bytes fixed + 24 bytes per ITL) Data Header (14 bytes) Compressed Data Header (16 bytes -variable ) Tail (4 bytes) 100 Call to discuss bill amount TEL NO YES 3 TEL 4 NO 5 YES 2 Call to discuss bill amount 1 100 1 2 3 4 5 101 Call to discuss new product MAIL NO N/A 8 MAIL 9 N/A 7 Call to discuss new product 6 101 6 7 8 4 9 102 Call to discuss new product TEL YES N/A 10 7 3 5 9 10 102 ID DESCRIPTION CONTACT TYPE OUTCOME FOLLOWUP Table Directory (8 bytes) Row Directory (2 bytesper row ) 8. What Affects Compression ?
9. Compression v Block Size
10. Compression v ITL
11. Compression v Number Of Columns
12. Compression v PCTFREE
13. Compression v NDV
14. Compression v Column Length
15. Compression v Ordering
Uniformly distributed Colocated 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 16. Get Max Compression Order Package
Running mgmt_p_get_max_compress_order... ---------------------------------------------------------------------------------------------------- Table: BIG_TABLE Sample Size: 10000 Unique Run ID: 25012006232119 ORDER BY Prefix: ---------------------------------------------------------------------------------------------------- Creating MASTER Table: TEMP_MASTER_25012006232119 Creating COLUMN Table 1: COL1 Creating COLUMN Table 2: COL2 Creating COLUMN Table 3: COL3 ---------------------------------------------------------------------------------------------------- The output below lists each column in the table and the number of blocks/rows and space used when the table data is ordered by only that column, or in the case where a prefix has been specified, where the table data is ordered by the prefix and then that column. From this one can determine if there is a specific ORDER BY which can be applied to to the data in order to maximise compression within the table whilst, in the case of a a prefix being present, ordering data as efficiently as possible for the most common access path(s). ---------------------------------------------------------------------------------------------------- NAMECOLUMNBLOCKSROWS SPACE_GB ============================== ============================== ============ ============ ======== TEMP_COL_001_25012006232119COL129010000 .0022 TEMP_COL_002_25012006232119COL234510000 .0026 TEMP_COL_003_25012006232119COL355510000 .0042 17. Pros & Cons
18. Pros & Cons
19. Data Warehousing Specifics
1 -Table Compression in Oracle 9iR2: A Performance Analysis 20. Things To Watch Out For
21. A Funny Thing
Thanks to Julian Dyke for the block dumping information http://www.juliandyke.com 22. What Is Partitioning ?
23. Partition To Tablespace Mapping
P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only 24. Read Only Tablespaces
Partition Tablespace 25. Why Partition ? - Performance
SELECT SUM(sales)FROM part_tab WHERE sales_date BETWEEN 01-JAN-2005AND 30-JUN-2005 Sales Fact Table * Oracle 10gR2 Data Warehousing Manual JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 26. Why Partition ? - Manageability
27. Why Partition ? - Scalability
28. Why Partition ? - Availability
P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only 29. Fact Table Partitioning Transaction Date Load Date
07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 January Partition February Partition 22-JAN-2005 Customer 3 01-FEB-2005 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 21-JAN-2005 Customer 7 04-APR-2005 09-APR-2005 Customer 9 10-APR-2005 07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 21-JAN-2005 Customer 7 04-APR-2005 22-JAN-2005 Customer 3 01-FEB-2005 January Partition February Partition 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 09-APR-2005 Customer 9 10-APR-2005 30. Watch out for
Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2 31. Partitioning Feature: Characteristic Reason Matrix Partition Truncation Exchange Partition Archiving Pruning (Partition Elimination) Partition wise joins Parallel DML Local Indexes Read Only Partitions Availability Scalability Manageability Performance Characteristic: Feature: 32. Questions ? 33. References: Papers
34. References: Online Presentation / Code