Informix partitioning interval_rolling_window_table

  • View
    795

  • Download
    1

Embed Size (px)

DESCRIPTION

Informix interval paritioning

Text of Informix partitioning interval_rolling_window_table

  • 1.Deep dive into interval partitioning &rolling window table in IBM InformixKeshava MurthyIBM Informix Development

2. Partitioning 101 Interval partitioning Rolling window tablepartitioning 3. Partitioning 101What? Ability to partition a table or indexinto multiple physical partitions.Applications have a single schema or table.Underneath, table or index is organized bymultiple partitions; Query processing andtools understand this and combine thepartitions to provide a single view of thetable. E.g. UNION (ALL) of States makingUNITED STATES of AMERICA.Why? Capacity, parallelism, queryperformance (parallelism, partitionelimination), time cyclic data management,faster statistics collection, multi-temperature data storageHow?-CREATE TABLE with PARTITION(FRAGMENT) BY clause-ALTER TABLE INIT-CREATE INDEX on a partitionedtable-CREATE INDEX explicitly withPARTITION clauseQuery Processing and more:-Scans all the fragments to complete the scan- Parallelization- Partition elimination during scan and join- Parallelized Index builds 4. DBSPACECHUNK CHUNK CHUNKExtent ExtentExtentCHUNKPagesPagesPagesPartitionExtent 5. Customer_table Partitionidx_cust_idCustomer_tablePartitionParititionStoresales_tableIdx_store_idPartitionPartitionTables, Indices and Partitions 6. CREATE TABLE customer_p (id int, lname varchar(32))FRAGMENT BY ROUND ROBINPARTITION part1 IN dbs1,PARTITION part2 IN dbs1,PARTITION part3 IN dbs2;CREATE TABLE customer_p (id int, state varchar (32))FRAGMENT BY EXPRESSIONPARTITION part1 (state = "CA") in dbs1,PARTITION part2 (state = "KS") in dbs1,PARTITION part3 (state = "OR") in dbs1,PARTITION part4 (state = "NV") in dbs1;CREATE TABLE customer (id int, state char (2), zipcode decimal(5,0))FRAGMENT BY EXPRESSIONPARTITION partca93 (state = CA and zipcode 93000) in dbs2,PARTITION partks (state = KS) in dbs3,PARTITION partor (state = OR) in dbs1,PARTITION part4 (state = NV) in dbs1; 7. Multi-threaded Dynamic ScalableArchitecture (DSA) Scalability and Performance Optimal usage of hardware and OSresources DSS Parameters to optimize memory DSS queries Efficient hash joins Parallel Data Query for paralleloperations Light scans, extensive calculations, sorts, multiple joins Ideal for DSS queries and batchoperations Data Compression Time cyclic data mgmt Fragment elimination, fragmentattach and detach Data/index distribution schemas Improve large data volumemanageability Increase performance bymaximizing I/O throughput Configurable Page Size On disk and in memory Additional performance gains Large Chunks support Allows IDS instances to handlelarge volumes Quick Sequential Scans Essential for table scans commonto DSS environments 17Top IDS features utilized for building warehouseSource: 8. Multi-threaded Dynamic ScalableArchitecture (DSA) Scalability and Performance Optimal usage of hardware and OSresources DSS Parameters to optimize memory DSS queries Efficient hash joins Parallel Data Query for paralleloperations Light scans, extensive calculations, sorts, multiple joins Ideal for DSS queries and batchoperations Data Compression Time cyclic data mgmt Fragment elimination, fragmentattach and detach Data/index distribution schemas Improve large data volumemanageability Increase performance bymaximizing I/O throughput Configurable Page Size On disk and in memory Additional performance gains Large Chunks support Allows IDS instances to handlelarge volumes Quick Sequential Scans Essential for table scans commonto DSS environments 17Top IDS features utilized for building warehouseSource:Fragmentation Features 9. List fragmentationCREATE TABLE customer(id SERIAL, fname CHAR(32), lname CHAR(32), state CHAR(2), phone CHAR(12))FRAGMENT BY LIST (state)PARTITION p0 VALUES ("KS", "IL", "IN") IN dbs0,PARTITION p1 VALUES ("CA", "OR", "NV") IN dbs1,PARTITION p2 VALUES ("NY", "MN") IN dbs2,PARTITION p3 VALUES (NULL) IN dbs3,PARTITION p4 REMAINDER IN dbs3; 10. Open Loops with Partitioning As of 11.501. UPDATES STATISTICS on a large fragmented table takes a long time2. Need to explicitly create new partitions for new range of data3. Need database & application down time to manage the application 11. Smarter StatisticsCollection 12. Statistics collection by partitionDistinct histograms for each partitionAll the histograms are combinedEach data partition has UDI counterSubsequently, only recollect modified partitions& update the global histogram Smarter Statistics Only recollect if 10% of th data has changed Automatic statistics during attach, detachSmarter UPDATE STATISTICS 13. UPDATE STATISTICS during ATTACH, DETACH Automatically kick-off update statisticsrefresh in the background need toenable fragment level statistics tasks eliminated by interval fragmentationRunning of update statistics manually afterALTER operationsTime taken to collect statistics is reduced aswell. 14. Fragment Level Statistics (FLS) Generate and store column distribution atfragment level Fragment level stats are combined to formcolumn distribution System monitors UDI (Update/Delete/Insert)activities on each fragment Stats are refreshed only for frequently updatedfragments Fragment level distribution is used to re-calculatecolumn distribution No need to re-generate stats across entire table 15. Generating Table Level Statistics Distribution created for entire column dataset from all fragments. Stored in sysdistrib with (tabid,colno) combination. Dbschema utility can decodes and display encoded distribution. Optimizer uses in-memory distribution representation for queryoptimization.DataDistributionCacheDataDistributionCacheFeedSortedDataFeedColumnDataStoreEncodedDistributionDecodeDistributionBinGenerator& EncoderSORTSysdistribCatalogtableFrag 1Frag 2Frag n 16. Generating Fragment Level StatisticsDataDistributionCacheDataDistributionCacheFeedSortedDataFeedColumnDataStoreEncodedMinibinsDecodeDistributionMini-BinGenerator& EncoderSORTSysfragdistCatalogTableFrag 1Frag 2Frag nSORTSORTMini-BinGenerator& EncoderMini-BinGenerator& EncoderMini-BinMerger& BinEncoderSysdistribCatalogTableSORTFeeddecodeMinibinsStoreEncodedDistribution 17. STATLEVEL propertySTATLEVEL defines the granularity or level of statistics created for thetable.Can be set using CREATE or ALTER TABLE.STATLEVEL [TABLE | FRAGMENT | AUTO] are the allowed values forSTATLEVEL.TABLE entire table dataset is read and table level statistics arestored in sysdistrib catalog.FRAGMENT dataset of each fragment is read an fragment levelstatistics are stored in new sysfragdist catalog. This option is onlyallowed for fragmented tables.AUTO System determines when update statistics is run if TABLE orFRAGMENT level statistics should be created. 18. UPDATE STATISTICS extensions UPDATE STATISTICS [AUTO | FORCE]; UPDATE STATISTICS HIGH FOR TABLE [AUTO |FORCE]; UPDATE STATISTICS MEDIUM FOR TABLE tab1SAMPLING SIZE 0.8 RESOLUTION 1.0 [AUTO |FORCE ]; Mode specified in UPDATE STATISTICS statementoverrides the AUTO_STAT_MODE session setting.Session setting overrides the ONCONFIGsAUTO_STAT_MODE parameter. 19. UPDATE STATISTICS extensions New metadata columns - nupdates, ndeletes and ninserts in sysdistrib and sysfragdist store the correspondingcounter values from partition page at the time of statisticsgeneration. These columns will be used by consecutiveupdate statistics run for evaluating if statistics are stale orreusable. Statistics evaluation is done at fragment level for tableswith fragment level statistics and at table level for the rest. Statistics created by MEDIUM or HIGH mode (columndistributions) is evaluated. The LOW statistics is saved at the fragment level as well andis aggregated to collect global statistics 20. Alter Fragment Attach/Detach Automatic background refreshing of column statistics afterexecuting ALTER FRAGMENT ATTACH/DETACH on a table withfragmented statistics. Refreshing of statistics begins after the ALTER has beencommitted. For ATTACH operation, fragmented statistics of the newfragment is built and table level statistics is rebuilt from allfragmented statistics. Any existing fragments with out of datecolumn statistics will be rebuilt at this time too. For DETACH operation, table level statistics of the resultingtables are rebuilt from the fragmented statistics. The background task that refreshes statistics is refreshstatsand will print errors in online.log if any are encountered. 21. Design for Time Cyclic data mgmtcreate table mytrans(custid integer,proc_date date,store_loc char(12).) fragment by expression......(proc_date < DATE (01/01/2009 ) ) in fe_auth_log20081231,(MONTH(proc_date) = 1 ) in frag2009Jan ,(MONTH(proc_date) = 2 ) in frag2009Feb,.(MONTH(proc_date) = 10 and proc_date < DATE (10/26/2009 ) ) in frag2009Oct ,(proc_date = DATE (10/26/2009 ) ) in frag20091026 ,(proc_date = DATE (10/27/2009 ) ) in frag20091027,(proc_date = DATE (10/28/2009 ) ) in frag20091027 ,(proc_date = DATE (10/29/2009 ) ) in frag20091027 ,(proc_date = DATE (10/30/2009 ) ) in frag20091027 ,(proc_date = DATE (10/31/2009 ) ) in frag20091027 ,(proc_date = DATE (11/01/2009 ) ) in frag20091027 ,; 22. RoundRobinList Expression IntervalParallelism Yes Yes Yes YesRange Expression No Yes Yes YesEqualityExpressionNo Yes Yes YesFLS Yes Yes Yes YesSmarter Stats Yes Yes Yes YesATTACH ONLINE No No No YesDETACH ONLINE No No No YesMODIFY ONLINE No No No Yes -- MODIFYtransition valueCreate indexONLINEYes Yes Yes Not yetStorageProvisioningNo No No Yes 23. Type of filter (WHEREclause)NonoverlappingSingle fragmentkeyOverlapping on asingle column keyNonoverlappingMultiple columnkeyRange expression Can eliminate Cannot eliminate Cannot eliminateEquality expression Can eliminate Can eliminate Can eliminateFragment elimination 24. New fragmentation Strategies inInformix v11.70 List FragmentationSimilar to expression based fragmentationSyntax compatibility Interval FragmentationLike expression, but policy basedImproves availability of the system 25. Time Cyclic Data management Time-cycl