11
For the Complete Technology & Database Professional Q4-15 SELECTJournal www.ioug.org Precisely My Point: Leverage Attribute Clustering and Zone Mapping in Oracle Database 12.1.0.2 Securing an Oracle Database Environment with SELinux DBA 101: Becoming an Oracle DBA IOUG Press Corner: Boost Your Career Through Writing Features: Complete your goals, plan for challenges, reinvent yourself Renewal

SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

For the Complete Technology & Database Professional Q4-15

SELECTJournal

www.ioug.org

Precisely My Point: Leverage Attribute Clustering and Zone Mapping in Oracle Database 12.1.0.2

Securing an Oracle Database Environment with SELinux

DBA 101: Becoming an Oracle DBA

IOUG Press Corner: Boost Your Career

Through Writing

Features:

Complete your goals, plan for challenges, reinvent yourself

Renewal

Page 2: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

H10 ◾ Q4-15 www.ioug.org

The new features of 12.1.02 grant an Oracle DBA the power to:

◾◾ Leverage attribute clustering to improve query performance when data is retrieved from HDD devices by storing a table’s data in a specific order — and therefore in closer proximity on disk — for faster access times.

◾◾ Implement zone mapping to improve the performance of physical reads when data must be retrieved from HDDs by capture metadata about exactly where data is stored on disk, thus enabling pruning of physical I/O activity.

◾◾ Leverage both these feature sets in concert with engineered systems features as an alternative to creating multiple indexes on a table’s columnar data, thereby potentially also improving DML throughput during batch processing as well as OLTP transaction activity.

Data Warehousing, Star Schemas and the Lack of Ordered DataWhen I speak with my Oracle DBA colleagues about their most vexing issues, with rare exceptions — say, application workloads that demand extremely quick response time like banking/financial/trading systems — I’ve found that my colleagues aren’t usually worried about the response time of their OLTP applications or even the tables loaded via batch processing. On the contrary, it’s their IT organization’s decision support systems (DSS) and data warehousing (DW) application workloads that continue to demand ever-faster response time.

Ongoing IT industry trends are increasing the demand for ever-faster queries as well:

◾◾ As big data and the Internet of Things (IOT) continue to penetrate almost every line of business, analytical queries are required to return meaningful results in almost real time about semi-structured or completely unstructured data.

Precisely My Point: Leverage Attribute Clustering and Zone Mapping in Oracle Database 12.1.0.2By Jim Czuprynski ◾ Eric Mader, Editor

In this issue’s article, I’ll review the new features of attribute clustering, zone mapping and how to leverage them with engineered systems features.

Oracle Database 12c’s latest release (12.1.0.2) offers several new features that provide speedier access to data that is stored on spinning disk.

Page 3: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

www.ioug.org Q4-15 ◾ 11H

◾◾ DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced in September 2014 that 90 percent of all data that exists today was created in just the last two years.)

◾◾ The typical knee-jerk reaction to decreasing query response time — adding indexes to tables in search of faster searches, joins and ordered retrievals — has become untenable as the sheer volume of data grows.

◾◾ To add to the complexity of these issues, the number of dimensions that need to be analyzed, as well as the detailed data present in those dimensions, continues to expand.

Consider a typical star schema query from Listing 1, which attempts to identify all sales detail for customers located in a specific town in Wisconsin who also purchased products within a specific product category.

filtering only the required data are completely unavailable in the Sales fact table (SH.ZM_SALES). An unfortunate fact of most data stored within fact tables is that if the columnar data for the same Product and Customer are not stored closely together (or clustered) within the fact table — not an uncommon situation! — then the query is quite likely to consume considerably more physical I/O when these data need to be retrieved from the fact table.

Of course, it’s quite unreasonable to expect the fact table’s data to be stored in the right order all the time for every query; the data could have been loaded in any order from any number of sources, and even if it’s loaded from a batch process or static data source, there’s no way to guarantee that it’s sorted the best way to satisfy this particular query. An Oracle DBA would probably respond to complaints of poor response time by creating a non-unique index on the column combinations used for this particular query. However, this solution cannot address a crucial fact: the index cannot capture the necessary data because it is only present within the attributes of the Product and Customer dimensions.

Attribute ClusteringOracle Database 12.1.0.2 offers a potential solution to this performance issue through its new attribute clustering features. Simply put, attribute clustering tends to improve physical I/O performed against data segments by grouping a table’s row pieces into a more appropriate order on physical disk based on how these data are most likely to be queried:

◾◾ Searches for data within tables that employ attribute clustering will tend to be significantly faster because that data will be clustered in closer proximity on disk, so this means considerably less physical I/O will be needed to retrieve the same number of row pieces.

◾◾ If an index has been defined on the columns most likely to be used for selection criteria, that index’s clustering factor will be significantly lower than if the data was not ordered appropriately.

◾◾ Attribute clustering also permits ordering the row pieces within the fact table based on how the data is grouped together in corresponding dimension tables, or even based on the order in which rows will be joined between fact and dimension table(s).

Click here for the DDL to create

the underlying tables

for this query.

Note that several attributes of the Products and Customers dimensions — not the join keys that link each dimension to its fact table — are used as selection criteria:

SELECT P.prod_category ,P.prod_subcategory ,C.cust_state_province ,C.cust_city ,SUM(S.quantity_sold) ,SUM(S.amount_sold) FROM sh.zm_sales S ,sh.zm_products P ,sh.zm_customers C WHERE S.prod_id = P.prod_id AND S.cust_id = C.cust_id AND C.cust_city = 'Bay City' AND C.cust_state_province = 'WI' AND P.prod_category = 'Software/Other' GROUP BY P.prod_category ,P.prod_subcategory ,C.cust_state_province ,C.cust_city ORDER BY P.prod_category ,P.prod_subcategory ,C.cust_state_province ,C.cust_city;

Listing 1: Typical Sales History Star Schema Query

In this case, the attributes of the Product dimension (SH.ZM_PRODUCTS) and Customer dimension (SH.ZM_CUSTOMERS) that are required for

Attribute clustering also permits ordering the row pieces within the fact table based on how the data is grouped together in corresponding dimension tables.

Page 4: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

12 ◾ Q4-15 www.ioug.orgH

of the rows is evident based on the progression of values for each row’s ROWID, as Listing 3 shows:

SQL> SELECT ROWID, cust_id, prod_id, quantity_sold, amount_sold FROM sh.sales WHERE rownum < 101;

ROWID CUST_ID PROD_ID QUANTITY_SOLD AMOUNT_SOLD------------------ ---------- ---------- ------------- -----------AAAW1dAAKAAAmISAAA 2 13 1 1232.16AAAW1dAAKAAAmISAAB 2 14 1 1259.99AAAW1dAAKAAAmISAAC 2 14 1 1259.99AAAW1dAAKAAAmISAAD 2 14 1 1259.99. . . AAAW1dAAKAAAmISAAq 2 146 1 16.79AAAW1dAAKAAAmISAAr 2 146 1 16.79AAAW1dAAKAAAmISAAs 2 148 1 29.39AAAW1dAAKAAAmISAAt 7 41 1 48.36AAAW1dAAKAAAmISAAu 7 42 1 48.36AAAW1dAAKAAAmISAAv 7 45 1 48.36AAAW1dAAKAAAmISAAw 8 19 1 63.02AAAW1dAAKAAAmISAAx 8 19 1 63.02AAAW1dAAKAAAmISAAy 8 23 1 24.08AAAW1dAAKAAAmISAAz 8 25 1 128.32. . . AAAW1dAAKAAAmISABD 8 140 1 42.58AAAW1dAAKAAAmISABE 8 146 1 17.03AAAW1dAAKAAAmISABF 8 148 1 29.8AAAW1dAAKAAAmISABG 9 38 1 32.22AAAW1dAAKAAAmISABH 9 40 1 46.79AAAW1dAAKAAAmISABI 9 43 1 46.79AAAW1dAAKAAAmISABJ 9 44 1 46.79AAAW1dAAKAAAmISABK 9 45 1 46.79. . . AAAW1dAAKAAAmISABh 9 128 1 28.86AAAW1dAAKAAAmISABi 9 128 1 28.86AAAW1dAAKAAAmISABj 9 128 1 28.86

100 rows selected.

Listing 3: Proving LINEAR Attribute Clustering Implementation

The second type of clustering, INTERVAL, applies a special algo-rithm known as a Z-Order Curve (also known as Morton Order) to cluster related data into discrete groups based on the columns specified. Listing 4 shows how the INTERVAL clustering method is implemented for table SH.ZM_SALES, and Listing 5 shows the resulting row order.

ALTER TABLE sh.zm_sales DROP CLUSTERING;

ALTER TABLE sh.zm_sales ADD CLUSTERING BY INTERVAL ORDER (cust_id, prod_id) YES ON LOAD YES ON DATA MOVEMENT WITHOUT MATERIALIZED ZONEMAP;

ALTER TABLE sh.zm_sales MOVE ALLOW CLUSTERING;

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');

Listing 4: Applying INTERVAL Attribute Clustering

◾◾ Data that’s been ordered via attribute clustering also tends to improve the data’s compressibility because similar groups of data values will be located closer to each other — perhaps even within the same database block or extent — and this usually has a positive impact on data that is typically accessed via full table scans. Because of their proximity, clustered data also tends to require fewer CPU cycles to compress.

◾◾ Attribute clustering can take advantage of advanced Oracle Enterprise System features such as Exadata storage indexes.

◾◾ Finally, data that’s been grouped via attribute clustering can take advantage of in-memory min/max pruning as well as advanced Oracle Enterprise System features such as Exadata storage indexes and zone maps, which we will explore in the second section of this article.

Linear Versus Interleaved Attribute ClusteringAttribute Clustering offers two different methods of data clustering. The default method — LINEAR — insures that data will be ordered within a table based on the order of the specified column(s) in the CLUSTERING clause for a table or table partition. Listing 2 illustrates the simplest form of linear attribute clustering for table SH.ZM_SALES — clustering on the column values for CUST_ID and PROD_ID:

ALTER TABLE sh.zm_sales DROP CLUSTERING;

ALTER TABLE sh.zm_sales ADD CLUSTERING BY LINEAR ORDER (cust_id, prod_id);

ALTER TABLE sh.zm_sales MOVE ALLOW CLUSTERING;

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','ZM_SALES');

Listing 2: Applying LINEAR Attribute Clustering

It’s important to realize that until the ALTER TABLE … MOVE ALLOW CLUSTERING; command is issued, the data in SH.ZM_SALES will remain in its originally loaded order. Once the data has been clustered, however, a simple query retrieving the first 100 rows proves that the data in this table has indeed been physically ordered based on the specified clustering clause. Note the ordering

Data that’s been ordered via attribute clustering also tends to improve the data’s compressibility because similar groups of data values will be located closer to each other.

Page 5: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

www.ioug.org Q4-15 ◾ 13H

SQL> SELECT ROWID, cust_id, prod_id, quantity_sold, amount_sold FROM sh.sales WHERE rownum < 101;

ROWID CUST_ID PROD_ID QUANTITY_SOLD AMOUNT_SOLD------------------ ---------- ---------- ------------- -----------AAAWeZAAKAAABiSAAA 987 13 1 1232.16AAAWeZAAKAAABiSAAB 1660 13 1 1232.16AAAWeZAAKAAABiSAAC 1762 13 1 1232.16AAAWeZAAKAAABiSAAD 1843 13 1 1232.16AAAWeZAAKAAABiSAAE 1948 13 1 1232.16. . .AAAWeZAAKAAABiSAAX 659 13 1 1232.16AAAWeZAAKAAABiSAAY 848 13 1 1232.16AAAWeZAAKAAABiSAAZ 949 13 1 1232.16AAAWeZAAKAAABiSAAa 1242 13 1 1232.16AAAWeZAAKAAABiSAAb 1291 13 1 1232.16AAAWeZAAKAAABiSAAc 1422 13 1 1232.16AAAWeZAAKAAABiSAAd 1485 13 1 1232.16AAAWeZAAKAAABiSAAe 1580 13 1 1232.16. . .AAAWeZAAKAAABiSAA2 14457 13 1 1232.16AAAWeZAAKAAABiSAA3 17011 13 1 1232.16AAAWeZAAKAAABiSAA4 17566 13 1 1232.16AAAWeZAAKAAABiSAA5 17633 13 1 1232.16AAAWeZAAKAAABiSAA6 2 13 1 1232.16. . .AAAWeZAAKAAABiSABf 11453 13 1 1232.16AAAWeZAAKAAABiSABg 12783 13 1 1232.16AAAWeZAAKAAABiSABh 15826 13 1 1232.16AAAWeZAAKAAABiSABi 26631 13 1 1232.16AAAWeZAAKAAABiSABj 343 13 1 1237.31

Listing 5: INTERVAL Attribute Clustering: Ordered Results

While the order of these data’s rows using INTERLEAVED clustering may appear to be distributed in an almost desperately random fashion when compared to LINEAR clustering, these rows are actually arranged according to an ingenious algorithm that allows a series of values to be located quickly and efficiently

within several smaller “boxed sets” of rows. Even though the algorithm is almost 50 years old (it was first proposed by G. M. Morton in 1966), it has been proven extremely effective for searches, especially when a B-Tree index structure is used to access these smaller box sets of rows.

Oracle 12.1.0.2 implements INTERLEAVED clustering using what is typically called the Z-Order Curve method, as illustrated in Figure 1. The Z-Order Curve method essentially passes over each “boxed set” of clustered values in a backward “Z” pattern using a series of bits to record the minimum and maximum values within each group of values. This “Z” pattern can be repeated virtually infinitely, but it requires a very small number of bits to record the clustered values.

An interesting analogy for the way that interleaved clustering works is the puzzle of the nine dots, in which the challenge is to draw a line through nine dots arranged in three rows of three dots each, but using only four vertical or horizontal lines without ever lifting pencil off paper. Interleaved clustering essentially attempts to do the same thing, but with (in theory) an infinite number of multiple data points in n-dimensional space instead of just a few dots on a piece

of paper.

Attribute Clustering and Dimensional Joins

Another interesting feature of attribute clustering is the ability to arrange data within a fact table or fact table partition based not on how the data is sorted within the fact table, but instead how it is ordered within one or more corresponding dimensions. Consider Listing 6, which shows an implementation of a dimensional join between the ZM_SALES fact table and one of its corresponding dimensions, ZM_PRODUCTS:  

ALTER TABLE sh.zm_sales DROP CLUSTERING;

ALTER TABLE sh.zm_sales ADD CLUSTERING sh.zm_sales JOIN sh.zm_products ON (sh.zm_sales.prod_id = sh.zm_products.prod_id) BY LINEAR ORDER ( sh.zm_products.prod_category ,sh.zm_products.prod_subcategory) YES ON LOAD YES ON DATA MOVEMENT WITHOUT MATERIALIZED ZONEMAP;

ALTER TABLE sh.zm_sales MOVE ALLOW CLUSTERING; EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','ZM_SALES');

Listing 6: Single Dimensional Join

In this scenario, attribute clustering will use the order of the Product Category and Product Subcategory codes as they are stored in the SH.ZM_PRODUCTS dimension table. The key advantage of this clustering method is that it can essentially pre-group and pre-order rows in the fact table in the same way that one or more queries against the fact and dimension is likely to retrieve them.

Figure 1: Z-Order Curve Patterns for INTERVAL Attribute Clustering

Page 6: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

14 ◾ Q4-15 www.ioug.orgH

The previous scenario only used one dimension for attribute clustering, but it is possible to use more than one dimension as well, as Listing 7 shows. In this scenario, the fact table’s rows will be ordered using the INTERLEAVED (Z-Order Curve) algorithm instead of the default LINEAR method.

ALTER TABLE sh.zm_sales DROP CLUSTERING;

ALTER TABLE sh.zm_sales ADD CLUSTERING sh.zm_sales JOIN sh.zm_products ON (sh.zm_sales.prod_id = sh.zm_products.prod_id) JOIN sh.zm_customers ON (sh.zm_sales.cust_id = sh.zm_customers.cust_id) BY INTERLEAVED ORDER ( sh.zm_sales.cust_id ,sh.zm_sales.prod_id ,sh.zm_products.prod_category ,sh.zm_products.prod_subcategory) YES ON LOAD YES ON DATA MOVEMENT WITHOUT MATERIALIZED ZONEMAP;

ALTER TABLE sh.zm_sales MOVE ALLOW CLUSTERING; EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','ZM_SALES');

Listing 7: Multi-Dimensional Join: Simple

Finally, Listing 8 shows one more attribute clustering scenario: the use of both the Customer and Product dimensions to order the fact table in LINEAR order based on four columns not present in the fact table at all.

ALTER TABLE sh.zm_sales DROP CLUSTERING;

ALTER TABLE sh.zm_sales ADD CLUSTERING sh.zm_sales JOIN sh.zm_products ON (sh.zm_sales.prod_id = sh.zm_products.prod_id) JOIN sh.zm_customers ON (sh.zm_sales.cust_id = sh.zm_customers.cust_id) BY LINEAR ORDER ( sh.zm_customers.cust_state_province ,sh.zm_customers.cust_city ,sh.zm_products.prod_category ,sh.zm_products.prod_subcategory) YES ON LOAD YES ON DATA MOVEMENT WITHOUT MATERIALIZED ZONEMAP;

ALTER TABLE sh.zm_sales MOVE ALLOW CLUSTERING; EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','ZM_SALES');

Listing 8: Multi-Dimensional Join: Complex

This attribute clustering method is particularly useful when queries against the fact table and these two dimensions typically need to group and order data within a hierarchy of values; in this case, both the State and City attributes of the Customer dimension as well as the major and minor Categories of the Product dimension.

Attribute Clustering MetadataOracle Database 12.1.0.2 also provides four new data dictionary views summarized in Table 1 that contain the metadata about which attribute clustering methods have been applied to database objects.

Data Dictionary View ContentsDBA_CLUSTERING_TABLES Shows which tables have at

least one attribute cluster

DBA_CLUSTERING_KEYS Displays keys and values which control attribute clustering

DBA_CLUSTERING_DIMENSIONS

Lists which dimensions are used to control sorting and ordering within fact tables

DBA_CLUSTERING_JOINS When joins between dimension and fact tables are employed, Identifies the dimensions and columns used to define attribute clustering

Table 1: Attribute Clustering Metadata: Data Dictionary Views

The current state of attribute clustering for the sample tables we’re using for these demonstrations are illustrated in the following queries. Listing 9 lists a query against DBA_CLUSTERING_TABLES and the resulting output that shows which tables are using attribute clustering.

SET LINESIZE 100SET PAGESIZE 20000COL owner FORMAT A12 HEADING "Table|Owner"COL table_name FORMAT A30 HEADING "Table Name"COL cls_type FORMAT A12 HEADING "Cluster|Type"COL cls_onld FORMAT A08 HEADING "Cluster|On Data|Loads?"COL cls_ondm FORMAT A08 HEADING "Cluster|On Data|Moves?"COL wzmap FORMAT A08 HEADING "With|Zone|Map?"TTITLE "Clustering Attribute Metadata|(from DBA_CLUSTERING_TABLES)"SELECT owner ,table_name ,clustering_type cls_type ,on_load cls_onld ,on_datamovement cls_ondm ,with_zonemap wzmap FROM dba_clustering_tables WHERE owner = 'SH' ORDER BY 1,2;TTITLE OFF

Clustering Attribute Metadata (from DBA_CLUSTERING_TABLES)

Table Cluster On Cluster On with zoneOwner Table Name Cluster Type Data Loads? Moves? Map?SH ZM_SALES LINEAR YES YES NO

Listing 9: Tables Using Attribute Clustering

For example, Listing 10 displays a query against DBA_CLUSTERING_KEYS and the resulting output that shows which columns implement the ordering of data within fact tables using attribute clustering:

Page 7: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

www.ioug.org Q4-15 ◾ 15H

SET LINESIZE 130SET PAGESIZE 20000COL owner FORMAT A12 HEADING "Table|Owner"COL table_name FORMAT A30 HEADING "Table Name"COL column_name FORMAT A30 HEADING "Column Name"COL position FORMAT 999 HEADING "Pos|#"TTITLE "Clustering Columns Metadata|(from DBA_CLUSTERING_KEYS)"SELECT detail_owner owner ,detail_name table_name ,detail_column column_name ,position FROM dba_clustering_keys WHERE owner = 'SH' ORDER BY 1,2,3;TTITLE OFF

Clustering Columns Metadata (from DBA_CLUSTERING_KEYS)

Table PosOwner Table Name # Column Name------------ ------------------------------ ---- ------------------------------SH ZM_CUSTOMERS 1 CUST_STATE_PROVINCESH ZM_CUSTOMERS 2 CUST_CITYSH ZM_PRODUCTS 3 PROD_CATEGORYSH ZM_PRODUCTS 4 PROD_SUBCATEGORY

Listing 10: Attribute Clustering Keys

The query against DBA_CLUSTERING_DIMENSIONS and its resulting output in Listing 11 shows which dimension tables are used to enforce attribute clustering on their corresponding fact tables.

SET LINESIZE 130SET PAGESIZE 20000COL owner FORMAT A12 HEADING "Table|Owner"COL table_name FORMAT A30 HEADING "Table Name"COL dim_owner FORMAT A12 HEADING "Dimension|Owner"COL dim_name FORMAT A30 HEADING "Dimension Name"TTITLE "Clustering Dimensions|(from DBA_CLUSTERING_DIMENSIONS)"SELECT owner ,table_name ,dimension_owner dim_owner ,dimension_name dim_name FROM dba_clustering_dimensions WHERE owner = 'SH' ORDER BY 1,2,3;TTITLE OFF

Clustering Dimensions (from DBA_CLUSTERING_DIMENSIONS)

Table DimensionOwner Table Name Owner Dimension Name------------ ------------------------------ ------------ ------------------------------SH ZM_SALES SH ZM_PRODUCTSSH ZM_SALES SH ZM_CUSTOMERS

Listing 11: Attribute Clustering Dimensions

Finally, Listing 12 lists a query against DBA_CLUSTERING_JOINS and its resulting output that shows which joins between fact and dimension tables to implement attribute clustering.

SET LINESIZE 130SET PAGESIZE 20000COL owner FORMAT A12 HEADING "Table|Owner"COL table_name FORMAT A30 HEADING "Table Name"COL column_name FORMAT A30 HEADING "Column Name"TTITLE "Clustering Dimensions|(from DBA_CLUSTERING_JOINS)"SELECT owner ,table_name ,tab1_owner ,tab1_name ,tab1_column ,tab2_owner ,tab2_name ,tab2_column FROM dba_clustering_joins WHERE tab1_owner = 'SH' ORDER BY 1,2,3;TTITLE OFF

Clustering Joins (from DBA_CLUSTERING_JOINS)

Join Join Join Joined Joined JoinedTable Table Column Table Table ColumnOwner Name Name Owner Name Name---------- ----------- ------------ --------- ---------------- -----SH ZM_SALES CUST_ID SH ZM_CUSTOMERS CUST_ISH ZM_SALES PROD_ID SH ZM_PRODUCTS PROD_ID

Listing 12: Attribute Clustering Joins

Zone Maps: Hit ’Em Where They Ain’tWhile attribute clustering actually stores related data in an appropriate pattern based on the desired column ordering of the values, the zone mapping features of Oracle Database 12.1.0.2 actually does the inverse: It accurately ascertains exactly where data is stored on I/O devices and thus eliminates unnecessary physical I/O processing through Oracle 12c Database’s proprietary in-memory min/max pruning features.

Zone maps are database objects specifically designed to locate table and table partition data with minimal physical I/O. Zone maps also pair nicely with Exadata storage indexes and are therefore only useful on selected Oracle Enterprise hardware.

Creating Zone MapsZone maps can be created either automatically during the specification of attribute clustering for a table or partition, or they can be created after attribute clustering has been implemented. In fact, attribute clustering is not even required to implement a zone map, but as we’ll see a bit later, it makes sense to leverage both features simultaneously. Listing 13 shows how a zone map can be created during attribute clustering creation:

Page 8: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

16 ◾ Q4-15 www.ioug.orgH

SQL> ALTER TABLE sh.zm_sales DROP CLUSTERING;Table altered.SQL> ALTER TABLE sh.zm_sales ADD CLUSTERING BY INTERVAL ORDER (cust_id, prod_id) YES ON LOAD YES ON DATA MOVEMENT WITH MATERIALIZED ZONEMAP;

Table altered.

SQL> ALTER TABLE sh.zm_sales MOVE ALLOW CLUSTERING;

Table altered.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','ZM_SALES');

PL/SQL procedure completed.

Listing 13: Creating a Zone Map Automatically During Attribute Clustering Implementation

However, only one zone map can exist per table at any time, so any existing zone map must be dropped before a new one can be created, as Listing 14 illustrates:

SQL> CREATE MATERIALIZED ZONEMAP sh.mzm_sales TABLESPACE ado_cold_data REFRESH ON LOAD DATA MOVEMENT AS SELECT SYS_OP_ZONE_ID(S.ROWID) ,MIN(cust_id) ,MAX(cust_id) ,MIN(prod_id) ,MAX(prod_id) FROM sh.zm_sales S GROUP BY sys_op_zone_id(S.ROWID);

ERROR at line 1:ORA-31958: fact table "SH"."ZM_SALES" already has a zonemap "SH"."MZM_SALES" on it

SQL> DROP MATERIALIZED ZONEMAP sh.mzm_sales;

Zone map dropped.

SQL> DROP CREATE MATERIALIZED ZONEMAP sh.mzm_sales TABLESPACE ado_cold_data REFRESH ON LOAD DATA MOVEMENT AS . . .

Zone map created.

Listing 14: Creating a Zone Map Manually When One Already Exists

Zone Map MetadataJust as with attribute clustering, metadata about zone maps is retained within the database’s data dictionary. Table 2 summarizes the two new data dictionary views in Oracle Database 12.1.0.2 that contain the metadata for zone map objects.

Data Dictionary View ContentsDBA_ZONEMAPS Describes Zone Map metadata,

including when a Zone Map was refreshed

DBA_ZONEMAP_MEASURES Describes functional methods the Zone Map has utilized to map out data (e.g. MIN/MAX to enable pruning)

Table 2: Zone Maps Metadata: Data Dictionary Views

Listing 15 illustrates how to query the DBA_ZONEMAPS data dictionary view to see information about which zone maps exist, and Listing 16 shows the results of a query against the DBA_ZONEMAP_MEASURES view to identify which measures — in other words, the functions that were used to obtain the boundaries of each zone map — and which columns of underlying database tables were used to construct those boundaries.

SET LINESIZE 130SET PAGESIZE 20000COL owner FORMAT A12 HEADING "Zone Map|Owner"COL zonemap_name FORMAT A30 HEADING "Zone Map Name"COL fact_owner FORMAT A12 HEADING "Fact|Table|Owner"COL fact_table FORMAT A30 HEADING "Fact Table Name"COL hierarchical FORMAT A12 HEADING "Hierarchy”COL scale FORMAT A12 HEADING "Zone|Map|Scale|Fctr"COL with_clustering FORMAT A05 HEADING "With|Clst?"COL pruning FORMAT A08 HEADING "Pruning|Enabled?"COL refresh_mode FORMAT A17 HEADING "Refresh|Mode"COL refresh_method FORMAT A11 HEADING "Refresh|Method"COL last_rfsh_mthd FORMAT A11 HEADING "Last|Refresh|Method"COL last_rfsh_dtm FORMAT A11 HEADING "Last|Refreshed On"COL pruning FORMAT A08 HEADING "Pruning|Enabled?"COL invalid FORMAT A08 HEADING "Invalid?"COL stale FORMAT A08 HEADING "Stale?"COL unusable FORMAT A08 HEADING "Unusable?"COL compile_state FORMAT A13 HEADING "Compile|State"TTITLE "Zone Mapping Metadata|(from DBA_ZONEMAPS)"SELECT owner ,zonemap_name ,fact_owner ,fact_table ,scale ,hierarchical ,with_clustering ,pruning ,refresh_mode ,refresh_method ,last_refresh_method last_rfsh_mthd ,TO_CHAR(last_refresh_time,"yyyy-mm-dd.mi:ss") last_rfsh_dtm ,invalid ,stale ,unusable ,compile_state FROM dba_zonemaps WHERE owner IN ('AP','SH','TPCH') ORDER BY 1,2,3,4;TTITLE OFF

Just as with attribute clustering, metadata about zone maps is retained within the database’s data dictionary.

Page 9: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

www.ioug.org Q4-15 ◾ 17H

Zone Mapping Metadata (from DBA_ZONEMAPS) Zone Zone Fact Fact Zone MapMap Map Table Table Scale with PruningOwner Name Owner Name Scale Fctr Clst? Enabled?------ ----------- ------- ---------- ------- --------- ----- --------SH MZM_SALES SH ZM_SALES 10 NO NO ENABLED

Zone Mapping Metadata (from DBA_ZONEMAPS)Zone Zone Last Map Map Refresh Refresh Refresh Last Owner Name Mode Method Method Refresh------ ---------- ----------------- -------- ---------- -----------SH MZM_SALES LOAD DATAMOVEMENT FORCE COMPLETE 2015-02-19.46.49s CompileDtmInvalid? Stale? Unusable State----------- --------- ----------- -------NO NO NO VALID

Listing 15: Zone Maps Metadata

SET LINESIZE 130SET PAGESIZE 20000COL owner FORMAT A12 HEADING "Zone Map|Owner"COL zonemap_name FORMAT A30 HEADING "Zone Map Name"COL position FORMAT 99999 HEADING "Pos|In|SELECT"COL agg_function FORMAT A30 HEADING "Aggregate|Function"COL column_name FORMAT A30 HEADING "Column Name"TTITLE "Zone Mapping Metadata|(from DBA_ZONEMAP_MEASURES)"SELECT owner ,zonemap_name ,position_in_select position ,agg_function ,agg_column_name column_name FROM dba_zonemap_measures WHERE owner IN ('AP','SH','TPCH') ORDER BY 1,2,3;TTITLE OFF

Zone Mapping Metadata (from DBA_ZONEMAP_MEASURES)

PosZone Map In AggregateOwner Zone Map Name SELECT Function Column Name--------- -------------------- ------ ----------------- -----------------SH MZM_SALES 2 MIN MIN_1_CUST_IDSH MZM_SALES 3 MAX MAX_1_CUST_IDSH MZM_SALES 4 MIN MIN_2_PROD_ID

Listing 16: Zone Mapping MEASUREs Metadata

To emphasize the point that zone maps are actually retained within the database, they can be queried just like any database object. Listing 17 shows the contents of the SH.MZM_SALES zone map; note the distinct range of values for CUST_ID and PROD_ID that this zone map establishes, as well as the number of rows in each zone map set:

SELECT min_1_cust_id ,max_1_cust_id ,min_2_prod_id ,max_2_prod_id ,zone_level$ ,zone_state$ ,zone_rows$ FROM sh.mzm_sales ORDER BY 1,2,3,4;

MIN_1_ MAX_1_ MIN_2_ MAX_2_ ZONE_ ZONE_ ZONE_CUST_ID CUST_ID PROD_ID PROD_ID LEVEL$ STATE$ ROWS$------------- ------------- ------------- ------------- -----------2 324 13 148 0 0 25495324 2506 13 148 0 0 2164962506 4897 13 148 0 0 2165264897 8809 13 148 0 0 2165188809 28700 13 148 0 0 21181328700 101000 13 148 0 0 31995 Zone Mapping Metadata (from DBA_ZONEMAP_MEASURES)

PosZone Map In AggregateOwner Zone Map Name SELECT Function Column Name--------- -------------------- ------ ----------------- -----------------SH MZM_SALES 2 MIN MIN_1_CUST_IDSH MZM_SALES 3 MAX MAX_1_CUST_IDSH MZM_SALES 4 MIN MIN_2_PROD_ID

Listing 17: Zone Maps: Minimum and Maximum Attribute Values

Building Non-Default Zone MapsInterestingly, Oracle 12.1.0.2 also provides the capability to build your own zone maps using functions of your choice. Depending on your perspective, this may either be a thrilling experiment or a daunting task; however, it appears that the primary reason for this flexibility is to allow the construction of more complex zone mapping structures in later releases of the database. The example in Listing 18 illustrates how to reconstruct the SH.MZM_SALES zone map using appropriate minimum and maximum values for four columns in the SH.PRODUCTS and SH.CUSTOMERS dimension tables. Note the call to the SYS_OP_ZONE_ID function to obtain an appropriate value for that mandatory column of the Zone Map:

DROP MATERIALIZED ZONEMAP sh.mzm_sales;CREATE MATERIALIZED ZONEMAP sh.mzm_sales TABLESPACE ado_cold_data REFRESH ON LOAD DATA MOVEMENT AS

Oracle 12.1.0.2 also provides the capability to build your own zone maps using functions of your choice.

Page 10: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

18 ◾ Q4-15 www.ioug.orgH

SELECT SYS_OP_ZONE_ID(S.ROWID) ,MIN(cust_state_province) ,MAX(cust_state_province) ,MIN(cust_city) ,MAX(cust_city) ,MIN(prod_category) ,MAX(prod_category) ,MIN(prod_subcategory) ,MAX(prod_subcategory) FROM sh.zm_sales S LEFT OUTER JOIN sh.zm_products P ON S.prod_id = P.prod_id LEFT OUTER JOIN sh.zm_customers C ON S.cust_id = C.cust_id GROUP BY sys_op_zone_id(S.ROWID);

Zone Mapping Metadata (from DBA_ZONEMAP_MEASURES)

PosZone Map In AggregateOwner Zone Map Name SELECT Function Column Name--------- --------------------- ------ ---------- ---------------------SH MZM_SALES 2 MIN MIN_1_CUST_STATE_PROVINCESH MZM_SALES 3 MAX MAX_1_CUST_STATE_PROVINCESH MZM_SALES 4 MIN MIN_2_CUST_CITYSH MZM_SALES 5 MAX MAX_2_CUST_CITYSH MZM_SALES 6 MIN MIN_3_PROD_CATEGORYSH MZM_SALES 7 MAX MAX_3_PROD_CATEGORYSH MZM_SALES 8 MIN MIN_4_PROD_SUBCATEGORYSH MZM_SALES 9 MAX MAX_4_PROD_SUBCATEGORY

Listing 18: Building a Zone Map with Aggregate Functions

Putting It All Together: Attribute Clustering Plus Zone Mapping

At this point in our discussion, it shouldn’t come as a great surprise that attribute clustering and zone maps work quite nicely together. They are especially effective at reducing the requirement to create alternative B-Tree or bitmap indexes on one or more columns in a table in a possibly futile attempt to improve the performance of just a few queries at the cost of DML performance.

For example, Listing 19 shows the resulting execution plan for our original query in Listing 1. Note that the query was able to take advantage of the SH.MZM_SALES zone map that was created via the code in Listing 18; the plan shows the application of the zone map to filtering operations against the SH.ZM_SALES table through the SYS_ZMAP_FILTER function:

Plan hash value: 2722944393------------------------------------------------------------------------Id | Operation | Name | Rows | Bytes | Cost |------------------------------------------------------------------------0 | SELECT STATEMENT | | 11 | 858 | 1624|1 | SORT GROUP BY | | 11 | 858 | 1624|*2 | HASH JOIN | | 16 | 1248 | 1623|3 | JOIN FILTER CREATE | :BF0000 | 9 | 549 | 425|4 | MERGE JOIN | | 9 | 549 | 425| CARTESIAN*5 | TABLE ACCESS | ZM_CUSTOMERS | 1 | 26 | 423|◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾STORAGE FULL6 | BUFFER SORT | | 14 | 490 | 2|7 | TABLE ACCESS BY | ZM_PRODUCTS | 14 | 490 | 2|◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾INDEX ROWID BATCHED*8 | INDEX RANGE SCAN | ZM_PRODUCTS_PROD_ | 14 | | 0|

◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾CAT_IX9 | JOIN FILTER USE | :BF0000 |918K| 14M | 1196|*10| TABLE ACCESS STORAGE | ZM_SALES |918K| 14M | 1196| FULL WITH ZONEMAP-----------------------------------------------------------------------Predicate Information (identified by operation id):-----------------------------------------------------------------------. . . 10 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"S"."PROD_ID","S"."CUST_ID")) filter(SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_3_PROD_CATEGORY" > :1 OR zm."MAX_3_PROD_CATEGORY" < :2 OR zm."MIN_2_CUST_CITY" > :3 OR zm."MAX_2_CUST_CITY" < :4 OR zm."MIN_1_CUST_STATE_PROVINCE" > :5 OR zm."MAX_1_CUST_STATE_PROVINCE" < :6) THEN 3 ELSE 2 END END FROM "SH"."MZM_SALES" zm WHERE zm."ZONE_LEVEL$"=0 ORDER BY zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),'Software Other','Software/Other','Bay City','Bay City','WI','WI')<3 AND SYS_OP_BLOOM_FILTER(:BF0000,"S"."PROD_ ID","S"."CUST_ID"))

Listing 19: Execution Plan for Query Leveraging Zone Mapping

Leveraging In-Memory Min-Max Partition PruningPairing attribute clustering with zone maps also offer another intriguing performance improvement opportunity: the ability to leverage in-memory partition pruning based on the range of minimum and maximum values that zone maps can track to minimize physical I/O as storage is searched for the appropriate values. To illustrate the power of combining attribute clustering and zone maps, consider table AP.RANDOMIZED_PARTED, which was created and loaded via the code in Listing 20:

CREATE TABLE ap.randomized_parted ( key_id NUMBER(8) ,key_date DATE ,key_desc VARCHAR2(32) ,key_sts NUMBER(2) NOT NULL) CLUSTERING BY LINEAR ORDER (key_sts) YES ON LOAD YES ON DATA MOVEMENT WITH MATERIALIZED ZONEMAP (zm_randomized_parted) PARTITION BY RANGE(key_date) ( PARTITION p1_frigid VALUES LESS THAN (TO_DATE('2010-01-01','yyyy-mm-dd')) TABLESPACE ado_cold_data ,PARTITION p2_cool VALUES LESS THAN (TO_DATE('2013-01-01','yyyy-mm-dd')) TABLESPACE ado_cool_data ,PARTITION p3_warm VALUES LESS THAN (TO_DATE('2014-01-01','yyyy-mm-dd')) TABLESPACE ado_warm_data ,PARTITION p4_hot VALUES LESS THAN (TO_DATE('2014-07-01','yyyy-mm-dd')) TABLESPACE ado_hot_data ,PARTITION p5_radiant VALUES LESS THAN (MAXVALUE) TABLESPACE ap_data);

INSERT INTO ap.randomized_partedSELECT * FROM ap.randomized_sorted;

COMMIT;

Listing 20: Creating a Partitioned Table to Demonstrate In-Memory Min-Max Partition Pruning

Page 11: SELECTJournal - WordPress.com H Q4-15 11 DSS/DW queries continue to demand more computing resources as the volume, complexity and multidimensionality of data expands. (Intel announced

www.ioug.org Q4-15 ◾ 19H

The query in Listing 21 results in the optimizer choosing quite a different approach to accessing these data, as the execution plan in Listing 22 shows:

EXPLAIN PLAN FOR SELECT MIN(key_id), MAX(key_id), COUNT(*) FROM ap.randomized_parted WHERE key_date BETWEEN TO_DATE(‘2014-01-01’,’YYYY-MM-DD’) AND TO_DATE(‘2014-09-30’,’YYYY-MM-DD’) AND key_sts < 50; SELECT plan_table_output FROM TABLE(DBMS_XPLAN.DISPLAY(FORMAT => ‘BASIC PREDICATE PARTITION’));

Listing 21: Leveraging Zone Mapping and Attribute Clustering to Prune Partitions

Plan hash value: 4272865020-----------------------------------------------------------------------------------------| Id | Operation | Name | Pstart| Pstop |-----------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | || 1 | SORT AGGREGATE | | | || 2 | PX COORDINATOR | | | || 3 | PX SEND QC (RANDOM) | :TQ10000 | | || 4 | SORT AGGREGATE | | | || 5 | PX BLOCK ITERATOR | |KEY(AP)|KEY(AP)||* 6 | TABLE ACCESS STORAGE | RANDOMIZED_PARTED |KEY(AP)|KEY(AP)| FULL WITH ZONEMAP-----------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 6 - storage(“KEY_STS”<50 AND “KEY_DATE”<=TO_DATE(‘ 2014-09-30 00:00:00’,‘syyyy-mm-dd hh24:mi:ss’)) filter(SYS_ZMAP_FILTER(‘/* ZM_PRUNING */ SELECT “ZONE_ID$”, CASE WHEN BITAND(zm.”ZONE_STATE$”,1)=1 THEN 1 ELSE CASE WHEN (zm.”MIN_1_KEY_STS” >= :1) THEN 3 ELSE 2 END END FROM “AP”.”ZM_RANDOMIZED_PARTED” zm WHERE zm.”ZONE_LEVEL$”=0 ORDER BY zm.”ZONE_ID$”’,SYS_OP_ZONE_ID(ROWID),50)<3 AND “KEY_STS”<50 AND “KEY_DATE”<=TO_DATE(‘ 2014-09-30 00:00:00’, ‘syyyy-mm-dd hh24:mi:ss’))

Listing 22: Resulting Execution Plan Leveraging Partition Pruning

ContactJim Czuprynski has accumulated over 30 years of experience during his career in information technology. He has served diverse roles at several Fortune 1000 companies in those three decades — mainframe programmer, applications developer, business analyst, and project manager — before becoming an Oracle database administrator in 2001. He is an Oracle ACE Director and he currently holds OCP certification for Oracle 9i, 10g and 11g.

Oracle Database Release 12.1.0.2: Concluding the Series

The last four articles in this series have delved deeply into many of the new feature sets of Oracle Database Release 12.1.0.2. Though many IT organizations are doubtlessly waiting for Oracle Database Release 12.2 until beginning their foray into Oracle Database 12c, hopefully these articles have made a convincing case that there’s no time like the present to begin exploring 12.1.0.2’s key features, especially the potentially dramatic increase in database application performance through In-Memory Column Store (IMCS).

References12.1.0.2 Introduction to Zone Maps Part I (Map Of The Problematique). (2014, September). Retrieved February 15, 2015, from Richard Foote’s Oracle Blog.

12.1.0.2 Introduction to Zone Maps Part II (Changes). (2014, October). Retrieved February 15, 2015, from Richard Foote’s Oracle Blog.

12.1.0.2 Introduction to Zone Maps Part III (Little by Little). (2014, November). Retrieved February 15, 2015, from Richard Foote’s Oracle Blog.

Oracle Database Administrator’s Guide 12c Release 1. (2014, July). Retrieved February 18, 2015, from Oracle Database 12c Online Documentation Library.

Oracle Database Backup and Recovery Reference 12c Release 1. (2014, July). Retrieved February 18, 2015, from Oracle Database 12c Online Documentation Library.

Oracle Database In-Memory. (2014, October). Retrieved February 15, 2015, from Oracle Database 12c Online Documentation Library.

Oracle Database Licensing Information 12c Release 1. (2014, July). Retrieved February 18, 2015, from Oracle Database 12c Online Documentation Library.

Oracle Database New Features 12c Release 1. (2014, July). Retrieved February 18, 2015, from Oracle Database 12c Online Documentation Library.

Oracle Database Reference 12c Release 1. (2014, July). Retrieved February 18, 2015, from Oracle Database 12c Online Documentation Library.

Oracle Database SQL Reference 12c Release 1. (2014, July). Retrieved February 18, 2015, from Oracle Database 12c Online Documentation Library.

Oracle Database Utilities 12c Release 1. (2014, July). Retrieved February 18, 2015, from Oracle Database 12c Online Documentation Library.