Upload
ranusofi
View
216
Download
0
Embed Size (px)
Citation preview
7/27/2019 B208 Access & Constraints
1/27
Module 8: Access Considerations and
Constraints
After completing this module, you will be able to:
Analyze Optimizer Access scenarios.
Explain partial value searches and data conversions.
Identify the effects of conflicting data types.
Determine the cost of I/Os.
Identify column level attributes and constraints.
Identify table level attributes and constraints.
Add, modify and drop constraints from tables.
Explain how the Identity column allocates new numbers.
7/27/2019 B208 Access & Constraints
2/27
Access Method Comparison
Unique Primary Index
Very efficient
One AMP, one row
No spool file
Non-Unique Primary Index
Efficient if the number of rows
per value is reasonable and
there are no severe spikes.
One AMP, multiple rows
Spool file if needed
Unique Secondary Index
Very efficient
Two AMPs, one row
No spool file
Non-Unique Secondary Index
Efficient only if the number of rows
accessed is a small percentage of
the total data rows in the table.
All AMPs, multiple rows
Spool file if needed
Full-Table Scan
Efficient since each row is touched
only once.
All AMPs, all rows
Spool file may equal the table insize
The Optimizer chooses the fastest access method.
COLLECT STATISTICS to help the Optimizer make
good decisions.
7/27/2019 B208 Access & Constraints
3/27
Optimizer Access Scenarios
SINGLE TABLE CASE
WHERE Table_1.Col_1 = :value_1
AND Table_1.Col_2 = :value_2 ; Column theOptimizer
uses foraccess.
USI NUSI
NOT
INDEXEDCol_1
Col_2
USI USI USI USI
NUSI USI
NOT
INDEXEDFTS
NUPI NUPI NUPI
UPI UPI UPI UPI
NUPI or
USI
Either, Both,
orFTSNUSI orFTS
USI NUSIorFTS
1
2 3
3
1. The Optimizer prefers Primary Indexes over Secondary Indexes. It chooses the NUPIif only one I/O (block) is accessed.
The Optimizer prefers Unique indexes over non-unique indexes. Only one row is
involved with USI even though it is a two-AMP operation.
2. Depending on relative selectivity, the Optimizer may use either NUSI, may use both
with NUSI Bit Mapping, or may do a FTS.
3. It depends on the selectivity of the index.
Notes:
7/27/2019 B208 Access & Constraints
4/27
Partial Value Searches
Columns values must not be decomposable.
LIKE, INDEX, and SUBSTRING operators indicate decomposable data.
Show all calls placed by people within Area Code 415:
SELECT , phone,
FROM Call
WHERE phone LIKE '415%' ;
Always decompose data to the finest level of access usage.
Use the SQL concatenation operator ( ll ) to display the data:
SELECT , area_code ll '/' ll phone,
FROM Call
WHERE AREA_CODE = 415 ;
The Teradata Database does a FTS on a partial index value unless the index is
ordered by value (Value-ordered NUSI or Hash Index).
Data storage and display should be treated as separate issues.
7/27/2019 B208 Access & Constraints
5/27
Data Conversions
Columns (or values) must be of the same data type to be compared.
If column (or values) types differ, internal conversion is performed.
Character data is compared using the hosts collating sequence.
Unequal-length character strings are converted by right-padding the shorter
one with blanks.
Numeric values are converted to the same underlying representation.
Character to numeric comparison requires the character value to be
converted to a numeric value.
Data conversion is expensive and generally unnecessary.
Implement data types at the Domain level.
Comparison across data types may indicate that Domain definitions are not
clearly understood.
7/27/2019 B208 Access & Constraints
6/27
Storing Numeric Data
When comparing character data to numeric, Teradata will always convert
character to numeric, then do the comparison.
Case 1
Table 1
CREATE TABLE Emp1
(Emp_no CHAR(6),Emp_name CHAR(20))
PRIMARY INDEX (Emp_no);
Statement 1
SELECT *
FROM Emp1
WHERE Emp_no = '1234';
Statement 2
SELECT *
FROM Emp1
WHERE Emp_no = 1234;
Table 1
CREATE TABLE Emp2
(Emp_no INTEGER,Emp_name CHAR(20))
PRIMARY INDEX (Emp_no);
Statement 1
SELECT *
FROM Emp2
WHERE Emp_no = 1234;
Statement 2
SELECT *
FROM Emp2
WHERE Emp_no = '1234';
Case 2Comparison Rules:
To compare columns, they
must be of the same Data
types.
Character data types will
always be converted to
numeric (when comparing
character to numeric).
Bottom Line:
Always store numeric data
in numeric data types to
avoid unnecessary and
costly data conversions.
Results in Full Table Scan Results in unnecessaryconversion
7/27/2019 B208 Access & Constraints
7/27
Data Conversion Example
CREATE SET TABLE TFACT01.Table1
(col1 CHAR(12) NOT NULL)UNIQUE PRIMARY INDEX (col1);
EXPLAIN SELECT * FROM Table1 WHERE col1 = '8';
1) First, we do a single-AMP RETRIEVE step from TFACT01.Table1 by way of the unique primary index
"TFACT01.Table1.col1 = '8' " with no residual conditions. The estimated time for this step is 0.03
seconds.-> The row is sent directly back to the user as the result of statement 1. The total estimated time is
0.03 seconds.
EXPLAIN SELECT * FROM Table1 WHERE col1 = 8;
1) First, we lock a distinct TFACT01."pseudo table" for read on a RowHash to prevent global deadlock
for TFACT01.Table1.2) Next, we lock TFACT01.Table1 for read.
3) We do an all-AMPs RETRIEVE step from TFACT01.Table1 by way of an all-rows scan with a
condition of ("(TFACT01.Table1.col1 (FLOAT, FORMAT '-9.99999999999999E-999')UNICODE)=
8.00000000000000E 000") into Spool 1, which is built locally on the AMPs. The size of Spool 1 is
estimated with no confidence to be 1,001 rows. The estimated time for this step is 0.28 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimatedtime is 0.28 seconds.
7/27/2019 B208 Access & Constraints
8/27
Matching Data Types
The following data types are identical to the hashing algorithm:
INTEGER = DATE = DECIMAL (x,0)
CHAR = VARCHAR = LONG VARCHAR
BYTE = VARBYTE
GRAPHIC = VARGRAPHIC
Administer data type assignments at the domain level.
Give matching Primary Indexes across tables the same data type.
7/27/2019 B208 Access & Constraints
9/27
Counting I/O Operations
Many factors influence the number of physical I/Os in a transaction:
Cache hits
Swapping
Rows per block
Cylinder splits/migrates
Mini-Cylpacks
Number of spool files
Spool file sizes
I/Os may be done serially or in parallel.
Data and index block I/O may or may not require Cylinder Index I/O.
Changes to data rows and USI rows require Transient Journal I/O.
I/O counts indicate the relative cost of a transaction.
A given I/O operation may not cause any actual physical I/O.
7/27/2019 B208 Access & Constraints
10/27
Transient Journal I/O
The Transient Journal is
A journal of transaction before images.
Provides for automatic rollback in the event of TXN failure.
Is automatic and transparent.
TJ space comes from available free cylinders in the system.
When a transaction completes, TJ space is returned to free cylinder lists.
Provides Transaction Integrity.
Therefore, when modifying a table, there are I/Os for data table and the Transient
Journal.
Some situations where Transient Journal is not used include:
INSERT / SELECT into an empty table
DELETE FROM tablename ALL
Utilities such as FastLoad and MultiLoad
7/27/2019 B208 Access & Constraints
11/27
INSERT and DELETE Operations
INSERT INTO tablename . . . ; DELETE FROM tablename . . . ;
* = I/O Operations
DATA ROW * READ DATA BLOCK
* WRITE TRANSIENTJOURNAL
INSERT or DELETE the DATA ROW
* WRITE NEW DATA BLOCK
* WRITE CYLINDER INDEX
For each USI * READ INDEX BLOCK
* WRITE TRANSIENTJOURNAL
INSERT or DELETE the NEW INDEX ROW
* WRITE NEW INDEX BLOCK
* WRITE CYLINDER INDEX
For each NUSI * READ INDEX BLOCK
ADD or DELETE the ROWID on the ROWID LIST or
ADD or DELETE the SUBTABLE ROW
* WRITE NEW INDEX BLOCK
* WRITE CYLINDER INDEX
I/O operations per row = 4 + [ 4 * (#USIs) ] + [ 3 * (#NUSIs) ]
Double for FALLBACK
7/27/2019 B208 Access & Constraints
12/27
7/27/2019 B208 Access & Constraints
13/27
7/27/2019 B208 Access & Constraints
14/27
Permanent Journal I/O
SINGLE image journaling is not allowed on
FALLBACK tables.
AFTER
IMAGE
NONE NONE 0
NONE SINGLE 2
SINGLE NONE 2
SINGLE SINGLE 4
4DUALNONE
DUAL NONE 4
SINGLE DUAL 6
DUAL SINGLE 6
DUAL DUAL 8
BEFORE
IMAGE PJ I/O COUNT (Count)These counts include:
1. Write the PJ block,
2. Write the Cylinder Index.
INSERT : Total PJ I/O = Count + (#USIs * Count)
DELETE :
Total PJ I/O = Count + (#USIs changed * Count * 2)UPDATE :
Total PJ I/O = Count + (#USIs * Count)
Total I/O = Total PJ I/O + DATA I/O
Changes to NUSI columns cause no additional I/Os.
Changes to PI columns double the counts.
The total number of Permanent Journal I/O
operations per row is:
7/27/2019 B208 Access & Constraints
15/27
Table Level Attributes
CREATE MULTISET TABLE Table_1, FALLBACK,
DATABLOCKSIZE = 16384 BYTES, FREESPACE = 10 PERCENT, CHECKSUM = NONE(column1 INTEGER,
column2 CHAR(5) );
SET Dont allow duplicate rows
MULTISET Allow duplicate rows (ANSI)
DATABLOCKSIZE = Maximum multi-row block size for table in:
BYTES Rounded to nearest sector (512)
KILOBYTES (or KBYTES) Increments of 1024
MINIMUM DATABLOCKSIZE (7168)
MAXIMUM DATABLOCKSIZE (130,560)IMMEDIATE May be used to immediately re-block the data (ALTER)
FREESPACE Percent of freespace to keep on cylinder during load operations (0 - 75%).
CHECKSUM = DEFAULT | NONE | LOW | MEDIUM | HIGH | ALLDisk I/O Integrity Check V2R5.1 feature
7/27/2019 B208 Access & Constraints
16/27
Column Level Constraints
PRIMARY KEY No Nulls, No Duplicates
UNIQUE No Nulls, No Duplicates
CHECK Verify values or range
REFERENCES Relates to other columns
CREATE TABLE Table_2(col1 INTEGER NOT NULL CONSTRAINT primary_1 PRIMARY KEY,
col2 INTEGER NOT NULL CONSTRAINT unique_1 UNIQUE,
col3 INTEGER CONSTRAINT check_1 CHECK (col3 > 0),
col4 INTEGER CONSTRAINT reference_1 REFERENCES Table_3(col_a)
);
All constraints are named.
All constraints are at column level.
PRIMARY KEY columns must have NOT NULL attribute.
UNIQUE columns must also have NOT NULL attribute.
7/27/2019 B208 Access & Constraints
17/27
Table Level Constraints
CREATE TABLE Table_4
(col1 INTEGER NOT NULL,col2 INTEGER NOT NULL,
col3 INTEGER NOT NULL,
col4 INTEGER NOT NULL,
col5 INTEGER,
col6 INTEGER,
CONSTRAINT primary_1 PRIMARY KEY (col1, col2),CONSTRAINT unique_1 UNIQUE (col3, col4),
CONSTRAINT check_1 CHECK (col2 > 0 OR col4 > 0),
CONSTRAINT reference_1 FOREIGN KEY (col5, col6)
REFERENCES Table_5 (colA, colB),
CHECK (col4 > col5),
FOREIGN KEY (col3) REFERENCES Table_6 (colX)
);
Some constraints are named.
Some constraints are unnamed.
All constraints are at table level.
Named
Unnamed
7/27/2019 B208 Access & Constraints
18/27
7/27/2019 B208 Access & Constraints
19/27
Example: SHOW Department Table
SHOW TABLE Department;
CREATE SET TABLE PD.Department , FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
dept_number INTEGER NOT NULL,
dept_name CHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,dept_mgr_number INTEGER,
budget_amount DECIMAL(10,2),
CONSTRAINT dn_1000_plus CHECK ( dept_number > 999 ),
CONSTRAINT refer_1 FOREIGN KEY ( dept_mgr_number ) REFERENCES
PD.EMPLOYEE ( EMPLOYEE_NUMBER ))
UNIQUE PRIMARY INDEX primary_1 ( dept_number )UNIQUE INDEX ( dept_name );
Notes:
Primary key constraint becomes a named index.
Unique constraint becomes a unique index.
All constraints are specified at table level.
Note: The Primary Key constraintdefined with the CREATE TABLE
doesn't appear in this SHOW TABLE.
7/27/2019 B208 Access & Constraints
20/27
Altering Table Constraints
ALTER TABLE tablename
ADD CONSTRAINT constrname CHECK . . .
ADD CONSTRAINT constrname UNIQUE . . .
ADD CONSTRAINT constrname PRIMARY KEY . . .
ADD CONSTRAINT constrname FOREIGN KEY . . .
To add constraints to a table:
ALTER TABLE tablename
DROP CONSTRAINT constrname ;
ALTER TABLE tablename
MODIFY CONSTRAINT constrname . . . ;
To modify existing constraints:
To drop constraints:
Note:
Only constraint that can be modified
is a named CHECK constraint.
In V2R5, the ALTER TABLE command can also be used to add new columns (up to
2048) to an existing table.
7/27/2019 B208 Access & Constraints
21/27
Also known as a DBS Generated Uniqu e Primary Index: A table-level unique
number system-generated for every row as it is inserted in the table.
Identity Columns may be used to ...
Guarantee row uniqueness in a table
Guarantee even row distribution for a table
Optimize and simplify initial port from other databases that use generated keys
Identity Columns are valid for:
Single inserts
Multi-session concurrent insert requests (e.g., TPump)
INSERT SELECT
Identity Columns Save Overhead/Maintenance Costs:
Reduce need for uniqueness constraints
Reduce manual coding tasks
Generate unique PK values
Comply with the ANSI Standard
Identity Column Overview
7/27/2019 B208 Access & Constraints
22/27
Identity Column Implementation
Characteristics of the IDENTITY Column feature are ...
Implemented at column level in a CREATE TABLE statement
Data type may be any exact numeric type
GENERATED ALWAYS always generates a value
GENERATED BY DEFAULT generates a value only when no value is specified
GENERATED ALWAYS + NO CYCLE implies uniqueness
CYCLE restarts numbering after the maximum/minimum number is
generated
DBSControl setting indicates the number pool size to reserve for generating
numbers
Each Vproc may reserve 1 1,000,000 numbers; default is 100000.
Numbering gaps can occur
Generated numbers do not reflect row insertion sequence
Exact incrementing is not guaranteed
Scalability and performance are favored over enforced sequential
numbering
7/27/2019 B208 Access & Constraints
23/27
Identity Column Example 1
Example 1: GENERATED ALWAYS AS IDENTITY
This command always generates a value. It does not cycle and does not repeat priorused values.
CREATE TABLE Table_A
(Cust_Number INTEGER GENERATED ALWAYS AS IDENTITY
(START WITH 1001 INCREMENT BY 1 MAXVALUE 1000000 NO CYCLE),
LName VARCHAR(15),
Zip_code INTEGER);
INSERT INTO Table_A SELECT c_custid, c_lname, c_zipcode FROM Customer;
Customer has 500 rows new customer
numbers generated are not sequentially
numbered from 1001 to 1500.
Numbering gaps can occur exactincrementing is not guaranteed.
Pools (range of numbers) are reserved
and allocated by Teradata software.
Default for next allocation pool is
DBSControl parameter value of 100,000.
SELECT * FROM Table_A ORDER BY 1;
Cust_Number LName Zip_Code
1001 Tatem 897141002 Kroger 98101
1003 Yang 77481
1004 Miller 45458
: : :
101001 Powell 57501
101002 Gordan 89714
101003 Smoothe 80002
: : :
7/27/2019 B208 Access & Constraints
24/27
Identity Column Example 2
Example 2: GENERATED BY DEFAULT AS IDENTITY
This option generates a value only when no value is specified for the column.
CREATE TABLE Table_B
(Cust_Number INTEGER GENERATED BY DEFAULT AS IDENTITY
(START WITH 10000000 INCREMENT BY -1 MINVALUE 0),
LName VARCHAR(15),
Zip_code INTEGER);
INSERT INTO Table_B SELECT NULL, c_lname, c_zipcode FROM Customer;
Customer has 500 rows new customer
numbers are generated because NULL was
part of SELECT.
If MINVALUE is not used, the minimumvalue for an INTEGER is -2,147,483,647.
CYCLE option is not used default is NO
CYCLE.
GENERATED BY DEFAULT provides
capability of copying the contents of one
table with an Identity column into another.
SELECT * FROM Table_B ORDER BY 1 DESC;
Cust_Number LName Zip_Code
10000000 Tatem 897149999999 Kroger 98101
9999998 Yang 77481
9999997 Miller 45458
: : :
9900000 Powell 57501
9899999 Gordan 89714
9899998 Smoothe 80002
: : :
7/27/2019 B208 Access & Constraints
25/27
Identity Column Considerations
Generated Always Identity Columns
Typically define the Primary Index.
Define as the Primary Index only i f i t is the pr imary path.
If it is also used as an access path, consider it as a Secondary Index.
Generated By Default Identity Columns
Facilitate copying data from one table into another.
Use a numeric type large enough to hold all the values that will ever be required.
Neveruse as a subst i tutefor a good logical database design.
May not optimally utilize Teradata join and access capabilities.
Restrictions
A table can only have 1 Identity column.
FastLoad and MultiLoad do not support Identity columns with Teradata V2R5.0.
ALTER TABLE statement can not add an Identity Column to an existing table.
Cannot be part of a composite primary or a composite secondary index.
Cannot be used with Global Temporary or volatile tables.
Cannot be used in a join index, hash index, PPI or value-ordered index.
Atomic UPSERTs are not supported on a table with an Identity Column as its PI.
GENERATED ALWAYS Identity Column value updates are not supported.
Note: With Teradata V2R5.1, Identity columns are supported with the FastLoad, MultiLoad,and Teradata Warehouse Builder (TWB) utilities.
7/27/2019 B208 Access & Constraints
26/27
Review Questions
1. Which one of the following situations requires the use of the Transient Journal?
a. INSERT / SELECT into an empty table
b. UPDATE all the rows in a table
c. DELETE all the rows in a table
d. loading a table with FastLoad
2. What is a negative impact of updating a UPI value?
______________________________________________________
______________________________________________________
3. What are the 4 types of constraints?
_____________ _____________ _____________ _____________
4. True or False? A primary key constraint is always implemented as a primary index.
5. True or False? A primary key constraint is always implemented as a unique index.
6. True or False? Multi-column constraints must be coded as table level constraints.
7. True or False? Only named check constraints may be modified.
8. True or False? Named primary key constraints may always be dropped if they are no longer
needed.
9. True or False? Using the START WITH 1 and INCREMENT BY 1 options with an Identity
column will provide sequential numbering with no gaps for the column.
7/27/2019 B208 Access & Constraints
27/27
Module 8: Review Question Answers
1. Which one of the following situations requires the use of the Transient Journal?
a. INSERT / SELECT into an empty table
b. UPDATE all the rows in a tablec. DELETE all the rows in a table
d. loading a table with FastLoad
2. What is a negative impact of updating a UPI value?
Very I/O intens ive - updating th e Primary Ind ex requires th at (internally ) the data row b e deleted and
re-inserted into the table as well as updating the existing secondary index references to the new
RowID
3. What are the 4 types of constraints?
Primary Key Unique References Check
4. True orFalse? A primary key constraint is always implemented as a primary index.
5. Trueor False? A primary key constraint is always implemented as a unique index.
6. Trueor False? Multi-column constraints must be coded as table level constraints.
7. Trueor False? Only named check constraints may be modified.
8. True orFalse? Named primary key constraints may always be dropped if they are no longer
needed.
9. True orFalse? Using the START WITH 1 and INCREMENT BY 1 options with an Identity