15
BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Embed Size (px)

Citation preview

Page 1: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

BYLECTURER/ AISHA DAWOOD

DW Lab # 4Overview of Extraction,

Transformation, and Loading

Page 2: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Transformation Flow From an architectural perspective, you can transform your data in two ways: ■ Multistage Data Transformation ■ Pipelined Data Transformation

LAB EXERCISE #4 Oracle Data Warehousing

Page 3: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Multistage Data Transformation The data transformation logic for most data warehouses consists of multiple steps. For

example, in transforming new records to be inserted into a sales table, there may be separate logical transformation steps to validate each dimension key.

LAB EXERCISE #4 Oracle Data Warehousing

Page 4: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Pipelined Data Transformation

LAB EXERCISE #4 Oracle Data Warehousing

Page 5: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Loading Mechanisms

You can use the following mechanisms for loading a data warehouse:

■ Loading a Data Warehouse with SQL*Loader

■ Loading a Data Warehouse with External Tables

■ Loading a Data Warehouse with OCI and Direct-Path APIs

■ Loading a Data Warehouse with Export/Import

LAB EXERCISE #4 Oracle Data Warehousing

Page 6: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

■ Loading a Data Warehouse with SQL*Loader

LAB EXERCISE #4 Oracle Data Warehousing

Page 7: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Transformation Mechanisms You have the following choices for transforming data inside the database:

■ Transforming Data Using SQL

■ Transforming Data Using PL/SQL

■ Transforming Data Using Table Functions

Transforming Data Using SQL

Once data is loaded into the database, data transformations can be executed using SQL operations. There are four basic techniques for implementing SQL data transformations:

■ CREATE TABLE ... AS SELECT And INSERT /*+APPEND*/ AS SELECT (Data substitution)

■ Transforming Data Using UPDATE (Data substitution)

■ Transforming Data Using MERGE

■ Transforming Data Using Multitable INSERT

LAB EXERCISE #4 Oracle Data Warehousing

Page 8: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

CREATE TABLE ... AS SELECT And INSERT /*+APPEND*/ AS SELECT The CREATE TABLE ... AS SELECT statement (CTAS) is a powerful tool for efficiently executing a

SQL query and storing the results of that query in a new database table. The INSERT /*+APPEND*/ ... AS SELECT statement offers the same capabilities with existing

database tables. The following SQL statement inserts data from sales_activity_direct into the sales table of the sample

schema, using a SQL function to truncate the sales date values to the midnight time and assigning a fixed channel ID of 3.

INSERT /*+ APPEND NOLOGGING PARALLEL */

INTO sales SELECT product_id, customer_id, TRUNC(sales_date), 3,

promotion_id, quantity, amount

FROM sales_activity_direct;

LAB EXERCISE #4 Oracle Data Warehousing

Note: receiving data from multiple source systems for your data warehouse.

Page 9: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Transforming Data Using UPDATE Another technique for implementing a data substitution is to use an UPDATE

statement to modify the sales.channel_id column. An UPDATE will provide the correct result.

LAB EXERCISE #4 Oracle Data Warehousing

Page 10: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Transforming Data Using MERGE Oracle Database's merge functionality extends SQL, by introducing the SQL

keyword MERGE, in order to provide the ability to update or insert a row conditionally into a table or out of line single table views.

Example: assume that new data for the dimension table products is propagated to the data warehouse and has to be either inserted or updated. The table products_delta has the same structure as products.

Merge Operation Using SQL

LAB EXERCISE #4 Oracle Data Warehousing

Page 11: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Transforming Data Using Multitable INSERT Many times, external data sources have to be segregated based on logical

attributes for insertion into different target objects. It offers the benefits of the INSERT ... SELECT statement when multiple tables

are involved as targets.

LAB EXERCISE #4 Oracle Data Warehousing

Page 12: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Example (Unconditional Insert) The following statement aggregates the transactional sales information, stored in

sales_activity_direct, on a daily basis and inserts into both the sales and the costs fact table for the current day.

INSERT ALLINTO sales VALUES (product_id, customer_id, today, 3, promotion_id, quantity_per_day, amount_per_day)INTO costs VALUES (product_id, today, promotion_id, 3, product_cost, product_price)SELECT TRUNC (s.sales_date) AS today, s.product_id, s.customer_id, s.promotion_id, SUM(s.amount) AS amount_per_day, SUM(s.quantity) quantity_per_day, p.prod_min_price*0.8 AS product_cost, p.prod_list_price AS product_priceFROM sales_activity_direct s, products pWHERE s.product_id = p.prod_id AND TRUNC(sales_date) = TRUNC(SYSDATE)GROUP BY TRUNC(sales_date), s.product_id, s.customer_id, s.promotion_id, p.prod_min_price*0.8, p.prod_list_price;

LAB EXERCISE #4 Oracle Data Warehousing

Page 13: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Example (Conditional ALL Insert) The following statement inserts a row into the sales and costs tables for all sales transactions with a

valid promotion and stores the information about multiple identical orders of a customer in a separate table cum_sales_activity. It is possible two rows will be inserted for some sales transactions, and none for others.

INSERT ALL

WHEN promotion_id IN (SELECT promo_id FROM promotions) THEN

INTO sales VALUES (product_id, customer_id, today, 3, promotion_id,

quantity_per_day, amount_per_day)

INTO costs VALUES (product_id, today, promotion_id, 3,

product_cost, product_price)

WHEN num_of_orders > 1 THEN

INTO cum_sales_activity VALUES (today, product_id, customer_id,

promotion_id, quantity_per_day, amount_per_day, num_of_orders)

SELECT TRUNC(s.sales_date) AS today, s.product_id, s.customer_id,

s.promotion_id, SUM(s.amount) AS amount_per_day, SUM(s.quantity)

quantity_per_day, COUNT(*) num_of_orders, p.prod_min_price*0.8

AS product_cost, p.prod_list_price AS product_price

FROM sales_activity_direct s, products p

WHERE s.product_id = p.prod_id

AND TRUNC(sales_date) = TRUNC(SYSDATE)

GROUP BY TRUNC(sales_date), s.product_id, s.customer_id,

s.promotion_id, p.prod_min_price*0.8, p.prod_list_price;

LAB EXERCISE #4 Oracle Data Warehousing

Page 14: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Conditional FIRST Insert The following statement inserts into an appropriate shipping manifest according to the total quantity

and the weight of a product order. An exception is made for high value orders, which are also sent by express, unless their weight classification is not too high. All incorrect orders, in this simple example represented as orders without a quantity, are stored in a separate table. It assumes the existence of appropriate tables large_freight_shipping, express_shipping, default_shipping, and incorrect_sales_order.

INSERT FIRST WHEN (sum_quantity_sold > 10 AND prod_weight_class < 5) AND

sum_quantity_sold >=1) OR (sum_quantity_sold > 5 AND prod_weight_class > 5) THEN

INTO large_freight_shipping VALUES

(time_id, cust_id, prod_id, prod_weight_class, sum_quantity_sold)

WHEN sum_amount_sold > 1000 AND sum_quantity_sold >=1 THEN

INTO express_shipping VALUES

(time_id, cust_id, prod_id, prod_weight_class,

sum_amount_sold, sum_quantity_sold)

WHEN (sum_quantity_sold >=1) THEN INTO default_shipping VALUES

(time_id, cust_id, prod_id, sum_quantity_sold)

ELSE INTO incorrect_sales_order VALUES (time_id, cust_id, prod_id)

SELECT s.time_id, s.cust_id, s.prod_id, p.prod_weight_class,

SUM(amount_sold) AS sum_amount_sold,

SUM(quantity_sold) AS sum_quantity_sold

FROM sales s, products p

WHERE s.prod_id = p.prod_id AND s.time_id = TRUNC(SYSDATE)

GROUP BY s.time_id, s.cust_id, s.prod_id, p.prod_weight_class;

LAB EXERCISE #4 Oracle Data Warehousing

Page 15: BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading

Example (Mixed Conditional and Unconditional Insert) The following example inserts new customers into the customers table and stores

all new customers with cust_credit_limit higher then 4500 in an additional, separate table for further promotions.

INSERT FIRST WHEN cust_credit_limit >= 4500 THEN INTO customers

INTO customers_special VALUES (cust_id, cust_credit_limit)

ELSE INTO customers

SELECT * FROM customers_new;

LAB EXERCISE #4 Oracle Data Warehousing