Upload
ashwini-padhy
View
222
Download
0
Embed Size (px)
Citation preview
7/30/2019 5-Join LookUp and Merge Stages
1/22
Join, LookUp, and Merge stages
These Three Stages combine two or more input
links according to values of user-designated Key
column(s)
7/30/2019 5-Join LookUp and Merge Stages
2/22
1. Join Stage
The Join stage is an active stage. It performs join
operations on two data sets input to the stage and then
outputs the resulting data set.
The input data sets are notionally identified as the
right set and the left sets. You can specify which is
which.
It has any number of input links and a single output
link.
7/30/2019 5-Join LookUp and Merge Stages
3/22
The stage can perform one of four join operations:
Innertransfers records from input data sets whose key columns
contain equal values to the output data set. Records whose key columns do notcontain equal values are dropped.
Left outertransfers all values from the left data set but transfersvalues from the right data set. The operator drops the key column from the rightdata set.
Right outertransfers all values from the right data set and transfersvalues from the left data set. The operator drops the key column from the leftdata set.
Full outertransfers records in which the contents of the key columns
are equal from the left and right input data sets to the output data set. It alsotransfers records whose key columns contain unequal values from both inputdata sets to the output data set.
7/30/2019 5-Join LookUp and Merge Stages
4/22
Join Stage Ex
7/30/2019 5-Join LookUp and Merge Stages
5/22
Join Stage Ex
7/30/2019 5-Join LookUp and Merge Stages
6/22
7/30/2019 5-Join LookUp and Merge Stages
7/22
7/30/2019 5-Join LookUp and Merge Stages
8/22
7/30/2019 5-Join LookUp and Merge Stages
9/22
7/30/2019 5-Join LookUp and Merge Stages
10/22
7/30/2019 5-Join LookUp and Merge Stages
11/22
2. Lookup Stage
The Lookup stage is an active stage. It is used to
perform lookup operations on a lookup table contained in
a Lookup File Set stage or provided by one of the
database stages that support reference output links !
Lookup Tables should be small enough to fit into physical
memory (otherwise, performance hit due to paging)
7/30/2019 5-Join LookUp and Merge Stages
12/22
7/30/2019 5-Join LookUp and Merge Stages
13/22
7/30/2019 5-Join LookUp and Merge Stages
14/22
Lookup stage Ex
7/30/2019 5-Join LookUp and Merge Stages
15/22
Lookup stage Editor
7/30/2019 5-Join LookUp and Merge Stages
16/22
7/30/2019 5-Join LookUp and Merge Stages
17/22
7/30/2019 5-Join LookUp and Merge Stages
18/22
7/30/2019 5-Join LookUp and Merge Stages
19/22
The Merge stage is an active stage. It can haveany number of input links,a single output link, andthe same number of reject links as there are input
links. The Merge stage combines a sorted master dataset with one or more sorted update data sets.
The columns from the records in the master andupdate data sets are merged so that the outputrecord contains all the columns from the masterrecord plus any additional columns from eachupdate record.
3. Merge Stage
7/30/2019 5-Join LookUp and Merge Stages
20/22
3. Merge Stage
7/30/2019 5-Join LookUp and Merge Stages
21/22
7/30/2019 5-Join LookUp and Merge Stages
22/22
Synopsis:
Joins, Lookup and Merge
Joins Lookup Merge
Model RDBMS-style relational Source - in RAM LU Table Master -Update(s)
Memory usage light heavy light
# and names of Inputs exactly 2: 1 left, 1 right 1 Source, N LU Tables 1 Master, N Update(s)
Mandatory Input Sort both inputs no all inputs
Duplicates in primary input OK (x-product) OK Warning!
Duplicates in secondary input(s) OK (x-product) Warning! OK only when N= 1
Options on unmatched primary NONE [fail] | continue | drop | rejec t [keep] | drop
Options on unmatched secondary NONE NONE capture in reject set(s)
On match, secondary entries are reusable reusable consumed
# Outputs 1 1 out, (1 reject) 1 out, (N rejects)
Captured in reject set(s) Nothing (N/A) unmatched primary entries unmatched secondary entries