9
Optimization of Lookup Transformation DECLARATION I hereby declare that this document is based on my personal experiences and / or experiences of my project members. To the best of my knowledge, this document does not contain any material that infringes the copyrights of any other individual or organization including the customers of Infosys. Balaji Subramanian MHRSINFA Project Date: 07th December, 2004

Optimization of Lookup Transformation

Embed Size (px)

Citation preview

Page 1: Optimization of Lookup Transformation

 

Optimization of Lookup Transformation 

 

DECLARATION

I hereby declare that this document is based on my personal experiences and / or

experiences of my project members. To the best of my knowledge, this document

does not contain any material that infringes the copyrights of any other individual or

organization including the customers of Infosys.

  

Balaji Subramanian

MHRSINFA Project

Date: 07th December, 2004

Page 2: Optimization of Lookup Transformation

 

ACKNOWLEDGEMENTS

I take immense pleasure in expressing my gratitude to my Project Manager

Mr.Lakshmi Narasimhan CR, under whose leadership we are executing the project

MHRSINFA.

Loads of thanks to my Module Leader Mr.Tak Amit Ashok who guided me in

completing this BOK.

I also take this opportunity to thank our teammates who provided many of the inputs

to create this BOK.

 Thank You,

Balaji Subramanian

mailto: [email protected]

 INDEX  

DECLARATION..........................................................................................1ACKNOWLEDGEMENTS..............................................................................21. LOOKUP TRANSFORMATION OVERVIEW..................................................32. LOOKUP PROPERTIES............................................................................63. LOOKUP CACHE....................................................................................74. LOOKUP TRANSFORMATION TIPS...........................................................8 

Page 3: Optimization of Lookup Transformation

  

Page 4: Optimization of Lookup Transformation

 

Page 5: Optimization of Lookup Transformation

1. LOOKUP TRANSFORMATION OVERVIEW

Lookup transformation is used to look up data in a relational table, view, or synonym.

We can import a lookup definition from any relational database to which both the

Informatica Client and Server can connect. Lookup is a passive transformation. We

can use multiple Lookup transformations in a mapping.

 

The Informatica Server queries the lookup table based on the lookup ports in the

transformation. It compares Lookup transformation port values to lookup table

column values based on the lookup condition. We can configure the Lookup

transformation to perform different types of lookups. We can configure the

transformation to be connected or unconnected, cached or uncached. 

 Connected and Unconnected Lookups

Connected and unconnected lookup transformations receive input and send output in

different ways. We can configure a connected Lookup transformation to receive input

directly from the mapping pipeline, or we can configure an unconnected Lookup

transformation to receive input from the result of an expression in another

transformation.

 

         Connected lookup receives input values directly from the pipeline,

whereas unconnected lookup receives input values from the result of a :LKP

expression in another transformation.

         Connected lookup can return multiple columns from the same row,

whereas unconnected lookup can return only one column from each row.

         Unconnected lookup transformation can be called more than once in a

mapping.

         If we are going to return only one column from the table, then it is better

to use unconnected lookup transformation. Because in unconnected lookup,

we can decide whether to lookup for a row or not, by giving an IIF condition in

the expression transformation when calling the lookup.

 

CACHED OR UNCACHED

We can configure a Lookup transformation to cache the lookup table. The Informatica

Server builds a cache in memory when it processes the first row of data in a cached

Lookup transformation. It allocates memory for the cache based on the amount we

configure in the transformation or session properties. The Informatica Server stores

Page 6: Optimization of Lookup Transformation

condition values in the index cache and output values in the data cache. The

Informatica Server queries the cache for each row that enters the transformation.

  

2. LOOKUP PROPERTIES

Properties for the Lookup transformation identify the database source, how the

Informatica Server processes the transformation, and how it handles caching and

multiple matches.

The following are some of the lookup transformation properties.

LOOKUP SQL OVERRIDE

         Overrides the default SQL statement to query the lookup table

         Specifies the SQL statement that we want the Informatica Server to use

for querying lookup values.

         Use only with the lookup cache enabled.

         Enter only the SELECT, FROM, and WHERE clauses when entering the SQL

override.

 

LOOKUP CACHING ENABLED

Indicates whether the Informatica Server caches lookup values during the session. When we enable lookup caching, the Informatica Server queries the lookup table once, caches the values, and looks up values in the cache during the session. This can improve session performance.

When we disable caching, each time a row passes into the transformation, the Informatica Server issues a select statement to the lookup table for lookup values.

Page 7: Optimization of Lookup Transformation

LOOKUP DATA CACHE SIZE

Indicates the maximum size the Informatica Server allocates to the data cache in memory. If the Informatica Server cannot allocate the configured amount of memory when initializing the session, it fails the session. When the Informatica Server cannot store all the data cache data in memory, it pages to disk as necessary.

         The Lookup Data Cache Size is 2,000,000 bytes by default. The minimum

size is 1,024 bytes. Use only with the lookup cache enabled.

         For optimized performance the data cache size should be total size of all

fields * rows

 

LOOKUP INDEX CACHE SIZE

Indicates the maximum size the Informatica Server allocates to the index cache in memory. If the Informatica Server cannot allocate the configured amount of memory when initializing the session, it fails the session. When the Informatica Server cannot store all the index cache data in memory, it pages to disk as necessary.

         The Lookup Index Cache Size is 1,000,000 bytes by default. The minimum

size is 1,024 bytes. Use only with the lookup cache enabled.

         For optimized performance the index cache size should be total size of

condition fields * rows

3. LOOKUP CACHE

The Informatica Server creates the cache files by default in the $PMCacheDir. If the data does not fit in the memory cache, the Informatica Server stores the overflow values in the cache files. When the session completes, the Informatica Server releases cache memory and deletes the cache files unless we configure the Lookup transformation to use a persistent cache.

When configuring a lookup cache, we can specify any of the following options:

PERSISTENT CACHE

If we want to save and reuse the cache files, we can configure the transformation to use a persistent cache. Use a persistent cache when the lookup table does not change between session runs. The first time the Informatica Server runs a session using a persistent lookup cache, it saves the cache files to disk instead of deleting them. The next time the Informatica Server runs the session, it builds the memory cache from the cache files.

RECACHE FROM DATABASE

If the persistent cache is not synchronized with the lookup table, we can configure the Lookup transformation to rebuild the lookup cache.

STATIC CACHE

Page 8: Optimization of Lookup Transformation

We can configure a static, or read-only, cache for any lookup table. By default, the Informatica Server creates a static cache. It caches the lookup table and looks up values in the cache for each row that comes into the transformation. When the lookup condition is true, the Informatica Server returns a value from the lookup cache. The Informatica Server does not update the cache while it processes the Lookup transformation.

DYNAMIC CACHE

The lookup transformation is configured to use a dynamic cache when the target table is also the lookup table. When we use a dynamic cache, the Informatica Server updates the lookup cache as it passes rows to the target.

SHARED CACHE

We can configure multiple Lookup transformations to share a single lookup cache. The Informatica Server builds the cache when it processes the first Lookup transformation. It uses the same cache to perform lookups for subsequent Lookup transformations that share the cache.

We can share an unnamed cache between transformations in the same mapping. We can share a named cache between transformations in the same or different mappings.

4. LOOKUP TRANSFORMATION TIPS

 

         If we include more than one lookup condition, place the conditions with

an equal sign first to optimize lookup performance.

 

         We can increase performance by overriding the default ORDER BY

statement with an ORDER BY statement with fewer columns. By default, the

Informatica Server generates an ORDER BY statement for a cached lookup

that contains all lookup ports. To increase performance, we can suppress the

default ORDER BY statement and enter an override ORDER BY with fewer

columns.

 

         Default buffer block size should be total size of all fields * (20 to 100) for

optimal performance.

 

         Add an index to the columns used in a lookup condition

 

         Cache small lookup tables. The result of the lookup query and processing

is the same, whether or not we cache the lookup table.

 

Page 9: Optimization of Lookup Transformation

         UsU U pUrsistUnt lookup UUUUU Uor stUtiU lookup tUUlUsU

IU tUU lookup tUUlU UoUs not UUUnUU UUtwUUn sUssionsU

UonUiUurU tUU Lookup trUnsUormUtion to usU U pUrsistUnt

lookup UUUUUU TUU InUormUtiUU SUrvUr tUUn sUvUs UnU rUusUs UUUUU

UilUs Urom sUssion to sUssionU UliminUtinU tUU timU

rUquirUU to rUUU tUU lookup tUUlUU

 

         When your source is large, cache lookup table columns for those lookup

tables of 500,000 rows or less.

 

         Cache only lookup tables if the number of lookup calls is more than 10-

20% of the lookup table rows.

 

         For small lookup tables, less than 5,000 rows, cache for more than 5-10

lookup calls.

 

 

 

-----------------------------7d4e1390270 Content-Disposition: form-data; name="UploadIt"; filename="" Content-Type: application/octet-stream