18
Informatica Powermart Sessions & Batches By Partha Sarathi Seth

15610831-informatica-powermart

Embed Size (px)

Citation preview

Page 1: 15610831-informatica-powermart

Informatica Powermart

Sessions & Batches

By Partha Sarathi Seth

Page 2: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

DeclarationI hereby declare that this document is based on my personal experiences and/or experiences of my project members. To the best of my knowledge, this document does not contain any material that infringes the copyrights of any other individual or organization including the customers of Infosys.

Partha Sarathi Seth([email protected])

Target readers ALL

Keywords Informatica, Powermart, Batch, Session

Page 2 of 18

Page 3: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

Table of Contents

Introduction to Powermart ...................................................................................................... 4 Sessions ................................................................................................................................... 4

Creating a Session ............................................................................................................... 4 General Attributes ............................................................................................................... 4 Time .................................................................................................................................... 9 Log Files ............................................................................................................................ 11 Transformations ................................................................................................................ 12

Batch ..................................................................................................................................... 13 Batch Attributes ................................................................................................................. 14

Event Based Scheduling ........................................................................................................ 16 Reference .............................................................................................................................. 18

Page 3 of 18

Page 4: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

Introduction to Powermart

Informatica Powermart is an ETL tool that Extracts, Transforms and Loads data from source to target. It involves various individual transformations that perform several types of operations to convert, cleanse and integrate source data before loading into the target.

Powermart 5.1 provides a Designer, Server Manager and Repository Manager.

• Designer – It provides an environment to build Mappings and Mapplets.

• Server Manager – Among other functions, it helps in creating sessions and batches.

• Repository Manager – It helps in the maintenance of the Repository.

A Mapping contains the data flow between a source and a target. It contains Source definition, Transformation, Target definition and Connectors.

A Mapplet is a set of transformations. It is created as a reusable object in various Mappings.

Sessions

In short, a Session executes a Mapping. A session contains additional information viz., Source, Destination, Log Files, Error Files, Schedule, etc.

Creating a Session

A session is created for every mapping that needs to be executed. The Session Manager provides a Session wizard to enter the session attributes.

Go to Operations → Add Session in the menu and select the Mapping to invoke the Session wizard.

General Attributes

1. Enter the session name. Generally, the session is named on the mapping name preceded by an ‘S’.

2. Select the Source type.

Page 4 of 18

Page 5: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

3. Select the Source database name. You will have to configure the database details for the first time. (Menu: Server Configuration → Database Connections, Add a new connection).

4. Source Options: Enter the location of the file (if the source is a file) or enter the database name, if the source is a table.

Page 5 of 18

Page 6: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

5. Select the target type.6. Target Options: Following options will be provided.

• Insert – If this option is selected, all the rows flagged for insert will be inserted.

• Update (as update) – If this option is selected, all the rows flagged for update will be updated.

• Update (as insert) – If this option is selected, all the rows flagged for update will be inserted.

• Update (else insert) – If this option is selected, all the rows flagged for update will be updated, if it exists in the target, else it will be inserted.

• Delete – If this option is selected, all the rows flagged for delete will be deleted.

• Truncate Table - If this option is selected, the table is truncated before upload.

Page 6 of 18

Page 7: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

7. Reject Options: Enter the location of the Reject file. This file will hold all the rejected record.

8. Pre-Session Commands: The Server Manager provides option to write Pre-Session UNIX scripts, which will be executed before the execution the Session. This option is very useful if you have to process or format the source file before the session is executed.

Page 7 of 18

Page 8: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

9. Post Session Commands and Email: Similar to the Pre-Session commands, the Server manager provides option to write post-session UNIX scripts. This is a very useful option if you have event-based scheduling (I will elaborate on event-based scheduling, later in the document). In addition to this, the Server manager also provides Email options, which are triggered on success or failure of a session.

10.Configuration Parameters: In the Event based scheduling section, enter the location of the Event file and the Event file name.

Page 8 of 18

Page 9: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

Time

The Server Manager provides several options when it comes to scheduling a Session.

• Run only on demand – If this option is selected, then the session would execute only when you manually start the session.

• Run once – If this option is selected, then the “Start Date” section is enabled and you will be able to schedule the session to run only once for a particular day and time.

• Run every – If this option is enabled, then the “Day-Minutes-Hours” section will be enabled and you will be able to run the frequency

• Customised Repeat – It this option is enabled, then the session executes based on the dates and times specified in the Repeat dialog.

• Run continuously – If this option is enabled, the session restarts after every completion of execution.

Page 9 of 18

Page 10: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

The Customised repeat option is the most flexible of all the available options. On click of the edit button, the Repeat dialog opens.

You can define a Daily, Weekly or a Monthly schedule.

In Daily schedule, you can either select “Run once” or “Run every” options.

In Weekly schedule, you can select the day or days of the week on which you would like to schedule the session.

In Monthly schedule, you can either select “Run on a day”, like 1st and 31st of every month, or select “Run on the”, like Last Monday of every month. If you schedule the session to run on 31st of every month, the Server manager automatically schedules the session on the last day of the month in cases where the months have less than 31 days.

Page 10 of 18

Page 11: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

Log Files

The Log file Path and the name have to be entered in this dialog. The Sever manager provides us option to either save the session logs by timestamp or specify the number of log files to retain. In the latter case, the server manager suffixes numbers (from 0) to the log files names.

You can also specify the parameter file name and location.

In Batch handling, you can select

• Run always – If this option is selected, the session will be executed even if the previous session in the batch has failed.

• Run if Previously Completed – If this option is selected, the session will be executed only if the previous session in the batch has completed successfully.

The server manager also provides several error handling options.

Page 11 of 18

Page 12: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

Transformations

In the Session level override for Transformation, you can override the attributes of a mapping transformation.

Page 12 of 18

Page 13: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

On successful creation of a Session, the session appears with the following details.

Note: A Session or a Batch can also be started from the command line using a ‘pmcmd’ command. The ‘pmcmd’ can be embedded into a shell script and scheduled using any of scheduling tools like ‘cron tab’.

Batch

Batches help in grouping sessions for either sequential or parallel execution. There are two types of Batches.

Sequential Batch – It executes sessions one after the other.

Concurrent Batch – It executes sessions in parallel.

As in case of Sessions, the Session Manager also provides a Batch wizard to enter the Batch attributes. The Batch attributes override the session attributes.

Page 13 of 18

Page 14: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

Go to Operations → Add Batch in the menu to invoke the Batch wizard.

Batch Attributes

Enter the Batch Name and the location of the parameter file. Check on the “Concurrent” check box to make it a concurrent batch. By default, the batch will be sequential batch.

The Scheduling process is exactly the same as in the case of a session. The Batch scheduling will override the session scheduling.

Page 14 of 18

Page 15: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

The following example explains how a Batch works. We have 6 sessions, s_Test0, s_Test1, s_Test2, s_Test3, s_Test4 and s_Test5.

• The parent batch is Sequential_Test_Batch. • Session s_Test0, batch Concurrent_Test_Batch and Session s_Test5 are

sequential.• Batches Sequential_Test_Batch_01 and Sequential_Test_Batch_01 will run in

parallel.• Sessions s_Test1 and s_Test2 are sequential.• Sessions s_Test3 and s_Test4 are also sequential.

All sessions and/or batches inside a sequential batch will run sequentially. All sessions and/or batches inside a concurrent batch will run in parallel.

In the above example, when the Batch Sequential_Test_Batch is started, s_Test0 is executed first. On completion of s_Test0, the Concurrent_Test_Batch starts, and hence batches Sequential_Test_Batch_01 and Sequential_Test_Batch_02 executes in parallel. But s_Test2 starts only after the completion of s_Test1. Similarly, s_Test4 starts only

Page 15 of 18

Page 16: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

after the completion of s_Test3. Session s_Test5 starts only after completion of sessions s_Test2 and s_Test4.

Event Based Scheduling

In case of Event based scheduling, the session waits for an event/indicator file at a specific location in the server. When the file arrives, the session starts and it deletes the event file.

When you create a session, you will have to enter the Event file name and location in the “Event based scheduling” area.

Now, if you start the session manually, the Server manager waits for the event file to appear in the specified location before running the session. If you schedule the session, the server manager first waits for the scheduled time to arrive and then for the event file.

When the session waits for the scheduled time, the session would appear “Scheduled” status in the Server manager Monitor. When the session waits for the event file, the session would appear in “File wait” status in the monitor.

The Event based scheduling is very useful when you have to execute two or more independent batches in sequence.

For example I have 4 independent batches, Batch1, Batch2, Batch3 and Batch4. The following figure shows the process flow.

All the batches are scheduled to run at 10:00 hrs every day, but Batch2, Batch3 and Batch4 have event file dependencies as well.

Page 16 of 18

Page 17: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

Batch1

Batch4Batch3

Batch2

Event File 1

Event File 2BEvent File 2A

Every Day at 10:00 hrs

Every Day at 10:00 hrs Every Day at 10:00 hrs

Every Day at 10:00 hrs

At 10:00 hrs, Batch1 starts execution, whereas Batch2, Batch3 and Batch4 go to the “File wait” status. The last session of Batch1 will have to contain a script to create the event file in the “Post Session Command” dialog. Similarly, the last session of the Batch2 will have to contain scripts to create event files 2A and 2B.

On Successful completion of Batch1 and the creation of the Event File 1, Batch2 starts its execution. Batch2 creates Event File 2A and 2B before completion. Batch3 and Batch4 will start as soon as it finds event files 2A and 2B, respectively.

Note: The Server manager doesn’t look into the content of the event file.

Page 17 of 18

Page 18: 15610831-informatica-powermart

Informatica Powermart Sessions & Batches

It is always a nice practice to have dummy sessions at the start and end of an independent batch and is scheduled to run sequentially. All Event file related post processing could be handled in the dummy session.

Reference

• Server Manager Online help.

Page 18 of 18