36
Informatica interview questions and answers. What is the difference between Informatica 7.0 and 8.0 ? Features of Informatica 8 The architecture of Power Center 8 has changed a lot; 1. PC8 is service-oriented for modularity, scalability and flexibility. 2. The Repository Service and Integration Service (as replacement for Rep Server and Informatica Server) can be run on different computers in a network (so called nodes), even redundantly. 3. Management is centralized, that means services can be started and stopped on nodes via a central web interface. 4. Client Tools access the repository via that centralized machine, resources are distributed dynamically. 5. Running all services on one machine is still possible, of course. 6. It has a support for unstructured data which includes spreadsheets, email, Microsoft Word files, presentations and .PDF documents. It provides high availability, seamless fail over, eliminating single points of failure. 7. It has added performance improvements (To bump up systems performance, Informatica has added "push down optimization" which moves data transformation processing to the native relational database I/O engine whenever its is most appropriate.) 8. Informatica has now added more tightly integrated data profiling, cleansing, and matching capabilities. 9. Informatica has added a new web based administrative console. 10.Ability to write a Custom Transformation in C++ or Java. 11.Midstream SQL transformation has been added in 8.1.1, not in 8.1. 12.Dynamic configuration of caches and partitioning 13.Java transformation is introduced. 14.User defined functions 15.PowerCenter 8 release has "Append to Target file" feature. In a scenario I have col1, col2, col3, under that 1,x,y, and 2,a,b and I want in this form col1, col2 and 1,x and 1,y and 2,a and 2,b, what is the procedure? Use Normalizer : create two ports - first port occurs = 1 second make occurs = 2 two output ports are created and connect to target On a day, I load 10 rows in my target and on next day if I get 10 more rows to be added to my target out of which 5 are updated rows how can I send them to target? How can I insert and update the record? We can use do this by identifying the granularity of the target table . We can use CRC external procedure after that to compare newly generated CRC no. with the old one and if they do not match then update the row. What is the method of loading 5 flat files of having same structure to a single target and which transformations I can use?

a Interview Questions and Answers - Cool Interview

Embed Size (px)

Citation preview

Page 1: a Interview Questions and Answers - Cool Interview

Informatica interview questions and answers.

What is the difference between Informatica 7.0 and 8.0 ?Features of Informatica 8 The architecture of Power Center 8 has changed a lot;1. PC8 is service-oriented for modularity, scalability and flexibility. 2. The Repository Service and Integration Service (as replacement for Rep Server and Informatica Server) can be run on different computers in a network (so called nodes), even redundantly. 3. Management is centralized, that means services can be started and stopped on nodes via a central web interface. 4. Client Tools access the repository via that centralized machine, resources are distributed dynamically. 5. Running all services on one machine is still possible, of course. 6. It has a support for unstructured data which includes spreadsheets, email, Microsoft Word files, presentations and .PDF documents. It provides high availability, seamless fail over, eliminating single points of failure. 7. It has added performance improvements (To bump up systems performance, Informatica has added "push down optimization" which moves data transformation processing to the native relational database I/O engine whenever its is most appropriate.) 8. Informatica has now added more tightly integrated data profiling, cleansing, and matching capabilities. 9. Informatica has added a new web based administrative console. 10.Ability to write a Custom Transformation in C++ or Java. 11.Midstream SQL transformation has been added in 8.1.1, not in 8.1. 12.Dynamic configuration of caches and partitioning 13.Java transformation is introduced. 14.User defined functions 15.PowerCenter 8 release has "Append to Target file" feature.

In a scenario I have col1, col2, col3, under that 1,x,y, and 2,a,b and I want in this form col1, col2 and 1,x and 1,y and 2,a and 2,b, what is the procedure?

Use Normalizer : create two ports -first port occurs = 1 second make occurs = 2 two output ports are created andconnect to target

On a day, I load 10 rows in my target and on next day if I get 10 more rows to be added to my target out of which 5 are updated rows how can I send them to target? How can I insert and update the record?

We can use do this by identifying the granularity of the target table .We can use CRC external procedure after that to compare newly generated CRC no. with the old one and if they do not match then update the row.

What is the method of loading 5 flat files of having same structure to a single target and which transformations I can use?

union transformation,otherwise write all file paths of five files in one file and use this file in session properties as indirect

In a scenario I want to change the dimensions of a table and normalize the renormalized table which transformation can I use?

you can use normalizer transformation .It will normalize the records.

Why is meant by direct and indirect loading options in sessions?

Page 2: a Interview Questions and Answers - Cool Interview

Direct loading can be used to Single transformation where as indirect transformation can be used to multiple transformations or files

In the direct we can perform recovery processbut in Indirect we cant do it .

SCD Mappings are mosltly used Unconnected Lookup Transformation. Unconnected Lookup used static Cache only.At this time how can u insert or update date in Target by using static cache?

How many types of dimensions are available in informatica?

There r 3 types of dimensions

1.star schema

2.snowflake schema

3.glaxy schema

3 types.That are

1. SCD(slowly changing dimension) type12. SCD type23. SCD type3

When we create a target as flat file and source as oracle.. how can i specify first rows as column names in flat files...

use a pre sql statement....but this is a hardcoding method...if you change the column names or put in extra columns in the flat file, you will have to change the insert statement

You can also achive this by changing the setting in the Informatica Repository manager to display the columns heading. The only disadvantage of this is that it will be applied on all the files that will be generated by This server

1.can u explain one critical mapping? 2.performance issue which one is better? whether connected lookup tranformation or unconnected one?

it depends on your data and the type of operation u r doing.

If u need to calculate a value for all the rows or for the maximum rows coming out of the source then go for a connected lookup.

Or,if it is not so then go for unconnectd lookup.

Specially in conditional case like,

we have to get value for a field 'customer' from order tabel or from customer_data table,on the basis of following rule:

If customer_name is null then ,customer=customer_data.ustomer_Id

otherwise

customer=order.customer_name.

so in this case we will go for unconnected lookup

Page 3: a Interview Questions and Answers - Cool Interview

Dimesions are 1.SCD2.Rapidly changing Dimensions3.junk Dimensions4.Large Dimensions5.Degenerated Dimensions6.Conformed Dimensions.

How can you improve the performance of Aggregate transformation?

we can improve the agrregator performence in the following ways

1.send sorted input.

2.increase aggregator cache size.i.e Index cache and data cache.

3.Give input/output what you need in the transformation.i.e reduce number of input and output ports.

Use Sorter Transformation to sort input in aggregrator propertiesfilter the records before

Why did you use stored procedure in your ETL Application?

usage of stored procedure has the following advantages

1checks the status of the target database

2drops and recreates indexes

3determines if enough space exists in the database

4performs aspecilized calculation

why did u use update stategy in your application?

Update Strategy is used to drive the data to be Inert, Update and Delete depending upon some condition. You can do this on session level tooo but there you cannot define any condition.For eg: If you want to do update and insert in one mapping...you will create two flows and will make one as insert and one as update depending upon some condition.Refer : Update Strategy in Transformation Guide for more information

I have used in the case where i wanted to insert and update the records in the same mapping .

How do you create single lookup transformation using multiple tables?

Write a override sql query. Adjust the ports as per the sql query.

By writing SQL override and specifying joins in the SQL override.

In update strategy target table or flat file which gives more performance ? why?

Pros: Loading, Sorting, Merging operations will be faster as there is no index concept and Data will be in ASCII mode.

Cons: There is no concept of updating existing records in flat file.

As there is no indexes, while lookups speed will be lesser.

How to load time dimension?

Page 4: a Interview Questions and Answers - Cool Interview

We can use SCD Type 1/2/3 to load any Dimensions based on the requirement.

We can also use procedure to populate Time Dimension

What is a source qualifier?-

When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.Basic purpose of a source qualifier is to make the database specific data types into informatica specific types , so that data can be integrated easily.

After draging the ports of three sources(sql server,oracle,informix) to a single source qualifier, can u map these three ports directly to target?

if u drag three hetrogenous sources and populated to target without any join means you are entertaining Carteisn product. If you don't use join means not only diffrent sources but homegeous sources are show same error.

If you are not interested to use joins at source qualifier level u can add some joins sepratly.

In Source qualifier we can join the tables from same database only.

How can u work with remote database in informatica?did u work directly by using remote connections?

You can work with remote,

But you have to

Configure FTP

Connection details

IP address

User authentication

What is power center repository?

Standalone repository. A repository that functions individually, unrelated and unconnected to other repositories. Global repository. (PowerCenter only.) The centralized repository in a domain, a group of connected repositories. Each domain can contain one global repository. The global repository can contain common objects to be shared throughout the domain through global shortcuts. Local repository. (PowerCenter only.) A repository within a domain that is not the global repository. Each local repository in the domain can connect to the global repository and use objects in its shared folders.

Power Center repository is used to store informatica's meta data .Information such as mapping name,location,target definitions,source definitions,transformation and flow is stored as meta data in the repository.

What r the types of metadata that stores in repository?

Page 5: a Interview Questions and Answers - Cool Interview

Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data. Target definitions. Definitions of database objects or files that contain the target data. Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions. Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data. Reusable transformations. Transformations that you can use in multiple mappings. Mapplets. A set of transformations that you can use in multiple mappings. Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.

Define informatica repository?

Infromatica Repository:The informatica repository is at the center of the informatica suite. You create a set of metadata tables within the repository database that the informatica application and tools access. The informatica client and server access the repository to save and retrieve metadata.

What is difference between maplet and reusable transformation?

Maplet: one or more transformations

Reusable transformation: only one transformation

Mapplet : set of transformations that are reusable.

Reusuable Transformation: Single transformation which is reusable.

What is difference between partioning of relatonal target and partitioning of file targets?

Partition's can be done on both relational and flat files.

Informatica supports following partitions

1.Database partitioning

2.RoundRobin

3.Pass-through

4.Hash-Key partitioning

5.Key Range partitioning

All these are applicable for relational targets.For flat file only database partitioning is not applicable.

Informatica supports Nway partitioning.U can just specify the name of the target file and create the partitions, rest will be taken care by informatica session.

What is parameter file?

When you start a workflow, you can optionally enter the directory and name of a parameter file. The Informatica Server runs the workflow using the parameters in the file you specify.

Page 6: a Interview Questions and Answers - Cool Interview

For UNIX shell users, enclose the parameter file name in single quotes:

-paramfile '$PMRootDir/myfile.txt'

For Windows command prompt users, the parameter file name cannot have beginning or trailing spaces. If the name includes spaces, enclose the file name in double quotes:

-paramfile ”$PMRootDirmy file.txt”

Note: When you write a pmcmd command that includes a parameter file located on another machine, use the backslash () with the dollar sign ($). This ensures that the machine where the variable is defined expands the server variable.

pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '$PMRootDir/myfile.txt'

Can u start a batches with in a batch?

U can not. If u want to start batch that resides in a batch,create a new independent batch and copy the necessary sessions into the new batch.

What is batch and describe about types of batches?

Batch--- is a group of any thing

Different batches ----Different groups of different things.

There are two types of batches1. Concurrent2. Sequential

Can u copy the session to a different folder or repository?

In addition, you can copy the workflow from the Repository manager. This will automatically copy the mapping, associated source,targets and session to the target folder.

Yes it is possible. For copying a session to a folder in the same repository or to another in a different repository, we can use the repository manager ( which is client sid etool).Simply by just dragging the session to the target destination, the session will be copied.

How the informatica server increases the session performance through partitioning the source?

For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection.Informatica server reads multiple partitions of a single source concurently.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently.

Why we use partitioning the session in informatica?

Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline.

Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel.

Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server?

Informatica Workflow Managar and Informatica Worlflow Monitor

Page 7: a Interview Questions and Answers - Cool Interview

Can u generate reports in Informatcia?

It is a ETL tool, you could not make reports from here, but you can generate metadata report, that is not going to be used for business analysis

How can u recognise whether or not the newly added rows in the source r gets insert in the target ?

If it is Type 2 Dimension the abouve answer is fine, but if u want to get the info of all the insert statements and Updates you need to use session log file where you configure it to verbose.

You will get complete set of data which record was inserted and which was not.

What r the different types of Type2 dimension maping?

1. Version number

2. Flag

3.Date

What r the mapings that we use for slowly changing dimension table?

i want whole information on slowly changing dimension.and also want project on slowly changing dimension in informatica.

We can use the following Mapping for slowly Changing dimension table.

• Expression• Lookup• Filter• Sequence Generator• Update Strategy

What r the types of maping in Getting Started Wizard?

1. Simple Pass through2. Slowly Growing Target

What r the types of maping wizards that r to be provided in Informatica?

Simple Pass through

Slowly Growing Target

Slowly Changing the Dimension

Type1

Most recent values

Page 8: a Interview Questions and Answers - Cool Interview

Type2

Full History

Version

Flag

Date

Type3

Current and one previous

What r the options in the target session of update strategy transsformatioin?

Update as Insert:

This option specified all the update records from source to be flagged as inserts in the target. In other words, instead of updating the records in the target they are inserted as new records.

Update else Insert:

This option enables informatica to flag the records either for update if they are old or insert, if they are new records from source.

insert,update,delete,insert as update,update else insert.update as update.

What is Datadriven?

The Informatica Server follows instructions coded into Update Strategy transformations within the session mapping to determine how to flag rows for insert, delete, update, or reject. If the mapping for the session contains an Update Strategy transformation, this field is marked Data Driven by default

What is the default source option for update stratgey transformation?

DATA DRIVEN

What is update strategy transformation ?

The model you choose constitutes your update strategy, how to handle changes to existing rows. In PowerCenter and PowerMart, you set your update strategy at two different levels:

Within a session. When you configure a session, you can instruct the Informatica Server to either treat all rows in the same way (for example, treat all rows as inserts), or use instructions coded into the session mapping to flag rows for different database operations. Within a mapping. Within a mapping, you use the Update Strategy transformation to flag rows for insert, delete, update, or reject.

What r the basic needs to join two sources in a source qualifier?

The both the table should have a common feild with same datatype.

Its not neccessary both should follow primary and foreign relationship. If any relation ship exists that will help u in performance point of view.

What is the default join that source qualifier provides?

The Joiner transformation supports the following join types, which you set in the Properties tab:

Page 9: a Interview Questions and Answers - Cool Interview

Normal (Default) Master Outer Detail Outer Full Outer

Inner equil joint is default joint in source qualifier.

What is the target load order?

A target load order group is the collection of source qualifiers, transformations, and targets linked together in a mapping.

What r the tasks that source qualifier performs?

Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.

What is source qualifier transformation?

When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.

Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure

Why we use stored procedure transformation?

A Stored Procedure transformation is an important tool for populating and maintaining databases. Database administrators create stored procedures to automate time-consuming tasks that are too complicated for standard SQL statements.

What r the types of groups in Router transformation?

A Router transformation has the following types of groups:

Input

Page 10: a Interview Questions and Answers - Cool Interview

Output Input Group The Designer copies property information from the input ports of the input group to create a set of output ports for each output group.

Output Groups There are two types of output groups:

User-defined groups Default group You cannot modify or delete output ports or their properties.

Two types of groups

user defined groupdefault group.

What is the Router transformation?

A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.

What is the Rankindex in Ranktransformation?

Based on which port you want generate Rank is known as rank port, the generated values are known as rank index.

How the informatica server sorts the string values in Ranktransformation?

When Informatica Server runs in UNICODE data movement mode ,then it uses the sort order configured in session properties.

Hi,We can run informatica server either in UNICODE data moment mode or ASCII data moment mode.Unicode mode: in this mode informatica server sorts the data as per the sorted order in session.

ASCII Mode:in this mode informatica server sorts the date as per the binary order

Which transformation should we use to normalize the COBOL and relational sources?

The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs. A Normalizer transformation can appear anywhere in a data flow when you normalize a relational source. Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize a COBOL source. When you drag a COBOL source into the Mapping Designer workspace, the Normalizer transformation automatically appears, creating input and output ports for every column in the source

Difference between static cache and dynamic cache

Static cache Dynamic cache U can not insert or update the cacheU can insert rows into the cache as u pass to the target

Page 11: a Interview Questions and Answers - Cool Interview

The informatic server returns a value from the lookup table or cache when the condition is true.When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations.The informatic server inserts rows into cache when the condition is false.This indicates that the the row is not in the cache or target table. U can pass these rows to the target table

What r the types of lookup caches?

Cache

1. Static cache

2. Dynamic cache

3. Persistent cache

Differences between connected and unconnected lookup?

Connected lookupUnconnected lookup Receives input values diectly from the pipe line.Receives input values from the result of a lkp expression in a another transformation.

U can use a dynamic or static cache U can use a static cache.

Cache includes all lookup columns used in the mapingCache includes all lookup out put ports in the lookup condition and the lookup/return port.

Support user defined default valuesDoes not support user defiend default values

what is the look up transformation?

Using it we can access the data from a relational table which is not a source in the mapping.

For Ex:Suppose the source contains only Empno, but we want Empname also in the mapping.Then instead of adding another tbl which contains Empname as a source ,we can Lkp the table and get the Empname in target.

What r the joiner caches?

Specifies the directory used to cache master records and the index to these records. By default, the cached files are created in a directory specified by the server variable $PMCacheDir. If you override the directory, make sure the directory exists and contains enough disk space for the cache files. The directory can be a mapped or mounted drive.

There r 2-types of cache in the joiner 1. Data cache2. Index Cache

what r the settiings that u use to cofigure the joiner transformation?

Master and detail source Type of join Condition of the join the Joiner transformation supports the following join types, which you set in the Properties tab:

Normal (Default)

Page 12: a Interview Questions and Answers - Cool Interview

Master Outer Detail Outer Full Outer

In which condtions we can not use joiner transformation(Limitaions of joiner transformation)? This is no longer valid in version 7.2

Now we can use a joiner even if the data is coming from the same source.

What r the diffrence between joiner transformation and source qualifier transformation?

Source qualifier – Homogeneous source

Joiner – Heterogeneous source

What is aggregate cache in aggregator transforamtion?

When you run a workflow that uses an Aggregator transformation, the Informatica Server creates index and data caches in memory to process the transformation. If the Informatica Server requires more space, it stores overflow values in cache files.

Can U use the maping parameters or variables created in one maping into another maping?

NO. You might want to use a workflow parameter/variable if you want it to be visible with other mappings/sessions

What r the mapping paramaters and maping variables?

Please refer to the documentation for more understanding.

Mapping variables have two identities:

Start value and Current value

Start value = Current value ( when the session starts the execution of the undelying mapping)

Start value <> Current value ( while the session is in progress and the variable value changes in one ore more occasions)

Current value at the end of the session is nothing but the start value for the subsequent run of the same session.

What r the unsupported repository objects for a mapplet?

Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data. Target definitions. Definitions of database objects or files that contain the target data. Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions. Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data. Reusable transformations. Transformations that you can use in multiple mappings. Mapplets. A set of transformations that you can use in multiple mappings. Sessions and workflows. Sessions and workflows store information about how and when the

Page 13: a Interview Questions and Answers - Cool Interview

Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.

What r the methods for creating reusable transforamtions?

You can design using 2 methods

using transformation developercreate normal one and promote it to reusable

What r the reusable transforamtions?

A transformation can reused, that is know as reusable transformation

You can design using 2 methods

using transformation developercreate normal one and promote it to reusable

What r the active and passive transforamtions?

Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition.

A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation. What is the maplet?

A mapplet should have a mapplet input transformation which recives input values, and a output transformation which passes the final modified data to back to the mapping.

when the mapplet is displayed with in the mapping only input & output ports are displayed so that the internal logic is hidden from end-user point of view. Which transformation should u need while using the cobol sources as source defintions?

Normalizer transformaiton which is used to normalize the data.Since cobol sources r oftenly consists of Denormailzed data.

Which is a transformation?

It is a process of converting given input to desired output.

How many ways you can update a relational source defintion and what r they?

Two ways 1. Edit the definition 2. Reimport the defintion Where should U place the flat file to import the flat file defintion to the designer?

There is no such restrication to place the source file. In performance point of view its better to place the file in server local src folder. if you need path please check the server properties availble at workflow manager.

Page 14: a Interview Questions and Answers - Cool Interview

It doesn't mean we should not place in any other folder, if we place in server src folder by default src will be selected at time session creation.

To provide support for Mainframes source data,which files r used as a source definitions?

COBOL Copy-book files What is Data cleansing..?

Data cleansing is a two step process including DETECTION and then CORRECTION of errors in a data set.

What is Load Manager?

I am providing the answer which I have taken it from Informatica 7.1.1 manual,

Ans: While running a Workflow,the PowerCenter Server uses the Load Manager process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks.When the PowerCenter Server runs a workflow, the Load Manager performs the following tasks:

1. Locks the workflow and reads workflow properties. 2. Reads the parameter file and expands workflow variables. 3. Creates the workflow log file. 4. Runs workflow tasks. 5. Distributes sessions to worker servers. 6. Starts the DTM to run sessions. 7. Runs sessions from master servers. 8. Sends post-session email if the DTM terminates abnormally.

When the PowerCenter Server runs a session, the DTM performs the following tasks: 1. Fetches session and mapping metadata from the repository. 2. Creates and expands session variables. 3. Creates the session log file. 4. Validates session code pages if data code page validation is enabled. Checks query conversions if data code page validation is disabled. 5. Verifies connection object permissions. 6. Runs pre-session shell commands. 7. Runs pre-session stored procedures and SQL. 8. Creates and runs mapping, reader, writer, and transformation threads to extract,transform, and load data. 9. Runs post-session stored procedures and SQL. 10. Runs post-session shell commands. 11. Sends post-session email.

How can we partition a session in Informatica?

The Informatica® PowerCenter® Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning. GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks, while ensuring data integrity throughout the execution process. As the amount of data within an organization expands and real-time demand for information grows, the PowerCenter Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.

Explain the informatica Architecture in detail

Page 15: a Interview Questions and Answers - Cool Interview

informatica server connects source data and target data using native

odbc drivers

again it connect to the repository for running sessions and retriveing metadata information

source------>informatica server--------->target

|

|

REPOSITORY

The PowerCenter Server is a repository client application. It connects to the Repository Server and Repository Agent to retrieve workflow and mapping metadata from the repository database. When the PowerCenter Server requests a repository connection from the Repository Server, the Repository Server starts and manages the Repository Agent. The Repository Server then re-directs the PowerCenter Server to connect directly to the Repository Agent.

How to read rejected data or bad data from bad file and reload it to target?

Correction the rejected data and send to target relational tables using loadorder utility. Find out the rejected data by using column indicatior and row indicator.

Diff between informatica repositry server & informatica server

Informatica Repository Server:It's manages connections to the repository from client application. Informatica Server:It's extracts the source data,performs the data transformation,and loads the transformed data into the target

How do you transfert the data from data warehouse to flatfile?

You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. A flat file target is built by pulling a source into target space using Warehouse Designer tool.

What r the tasks that Loadmanger process will do?

Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica server.When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process.

Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again.

Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file

Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session.

Creating log files: Loadmanger creates logfile contains the status of session.

Page 16: a Interview Questions and Answers - Cool Interview

The LM also sends the 'failure mails' in case of failure in execution of the Subsequent DTM process..

How can U create or import flat file definition in to the warehouse designer?

U can create flat file definition in warehouse designer.in the warehouse designer,u can create new target: select the type as flat file. save it and u can enter various columns for that created target by editing its properties.Once the target is created, save it. u can import it from the mapping designer. What r the connected or unconnected transforamations?

An unconnected transformation cant be connected to another transformation. but it can be called inside another transformation

What is a time dimension? give an example.

In a relational data model, for normalization purposes, year lookup, quarter lookup, month lookup, and week lookups are not merged as a single table. In a dimensional data modeling(star schema), these tables would be merged as a single table called TIME DIMENSION for performance and slicing data.

This dimensions helps to find the sales done on date, weekly, monthly and yearly basis. We can have a trend analysis by comparing this year sales with the previous year or this week sales with the previous week.

Discuss the advantages & Disadvantages of star & snowflake schema?

In a STAR schema there is no relation between any two dimension tables, whereas in a SNOWFLAKE schema there is a possible relation between the dimension tables. Waht are main advantages and purpose of using Normalizer Transformation in Informatica?

Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data

At the max how many tranformations can be us in a mapping?

In a mapping we can use any number of transformations depending on the project, and the included transformations in the perticular related transformatons What is the difference between Narmal load and Bulk load?

Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. when the source file is a text file and loading data to a table,in such cases we should you normal load only, else the session will be failed.

Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases.

compartivly Bulk load is pretty faster than normal load. What is a junk dimension

A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades.

Page 17: a Interview Questions and Answers - Cool Interview

Can we lookup a table from a source qualifer transformation-unconnected lookup

No. we can't do.

I will explain you why.

1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query.

2) source qualifier don't have any variables feilds to utalize as expression. How to get the first 100 rows from the flat file into the target?

1. Use test download option if you want to use it for testing.

2. Put counter/sequence generator in mapping and perform it.

Difference between summary filter and details filter?

Summary Filter --- we can apply records group by that contain common values.

Detail Filter --- we can apply to each and every record in a database.

what are the difference between view and materialized view?

Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse.

A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data Compare Data Warehousing Top-Down approach with Bottom-up approach

Top down

ODS-->ETL-->Datawarehouse-->Datamart-->OLAP

Bottom up

ODS-->ETL-->Datamart-->Datawarehouse-->OLAPDiscuss which is better among incremental load, Normal Load and Bulk load

It depends on the requirement. Otherwise Incremental load which can be better as it takes onle that data which is not available previously on the target. What is the difference between connected and unconnected stored procedures.

Unconnected:

The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping.

connected:

The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation.

Page 18: a Interview Questions and Answers - Cool Interview

Differences between Informatica 6.2 and Informatica 7.0 Yours sincerely

Features in 7.1 are :

1.union and custom transformation

2.lookup on flat file

3.grid servers working on different operating systems can coexist on same server

4.we can use pmcmdrep

5.we can export independent and dependent rep objects

6.we ca move mapping in any web application

7.version controlling

8.data profilling

What are the Differences between Informatica Power Center versions 6.2 and 7.1, also between Versions 6.2 and 5.1?

The main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a new thing called repository server and in place of server manager(5.1), they introduce workflow manager and workflow monitor.

Whats the diff between Informatica powercenter server, repositoryserver and repository?

Repository is a database in which all informatica componets are stored in the form of tables. The reposiitory server controls the repository and maintains the data integrity and Consistency across the repository when multiple users use Informatica. Powercenter Server/Infa Server is responsible for execution of the components (sessions) stored in the repository.

How to create the staging area in your database

A Staging area in a DW is used as a temporary space to hold all the records from the source system. So more or less it should be exact replica of the source systems except for the laod startegy where we use truncate and reload options.

So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab.

What does the expression n filter transformations do in Informatica Slowly growing target wizard?

EXPESSION transformation detects and flags the rows from source.

Filter transformation filters the rows that are not flagged and passes the flagged rows to the Update strategy transformation In a filter expression we want to compare one date field with a db2 system field CURRENT DATE. Our Syntax: datefield = CURRENT DATE (we didn't define it by ports, its a system field ), but this is not valid (PMParser: Missing Operator).. Can someone help us. Thanks Briefly explian the Versioning Concept in Power Center 7.1.

The db2 date formate is "yyyymmdd" where as sysdate in oracle will give "dd-mm-yy" so conversion of db2 date formate to local database date formate is compulsary. other wise u will get that type of error

Briefly explian the Versioning Concept in Power Center 7.1.

Page 19: a Interview Questions and Answers - Cool Interview

When you create a version of a folder referenced by shortcuts, all shortcuts continue to reference their original object in the original version. They do not automatically update to the current folder version.

For example, if you have a shortcut to a source definition in the Marketing folder, version 1.0.0, then you create a new folder version, 1.5.0, the shortcut continues to point to the source definition in version 1.0.0.

Maintaining versions of shared folders can result in shortcuts pointing to different versions of the folder. Though shortcuts to different versions do not affect the server, they might prove more difficult to maintain. To avoid this, you can recreate shortcuts pointing to earlier versions, but this solution is not practical for much-used objects. Therefore, when possible, do not version folders referenced by shortcuts.

How to join two tables without using the Joiner Transformation.

Itz possible to join the two or more tables by using source qualifier.But provided the tables should have relationship.

When u drag n drop the tables u will getting the source qualifier for each table.Delete all the source qualifiers.Add a common source qualifier for all.Right click on the source qualifier u will find EDIT click on it.Click on the properties tab,u will find sql query in that u can write ur sqls

You can also do it using Session --- mapping---source--- there you have an option called User Defined Join there you can write your SQL

Identifying bottlenecks in various components of Informatica and resolving them.

The best way to find out bottlenecks is writing to flat file and see where the bottle neck is .

Can Informatica be used as a Cleansing Tool? If Yes, give example of transformations that can implement a data cleansing routine. Yes, we can use Informatica for cleansing data. some time we use stages to cleansing the data. It depends upon performance again else we can use expression to cleasing data.

For example an feild X have some values and other with Null values and assigned to target feild where target feild is notnull column, inside an expression we can assign space or some constant value to avoid session failure.

The input data is in one format and target is in another format, we can change the format in expression.

we can assign some default values to the target to represent complete set of data in the target.

How do you decide whether you need ti do aggregations at database level or at Informatica level?

It depends upon our requirment only.If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Here i'm explaing why we need to use informatica.

what ever it may be informatica is a thrid party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. No necessary to process entire values again and again. Unless this can be done if nobody deleted that cache files. If that happend total aggregation we need to execute on informatica also.

In database we don't have Incremental aggregation facility.

How do we estimate the depth of the session scheduling queue? Where do we set the number of maximum concurrent sessions that Informatica can run at a given time?

Page 20: a Interview Questions and Answers - Cool Interview

please be more specific on the first half of the question.

u set the max no of concurrent sessions in the info server.by default its 10. u can set to any no.

How do we estimate the number of partitons that a mapping really requires? Is it dependent on the machine configuration?

It depends upon the informatica version we r using. suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions.

Suppose session is configured with commit interval of 10,000 rows and source has 50,000 rows. Explain the commit points for Source based commit and Target based commit. Assume appropriate value wherever required.

Source based commit will commit the data into target based on commit interval.so,for every 10,000 rows it will commit into target.

Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the buffer size is 6,000.So,for every 6,000 rows it commits the data.

We are using Update Strategy Transformation in mapping how can we know whether insert or update or reject or delete option has been selected during running of sessions in Informatica.

In Designer while creating Update Strategy Transformation uncheck "forward to next transformation". If any rejected rows are there automatically it will be updated to the session log file.

Update or insert files are known by checking the target file or table only.

What is the procedure to write the query to list the highest salary of three employees? The following is the query to find out the top three salaries

in ORACLE:--(take emp table)

select * from emp e where 3>(select count (*) from emp where

e.sal>emp.sal) order by sal desc.

in SQL Server:-(take emp table)

select top 10 sal from emp Which objects are required by the debugger to create a valid debug session? Intially the session should be valid session.

source, target, lookups, expressions should be availble, min 1 break point should be available for debugger to debug your session.

What is the limit to the number of sources and targets you can have in a mapping

As per my knowledge there is no such restriction to use this number of sources or targets inside a mapping.

Question is " if you make N number of tables to participate at a time in processing what is the position of your database. I orginzation point of view it is never encouraged to use N number of tables at a time, It reduces database and informatica server performance

Which is better among connected lookup and unconnected lookup transformations in informatica or any other ETL tool?

If you are having defined source you can use connected, source is not well defined or from different database you can go for unconnected

Page 21: a Interview Questions and Answers - Cool Interview

We are using like that only

In Dimensional modeling fact table is normalized or denormalized?in case of star schema and incase of snow flake schema?

In Dimensional modeling, Star Schema: A Single Fact table will be surrounded by a group of Dimensional tables comprise of de- normalized data Snowflake Schema: A Single Fact table will be surrounded by a group of Dimensional tables comprised of normalized dataThe Star Schema (sometimes referenced as star join schema) is the simplest data warehouse schema, consisting of a single "fact table" with a compound primary key, with one segment for each "dimension" and with additional columns of additive, numeric facts.The Star Schema makes multi-dimensional database (MDDB) functionality possible using a traditional relational database. Because relational databases are the most common data management system in organizations today, implementing multi-dimensional views of data using a relational database is very appealing. Even if you are using a specific MDDB solution, its sources likely are relational databases. Another reason for using star schema is its ease of understanding. Fact tables in star schema are mostly in third normal form (3NF), but dimensional tables in de-normalized second normal form (2NF). If you want to normalize dimensional tables, they look like snowflakes (see snowflake schema) and the same problems of relational databases arise - you need complex queries and business users cannot easily understand the meaning of data. Although query performance may be improved by advanced DBMS technology and hardware, highly normalized tables make reporting difficult and applications complex.The Snowflake Schema is a more complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake schema because the diagram of the schema resembles a snowflake.Snowflake schemas normalize dimensions to eliminate redundancy. That is, the dimension data has been grouped into multiple tables instead of one large table. For example, a product dimension table in a star schema might be normalized into a products table, a Product-category table, and a product-manufacturer table in a snowflake schema. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance.

What is difference between IIF and DECODE function

You can use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is zero or negative:

IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200, SALARY3, BONUS))), 0 )

You can use DECODE instead of IIF in many cases. DECODE may improve readability. The following shows how you can use DECODE instead of IIF :

SALES > 0 and SALES < 50, SALARY1,

SALES > 49 AND SALES < 100, SALARY2,

SALES > 99 AND SALES < 200, SALARY3,

SALES > 199, BONUS)

What are variable ports and list two situations when they can be used?

We have mainly tree ports Inport, Outport, Variable port. Inport represents data is flowing into transformation. Outport is used when data is mapped to next transformation. Variable port is used when we mathematical caluculations are required. If any addition i will be more than happy if you can share.

How does the server recognise the source and target databases?

By using ODBC connection.if it is relational.if is flat file FTP connection..see we can make sure with connection in the properties of session both sources && targets

How to retrive the records from a rejected file. explane with syntax or example

During the execution of workflow all the rejected rows will be stored in bad files(where your informatica server get installed;C:Program FilesInformatica PowerCenter 7.1Server) These bad files can be imported as flat a file in source then thro' direct maping we can load these files in desired format.

How to lookup the data on multiple tabels.

Page 22: a Interview Questions and Answers - Cool Interview

if the two tables are relational, then u can use the SQL lookup over ride option to join the two tables in the lookup properties.u cannot join a flat file and a relatioanl table.

eg: lookup default query will be select lookup table column_names from lookup_table. u can now continue this query. add column_names of the 2nd table with the qualifier, and a where clause. if u want to use a order by then use -- at the end of the order by.

What is the procedure to load the fact table.Give in detail?

Based on the requirement to your fact table, choose the sources and data and transform it based onyour business needs. For the fact table, you need a primary key so use a sequence generatortransformation to generate a unique key and pipe it to the target (fact) table with the foreign keysfrom the source tables.

What is the use of incremental aggregation? Explain me in brief with an example.

Its a session option. when the informatica server performs incremental aggr. it passes new source data through the mapping and uses historical chache data to perform new aggregation caluculations incrementaly. for performance we will use it.

How to delete duplicate rows in flat files source is any option in informatica

Use a sorter transformation , in that u will have a "distinct" option make use of it .

How to use mapping parameters and what is their use

In designer u will find the mapping parameters and variables options.u can assign a value to them in designer. comming to there uses suppose u r doing incremental extractions daily. suppose ur source system contains the day column. so every day u have to go to that mapping and change the day so that the particular data will be extracted . if we do that it will be like a layman's work. there comes the concept of mapping parameters and variables. once if u assign a value to a mapping variable then it will change between sessions.

in the concept of mapping parameters and variables, the variable value will be saved to the repository after the completion of the session and the next time when u run the session, the server takes the saved variable value in the repository and starts assigning the next value of the saved value. for example i ran a session and in the end it stored a value of 50 to the repository.next time when i run the session, it should start with the value of 70. not with the value of 51. how to do this.

u can do onething after running the mapping,, in workflow manager

start-------->session.

right clickon the session u will get a menu, in that go for persistant values, there u will find the last value stored in the repository regarding to mapping variable. then remove it and put ur desired one, run the session... i hope ur task will be done

can any one comment on significance of oracle 9i in informatica when compared to oracle 8 or 8i. I mean how is oracle 9i advantageous when compared to oracle 8 or 8i when used in informatica

it's very easy

Actually oracle 8i not allowed userdefined data types

but 9i allows

and then blob,clob allow only 9i not 8i

and more over list partinition is there in 9i only

Can we use aggregator/active transformation after update strategy transformation

Page 23: a Interview Questions and Answers - Cool Interview

You can use aggregator after update strategy. The problem will be, once you perform the update strategy, say you had flagged some rows to be deleted and you had performed aggregator transformation for all rows, say you are using SUM function, then the deleted rows will be subtracted from this aggregator transformation.

Why dimenstion tables are denormalized in nature ?

Because in Data warehousing historical data should be maintained, to maintain historical data means suppose one employee details like where previously he worked, and now where he is working, all details should be maintain in one table, if u maintain primary key it won't allow the duplicate records with same employee id. so to maintain historical data we are all going for concept data warehousing by using surrogate keys we can achieve the historical data(using oracle sequence for critical column).

so all the dimensions are marinating historical data, they are de normalized, because of duplicate entry means not exactly duplicate record with same employee number another record is maintaining in the table.

In a sequential Batch how can we stop single session?

We can stop it using PMCMD command or in the monitor right click on that perticular session and select stop.this will stop the current session and the sessions next to it.

How do you handle decimal places while importing a flatfile into informatica?

while importing flat file definetion just specify the scale for a neumaric data type. in the mapping, the flat file source supports only number datatype(no decimal and integer). In the SQ associated with that source will have a data type as decimal for that number port of the source.

source ->number datatype port ->SQ -> decimal datatype.Integer is not supported. hence decimal is taken care.

If you are workflow is running slow in informatica. Where do you start trouble shooting and what are the steps you follow?

When the work flow is running slowly u have to find out the bottlenecks

in this order

target

source

mapping

session

system

If you have four lookup tables in the workflow. How do you troubleshoot to improve performance?

There r many ways to improve the mapping which has multiple lookups.

1) we can create an index for the lookup table if we have permissions(staging area).

2) divide the lookup mapping into two (a) dedicate one for insert means: source - target,, these r new rows . only the new rows will come to mapping and the process will be fast . (b) dedicate the second one to update : source=target,, these r existing rows. only the rows which exists allready will come into the mapping.

3)we can increase the chache size of the lookup.

Can anyone explain error handling in informatica with examples so that it will be easy to explain the same in the interview.

Page 24: a Interview Questions and Answers - Cool Interview

Go to the session log file there we will find the information regarding to the

session initiation process,

errors encountered.

load summary.

so by seeing the errors encountered during the session running, we can resolve the errors.

There is one file called the bad file which generally has the format as *.bad and it contains the records rejected by informatica server. There are two parameters one fort the types of row and other for the types of columns. The row indicators signifies what operation is going to take place ( i.e. insertion, deletion, updation etc.). The column indicators contain information regarding why the column has been rejected.( such as violation of not null constraint, value error, overflow etc.) If one rectifies the error in the data preesent in the bad file and then reloads the data in the target,then the table will contain only valid data.

How do I import VSAM files from source to target. Do I need a special plugin

As far my knowledge by using power exchange tool convert vsam file to oracle tables then do mapping as usual to the target table.

Differences between Normalizer and Normalizer transformation.

Normalizer: It is a transormation mainly using for cobol sources,

it's change the rows into coloums and columns into rows

Normalization:To remove the retundancy and inconsitecy

What is IQD file?

IQD file is nothing but Impromptu Query Definetion,This file is maily used in Cognos Impromptu tool after creating a imr( report) we save the imr as IQD file which is used while creating a cube in power play transformer.In data source type we select Impromptu Query Definetion.

What is data merging, data cleansing, sampling?

Cleansing:---TO identify and remove the retundacy and inconsistency

sampling: just smaple the data throug send the data from source to target

Could anyone please tell me what are the steps required for type2 dimension/version data mapping. how can we implement it

1. Determine if the incoming row is 1) a new record 2) an updated record or 3) a record that already exists in the table using two lookup transformations. Split the mapping into 3 seperate flows using a router transformation.

2. If 1) create a pipe that inserts all the rows into the table.

3. If 2) create two pipes from the same source, one updating the old record, one to insert the new.

Hope this makes sense,

With out using Updatestretagy and sessons options, how we can do the update our target table?

In session properties, There is an option

insert

update

Page 25: a Interview Questions and Answers - Cool Interview

insert as update

update as update

like that

by using this we will easily solve

Two relational tables are connected to SQ Trans,what are the possible errors it will be thrown?

The only two possibilities as of I know is

Both the table should have primary key/foreign key relation shipBoth the table should be available in the same schema or same database

what is the best way to show metadata(number of rows at source, target and each transformation level, error related data) in a report format

You can select these details from the repository table. you can use the view REP_SESS_LOG to get these data

If u had to split the source level key going into two seperate tables. One as surrogate and other as primary. Since informatica does not gurantee keys are loaded properly(order!) into those tables. What are the different ways you could handle this type of situation?

foreign key

How to append the records in flat file(Informatica) ? Where as in Datastage we have the options i) overwrite the existing file ii) Append existing file

This is not there in Informatica v 7. but heard that its included in the latest version 8.0 where u can append to a flat file. Its about to be shipping in the market.

What are partition points?

Partition points mark the thread boundaries in a source pipeline and divide

the pipeline into stages.

What are cost based and rule based approaches and the difference

Cost based and rule based approaches are the optimization techniques which are used in related to databases, where we need to optimize a sql query.

Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these two techniques., bcz the third has some disadvantages.)

When ever you process any sql query in Oracle, what oracle engine internally does is, it reads the query and decides which will the best possible way for executing the query. So in this process, Oracle follows these optimization techniques.

1. cost based Optimizer(CBO): If a sql query can be executed in 2 different ways ( like may have path 1 and path2 for same query),then What CBO does is, it basically calculates the cost of each path and the analyses for which path the cost of execution is less and then executes that path so that it can optimize the quey execution.

2. Rule base optimizer(RBO): this basically follows the rules which are needed for executing a query. So depending on the number of rules which are to be applied, the optimzer runs the query.

Use:

If the table you are trying to query is already analysed, then oracle will go with CBO.

If the table is not analysed , the Oracle follows RBO.

Page 26: a Interview Questions and Answers - Cool Interview

For the first time, if table is not analysed, Oracle will go with full table scan.

What is mystery dimention?

using Mystery Dimension ur maitaining the mystery data in ur Project.

What is Micro Strategy? Why is it used for? Can any one explain in detail about it?

Micro strategy is again an BI tool whicl is a HOLAP... u can create 2 dimensional report and also cubes in here.......basically a reporting tool. IT HAS A FULL RANGE OF REPORTING ON WEB ALSO IN WINDOWS.

Can i start and stop single session in concurent bstch?

ya sure,Just right click on the particular session and going to recovery option

or

by using event wait and event rise

I want to prepare a questionnaire. The details about it are as follows: - 1. Identify a large company/organization that is a prime candidate for DWH project. (For example Telecommunication, an insurance company, banks, may be the prime candidate for this) 2. Give at least four reasons for the selecting the organization. 3. Prepare a questionnaire consisting of at least 15 non-trivial questions to collect requirements/information about the organization. This information is required to build data warehouse. Can you please tell me what should be those 15 questions to ask from a company, say a telecom company?

First of all meet your sponsors and make a BRD(business requirement document) about their expectation from this datawarehouse(main aim comes from them).For example they need :customer billing process.Now goto business managment team :they can ask for metrics out of billing process for their use.Now magament people :monthly usage,billing metrics,sales organization,rate plan to perform sales rep and channel performance analysis and rate plan analysis. So your dimension tables can be:Customer (customer id,name,city,state etc)Sales rep;sales rep number,name,idsales org:sales ord idBill dimension: Bill #,Bill date,Numberrate plan:rate plan codeAnd Fact table can be:Billing details(bill #,customer id,minutes used,call details etc)you can follow star and snow flake schema in this case.Depend upon the granualirty of your data.

what is difference between lookup cashe and unchashed lookup? Can i run the mapping with out starting the informatica server?

The difference between cache and uncacheed lookup iswhen you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation, in cache lookup the select statement executes only once and compares the values of the input record with the values in the cachebut in uncache lookup the the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record

What is the difference between stop and abort

stop: _______If the session u want to stop is a part of batch you must stop the batch,

if the batch is part of nested batch, Stop the outer most bacth

Abort:----

You can issue the abort command , it is similar to stop command except it has 60 second time out .

If the server cannot finish processing and commiting data with in 60 sec

Here's the difference:

ABORT is equivalent to:1. Kill -9 on Unix (NOT kill -7) but YES, Kill -92. SIGTERM ABEND (Force ABEND) on Mainframe

Page 27: a Interview Questions and Answers - Cool Interview

3. Windows FORCE QUIT on application.

What does this do?Each session uses SHARED/LOCKED (semaphores) memory blocks. The ABORT function kills JUST THE CODE threads, leaving the memory LOCKED and SHARED and allocated. The good news: It appears as if AIX Operating system cleans up these lost memory blocks. The bad news? Most other operating systems DO NOT CLEAR THE MEMORY, leaving the memory "taken" from the system. The only way to clear this memory is to warm-boot/cold-boot (restart) the informatica SERVER machine, yes, the entire box must be re-started to get the memory back.

If you find your box running slower and slower over time, or not having enough memory to allocate new sessions, then I suggest that ABORT not be used.

So then the question is: When I ask for a STOP, it takes forever. How do I get the session to stop fast?

well, first things first. STOP is a REQUEST to stop. It fires a request (equivalent to a control-c in SQL*PLUS) to the source database, waits for the source database to clean up. The bigger the data in the source query, the more time it takes to "roll-back" the source query, to maintain transaction consistency in the source database. (ie: join of huge tables, big group by, big order by).

It then cleans up the buffers in memory by releasing the data (without writing to the target) but it WILL run the data all the way through to the target buffers, never sending it to the target DB. The bigger the session memory allocations, the longer it takes to clean up.

Then it fires a request to stop against the target DB, and waits for the target to roll-back. The higher the commit point, the more data the target DB has to "roll-back".

FINALLY, it shuts the session down.

WHAT IF I NEED THE SESSION STOPPED NOW?Pick up the phone and call the source system DBA, have them KILL the source query IN THE DATABASE. This will send an EOF (end of file) downstream to Informatica, and Infa will take less time to stop the session.

If you use abort, be aware, you are choosing to "LOSE" memory on the server in which Informatica is running (except AIX).

If you use ABORT and you then re-start the session, chances are, not only have you lost memory - but now you have TWO competing queries on the source system after the same data, and you've locked out any hope of performance in the source database. You're competing for resources with a defunct query that's STILL rolling back.

Can we run a group of sessions without using workflow manager

ya Its Posible using pmcmd Command with out using the workflow Manager run the group of session.

as per my knowledge i give the answer.

If a session fails after loading of 10,000 records in to the target.How can u load the records from 10001 th record when u run the session next time in informatica 6.1?

Running the session in recovery mode will work, but the target load type should be normal. If its bulk then recovery wont work as expected

Explain use of update strategy transformation

To flag source records as INSERT, DELETE, UPDATE or REJECT for target database. Default flag is Insert. This is must for Incremental Data Loading.

What are mapping parameters and varibles in which situation we can use it

If we need to change certain attributes of a mapping after every time the session is run, it will be very difficult to edit the mapping and then change the attribute. So we use mapping parameters and variables and define the values in a parameter file. Then we could edit the parameter file to change the attribute values. This makes the process simple.

Mapping parameter values remain constant. If we need to change the parameter value then we need to edit the parameter file .

Page 28: a Interview Questions and Answers - Cool Interview

But value of mapping variables can be changed by using variable function. If we need to increment the attribute value by 1 after every session run then we can use mapping variables .

In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run.

What is worklet and what use of worklet and in which situation we can use it

A set of worlflow tasks is called worklet,

Workflow tasks means

1)timer2)decesion3)command4)eventwait5)eventrise6)mail etc......

But we r use diffrent situations by using this only

What is difference between dimention table and fact table and what are different dimention tables and fact tables

In the fact table contain measurable data and less columns and meny rows,

It's contain primarykey

Diffrent types of fact tables:

additive,non additive, semi additive

In the dimensions table contain textual descrption of data and also contain meny columns,less rows

Its contain primary key

How do you configure mapping in informatica

You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations. For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.

You can also perform the following tasks to optimize the mapping:

Configure single-pass reading. Optimize datatype conversions. Eliminate transformation errors. Optimize transformations. Optimize expressions. You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations. For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.

You can also perform the following tasks to optimize the mapping:

Configure single-pass reading. Optimize datatype conversions. Eliminate transformation errors. Optimize transformations. Optimize expressions.

What is the logic will you implement to laod the data in to one factv from 'n' number of dimension tables.

Noramally evey one use

!)slowly changing diemnsions

Page 29: a Interview Questions and Answers - Cool Interview

2)slowly growing dimensions