Data Stage Director Guide

Embed Size (px)

Citation preview

  • 8/6/2019 Data Stage Director Guide

    1/98

    Ascential DataStage

    Director Guide

    Version 7.1January 2004Part No. 00D-007DS71

  • 8/6/2019 Data Stage Director Guide

    2/98

    Published by Ascential Software Corporation.

    2004 Ascential Software Corporation. All rights reserved. Ascential, DataStage, QualityStage, AuditStage,,ProfileStage, and MetaStage ar are trademarks of Ascential Software Corporation or its affiliates and may beregistered in the United States or other jurisdictions. Windows is a trademark of Microsoft Corporation. Unixis a registered trademark of The Open Group. Adobe and Acrobate are registered trademarks of Adobe

    Systems Incorporated. Other marks are the property of the owners of those marks.This product may contain or utilize third party components subject to the user documentation previouslyprovided by Ascential Software Corporation or contained herein.

    Documentation Team: Mandy deBelin

  • 8/6/2019 Data Stage Director Guide

    3/98

    Table of Contents iii

    Table of Contents

    PrefaceOrganization of This Manual ...................................................................................... viiDocumentation Conventions ..................................................................................... viiiDataStage Documentation .......................................................................................... viii

    Chapter 1. Introducing DataStage

    What Is a Data Warehouse? ....................................................................................... 1-1Why Do I Need One? .................................................................................................. 1-1What Does DataStage Do? ......................................................................................... 1-2How Is DataStage Packaged? .................................................................................... 1-2DataStage Projects and Jobs ....................................................................................... 1-3

    Chapter 2. The DataStage DirectorStarting the DataStage Director ................................................................................ 2-2The DataStage Director Window .............................................................................. 2-3

    Job Category Pane ................................................................................................ 2-3Display Area ......................................................................................................... 2-4Menu Bar ............................................................................................................... 2-5Toolbar ................................................................................................................... 2-6Status Bar .............................................................................................................. 2-6

    Job Status View ............................................................................................................ 2-6 Job States ............................................................................................................... 2-7 Job Status Details ................................................................................................. 2-8

    Shortcut Menus ........................................................................................................... 2-9Shortcut Menus in the Job Status View .......................................................... 2-10Shortcut Menus in the Job Log View .............................................................. 2-10Shortcut Menus in the Job Schedule View ..................................................... 2-11Shortcut Menus in the Job Category Pane ..................................................... 2-12Shortcut Menu in the Monitor Window ......................................................... 2-12

  • 8/6/2019 Data Stage Director Guide

    4/98

    iv Ascential DataStage Director Guide

    Filtering the Job Status or Job Schedule View .......................................................2-13Examples of Filtering by Job Name .................................................................2-15

    Finding Text ................................................................................................................2-15Sorting Columns ........................................................................................................2-17Printing the Current View ........................................................................................2-17

    What Is in the Printout? .....................................................................................2-18Changing the Printer Setup ..............................................................................2-19

    Director Options ........................................................................................................2-19General Page .......................................................................................................2-20Limits Page ..........................................................................................................2-21View Page ............................................................................................................2-22Priority Page ........................................................................................................2-23

    Choosing an Alternative Project ..............................................................................2-24Viewing Jobs in Another Project ......................................................................2-24Viewing Jobs on a Different Server ..................................................................2-25

    Exiting the DataStage Director ................................................................................2-25

    Chapter 3. Running DataStage JobsSetting Job Options ......................................................................................................3-1Validating a Job ............................................................................................................3-3Running a Job ...............................................................................................................3-4

    Stopping a Job ..............................................................................................................3-4Resetting a Job ..............................................................................................................3-5Setting Default Job Parameters ..................................................................................3-5 Job Scheduling .............................................................................................................3-6

    Job Schedule View ................................................................................................3-7Viewing Details of a Job Schedule .....................................................................3-9Scheduling a Job .................................................................................................3-10Unscheduling a Job ............................................................................................ 3-11Rescheduling a Job .............................................................................................3-12

    Deleting a Job .............................................................................................................3-12 Job Administration ....................................................................................................3-13

    Cleaning Up Job Resources ...............................................................................3-13Clearing a Job Status File ..................................................................................3-16

    Multiple Job Invocations ..........................................................................................3-17

  • 8/6/2019 Data Stage Director Guide

    5/98

    Table of Contents v

    Setting Tracing Options ............................................................................................ 3-20

    Chapter 4. Job BatchesWhat Is a Job Batch? ................................................................................................... 4-1Creating a Job Batch .................................................................................................... 4-2Running a Job Batch ................................................................................................... 4-4Scheduling a Job Batch ............................................................................................... 4-5Unscheduling a Job Batch .......................................................................................... 4-6Rescheduling a Job Batch ........................................................................................... 4-6 Job Schedule Errors ..................................................................................................... 4-7Editing a Job Batch ...................................................................................................... 4-7Copying a Job Batch .................................................................................................... 4-8

    Deleting a Job Batch .................................................................................................... 4-8

    Chapter 5. Monitoring JobsThe Monitor Window ................................................................................................. 5-1

    Monitor Shortcut Menu ...................................................................................... 5-5Setting the Server Update Interval .................................................................... 5-6

    Switching Between Monitor Windows .................................................................... 5-6The Stage Status Window .......................................................................................... 5-7

    Chapter 6. The Job Log File Job Log View ................................................................................................................ 6-1The Event Detail Window .......................................................................................... 6-3Viewing Related Logs ................................................................................................. 6-4Filtering the Job Log View ......................................................................................... 6-5Purging Log File Entries ............................................................................................ 6-7

    Purging Log Entries Immediately ..................................................................... 6-8Purging Log Entries Automatically .................................................................. 6-9

    GlossaryIndex

  • 8/6/2019 Data Stage Director Guide

    6/98

    vi Ascential DataStage Director Guide

  • 8/6/2019 Data Stage Director Guide

    7/98

    Preface vii

    Preface

    Ascential DataStage is a powerful software suite that is used to developand run DataStage jobs. A DataStage job can extract from different sources,and then cleanse, integrate, and transform the data according to yourrequirements. The clean data is ready to be imported into a data ware-house for analysis and processing by business information software.

    This manual describes the DataStage Director, the DataStage componentthat is used to validate, schedule, run, and monitor DataStage server jobsand parallel jobs. For information about how to perform these tasks forDataStage mainframe jobs, refer to the documentation supplied with themainframe computer. (For a brief explanation of server, parallel, andmainframe jobs, refer to DataStage Projects and Jobs on page 1-3 .)

    To use this manual you should be familiar with the Windows 2000 orWindows NT interface, but no other special skills or knowledge arerequired.

    Organization of This Manual

    This manual contains the following:Chapter 1 contains an overview of DataStage and its component parts.

    Chapter 2 describes the DataStage Director and how to use it.

    Chapter 3 covers how to validate, run, delete, schedule, and admin-ister DataStage server jobs.

    Chapter 4 describes job batches.

    Chapter 5 describes how to monitor a running server job.

    Chapter 6 describes the job log file and the Job Log view.

    The Glossary defines terms that have specific meaning in DataStage.

    http://glossary.pdf/http://glossary.pdf/
  • 8/6/2019 Data Stage Director Guide

    8/98

    viii Ascential DataStage Director Guide

    Documentation ConventionsThis manual uses the following conventions:

    DataStage DocumentationDataStage documentation includes the following:

    DataStage Director Guide : This guide describes the DataStageDirector and how to validate, schedule, run, and monitor DataStageserver jobs.

    DataStage Manager Guide : This guide describes the DataStageManager and describes how to use and maintain the DataStageRepository.

    DataStage Designer Guide : This guide describes the DataStageDesigner, and gives a general description of how to create, design, anddevelop a DataStage application.

    Convention UsageBold In syntax, bold indicates commands, function names,

    keywords, and options that must be input exactly asshown. In text, bold indicates keys to press, functionnames, and menu selections.

    Italic In syntax, italic indicates information that yousupply. In text, italic also indicates UNIX commandsand options, file names, and pathnames.

    Plain In text, plain indicates Windows NT commands andoptions, file names, and pathnames.

    Courier Courier indicates examples of source code andsystem output.

    Courier Bold In examples, courier bold indicates characters thatthe user types or keys the user presses (for example, ).

    A right arrow between menu commands indicatesyou should choose each command in sequence. Forexample, Choose File Exit means you shouldchoose File from the menu bar, then choose Exit from the File pull-down menu.

  • 8/6/2019 Data Stage Director Guide

    9/98

    Preface ix

    DataStage Server: Server Job Developer Guide : This guide describesthe tools that are used in building a server job, and it suppliesprogrammers reference information..

    DataStage Enterprise Edition: Parallel Job Developer Guide : Thisguide describes the tools that are used in building a parallel job, and itsupplies programmers reference information.

    DataStage Enterprise MVS Edition: Mainframe Job Developer Guide :This guide describes the tools that are used in building a mainframe job, and it supplies programmers reference information..

    DataStage Administrator Guide : This guide describes DataStagesetup, routine housekeeping, and administration.

    DataStage Install and Upgrade Guide . This guide contains instruc-tions for installing DataStage on Windows and UNIX platforms, andfor upgrading existing installations of DataStage.

    DataStage NLS Guide . This Guide contains information about usingthe NLS features that are available in DataStage when NLS is installed.

    These guides are also available online in PDF format. You can read themusing the Adobe Acrobat Reader supplied with DataStage. See Install andUpgrade Guide for details on installing the manuals and the Adobe AcrobatReader.

    You can use the Acrobat search facilities to search the whole DataStagedocument set. To use this feature, select Edit Search then choose theAll PDF documents in option and specify the DataStage docs directory(by default this is C:\Program Files\Ascential\DataStage\Docs).

    Extensive online help is also supplied. This is particularly useful whenyou have become familiar with DataStage, and need to look up specificinformation.

  • 8/6/2019 Data Stage Director Guide

    10/98

    x Ascential DataStage Director Guide

  • 8/6/2019 Data Stage Director Guide

    11/98

    Introducing DataStage 1-1

    1Introducing DataStage

    Many organizations want to make better use of their data. But that datamay be stored in different formats in different types of database. Somedata sources may be dormant archives, others may be busy operationaldatabases. Extracting and cleaning data from these varied sources hasalways been time-consuming and costly until now. DataStage makes itsimple to design and develop efficient applications that make data ware-housing a reality where it was impossible before.

    What Is a Data Warehouse?A data warehouse is a central database containing copies of data from allthe operational sources and archive systems in an organization. But thedatabase does not have to be large. Instead of storing details of everytransaction, order, or set of sales figures, the data warehouse stores totals,averages, area figures, and so on. This data is structured to make it easy toquery and to generate reports.

    Inside the data warehouse you can perform analyses that would beimpractical on a working database. This means that anyone who needsaccess to the data gets all the information they want, and only the informa-tion they want. The data warehouse can be created or updated at any time,with minimum disruption to working systems.

    Why Do I Need One?Working databases are busy. It is hard to gain an accurate picture of thecontents of the database at any time because it changes frequently. Bytransferring working data into a data warehouse, you can take snapshotsof what is going on. Also, working databases contain dirty data recordsin different formats, with key values missing or out of range, and so on. In

  • 8/6/2019 Data Stage Director Guide

    12/98

    1-2 Ascential DataStage Director Guide

    a fast-moving working database it is difficult to trap mistakes or incom-plete entries. Using DataStage, you can cleanse data before loading it intoa data warehouse, ensuring that your business decisions are based only onvalid information.

    As well as a working database, you may have archive systems or incom-patible data sources that you have inherited. These may be static, butinaccessible because their format is different from your working system.You can use DataStage to transform this data into compatible formats thatcan be stored in the data warehouse.

    What Does DataStage Do?DataStage comes in between your data and your data warehouse.DataStage jobs process the data to meet your needs, including:

    Extraction DataStage takes data from indexed files, sequentialfiles, networked databases, archives, and external data sources andstores it in the data warehouse.

    Aggregation DataStage takes your working data, calculates totalsand averages, then stores it in the data warehouse. This means youhave to store much less data, which is quicker and easier to access.

    Transformation DataStage converts inconsistent data into therequired format and loads it into the data warehouse.

    How Is DataStage Packaged?DataStage is client/server software. The server holds the data while it is being processed. The client is the interface to DataStage that is used fordesigning and running jobs, or managing the data in the Repository. Theclient components include:

    DataStage Designer , for creating DataStage server and mainframe jobs. Server jobs are compiled into executable programs that arescheduled by the DataStage Director and run by the DataStage

    Server. Mainframe jobs are downloaded from the Designer tomainframe computers, where they are compiled and run by main-frame tools.

    DataStage Director , for running DataStage server jobs.

    DataStage Manager , for viewing and editing the contents of theRepository.

  • 8/6/2019 Data Stage Director Guide

    13/98

    Introducing DataStage 1-3

    DataStage Administrator , for administering DataStage projectsand conducting housekeeping on the server.

    The client and server components installed depend on the edition ofDataStage you have purchased. DataStage is packaged in two ways:

    Developers Edition , used by developers to design, develop, andcreate executable DataStage jobs. The Developers Edition containsall the client components described earlier and the server. Thedevelopers role, DataStage Designer, and DataStage Manager aredescribed in DataStage Designer Guide and DataStage ManagerGuide.

    Operators Edition , used by operators to validate, schedule, run,and monitor DataStage jobs that have been developed elsewhere.The Operators Edition contains DataStage Director, DataStage

    Administrator, and DataStage Server components only.

    DataStage Projects and JobsYou always enter DataStage through a DataStage project. When you starta DataStage client you are prompted to attach to a project. Each projectcontains DataStage jobs and the components required to develop or runthem.

    DataStage jobs are made up of individual stages. A stage represents a datasource or a process. For example, one stage may extract data from a datasource, while another transforms it. The data required at each stage andhow it is handled is specified in the job design. When the job is run, theprocessing described in the job design is performed. Variable parameterssuch as file names, dates, and so on, can be specified when the job is run.DataStage jobs can be exported for use on other DataStage systems.

    DataStage supports three types of job:

    Server jobs are both developed and compiled using DataStageclient tools. Compilation of a server job creates an executable that isscheduled and run from the DataStage Director.

    Parallel jobs . These are compiled and run on the DataStage serverin a similar way to server jobs, but support parallel processing onSMP, MPP, and cluster systems.

    Mainframe jobs are developed using the same DataStage clienttools as for server jobs, but compilation and execution occur on amainframe computer. The DataStage Designer generates a COBOL

  • 8/6/2019 Data Stage Director Guide

    14/98

    1-4 Ascential DataStage Director Guide

    source file and supporting JCL script, then lets you upload them tothe target mainframe computer. The job is compiled and run on themainframe computer under the control of native mainframesoftware.

    There are also:

    Job Sequences . A job sequence allows you to specify a sequence ofDataStage jobs to be executed, and actions to take depending onresults.

  • 8/6/2019 Data Stage Director Guide

    15/98

    The DataStage Director 2-1

    2The DataStage Director

    The DataStage Director is the client component that validates, runs, sched-ules, and monitors jobs run by the DataStage Server. It is the starting pointfor most of the tasks a DataStage operator needs to do in respect ofDataStage jobs.

    Note: DataStage mainframe jobs run on a mainframe computer, and usemainframe-specific tools. These jobs are not visible in theDataStage Director. In this manual the term job therefore refers toDataStage server and parallel jobs only. For information aboutrunning DataStage mainframe jobs, consult the documentationsupplied with your mainframe software.

    This chapter describes the interface to the DataStage Director and how to

    use it, including: Starting the DataStage Director

    Using the DataStage Director window

    Finding text in the DataStage Director window, sorting data, andprinting out the display

    Setting options and defaults for the DataStage Director windowand for jobs you want to run

    Switching between projects and exiting the DataStage Director

  • 8/6/2019 Data Stage Director Guide

    16/98

    2-2 Ascential DataStage Director Guide

    Starting the DataStage DirectorTo start the DataStage Director:

    1. Choose Start Programs Ascential DataStage DataStage Director , or choose the appropriate program folder if you installedDataStage elsewhere. The Attach to Project dialog box appears:

    2. Enter the name of your host in the Host system field. This is the nameof the system where the DataStage Server is installed.

    3. Enter your user name in the User name field. This is your user nameon the server system.

    4. Enter your password in the Password field. If you are connecting to aWindows NT server via LAN Manager, you can select the Omit check box. The User name and Password fields gray out and you log on tothe server using your current Windows account details.

    CAUTION: Think carefully before using the Omit option to log onto DataStage. If you use this option, note that:

    You cannot specify UNC pathnames in DataStageclients.

    You must specify the host name in uppercase. You cannot access remote files.

    You may get errors if you import meta data from anODBC data source if you are not logged on to thesame domain as the server.

  • 8/6/2019 Data Stage Director Guide

    17/98

    The DataStage Director 2-3

    5. Enter the name of the project you want to use or choose one from theProject list, which displays all the projects installed on the server.

    6. Click OK . The DataStage Director window appears.

    Note: You can also start the DataStage Director from the DataStageDesigner or the DataStage Manager by choosing Tools RunDirector . You are automatically attached to the same project andyou do not see the Attach to Project dialog box. For more informa-tion about the DataStage Designer and Manager, see DataStageDesigner Guide and DataStage Manager Guide.

    The DataStage Director WindowThe DataStage Director window appears when you start the Director:

    This section describes the features of the DataStage Director windowincluding:

    The job category pane The display area The menu bar The toolbar The status bar

    Job Category PaneThe left pane of the DataStage Director window is the job category pane.It displays the job category tree, which lists job categories and subcatego-ries that contain server jobs. The jobs in the currently selected category arelisted in the display area. You can hide the job category pane by choosingView Show Categories .

  • 8/6/2019 Data Stage Director Guide

    18/98

    2-4 Ascential DataStage Director Guide

    Display AreaThe display area is the main part of the DataStage Director window. Thereare three views:

    Job Status. The default view, which appears in the right pane ofthe DataStage Director window. It displays the status of all jobs inthe category currently selected in the job category tree. If you hidethe job category pane, the Job Status view includes a Categorycolumn, and displays the status of all server jobs in the currentproject, regardless of their category. See Job Status View onpage 2-6 for more information.

    Job Schedule. Displays a summary of scheduled jobs and batchesin the currently selected job category. If the job category pane ishidden, the display area shows all scheduled jobs and batches,

    regardless of their category. See Chapter 3 , Running DataStage Jobs, for a description of this view. To switch to the Job Scheduleview, choose View Schedule , or click the Schedule button onthe toolbar.

    Job Log. Displays the log file for a job chosen from the Job Statusview or the Job Schedule view. The job category pane is alwayshidden. See Chapter 6, The Job Log File, for more details. Toswitch to this view, choose View Log , or click the Log button onthe toolbar.

    Updating the DisplayYou can set how often the display area is updated from the server byediting the Director options. For more information, see Director Options on page 2-19 .

  • 8/6/2019 Data Stage Director Guide

    19/98

    The DataStage Director 2-5

    You can also update the screen immediately by doing one of the following:

    Choose View Refresh .

    Choose Refresh from the shortcut menus (see Shortcut Menus

    on page 2-9 for more details). Press Ctrl-R .

    Each entry in the display area represents a job, scheduled job, or event inthe job log, depending on the current view. An icon is displayed by defaultfor each entry. To hide the icons, choose Tools Options View .

    Note: You can increase the refresh rate by organizing jobs within job cate-gories, so that you do not display more jobs than necessary in thedisplay area. For information about how to organize jobs withincategories, refer to DataStage Designer Guide.

    Menu BarThe menu bar has six pull-down menus that give access to all the functionsof the Director:

    Project. Opens an alternative project and sets up printing.

    View. Displays or hides the toolbar, status bar, buttons, or job cate-gory pane, specifies the sorting order, changes the view, filtersentries, shows further details of entries, and refreshes the screen.

    Search. Starts a text search dialog box.

    Job. Validates, runs, schedules, stops, and resets jobs, purges oldentries from the job log file, deletes unwanted jobs, cleans up jobresources (if the administrator has enabled this option), and allowsyou to set default job parameter values.

    Tools. Monitors running jobs, manages job batches, and starts theDataStage Designer and DataStage Manager. It also startsMetaStage Explorer and Quality Manager, if these components areinstalled on the system, and custom software. If you are running

    parallel jobs on a UNIX server, allows you to manage data sets (seeDataStage Parallel Job developers Guidefor details).

    Help. Invokes the Help system. You can also get help from anyscreen or dialog box in the DataStage Director.

  • 8/6/2019 Data Stage Director Guide

    20/98

    2-6 Ascential DataStage Director Guide

    ToolbarThe toolbar gives quick access to the main functions of the DataStageDirector.

    The toolbar is displayed by default, but can be hidden by choosing View Toolbar or by changing the Director options. See Director Options onpage 2-19 for more details. To display ToolTips, let the cursor rest on a button in the toolbar.

    Status BarThe status bar appears at the bottom of the DataStage Director windowand displays the following information:

    The name of a job (if you are displaying the Job Log view).

    The number of entries in the display. If you look at the Job Status or Job Schedule view and use the Filter Entries command, thispanel specifies the number of lines that meet the filter criteria. Ifyou have set a filter then (filtered) or (limited) is displayed.

    The date and time on the DataStage server.

    Note: Under certain circumstances, the number of entries in the displayis replaced by the last error message issued by the server. Themessage disappears when the screen is refreshed.

    Job Status ViewThe Job Status view is the default view displayed when you start theDataStage Director (see The DataStage Director Window on page 2-3 ).You can change the default view by editing the Director options (see

    Job Run aJob

    Find

    Sort - Reset a

    Help

    Reschedulea Job

    OpenJobJobAscendingLogStatusProject

    Schedulea Job

    Stop aJob

    PrintView

    JobSchedule

    Sort -Descending

  • 8/6/2019 Data Stage Director Guide

    21/98

    The DataStage Director 2-7

    Director Options on page 2-19 for more details). You can also filter the jobs displayed in the view (see Filtering the Job Status or Job ScheduleView on page 2-13 ).

    The Job Status view shows the status of all the jobs in the currently selected job category, or, if the job category pane is hidden, in the current project.The view has the following columns:

    Job StatesThe Status column in the Job Status view displays the current status of the job. The possible job states are as follows:

    Column Description

    Job name The name of the job.Status The status of the job. See Job States for the possible

    job states and what they mean.StartedOn date

    The time and date a job was started. These fields areonly filled in for a job with a status of Running.

    Last ranOn date The time and date the job was finished, stopped, oraborted. These columns are blank for jobs that havenever been run.

    Description A description of the job, if available.

    Job State Description

    Compiled The job has been compiled but has not been vali-dated or run since compilation.

    Not compiled The job is under development and has not beencompiled successfully.

    Running The job is currently being run, reset, or validated.Finished The job has finished.Finished (see log) The job has finished but warning messages were

    generated or rows were rejected. View the log file formore details.

    Stopped The job was stopped by the operator.Aborted The job finished prematurely.Validated OK The job has been validated with no errors.

  • 8/6/2019 Data Stage Director Guide

    22/98

    2-8 Ascential DataStage Director Guide

    Job Status DetailsTo view more details about a jobs status, select it in the display and do oneof the following:

    Choose View Detail . Right-click to display the shortcut menu and choose Detail . Double-click the job in the display.

    The Job Status Detail dialog box appears:

    Validated (see log) The job has been validated but warning messageswere generated or rows were rejected. View the logfile for more details.

    Failed validation The job has been validated, but an error was found.Has been reset The job has been reset with no errors.

    Job State Description

  • 8/6/2019 Data Stage Director Guide

    23/98

    The DataStage Director 2-9

    This dialog box contains details of the selected jobs status:

    Use Copy to copy the whole window or selected text to the Clipboard foruse elsewhere.

    Click Next or Previous to display status details for the next or previous jobin the list.

    Click Close to close the dialog box.

    Shortcut MenusDataStage has shortcut menus that appear when you right-click in thedisplay area or job category pane. The menu you see depends on the viewor window you are using, and what is highlighted in the window whenyou click the mouse.

    This field Contains this information

    Project The name of the project and the DataStage server.Status The current status of the chosen job.Wave # An internal number used by DataStage when the job is

    run. Job name The name of the job.Started at The date and time the job was started. This field is used

    only for a job with a status of Running.Last run at The date and time the job was last run.Description A description of the job. This is the description entered

    by the developer when the job was created. If the jobhas job parameters, this column also displays thevalues used in the run.

  • 8/6/2019 Data Stage Director Guide

    24/98

    2-10 Ascential DataStage Director Guide

    Shortcut Menus in the Job Status ViewThis Job menu appears when you right-click on a selected job in the JobStatus view. If you right-click in the display area when the cursor is notover a job, a subset of this menu appears. From the Job menu you can:

    Shortcut Menus in the Job Log ViewThis menu appears when you right-click on an entry in the Job Log view.If you right-click in the display area when the cursor is not over a log entry,

    Add the job to the schedule

    Start a Monitor window (available onlyif an entry is selected)

    Set job parameter defaults

    Display the Job Schedule or Job Logview

    Use Find to search for text in the display

    area Filter (limit) the jobs listed in the display

    area

    Refresh the display

    Display details of a log entry (availableonly if an entry is selected)

    Delete the selected job

  • 8/6/2019 Data Stage Director Guide

    25/98

    The DataStage Director 2-11

    a subset of this menu appears. From the full menu you can:

    Shortcut Menus in the Job Schedule ViewThis Job menu appears when you right-click on a job in the Job Scheduleview. If you right-click in the display area when the cursor is not over a job,a subset of this menu appears. From the Job menu you can:

    Start a Monitor window (available only if alog entry is selected)

    Set job parameter defaults Display the Job Status or Job Schedule view

    Use Find to search for text in the display area

    Filter (limit) log entries

    Refresh the display

    Display details of a log entry (available onlyif a log entry is selected)

    Jump to the batch log from a job within a batch

    Delete an entry.

    Schedule, reschedule, or unschedule the

    selected job Set job parameter defaults

    Display the Job Status or Job Log view

    Use Find to search for text in the displayarea

    Filter (limit) the jobs listed in the displayarea

    Refresh the display

    Display details of an entry in the JobSchedule view (available only if an entryis selected)

    Delete an entry.

  • 8/6/2019 Data Stage Director Guide

    26/98

    2-12 Ascential DataStage Director Guide

    Shortcut Menus in the Job Category PaneWhen you right-click in the job category pane, this menu appears, fromwhich you can:

    Shortcut Menu in the Monitor WindowThis menu appears when you right-click in the Monitor window(described in Chapter 5 ). From this menu you can:

    Filter (limit) the jobs listed in the displayarea

    Refresh the job category pan

    Refresh the display

    Display details of a selected stage

    Show link information in the Monitorwindow

    Show CPU usage in the Monitor window

    Clean up job resources

    Save or clear the Monitor windowsettings

    Display information about the DataStagerelease

  • 8/6/2019 Data Stage Director Guide

    27/98

    The DataStage Director 2-13

    Filtering the Job Status or Job Schedule ViewBy default, the Job Status view or Job Schedule view displays informationabout all the jobs in the currently selected job category. If you hide the job

    category pane, the view displays information about all the jobs in a project.On a large project, the display may include more information than youneed. If you want to focus on specific types of job, based either on theirname or their status, you can filter the view so that it displays only those jobs.

    To filter the jobs in the Job Status view or the Job Schedule view:

    1. If the view you want to filter is not already displayed, choose View Status or View Schedule , as appropriate.

    2. Start the Filter facility by doing one of the following:

    Choose View Filter Entries from the menu bar. Choose Filter from the shortcut menu. Press Ctrl-T .

    The Filter Jobs dialog box appears:

    3. Choose which jobs to include in the view by clicking either the Alljobs or Jobs matching option button in the Include area.

  • 8/6/2019 Data Stage Director Guide

    28/98

    2-14 Ascential DataStage Director Guide

    If you select Jobs matching , enter a string in the Jobs matching field.Only jobs that match this string will be displayed. The string caninclude wildcards, character lists, and character ranges.

    4. Choose which jobs to exclude from the view by clicking either the Nojobs or Jobs matching option button in the Include area. If you select Jobs matching , enter a string in the Jobs matching field. Only jobsthat match this string will be excluded. The string definition is thesame as in step 3.

    5. Specify the status of the jobs you want to display by clicking anoption button in the Job status area.

    All lists jobs that have any status.

    All, except Not compiled lists jobs with any status except Notcompiled.

    Terminated normally lists jobs with a status of Finished, Validated,Compiled, or Has been reset.

    Terminated abnormally lists jobs with a status of Aborted,Stopped, Failed validation, Finished (see log), or Validated (seelog).

    6. If you want to restrict the display to released jobs, select the Includeonly released jobs check box in the Released jobs area.

    7. Click OK to activate the filter. The updated view displays the jobsthat meet the filter criteria. The status bar indicates that the entrieshave been filtered.

    Wildcard/Pattern Description? Matches any single character.* Matches zero or more characters.# Matches a single digit.[charlist] Matches any single character in charlist.[!charlist] Matches any single character not in

    charlist.[az] Matches any single character in the range

    az.

  • 8/6/2019 Data Stage Director Guide

    29/98

    The DataStage Director 2-15

    Examples of Filtering by Job NameThe following examples show how to use the Jobs Matching field in theFilter Jobs dialog box to filter the Job Status view or the Job Schedule view.

    Example 1

    Example 2

    Continuing Example 1, if you also specify *input as an Exclude filter,the Job Status view shows only job2output.

    Example 3

    Finding TextIf there are many entries in the display area, you can use Find to search fora particular job or event. You start Find in one of three ways:

    Choose Search Find . Choose Find from the shortcut menu. Click the Find button on the toolbar.

    Job Names Include Filter Job View

    job2input job2* job2input job2output job2output job3input job3output

    Job Names Include Filter Job View

    A3tires [A-E]3* A3tiresA3valves A3valvesB3tires B3tiresB3valves B3valvesF3tiresF3valves

  • 8/6/2019 Data Stage Director Guide

    30/98

    2-16 Ascential DataStage Director Guide

    The Find dialog box appears:

    The Look in field shows the currently selected job category. If the job cate-gory pane is hidden, and the display area lists jobs from all job categories,the Look in field specifies All Categories. You cannot edit this field.

    To use Find :

    1. Enter text in the Find what field. This could be a date, time, status, orthe name of a job.

    Note: If the text entered matches any portion of the text in anycolumn, this constitutes a match.

    2. If the displayed entry must match the case of the text you entered,select the Match Case check box. The default setting is cleared.

    3. Choose the search direction by clicking the Up or Down option button. The default setting is Down .

    4. Click Find Next . The display columns are searched to find theentered text. The first occurrence of the text is highlighted in thedisplay. The text can appear in any column or row of the display area.

    5. Click Find Next again to search for the next occurrence of the text.

    6. Click Cancel to close the Find dialog box.

    Note: You can also use Search Find Next to search for an entry in thedisplay. If there is a search string in the Find dialog box, Find Next acts in the same way as the Find Next button on the Find dialog box. If there is no search string in the Find dialog box, this optiondisplays the Find dialog box where you must enter a search string.

  • 8/6/2019 Data Stage Director Guide

    31/98

    The DataStage Director 2-17

    Sorting ColumnsYou can organize the entries in the display area by sorting the columns inascending or descending order. The column currently being used for

    sorting is indicated by a symbol in the column title: > indicates the sort is in ascending order. < indicates the sort is in descending order.

    To sort a column do any one of the following:

    Click the column title. This selects the column for sorting toggles between ascending and descending.

    Click the Ascending or Descending button on the toolbar.

    Choose View Ascending or View Descending .

    If you choose a column that contains a date or a time, both the date andtime columns are sorted together.

    Printing the Current ViewYou can print information from the current view. The content of theprintout depends on the view you are using and the options you choose inthe Print dialog box. To print the current view:

    1. Do one of the following to display the Print dialog box:

    Choose Project Print . Click the Print button on the toolbar.

  • 8/6/2019 Data Stage Director Guide

    32/98

    2-18 Ascential DataStage Director Guide

    This dialog box contains the name of the printer to use. By default, thisis the default Windows printer. For information on how to specify analternative printer, see Changing the Printer Setup on page 2-19 .

    2. Choose the range of items to print by clicking an appropriate option button in the Range area:

    All entries prints all entries in the current view.

    Current page prints all entries visible in the display area for thecurrent view.

    Current item prints the selected item only.

    3. Choose what to print by clicking an appropriate option button in thePrint what area:

    Summary only prints a summary for each item. Full details prints detailed information for each item.

    4. Specify the print quality to use by choosing an appropriate settingfrom the Print quality list:

    High (the default setting) Medium Low Draft

    Note: This setting is ignored if Print to file is selected.

    5. Select the Print to file check box if you want to print to a file only.6. Click OK . If you are printing to file, the Print to file dialog box

    appears. Enter the name of a text file to use. The default is DSDirect.txt in the Program Files \Ascential\DataStage directory.

    What Is in the Printout?The content of the printout depends on the view you are using:

    For the Job Status view, the printout contains the current status for

    each job in the project, and the date and time the job was last run. For the Job Schedule view, the printout contains an entry for each

    scheduled job in the project specifying when the job is scheduled torun.

    For the Job Log view, the printout can include information abouteach event in the job log file. For more information about the joblog file, see Chapter 6 .

  • 8/6/2019 Data Stage Director Guide

    33/98

    The DataStage Director 2-19

    Changing the Printer SetupPrintouts from the Director are normally output to the default Windowsprinter. To choose an alternative printer or specify other printer settings:

    1. Choose Project Print Setup . The Print Setup dialog boxappears. (This dialog box is also displayed if you click Setup in thePrint dialog box.)

    2. Change any of the settings as required or choose a different printer.

    3. Click OK to save the settings and close the dialog box.

    Director OptionsWhen you start the DataStage Director, the current settings for the Directoroptions determine what is displayed in the DataStage Director window.The settings include:

    The DataStage Director window position and size

    The time interval between updates from the server The number of data rows processed before a job is stopped (only

    applies to server jobs)

    The number of unexpected warnings that are permitted before a job is aborted

  • 8/6/2019 Data Stage Director Guide

    34/98

    2-20 Ascential DataStage Director Guide

    Whether the toolbar, status bar, or icons are displayed

    The filter criteria

    The process priority level for the DataStage Director

    You can modify the settings for the Director options by choosing Tools Options . The Options dialog box appears:

    The five pages in the dialog box are described in the following sections.

    General PageFrom the General page you can:

    Change the server update interval Save window settings Compare the client and server times

    Changing the Server Update IntervalThe Refresh Interval (seconds) is the time, in seconds, between updatesfrom the server. Click the arrow buttons to increase or decrease the valuein the box. The default setting is 4, the minimum is 2, and the maximum is65.

    Note that if you choose a long refresh time, the status displayed in theDataStage Director window may not represent what is happening on theserver. For example, if you start a run, the job status may not update toRunning until a whole refresh interval has elapsed. Conversely, if you

  • 8/6/2019 Data Stage Director Guide

    35/98

    The DataStage Director 2-21

    choose a refresh time that is too short, the DataStage Director requestsinformation from the server at a rate that is too frequent and unproductive.You must find a value between these extremes that meets your updaterequirements.

    Saving Window SettingsThe Save on exit area contains two check boxes that determine the settingsthat are saved on exit from the DataStage Director.

    If Main window size and position is selected, the DataStageDirector is restarted at the same screen coordinates as when itexited.

    If Filter settings is selected, the current filter settings are saved andused when the DataStage Director is started.

    Both check boxes are selected by default.

    Comparing Client and Server TimesIf you select the Compare client and server system times check box, whenthe Director first attaches to a project, it checks that the system times on theclient and the server are within five minutes of each other. If they are not,a warning message appears. This check box is selected by default.

    Limits Page

    The Limits page sets the maximum number of rows to process in a job run,and the maximum number of warning messages to allow before a jobaborts. The limits apply to all server jobs in the current session. You canoverride the settings for an individual job when it is validated, run, orscheduled.

  • 8/6/2019 Data Stage Director Guide

    36/98

    2-22 Ascential DataStage Director Guide

    Setting the Maximum Number of RowsThe option buttons set the maximum number of rows to be processed bythe job. Click No limit to process all rows or Stop stages after n rows tospecify a number of rows.

    Enter a value 1 thru 99999 or click the arrow buttons to increase or decreasethe value. The default value is 1000.

    Setting the Maximum Number of Warnings

    The option buttons set the maximum number of warning messagesallowed before a job is aborted. Select No limit to log all error messages orAbort job after to specify a number of warning messages. Enter a value 1thru 99999 or click the arrow buttons to increase or decrease the value. Thedefault value is 50.

    View PageThe options on the View page determine what is displayed in theDataStage Director window.

    The check boxes in the Show area are selected by default:

    Toolbar Displays the toolbar.Status bar Displays the status bar.Date and time Displays the date and time (of the server) on the status

    bar.

  • 8/6/2019 Data Stage Director Guide

    37/98

    The DataStage Director 2-23

    Specify the view to display when the Director is started, by clicking theappropriate option button:

    Status of jobs (the default setting) Schedule Log for last job

    Priority PageThe Priority page is included for sites where the DataStage client andserver components are installed on the same computer (this currentlyexcludes systems running parallel jobs). When jobs are running, theperformance of the DataStage Director may be noticeably slower. You can

    improve the performance by changing the priority of the DataStageDirector process.

    Setting the Process PriorityThe option buttons in the Set process priority area allow you to change thepriority of the DataStage Director process.

    Click High always to increase the priority of the DataStageDirector, regardless of where the client and server components areinstalled. Note that if you choose this option and the client andserver components are installed on different computers, you may

    Icons Displays the icons in the views.

  • 8/6/2019 Data Stage Director Guide

    38/98

    2-24 Ascential DataStage Director Guide

    not see an improvement in the performance of the DataStageDirector.

    Click High if project on local system to increase the priority of theDataStage Director process if the client and server componentsreside on the same machine. This is the default setting.

    Click Normal always if you do not want to change the prioritysetting for the DataStage Director process.

    Displaying the Current State

    The Current state area displays the current priority setting.

    Note: If you choose a high priority setting, it may take longer for a job torun. This is because processor cycles are directed toward moni-

    toring jobs rather than running them.Click OK to save the settings.

    Choosing an Alternative ProjectWhen you start the DataStage Director, the project chosen in the Attach toProject dialog box is opened. You can view the jobs in another DataStageproject without exiting the DataStage Director.

    Viewing Jobs in Another ProjectTo view the jobs in another project:

    1. Do one of the following to display the Open Project dialog box:

    Choose Project Open . Click the Open Project button on the toolbar.

  • 8/6/2019 Data Stage Director Guide

    39/98

    The DataStage Director 2-25

    2. Choose the project you want to open from the Projects list box. Thislist box contains all the DataStage projects on the server specified inthe Host system field, which is the server you initially attached to.

    3. Click OK to open the chosen project. The updated DataStage Directorwindow displays the jobs in the new project.

    Viewing Jobs on a Different ServerTo open a project on a different DataStage server:

    1. Click New host . The Attach to Project dialog box appears.

    2. Enter the name of the DataStage server and your logon details beforechoosing the project from the Project drop-down list.

    3. Click OK to open the chosen project. The updated DataStage Directorwindow displays the jobs in the new project.

    Note: If you have Monitor windows open when you choose an alterna-tive project, you are prompted to confirm that you want to changeprojects. If you click Yes, the Monitor windows are closed beforethe new project is opened. See Chapter 5, Monitoring Jobs, formore details.

    Exiting the DataStage DirectorTo exit the Director, choose Project Exit . Any open windows (forexample, Monitor windows) are automatically closed on exit.

  • 8/6/2019 Data Stage Director Guide

    40/98

    2-26 Ascential DataStage Director Guide

  • 8/6/2019 Data Stage Director Guide

    41/98

    Running DataStage Jobs 3-1

    3Running DataStage

    Jobs

    This chapter describes how to run DataStage jobs, including the followingtopics:

    Setting job options Validating jobs Starting, stopping, and resetting a job run Deleting jobs from a project Cleaning up the resources of jobs that have hung or aborted Creating multiple job invocations

    These tasks are performed from the Job Status view in the DataStage

    Director window. To switch to this view, choose View Status , or clickthe Status button on the toolbar.

    Setting Job OptionsEach time you validate, run, or schedule a job you can:

    Change the job parameters associated with the job, as appropriate,and set values for environment variables that have been defined as job parameters. (You can also set up default values for jobparameters).

    Override any default limits for row processing (server jobs and jobsequences only) and warning messages that are set for the job run.

    Assign Invocation IDs to create multiple job invocations. You cancreate as many invocations as you want. For more information on

  • 8/6/2019 Data Stage Director Guide

    42/98

    3-2 Ascential DataStage Director Guide

    creating Multiple Job Invocations see Multiple Job Invocations on page 3-17 .

    Set tracing options for server jobs and job sequences. For moreinformation, see Setting Tracing Options on page 3-20 .

    You do this from the Job Run Options dialog box which appears automat-ically when you run, validate, or schedule a job.

    Some parameters have default values assigned to them. You can use thedefault or enter another value. You can reinstate the default values byclicking Set to Default or All to Default . In the example screen, the defaultvalues are encrypted strings which are shown as asterisks. You can also setyour own defaults for job parameters from the Director.

    Some job parameters hold variable information such as dates or file namesthat you need to enter for each job run. You must enter appropriate valuesin all the fields before you can continue.

    If the jobs designer included help text for the job parameters, you can gethelp by selecting the parameter and clicking Property Help .

    You can also use this dialog box to set values for environment variablesthat affect parallel job runs. When you design the job, you can add envi-ronment variables to the list of job parameters, this dialog box will thenask you to supply values for those variables for this run. Environmentvariables are identified by a $ sign. When setting a value for an environ-ment variable, you can specify the special value $ENV, which instructsDataStage to use the current setting for the environment variable. Notethat you cannot use $ENV when viewing data on Parallel jobs. You will bewarned if you try to do this.

  • 8/6/2019 Data Stage Director Guide

    43/98

    Running DataStage Jobs 3-3

    Note: The dialog box displays a Parameters page only if the job hasparameters.

    Validating a JobYou can check that a job or job invocation will run successfully by vali-dating it. Jobs should be validated before running them for the first time,or after making any significant changes to job parameters.

    When a server job is validated, the following checks are made withoutactually extracting, converting, or writing data:

    Connections are made to the data sources or data warehouse.

    SQL SELECT statements are prepared.

    Files are opened. Intermediate files in Hashed File, UniVerse, orODBC stages that use the local data source are created, if they donot already exist.

    When a parallel job is validated, the job is run in check only mode so datais not affected.

    To validate a job:

    1. Select the job or job invocation you want to validate in the Job Statusview.

    2. Choose Job Validate . The Job Run Options dialog box appears.See Setting Job Options on page 3-1 .

    3. Fill in the job parameters as required.

    4. Click Validate . Click OK to acknowledge the message. The job is vali-dated and the jobs status is updated to Running.

    Note: It may take some time for the job status to be updated, dependingon the load on the server and the refresh rate for the client.

    Once validation is complete, the updated jobs status displays one of these

    status messages: Validated OK. You can now schedule or run the job.

    Failed validation. You need to view the job log file for details ofwhere the validation failed. For more details, see Chapter 6 , The Job Log File.

  • 8/6/2019 Data Stage Director Guide

    44/98

    3-4 Ascential DataStage Director Guide

    If you want to monitor the validation in progress, you can use a Monitorwindow. For more information, see Chapter 5 , Monitoring Jobs.

    Running a JobYou can run a job in two ways:

    Immediately.

    By scheduling it to run at a later time or date. See Job Scheduling on page 3-6 for how to do this.

    If you run a job immediately, you must ensure that the data sources anddata warehouse are accessible, and that other users on your system willnot be affected by the job run. To run a job immediately:

    1. Select the job or job invocation in the Job Status view.2. Do one of the following:

    Choose Job Run Now . Click the Run button on the toolbar.

    The Job Run Options dialog box appears. See Setting Job Options on page 3-1 .

    3. Fill in the job parameters and check warning and row limits for the job, as appropriate.

    4. Optionally, click Validate to validate the job.5. Click Run . The job is scheduled to run with the current date and time

    and the jobs status is updated to Running.

    Note: It may take some time for the job status to be updated, dependingon the load on the server and the refresh rate for the client.

    Stopping a JobTo stop a job that is currently running:

    1. Select it in the Job Status view.

    2. Do one of the following:

    Choose Job Stop . Click the Stop button on the toolbar.

  • 8/6/2019 Data Stage Director Guide

    45/98

    Running DataStage Jobs 3-5

    The job or invocation is stopped, regardless of the stage currently being processed, and the jobs status is updated to Stopped.

    Note: It may take some time for the job status to be updated, depending

    on the load on the server and the refresh rate for the client.

    Resetting a JobIf a job has stopped or aborted, it is difficult to determine whether all therequired data was written to the target data tables. When a job has a statusof Stopped or Aborted, you must reset it before running the job again.

    By resetting a job, you set it back to a runnable state and, optionally, returnyour target files to the state they were in before the job was run.

    Note: You can only reinstate sequential files and hashed files to a prerunstate if the backup option has been chosen on the correspondingstage in the job design.

    If you want to undo the updates performed during a successful job, youcan also use the Reset command for jobs with a status of Finished. TheReset command is not available for jobs with a status of Not compiled orRunning.

    To reset a job or job invocation:

    1. Select the job or invocation you want to reset in the Job Status view.2. Choose Job Reset or click the Reset button on the toolbar. A

    message box appears.

    3. Click Yes to reset the tables. All the files in the job are reinstated to thestate they were in before the job was run. The jobs status is updatedto Has been reset.

    Note: It may take some time for the job status to be updated, dependingon the load on the server and the refresh rate for the client.

    Restarting Job SequencesIf a sequence is restartable (i.e., is recording checkpoint information) andhas one of its jobs fail during a run, then the following status appears inthe DataStage Director:

    Aborted/restartable

  • 8/6/2019 Data Stage Director Guide

    46/98

    3-6 Ascential DataStage Director Guide

    In this case you can take one of the following actions:

    Run Job . The sequence is re-executed, using the checkpoint infor-mation to ensure that only the required components are re-executed.

    Reset Job . All the checkpoint information is cleared, ensuring thatthe whole job sequence will be run when you next specify run job.

    Note: If, during sequence execution, the flow diverts to an error handlingstage, DataStage does not checkpoint anything more. This is toensure that stages in the error handling path will not be skipped ifthe job is retarted and another error is encountered.

    Setting Default Job ParametersYou can set default values for any parameters a job has. These will over-ride an defaults set in the job design (although note that, if you recompilethe job, the defaults will be reset to the design ones, similarly if youupgrade DataStage).

    To set default parameters:

    1. Select the job in the display area.

    2. Choose Job Set Defaults . The Set Job Parameter Defaults dialog box appears.

    3. If defaults have been set in the Designer for this job, they will bedisplayed. Edit them to override them.

  • 8/6/2019 Data Stage Director Guide

    47/98

    Running DataStage Jobs 3-7

    Job SchedulingYou can schedule a job to run in a number of ways:

    Once today at a specified time Once tomorrow at a specified time On a specific day and at a particular time Daily at a particular time On the next occurrence of a particular date and time

    Each job can be scheduled to run on any number of occasions, usingdifferent job parameters if necessary. For example, you can schedule a jobto run at different times on different days.The scheduled jobs aredisplayed in the Job Schedule view. For example, you can schedule it torun at different times on different days.

    Note: Windows NT restricts job scheduling to administrators, thereforeyou need to be logged on as a Windows NT administrator in orderto use the DataStage scheduling features. Microsoft havepublished a workaround to this restriction visithttp://support.microsoft.com/directory/ and look up articleQ124859 for details.

    Job Schedule ViewThe Job Schedule view displays details of all scheduled and unscheduled

    jobs and batches in the currently selected job category. If the job categorypane is hidden, the Job Schedule view displays details of all scheduled andunscheduled jobs and batches in the project, regardless of their jobcategory.

  • 8/6/2019 Data Stage Director Guide

    48/98

    3-8 Ascential DataStage Director Guide

    To display the job schedule, choose View Schedule or click theSchedule button on the toolbar.

    You can filter the view to display specific types of job, based on their nameor status (see Filtering the Job Status or Job Schedule View on page 2-13).The icon on the left of the Job name column indicates that a job isscheduled.

    The To be run column shows when the job is scheduled to run, as shownin the following table:

    To be run Means this

    Every n n is a number representing the date. For example, Every

    12&27 means the job is scheduled to run on the 12th and27th day of each month.Every x x represents the day of the week:

    M = MondayT = TuesdayW = WednesdayTh = ThursdayF = FridayS = SaturdaySu = Sunday

    For example, Every Th&F means the job is scheduled torun every Thursday and Friday.

    Every n&x n is a date and x is a day of the week (as above). Forexample, Every 10&Su means the job is scheduled torun on every 10th day of the month and every Sunday.

  • 8/6/2019 Data Stage Director Guide

    49/98

    Running DataStage Jobs 3-9

    The At time column lists the time at which the job will run. This is

    displayed in the systems current time format: 12-hour or 24-hour clock.The Parameters/Description column lists the parameters required to runthe job. Each job has built-in job parameters which must be entered whenyou schedule or run a job. The entered values are displayed in this columnin the following format:

    parameter1 name = value, parameter2 name = value,

    A brief description appears here if there is a short description defined andthere are no job parameters.

    Viewing Details of a Job ScheduleTo view more details about a scheduled job or batch, select it in the displayarea and do one of the following:

    Choose View Detail . Choose Detail from the shortcut menu. Double-click the job or batch in the display.

    If you choose a batch, it has the same effect as choosing Tools Batch , asdescribed in Creating a Job Batch on page 4-2 .

    Today The job is run today at the specified time.Tomorrow The job will run tomorrow at the specified time.

    Next n n is a date (as above). For example, Next 28 means the job is run on the next 28th of the month.

    Next x x is a day of the week. For example, Next W means the job is run the next Wednesday in the month.

    Next n&x n is a number and x is a day of the week. For example,Next 5&12&T means the job is scheduled to run on thenext 5th and 12th day of the month, and the nextTuesday.

    To be run Means this

  • 8/6/2019 Data Stage Director Guide

    50/98

    3-10 Ascential DataStage Director Guide

    If you choose a job, the Job Schedule Detail dialog box appears:

    This dialog box contains a summary of the job details and all the settingsused to schedule the job.

    Note: The parameter name displayed here is the name used internally bythe job, not the descriptive parameter name you see when you

    enter job parameter values.

    Use Copy to copy the schedule details and job parameters to the Clipboardfor use elsewhere.

    This field Contains this information

    Project The name of the project and the DataStage server.Schedule # The schedule number the job has been assigned.Occurrences The number of times the job will be run using this

    schedule. A value of Repeats means that the job is

    continuously rescheduled. Job name The name of the job.Run time The time the job is set to run, in 24-hour format.Run date The date the job is set to run. Job parameters The job parameters. Each entry in this field is in the

    format parameter name=value.

  • 8/6/2019 Data Stage Director Guide

    51/98

    Running DataStage Jobs 3-11

    Click Next or Previous to display schedule details for the next or previous job in the list. These buttons are only active if the next or previous job isscheduled to run.

    Click Close to close the window.

    Scheduling a JobTo schedule a job:

    1. Select the job or job invocation you want to schedule in the Job Statusor Job Schedule view.

    Note: You cannot schedule a job with a status of Not compiled.

    2. Do one of the following to display the Add to schedule dialog box:

    Choose Job Add to Schedule . Choose Add To Schedule from the appropriate shortcut menu. Click the Schedule button on the toolbar.

    3. Choose when to run the job by clicking the appropriate option button:

    Today runs the job today at the specified time (in the future).

    Tomorrow runs the job tomorrow at the specified time.Every runs the job on the chosen day or date at the specified time inthis month and repeats the run at the same date and time in thefollowing months.

  • 8/6/2019 Data Stage Director Guide

    52/98

    3-12 Ascential DataStage Director Guide

    Next runs the job on the next occurrence of the day or date at the spec-ified time.

    Daily runs the job every day at the specified time.

    4. If you selected Every or Next in step 3, choose the day to run the job by doing one of the following:

    Choose an appropriate day or days from the Day list. Choose a date from the calendar.

    Note: If you choose an invalid date, for example, 31 September, the behavior of the scheduler depends upon the server operatingsystem, and you may not receive a warning of the invalid date.Refer to your server documentation for further information.

    5. Choose the time to run the job. There are two time formats:

    12-hour clock. Click either AM or PM . 24-hour clock. Click 24H Clock .

    Click the arrow buttons to increase or decrease the hours and minutes,or enter the values directly.

    6. Click OK . The Add to schedule dialog box closes and the Job RunOptions dialog box appears.

    7. Fill in the job parameter fields and check warning and row limits, asappropriate.

    8. Click Schedule . The job is scheduled to run and is added to the JobSchedule view.

    Unscheduling a JobIf you want to prevent a job from running at the scheduled time, you mustunschedule it. To unschedule a job:

    1. Select the job you want to unschedule in the Job Schedule view.

    2. Do one of the following:

    Choose Job Unschedule . Choose Unschedule from the Job shortcut menu.

    If the job is not scheduled to run at another time, the job status is updatedto Not scheduled in the To be run column, and is not run again until youadd it to the schedule.

  • 8/6/2019 Data Stage Director Guide

    53/98

    Running DataStage Jobs 3-13

    Rescheduling a JobIf you have a job scheduled to run, but you want to change the frequency,day, or time it is run, you can reschedule it. To reschedule a job:

    1. Select the job you want to reschedule in the Job Schedule view.2. Do one of the following to display the Add to schedule dialog box:

    Choose Job Reschedule . Choose Reschedule from the Job shortcut menu. Click the Reschedule button on the toolbar.

    The current settings for the job are shown in the dialog box.

    3. Edit the frequency, day, or time you want the job to run.

    4. Click OK . The Add to schedule dialog box closes and the Job Run

    Options dialog box appears.5. Fill in the job parameters and check warning and row limits as

    appropriate.

    6. Click Reschedule . The job is rescheduled and the To be run columnin the Job Schedule view is updated.

    Deleting a JobYou can remove unwanted or old versions of jobs from your project as

    follows:1. Select the job or job invocation in the Job Status view. You can make

    multiple selections.

    2. Choose Job Delete . A message confirms that you want to deletethe chosen job, or jobs.

    3. Click Yes to delete the jobs. A message confirms they have beendeleted.

    4. Click OK . The jobs and all the associated components used at runtime are deleted, including the files and records used by the Job Logview and the Monitor window.

    5. If you delete a job that is part of a batch, edit the batch to remove thedeleted job to prevent the batch from failing. See Chapter 4 , JobBatches.

  • 8/6/2019 Data Stage Director Guide

    54/98

    3-14 Ascential DataStage Director Guide

    If you delete the last server job from a job category, that category isremoved from the job category tree in the Director window.

    Job AdministrationFrom the DataStage Administration client, the administrator can enable job administration commands that let you clean up the resources of a jobthat has hung or aborted. These commands help you return the job to astate in which you can rerun it after the cause of the problem has beenfixed. You should use them with care, and only after you have tried to resetthe job and you are sure it has hung or aborted.

    There are two job administration commands:

    Cleanup Resources

    Clear Status File

    Cleaning Up Job ResourcesThis facility only applies to server jobs. The Cleanup Resources commandlets you:

    View and end job processes View and release the associated locks

    To run this command, do one of the following:

    Choose Job Cleanup Resources from the menu bar. Choose Cleanup Resources from the Monitor window shortcut

    menu.

  • 8/6/2019 Data Stage Director Guide

    55/98

    Running DataStage Jobs 3-15

    The Job Resources dialog box appears, from which you can view andclean up the resources of the selected job:

    In the Processes area:

    This column Displays this information

    PID # The process identification number.Context The process context. In a job with more than one

    active stage, a PID may be reused during a job run.In that case the context field may have entries formore than one active stage. The context will always be listed as Unavailable if the Show All option button is selected.

    User Name The identity of the user whose job started theprocess.

    Last CommandProcessed

    The command executed most recently by theprocess.

  • 8/6/2019 Data Stage Director Guide

    56/98

    3-16 Ascential DataStage Director Guide

    In the Locks area:

    Viewing Processes and Locks

    The default is for the Job Resources dialog box to display all the processesand locks associated with the job currently selected in the DataStageDirector. You can, however, filter the display by using the option buttons

    in the Processes and Locks areas of the Job Resources dialog box.In the Processes area:

    Show All displays all current DataStage processes.

    Show by job (the default) displays all processes for the selected job.

    In the Locks area:

    Show All displays all the current locks.

    Show by job (the default) displays all locks for the selected job.

    Show by process displays all the locks associated with the processthat you have selected in the Processes area in the Job Resources dialog box.

    Ending Job ProcessesTo end job processes:

    1. From the Job Resources dialog box, choose the range of processes tolist by using the Show All or Show by job option buttons in theProcesses area.

    2. To end all the processes associated with a job, click Logout All . (This button is disabled if you have clicked the Show All option button.)

    To end a particular process, select the process in the Processes list box,then click Logout .

    3. Wait for the processes to end (log out) and the display to update.

    This column Displays this information

    PID/User # The identification number of the process associatedwith the lock.

    Lock Type The type of lock: File, Record, or Group.Item Id The identity of the item (record) locked by the

    process. For a Group lock, this column is left blank.

  • 8/6/2019 Data Stage Director Guide

    57/98

    Running DataStage Jobs 3-17

    You can refresh the display manually at any time by clicking Refresh .

    If this procedure fails to end a process that you believe is causing a job tohang, try the following steps:

    1. Log out of all DataStage clients.2. Try to end the process using the Windows NT Task Manager or kill

    the process in UNIX.

    3. Stop and restart the DataStage Server Engine.

    4. Reset the job from the DataStage Director (see Resetting a Job onpage 3-5).

    If there is a problem with a job, you can also release locks (see the nextsection), or clear the job status file (see Clearing a Job Status File onpage 3-16 ).

    Releasing LocksTo release locks:

    1. From the Job Resources dialog box, choose the range of locks to list by doing either of the following:

    Click the Show by job option button in the Locks area.

    Select a process in the Processes area, then click the Show byprocess option button.

    Note: You cannot release locks if you have clicked the Show All option button in the Locks area.

    2. Click Release All . Each of the displayed locks is unlocked and thedisplay updates automatically. (You cannot select or release indi-vidual locks.)

    You can refresh the display manually at any time by clicking Refresh .

    Clearing a Job Status File

    When you clear a jobs status file you reset the status records associatedwith all stages in that job. You should therefore use this option with greatcare, and only when you believe a job has hung or aborted. In particular, before you clear a status file you should:

    1. Try to reset the job (see Resetting a Job on page 3-5).

  • 8/6/2019 Data Stage Director Guide

    58/98

    3-18 Ascential DataStage Director Guide

    2. Ensure that all the jobs processes have ended (see Ending JobProcesses on page 3-15 ).

    To clear the job status file, choose Job Clear Status File from the menu bar. The job status changes to Compiled and no evidence will remain thatthe job has ever run.

    If there is a problem with a job, you can also release the locks (seeReleasing Locks on page 3-16 ).

    Multiple Job InvocationsYou can create multiple invocations of a DataStage server or parallel job,with each invocation starting with different parameters to processdifferent data sets. A job invocation can be invoked regardless of the state

    of other invocations which are processing different data sets.If you run a job in the Director without giving an invocation Id then youcan't create any new invocations of that job until the job has finished. If youwant to run several invocations of the same job at the same time, you mustgive an invocation Id for the first invocation.

    The job designer should ensure that the job is suitable to have multipleinvocations run. For example, an unsuitable job may have different invo-cations running concurrently and writing to the same table. An unsuitable job may also adversely affect job performance.

    Parallel job invocations resulting from a decision to invoke multiple invo-cations of a job should not be confused with the several instances of thesame job that you get when running a partitioned job across severalprocessors. In the latter case the partitioning and collecting built-in to the job will handle the situation where several processes want to read or writeto the same data source.

    Creating Multiple Job Invocations

    1. You enable multiple job invocations from the DataStage Designer.You must select the Allow Multiple Instance option on the Job Prop-

    erties page. See the DataStage Designer Guide for more details onediting job properties.

    2. Compile the job as normal. See the DataStage Designer Guide for moredetails on compiling jobs.

  • 8/6/2019 Data Stage Director Guide

    59/98

    Running DataStage Jobs 3-19

    Note: When you recompile an job, the invocations are canceled,and new invocations will have to be recreated. This is toensure that job integrity is maintained after job designchanges.

    3. From the Job Status view select the job, and choose Job Validate .The Job Run Options dialog box appears:

    4. Enter an Invocation Id in the text field. This will be suffixed to the jobname to create the invocation. For example if the job name is Exercise5 and you enter an Invocation Id of Test1, then job will appear as

    Execrcise5.Test1.5. Click Validate . The Job Status view now shows the new job

    invocation:

  • 8/6/2019 Data Stage Director Guide

    60/98

    3-20 Ascential DataStage Director Guide

    Running a Job InvocationRunning an invocation is done in the same way as you would other jobs.

    1. From the Job Status view select the invocation from the list and

    choose Job Run now . The Job Run Option dialog appears:

    2. Fill in any Parameters, Rows and Warning Limits as required. ClickRun .

    Note: Another way to run an invocation is to choose the job from thelist, enter the Invocation Id in the text field and then click Run .

    Viewing the Job Log for an Invocation

    Viewing the job log for job invocations is exactly the same as for viewingother job log.

  • 8/6/2019 Data Stage Director Guide

    61/98

    Running DataStage Jobs 3-21

    Setting Tracing OptionsServer

    Jobs The Tracing page is included in the Job Run Options dialog box to helpAscential Software analysts during troubleshooting. You can generate

    tracing information and performance statistics for server jobs.The options on this page determine the amount of diagnostic informationgenerated the next time a job is run. Diagnostic information is generatedonly for the active stages in a chosen job. To specify the job, highlight it inthe Job Status or Job Schedule view before choosing Tools Options .

    The Stage names list box contains the names of the active stages in the jobin the format jobname.stagename. To set a trace level, highlight the stages inthe Stage names list box, then select any of the following check boxes:

    Report row data . Records an entry for every data row read oninput and written on output. This check box is cleared by default.

    Property values . Records an entry for every input and outputopened and closed. This check box is cleared by default.

    Subroutine calls . Records an entry for every BASIC subroutineused. This check box is cleared by default.

    Performance statistics . When performance tracing is turned on aspecial log entry is generated immediately after the stage comple-tion message. This contains performance statistics about theselected stages statistics in a tabular form. (For more informationsee Chapter 2 of DataStage Server Job Developers Guide)

  • 8/6/2019 Data Stage Director Guide

    62/98

    3-22 Ascential DataStage Director Guide

    When the job runs, a file is created for each active stage in the job. The filesare named using the format jobname.stagename.trace and are stored in the&PH& subdirectory of your DataStage server installation directory. If youneed to view the content of these files, contact your local Ascential

    Customer Support center for assistance.

  • 8/6/2019 Data Stage Director Guide

    63/98

    Job Batches 4-1

    4Job Batches

    This chapter describes how to create and run DataStage job batches,including the following topics:

    What is a job batch Creating and running job batches Scheduling, unscheduling and rescheduling job batches Deleting batches from a project

    What Is a Job Batch?A job batch is a group of jobs or separate invocations of the same job (withdifferent job parameters) that you want to run sequentially. DataStagetreats a batch as though it were a single job. If any job fails to complete

    successfully, the batch run stops. You can follow the progress of jobswithin a batch by examining the log files of each job or of the batch itself.These contain messages about the progress of the batch, as well as the job.You can create, schedule, edit, or delete job batches.

  • 8/6/2019 Data Stage Director Guide

    64/98

    4-2 Ascential DataStage Director Guide

    Creating a Job BatchTo create a job batch:

    1. Choose Tools Batch New . The Create New Batch dialog boxappears:

    2. Enter a name for the new batch in the Batch name field. You can alsoselect it from the list of existing batches and categories.

    3. In the Category field, enter the name of the job category in which thenew batch will be created. You can also select it from the list ofexisting batches and categories. If you leave the Category field blank,the new batch is located below the Jobs node in the job category tree.

  • 8/6/2019 Data Stage Director Guide

    65/98

    Job Batches 4-3

    4. Click OK . The Job Properties dialog box appears:

    5. Select the jobs to add to the batch from the Add Job list on the Jobcontrol page. This list displays all the server and parallel jobs in theproject. You are prompted to enter parameter values, and row andwarning limits for each job in the batch. As each job is added to the batch, the control information is added to the Job control page.

    6. Click the General tab. The General page appears at the front of the Job Properties dialog box. Optionally, enter a brief description of the batch in the Short job description field. This description appears inthe Parameters/Description column in the Job Schedule view.

    7. Optionally, enter a more detailed description of the batch in the Fulljob description field.

    8. Click the Parameters tab. The Parameters page appears at the front ofthe Job Properties dialog box.

    9. Define any parameters that you want to specify for the batch. Forexample, a user name and password to prompt for when the batch isrun.

  • 8/6/2019 Data Stage Director Guide

    66/98

    4-4 Ascential DataStage Director Guide

    10. When you have defined the batch, click OK . The batch is compiledand appears in the Job Status view.

    Note: The Dependencies page allows you to specify the dependencies of

    the batch job. You only need to do this if you intend to package the batch job for deployment on another system using the DataStageManager. Information on dependencies and packaging jobs is inDataStage Designer Guide, DataStage Manager Guide and DataStageDevelopers Help.

    Running a Job BatchYou can run a job batch in the same way as a standard job:

    Immediately.

    By scheduling it to run at a later time or date. See Scheduling a JobBatch on page 4-5 for how to do this.

    If you run a batch immediately, you must ensure that the data sources anddata warehouse are accessible, and that other users on your system willnot be affected by the job run.

    To run a batch immediately:

    1. Select the batch in the Job Status view.

    2. Do one of the following:

    Choose Job Run Now . Click the Run button on the toolbar.

    The Job Run Options dialog box appears. See Setting Job Options on page 3-1 .

    3. Fill in the job parameters and check warning and row limits for the batch, as appropriate.

    4. Optionally, click Validate to validate the job.

    5. Click Run . The batch is started and its status is updated to Running.

    Note: It may take some time for the job status to be updated