SAS/ETS 9.22 User's Guide

SAS/ETS 9.22Users Guide

SAS Documentation

The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2010. SAS/ETS 9.22 Users Guide.Cary, NC: SAS Institute Inc.

SAS/ETS 9.22 Users Guide

Copyright 2010, SAS Institute Inc., Cary, NC, USA

ISBN 978-1-60764-543-6

All rights reserved. Produced in the United States of America.

For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in anyform or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of thepublisher, SAS Institute Inc.

For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor atthe time you acquire this publication.

U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentationby the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19,Commercial Computer Software-Restricted Rights (June 1987).

SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.

1st electronic book, May 2010

1st printing, May 2010

SAS Publishing provides a complete selection of books and electronic products to help customers use SAS software toits fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit theSAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS InstituteInc. in the USA and other countries. indicates USA registration.

Other brand and product names are registered trademarks or trademarks of their respective companies.

Contents

I General Information 1Chapter 1. Whats New in SAS/ETS 9.22 . . . . . . . . . . . . . . . . . 3Chapter 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 15Chapter 3. Working with Time Series Data . . . . . . . . . . . . . . . . . 63Chapter 4. Date Intervals, Formats, and Functions . . . . . . . . . . . . . . 127Chapter 5. SAS Macros and Functions . . . . . . . . . . . . . . . . . . 153Chapter 6. Nonlinear Optimization Methods . . . . . . . . . . . . . . . . 169

II Procedure Reference 191Chapter 7. The ARIMA Procedure . . . . . . . . . . . . . . . . . . . . 193Chapter 8. The AUTOREG Procedure . . . . . . . . . . . . . . . . . . . 317Chapter 9. The COMPUTAB Procedure . . . . . . . . . . . . . . . . . . 463Chapter 10. The COUNTREG Procedure . . . . . . . . . . . . . . . . . . 517Chapter 11. The DATASOURCE Procedure . . . . . . . . . . . . . . . . . 563Chapter 12. The ENTROPY Procedure (Experimental) . . . . . . . . . . . . . 659Chapter 13. The ESM Procedure . . . . . . . . . . . . . . . . . . . . . 725Chapter 14. The EXPAND Procedure . . . . . . . . . . . . . . . . . . . 763Chapter 15. The FORECAST Procedure . . . . . . . . . . . . . . . . . . 817Chapter 16. The LOAN Procedure . . . . . . . . . . . . . . . . . . . . 871Chapter 17. The MDC Procedure . . . . . . . . . . . . . . . . . . . . . 913Chapter 18. The MODEL Procedure . . . . . . . . . . . . . . . . . . . . 993Chapter 19. The PANEL Procedure . . . . . . . . . . . . . . . . . . . . 1309Chapter 20. The PDLREG Procedure . . . . . . . . . . . . . . . . . . . 1395Chapter 21. The QLIM Procedure . . . . . . . . . . . . . . . . . . . . . 1421Chapter 22. The SEVERITY Procedure (Experimental) . . . . . . . . . . . . . 1491Chapter 23. The SIMILARITY Procedure . . . . . . . . . . . . . . . . . . 1589Chapter 24. The SIMLIN Procedure . . . . . . . . . . . . . . . . . . . . 1659Chapter 25. The SPECTRA Procedure . . . . . . . . . . . . . . . . . . . 1689Chapter 26. The STATESPACE Procedure . . . . . . . . . . . . . . . . . 1715Chapter 27. The SYSLIN Procedure . . . . . . . . . . . . . . . . . . . . 1761Chapter 28. The TIMEID Procedure (Experimental) . . . . . . . . . . . . . . 1825Chapter 29. The TIMESERIES Procedure . . . . . . . . . . . . . . . . . . 1849Chapter 30. The TSCSREG Procedure . . . . . . . . . . . . . . . . . . . 1919Chapter 31. The UCM Procedure . . . . . . . . . . . . . . . . . . . . . 1933Chapter 32. The VARMAX Procedure . . . . . . . . . . . . . . . . . . . 2047Chapter 33. The X11 Procedure . . . . . . . . . . . . . . . . . . . . . 2227Chapter 34. The X12 Procedure . . . . . . . . . . . . . . . . . . . . . 2295

III Data Access Engines 2395Chapter 35. The SASECRSP Interface Engine . . . . . . . . . . . . . . . . 2397Chapter 36. The SASEFAME Interface Engine . . . . . . . . . . . . . . . . 2499Chapter 37. The SASEHAVR Interface Engine . . . . . . . . . . . . . . . . 2555

IV Time Series Forecasting System 2605Chapter 38. Overview of the Time Series Forecasting System . . . . . . . . . . 2607Chapter 39. Getting Started with Time Series Forecasting . . . . . . . . . . . . 2611Chapter 40. Creating Time ID Variables . . . . . . . . . . . . . . . . . . 2667Chapter 41. Specifying Forecasting Models . . . . . . . . . . . . . . . . . 2681Chapter 42. Choosing the Best Forecasting Model . . . . . . . . . . . . . . . 2719Chapter 43. Using Predictor Variables . . . . . . . . . . . . . . . . . . . 2739Chapter 44. Command Reference . . . . . . . . . . . . . . . . . . . . . 2773Chapter 45. Window Reference . . . . . . . . . . . . . . . . . . . . . 2781Chapter 46. Forecasting Process Details . . . . . . . . . . . . . . . . . . 2889

V SAS/ETS Model Editor (Experimental) 2923Chapter 47. SAS/ETS Model Editor Window Reference . . . . . . . . . . . . 2925

VI Investment Analysis 2977Chapter 48. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 2979Chapter 49. Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . 2983Chapter 50. Investments . . . . . . . . . . . . . . . . . . . . . . . . 2991Chapter 51. Computations . . . . . . . . . . . . . . . . . . . . . . . 3035Chapter 52. Analyses . . . . . . . . . . . . . . . . . . . . . . . . . 3047Chapter 53. Details . . . . . . . . . . . . . . . . . . . . . . . . . . 3063

Subject Index 3075

Syntax Index 3117

iv

Credits and Acknowledgments

Credits

Documentation

Editing Anne Jones

Technical Review Evan L. Anderson, Ming-Chun Chang, Jan Chvosta,Brent Cohen, Allison Crutchfield, Paige Daniels, Gl Ege,Bruce Elsheimer, Donald J. Erdman, Kelly Felling-ham, Sanggohn Han, Laura Jackson, Wilma S. Jackson,Wen Ji, Kurt Jones, Kathleen Kiernan, Michael J. Leonard,Li C. Li, Mark R. Little, Kevin Meyer, Gina Marie Mon-dello, Steve Morrison, Youngjin Park, Jim Seabolt,David Schlotzhauer, Rajesh Selukar, Jennifer Sloan,Mark Traccarella, Michele A. Trovero, Charles Sun,Donna E. Woodward

Documentation Production Tim Arnold

Software

The procedures in SAS/ETS software were implemented by members of the Advanced Analyticsdivision. Program development includes design, programming, debugging, support, documentation,and technical review. In the following list, the name of the developer who currently has principalsupport responsibility for the procedure is listed first.

ARIMA Rajesh Selukar, Michael J. Leonard, Terry Woodfield

AUTOREG Xilong Chen, Jan Chvosta, Richard Potter, Jason Qiao, John P. Sall

COMPUTAB Michael J. Leonard, Alan R. Eaton

COUNTREG Jan Chvosta, Laura Jackson

DATASOURCE Kelly Fellingham, Meltem Narter

ENTROPY Xilong Chen, Arthur Sinko, Greg Sterijevski, Donald J. Erdman

ESM Michael J. Leonard

EXPAND Marc Kessler, Michael J. Leonard, Mark R. Little

FORECAST Michael J. Leonard, Mark R. Little, John P. Sall

LOAN Richard Potter, Gl Ege

MDC Jan Chvosta

MODEL Marc Kessler, Donald J. Erdman, Mark R. Little, John P. Sall

PANEL Jan Chvosta, Greg Sterijevski

PDLREG Xilong Chen, Richard Potter, Jan Chvosta, Leigh A. Ihnen

QLIM Jan Chvosta

SIMILARITY Michael J. Leonard

SEVERITY Mahesh V. Joshi

SIMLIN Mark R. Little, John P. Sall

SPECTRA Marc Kessler, Rajesh Selukar, Donald J. Erdman, John P. Sall

STATESPACE Donald J. Erdman, Michael J. Leonard

SYSLIN Laura. Jackson, Donald J. Erdman, Leigh A. Ihnen, John P. Sall

TIMEID Marc Kessler, Michael J. Leonard

TIMESERIES Marc Kessler, Michael J. Leonard

TSCSREG Jan Chvosta

vi

UCM Rajesh Selukar

VARMAX Youngjin Park

X11 Wilma S. Jackson, R. Bart Killam, Leigh A. Ihnen,Richard D. Langston

X12 Wilma S. Jackson

Time Series Evan L. Anderson, Michael J. Leonard, Meltem Narter, Gl EgeForecasting System

Investment Analysis Gl Ege, Scott Gray, Michael J. LeonardSystem

Compiler and Andrew Henrick, Stacey ChristianSymbolic Differentiation

SASEHAVR Kelly Fellingham

SASECRSP Kelly Fellingham, Peng Zang

SASEFAME Kelly Fellingham

Testing Shu An, Ming-Chun Chang, Bruce Elsheimer, Kelly Fellingham,Sanggohn Han, Li C. Li, Jennifer Sloan, Charles Sun, Peng Zang

Technical Support

Members Paige Daniels, Wen Ji, Kurt Jones, Kathleen Kiernan,Gina Marie Mondello, David Schlotzhauer, Donna E. Woodward

vii

Acknowledgments

Hundreds of people have helped the SAS System in many ways since its inception. The followingindividuals have been especially helpful in the development of the procedures in SAS/ETS software.Acknowledgments for the SAS System generally appear in Base SAS software documentation andSAS/ETS software documentation.

David Amick Idaho Office of Highway SafetyDavid M. DeLong Duke UniversityDavid Dickey North Carolina State UniversityDouglas J. Drummond Center for Survey StatisticsMichel Ferland Statistics CanadaSusie Fortier Statistics CanadaWilliam Fortney Boeing Computer ServicesWayne Fuller Iowa State UniversityA. Ronald Gallant The University North Carolina at Chapel HillPhil Hanser Sacramento Municipal Utilities DistrictMarvin Jochimsen Mississippi R&O CenterJeff Kaplan Sun GuardKen Kraus Center for Research in Security PricesDominique Ladiray INSEEGeorge McCollister San Diego Gas & ElectricDouglas Miller Purdue UniversityBrian Monsell U.S. Census BureauRobert Parks Washington UniversityBenoit Quenneville Statistics CanadaGregory Sali Idaho Office of Highway SafetyBob Spatz Center for Research in Security PricesMary Young Salt River Project

The final responsibility for the SAS System lies with SAS Institute alone. We hope that you willalways let us know your opinions about the SAS System and its documentation. It is through yourparticipation that SAS software is continuously improved.

viii

Part I

General Information

Chapter 1

Whats New in SAS/ETS 9.22

ContentsOverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Highlights of Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . 3Highlights of Enhancements in SAS/ETS 9.2 . . . . . . . . . . . . . . . . . 4

AUTOREG Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4COUNTREG Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5MDC Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6MODEL Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6QLIM Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7SASEFAME Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7SASEHAVR Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8New SEVERITY Procedure (Experimental) . . . . . . . . . . . . . . . . . . . . . 9SIMILARITY Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10New TIMEID Procedure (Experimental) . . . . . . . . . . . . . . . . . . . . . . . 10TIMESERIES Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10UCM Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11X12 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11SAS/ETS Model Editor Application (Experimental) . . . . . . . . . . . . . . . . . 12Date Intervals, Formats, and Functions . . . . . . . . . . . . . . . . . . . . . . . . 13

Overview

This chapter summarizes the new features available in SAS/ETS 9.22.

If you have used SAS/ETS procedures in the past, you can review this chapter to learn about the newfeatures that have been added. When you see a new feature that might be useful for your work, turnto the appropriate chapter to read about the feature in detail.

Highlights of Enhancements

The following new procedures have been added to SAS/ETS software:

4 F Chapter 1: Whats New in SAS/ETS 9.22

The SEVERITY procedure (Experimental)

The TIMEID procedure (Experimental)

The SIMILARITY procedure, which performs similarity analysis for sets of time series, wasexperimental in the previous release and is now production status.

A new Java application, called the SAS/ETS Model Editor (Experimental), provides a graphical userinterface for editing nonlinear statistical models and provides a convenient way to use the MODELprocedure.

New features have been added to the following SAS/ETS components:

The AUTOREG procedure

The COUNTREG procedure

The MDC procedure

The MODEL procedure

The QLIM procedure

SASEFAME interface engine

SASEHAVR interface engine

The TIMESERIES procedure

The UCM procedure

The X12 procedure

New features for defining custom time intervals have been added to Base SAS software that mightbe of interest to SAS/ETS users. For more information, see SAS Language Reference: Dictionary.

Highlights of Enhancements in SAS/ETS 9.2

Users who are updating directly to SAS/ETS 9.22 from a release prior to SAS/ETS 9.2 can findinformation about the SAS/ETS 9.2 changes and enhancements in the chapter Whats New inSAS/ETS in the SAS/ETS 9.2 Users Guide (see support.sas.com/whatsnewets92).

AUTOREG Procedure

The following new features have been added to the AUTOREG procedure:

http://support.sas.com/documentation/cdl/en/etsug/60372/HTML/default/whatsnew_toc.htm

COUNTREG Procedure F 5

Three asymmetric GARCH models, namely quadratic GARCH, threshold GARCH, and powerGARCH, are implemented to measure the impact of news on the future volatility. PowerGARCH also considers the long memory property in the volatility.

Besides the existing two tests for the existence of ARCH effect, Lee and Kings ARCH testand Wong and Lis ARCH test are implemented. Lee and Kings ARCH test is a one-sidedlocally most mean powerful (LMMP) test; Wong and Lis ARCH test is robust to outliers. Ifthe NLAG= option is specified, the statistics based on the final model residuals, along with theOLS residuals, can also be computed.

The Hannan-Quinn criterion (HQC) is implemented and included in the summary statistics.

Four statistical tests of independence are implemented: BDS test, runs test, turning point test,and rank version of the von Neumann ratio test. They are powerful tools for model selectionand specification test.

The augmented Dickey-Fuller (ADF) test for unit root is implemented. This test accounts forsome form of dependence between the innovations of the time series. The ADF formulationincludes lags of the order p in the regression. When the lag is specified to be zero, it reducesto the standard Dickey-Fuller Unit root test. In the presence of regressors, the Engle-Grangercointegration test is performed using the augmented Dickey-Fuller test statistic.

The Elliott-Rothenberg-Stock (ERS) unit root and Ng-Perron (NP) unit root test are imple-mented. These tests also perform automatic lag length selection by using the informationcriterion. The Bayesian information criterion (BIC) is used in the ERS test, and the modifiedAkaike information criterion (AICc) is used in Ng-Perron test.

The CLASS statement is now supported. A CLASS statement enables you to declare classifi-cation variables for use as explanatory effects in a model. When a CLASS variable is used asa predictor in the MODEL statement, the procedure automatically creates a dummy regressorthat corresponds to each discrete value or level of the CLASS variable.

The MODEL statement now supports the use of CLASS variables and interaction terms aspredictors.

The AR, GARCH, and HETERO parameters can be specified in the TEST and RESTRICTstatements.

The likelihood ratio (LR) test and the Lagrange multiplier (LM) test are supported in TESTstatement when GARCH= option is specified.

COUNTREG Procedure

The following new features have been added to the COUNTREG procedure:

The CLASS statement is now supported. A CLASS statement enables you to declare classifi-cation variables for use as explanatory effects in a model. When a CLASS variable is used as


a predictor in the MODEL statement, the procedure automatically creates a dummy regressorthat corresponds to each discrete value or level of the CLASS variable.


The FREQ statement is now supported. A FREQ statement specifies a variable whose valuesindicate the number of cases that are represented by each observation. That is, the proceduretreats each observation as if it had appeared n times in the input data set, where n is the valueof the FREQ variable.

The WEIGHT statement is now supported. A WEIGHT statement specifies a variable whosevalues supply weights for each observation in the dataset. These weights control the importance(weight) given to the data observations in fitting the model.

The NLOPTIONS statement enables you to specify options for the subsystem that is used forthe nonlinear optimization.

MDC Procedure

The following new features have been added to the MDC procedure:

The CLASS statement is now supported. A CLASS statement enables you to declare classifi-cation variables for use as explanatory effects in a model. When a CLASS variable is used asa predictor in the MODEL statement, the procedure automatically creates a dummy regressorthat corresponds to each discrete value or level of the CLASS variable.


The TEST statement is now supported to test linear equality restrictions on the parameters.Three tests are available: Wald, Lagrange multiplier, and likelihood ratio.

MODEL Procedure

The following feature has been added to the MODEL procedure:

For the GMM estimation method, Hansens J statistic for the test of overidentifying restrictionsis reported along with its probabilty.

QLIM Procedure F 7

QLIM Procedure

The following new features have been added to the QLIM procedure:

The TE1 and TE2 options output technical efficiency measures for each producer in stochasticfrontier models as suggested by Battese and Coelli (1988) and Jondrow at al. (1982).

The WEIGHT statement is now supported. A WEIGHT statement identifies a variable tosupply weights for each observation in the dataset. By default, the weights are normalized sothat they add up to the sample size. If the NONORMALIZE option is used, the actual weightsare used without normalization.

SASEFAME Engine

The SASEFAME interface engine provides a seamless interface between Fame and SAS data toenable SAS users to access and process time series, case series, and formulas that reside in a Famedatabase. The following enhancements have been made to the SASEFAME access engine for Famedatabases:

The INSET= option enables you to pass Fame commands through an input SAS data set andselect your Fame input variables by using the KEEPLIST= clause or the WHERE= clause asselection input for BY variables.

The DBVERSION= option displays the version number of the Fame Work data base in theSAS log. SASEFAME uses Fame 10, which does not allow version 2 databases. Use the Famecompress utility with the -m option to convert your version 2 databases to version 3 or 4. Thedefault is version 4.

The TUNEFAME= option tunes the Fame database engines use of memory to reduce I/Otimes in favor of a bigger virtual memory for caching database objects. The default is 100 MB.

The TUNECHLI= option tunes the C host language interface (CHLI) database engines use ofmemory to reduce I/O times in favor of a bigger virtual memory for caching database objects.The default is 100 MB.

The WILDCARD= option enables you to select series by using the new Fame 10 wildcardingcapabilities which allow a longer 242-character wildcard to match data object series nameswithin the Fame database.

The interface uses the most current version of Fame 10 CHLI. The SAS log reports the versionnumber of the Fame 10 CHLI:

NOTE: The SASEFAME engine is using Version 10.03 of the HLI.


SASEHAVR Engine

The SASEHAVR interface engine is a seamless interface between Haver and SAS data processingthat enables SAS users to read economic and financial time series data that reside in a Haver AnalyticsDLX (Data Link Express) database. The following enhancements have been made to the SASEHAVRaccess engine for Haver Analytics databases:

The AGGMODE= option enables you to specify a STRICT or RELAXED aggregation method.AGGMODE=RELAXED is the default setting. Aggregation is supported only from a morefrequent time interval to a less frequent time interval, such as from weekly to monthly. TheSAS log reports the status of AGGMODE.

The SHORT= option enables you to specify the list of Haver short sources to be included inthe output SAS data set. This list is comma-delimited and must be surrounded by quotationmarks .

The DROPSHORT= option enables you to specify the list of Haver short sources to beexcluded from the output SAS data set. This list is comma-delimited and must be surroundedby quotation marks .

The LONG= option enables you to specify the list of Haver long sources to be included in theoutput SAS data set. This list is comma-delimited and must be surrounded by quotation marks.

The DROPLONG= option enables you to specify the list of Haver long sources to be excludedfrom the output SAS data set. This list is comma-delimited and must be surrounded byquotation marks .

The GEOG1= option enables you to specify the list of Haver geography1 codes to be includedin the output SAS data set. This list is comma-delimited and must be surrounded by quotationmarks .

The DROPGEOG1= option enables you to specify the list of Haver geography1 codes to beexcluded from the output SAS data set. This list is comma-delimited and must be surroundedby quotation marks .

The GEOG2= option enables you to specify the list of Haver geography2 codes to be includedin the output SAS data set. This list is comma-delimited and must be surrounded by quotationmarks .

The DROPGEOG2= option enables you to specify the list of Haver geography2 codes to beexcluded from the output SAS data set. This list is comma-delimited and must be surroundedby quotation marks .

The OUTSELECT=ON option specifies that the output data set show values of selection keyssuch as geography codes, groups, sources, and short and long sources for each selected variablename (time series) in the database. The SAS log reports the status of OUTSELECT.

New SEVERITY Procedure (Experimental) F 9

The OUTSELECT=OFF option specifies that the output data set show the observations inrange for all selected time series. This is the default for this option.

The interface is now using the most current version of DLXAPI32. The SAS log reports theversion number of the Haver DLX api.

New SEVERITY Procedure (Experimental)

The new SEVERITY procedure fits models for statistical distributions of the severity (magnitude) ofevents. A couple of examples of the events typically modeled using the procedure are insurance losspayments and intermittent sales of products.

The SEVERITY procedure is experimental for this release. It provides the following features:

The magnitude of events can be modeled as a random variable with a continuous parametricprobability distribution. The SEVERITY procedure uses the maximum likelihood method tofit multiple specified distributions and identifies the best model based on a specified modelselection criterion.

The SEVERITY procedure is delivered with a set of predefined models for several commonlyused distributions. These include the Burr, exponential, gamma, inverse Gaussian, lognormal,Pareto, generalized Pareto, and Weibull distributions.

The SEVERITY procedure is can be extended to fit any continuous parametric distribution.You can specify the distributions model by using a set of functions and subroutines that aredefined by using the FCMP procedure. The model must include functions to provide the valuesof the probability density function (PDF) and the cumulative distribution function (CDF) ofthe distribution. The model can also optionally include functions or subroutines that providethe distributions description, the number of parameters, initial values and bounds for theparameters, the scale parameter transform, and the gradient vector and the Hessian matrix ofthe PDF and the CDF with respect to the parameters.

Exogenous variables can be specified for fitting a model that has a scale parameter. Theexogenous variables are modeled such that their linear combination affects the scale parametervia a specified link function. The regression coefficients that are associated with the variablesin the linear combination are estimated along with the parameters of the distribution. Currently,only the exponential link function is supported.

Censoring and truncation can be specified for each observed value of the response variable.Global values can also be specified to override the individual values that are associated witheach observed value. Currently, only censoring from above (that is, right-censoring) andtruncation from below (that is, left-truncation) are allowed.


SIMILARITY Procedure

The SIMILARITY procedure was classified as experimental in SAS/ETS 9.2. PROC SIMILARITYis now production status.

New TIMEID Procedure (Experimental)

The new TIMEID procedure analyzes the sequence of ID values in a SAS data set to identify the timeinterval between observations and verifies that the observations in the data set represent a properlyspaced time series.

The TIMEID procedure provides the following features:

Specified time intervals and alignments can be used to evaluate a data sets time ID valuesin terms of the distributions of duplicated values, alignment offsets, and the gaps betweenadjacent observations.

The time intervals width, shift, and alignment can be inferred from a time ID variable. Wheneither the interval or its alignment is specified, this information is used to guide the process ofinferring the remaining quantity.

When multiple BY groups are present, detailed diagnostics for each BY group are reported inaddition to summarized diagnostic information which applies to all BY groups in the data set.

TIMESERIES Procedure

Three features have been added to the TIMESERIES procedure for performing spectral analyses ofthe input time series and native database accumulation of data for a time series.

Singular Spectrum Analysis

Singular spectrum analysis (SSA) is a technique for decomposing a time series into additive com-ponents and categorizing these components based on the magnitudes of their contributions. SSAuses a single parameter, the window length, to quantify patterns in a time series without relyingon preconceived notions about the structure of the time series. The window length represents themaximum lag considered in the analysis and corresponds to the dimensionality of the PCA (principlecomponents analysis) on which the SSA is based.

UCM Procedure F 11

In addition to SSA output options, an SSA statement has been added to explicitly control the windowlength parameter and the grouping of SSA series components.

Fourier Spectrum Analysis

Functionality similar to that available in PROC SPECTRA for analyzing periodograms of timeseries data has been incorporated into PROC TIMESERIES. Now ODS graphical representations ofperiodograms and spectral density estimates can be computed and displayed.

Database Accumulation

For Teradata-based input data sets, aggregation and accumulation can be performed using nativefacilities in the database server. Most ACCUMULATE= options specified in the ID and VARstatements can be performed by the database server.

UCM Procedure

The ARMA model specification options in the IRREGULAR statement, which were experimental inSAS 9.2, are now production.

X12 Procedure

Many new features have been added to the X12 procedure.

The CHECK statement produces statistics for diagnostic checking of residuals from the esti-mated regARIMA model. The following new tables are associated with the CHECK statement:Autocorrelation of regARIMA Model Residuals, Partial Autocorrelation of regARIMAModel Residuals, Autocorrelation of Squared regARIMA Model Residuals, SummaryStatistics for the Unstandardized Residuals, Normality Statistics for regARIMA ModelResiduals, and Table G Rs: 10*LOG(SPECTRUM) of the regARIMA Model Residuals.If ODS GRAPHICS ON is specified, the following new plots are associated with diagnosticchecking output: the autocorrelation function (ErrorACF) plot of the residuals, the partialautocorrelation function (ErrorPACF) plot of the residuals, the autocorrelation function (SqEr-rorACF) plot of the squared residuals, a histogram (ResidualHistogram) of the residuals, and aspectral plot (SpectralPlot) of the residuals.

The MAXLAG option of the IDENTIFY statement specifies the maximum number of lags forthe sample ACF and PACF that are associated with model identification.


The following tables are now available through the OUTPUT statement: E1, E2, E3, and E8.

The SIGMALIM option of the X11 statement enables you to specify the upper and lowersigma limits that are used to identify and decrease the weight of extreme irregular values inthe internal seasonal adjustment computations.

The TYPE option of the X11 statement controls which factors are removed from the originalseries to produce the seasonally adjusted series (table D11) and also the final trend cycle (tableD12).

The OUTSTAT= option of the X12 statement specifies the optional output data set that containsthe summary statistics related to each seasonally adjusted series. The data set is sorted by theBY-group variables, if any, and by series names.

The PERIODOGRAM option of the X12 statement enables you to specify that the PERI-ODOGRAM rather than the SPECTRUM of the series be plotted in the G tables and plots.

The PLOTS= option of the X12 statement controls the plots that are produced through ODSGraphics.

The SPECTRUMSERIES option of the X12 statement specifies the table name of the seriesthat is used in the spectrum of the original series (table G0). The table names that can bespecified are A1, A19, B1, or E1. The default is B1.

The following tables are now available through the TABLES statement: E1, E2, and E3.

The following tables are now available through ODS: Model Description for ARIMA ModelIdentification, Model Description for ARIMA Model Estimation, Final Seasonal FilterSelection via Global MSR, Seasonal Filters by Period, and Final Trend Cycle Statistics.The model description information was previously displayed in notes; an ODS table enablesyou to export the information to a data set. The seasonal filter and trend filter tables are new.

Auxiliary variables have been added to ACF and PACF data sets that are available through ODSOUTPUT. The following variables have been added: _NAME_, Transform, Adjust, Regressors,Diff, and Sdiff. The purpose of the new variables is to help you identify the source of the datawhen multiple ACFs and PACFs are calculated.

The following new feature is experimental:

The AUXDATA= option of the X12 specifies an auxiliary input data set that can containuser-defined variables specified in the INPUT statement, the USERVAR= option of the RE-GRESSION statment, or the USERDEFINED statement. The AUXDATA= option is usefulwhen user-defined regressors are used for multiple time series data sets or multiple BY groups.

SAS/ETS Model Editor Application (Experimental)

A new interactive application, the SAS/ETS Model Editor, enables you to define, fit, and simulatenonlinear statistical models using the MODEL procedure. The SAS/ETS Model Editor enables you

Date Intervals, Formats, and Functions F 13

to use the powerful features of PROC MODEL through a convenient and interactive graphical userinterface.

Date Intervals, Formats, and Functions

The custom time intervals that are available in Base SAS software can be used in SAS/ETS procedures.Custom time intervals enable you to specify beginning and ending dates and seasonality for timeintervals according to any definition. Such intervals can be used to define the following:

fiscal intervals such as monthly intervals that begin on a day other than the first day of themonth (for example, intervals that begin on the 10th day of each month)

fiscal intervals such as monthly intervals that begin on different days for different months (forexample, March of 2000 can begin on March 10, but April of 2000 can begin on April 12)

business days, such as banking days that exclude holidays

hourly intervals that omit hours that the business is closed

Chapter 2

Introduction

ContentsOverview of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Uses of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Contents of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . 18

About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Chapter Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Typographical Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Where to Turn for More Information . . . . . . . . . . . . . . . . . . . . . . . . . 22Accessing the SAS/ETS Sample Library . . . . . . . . . . . . . . . . . . . 22Online Help System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22SAS Short Courses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23SAS Technical Support Services . . . . . . . . . . . . . . . . . . . . . . . . 23

Major Features of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . 23Discrete Choice and Qualitative and Limited Dependent Variable Analysis . 23Regression with Autocorrelated and Heteroscedastic Errors . . . . . . . . . 25Simultaneous Systems Linear Regression . . . . . . . . . . . . . . . . . . . 26Linear Systems Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Polynomial Distributed Lag Regression . . . . . . . . . . . . . . . . . . . . 28Nonlinear Systems Regression and Simulation . . . . . . . . . . . . . . . . 29ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) Modeling and Forecasting . 31Vector Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 32State Space Modeling and Forecasting . . . . . . . . . . . . . . . . . . . . 34Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Seasonal Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Structural Time Series Modeling and Forecasting . . . . . . . . . . . . . . . 36Time Series Cross-Sectional Regression Analysis . . . . . . . . . . . . . . . . 37Automatic Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . 38Time Series Interpolation and Frequency Conversion . . . . . . . . . . . . . 39Trend and Seasonal Analysis on Transaction Databases . . . . . . . . . . . . 41Access to Financial and Economic Databases . . . . . . . . . . . . . . . . . 42Spreadsheet Calculations and Financial Report Generation . . . . . . . . . . 44Loan Analysis, Comparison, and Amortization . . . . . . . . . . . . . . . . 45Time Series Forecasting System . . . . . . . . . . . . . . . . . . . . . . . . 46Investment Analysis System . . . . . . . . . . . . . . . . . . . . . . . . . . . 47ODS Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

16 F Chapter 2: Introduction

Related SAS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Base SAS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49SAS Forecast Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51SAS High-Performance Forecasting . . . . . . . . . . . . . . . . . . . . . . 52SAS/GRAPH Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52SAS/STAT Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53SAS/IML Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54SAS/IML Stat Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55SAS/OR Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55SAS/QC Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56MLE for User-Defined Likelihood Functions . . . . . . . . . . . . . . . . . 56JMP Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57SAS Enterprise Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58SAS Add-In for Microsoft Office . . . . . . . . . . . . . . . . . . . . . . . 59Enterprise MinerTime Series nodes . . . . . . . . . . . . . . . . . . . . . 59SAS Risk Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Overview of SAS/ETS Software

SAS/ETS software, a component of the SAS System, provides SAS procedures for:

econometric analysis

time series analysis

time series forecasting

systems modeling and simulation

discrete choice analysis

analysis of qualitative and limited dependent variable models

seasonal adjustment of time series data

financial analysis and reporting

access to economic and financial databases

time series data management

In addition to SAS procedures, SAS/ETS software also includes seamless access to economic andfinancial databases and interactive environments for time series forecasting and investment analysis.

Uses of SAS/ETS Software F 17

Uses of SAS/ETS Software

SAS/ETS software provides tools for a wide variety of applications in business, government, andacademia. Major uses of SAS/ETS procedures are economic analysis, forecasting, economic andfinancial modeling, time series analysis, financial reporting, and manipulation of time series data.

The common theme relating the many applications of the software is time series data: SAS/ETSsoftware is useful whenever it is necessary to analyze or predict processes that take place over timeor to analyze models that involve simultaneous relationships.

Although SAS/ETS software is most closely associated with business, finance and economics, timeseries data also arise in many other fields. SAS/ETS software is useful whenever time dependencies,simultaneous relationships, or dynamic processes complicate data analysis. For example, an environ-mental quality study might use SAS/ETS softwares time series analysis tools to analyze pollutionemissions data. A pharmacokinetic study might use SAS/ETS softwares features for nonlinearsystems to model the dynamics of drug metabolism in different tissues.

The diversity of problems for which econometrics and time series analysis tools are needed isreflected in the applications reported by SAS users. The following listed items are some applicationsof SAS/ETS software presented by SAS users at past annual conferences of the SAS Users GroupInternational (SUGI).

forecasting college enrollment (Calise and Earley 1997)

fitting a pharmacokinetic model (Morelock et al. 1995)

testing interaction effect in reducing sudden infant death syndrome (Fleming, Gibson, andFleming 1996)

forecasting operational indices to measure productivity changes (McCarty 1994)

spectral decomposition and reconstruction of nuclear plant signals (Hoyer and Gross 1993)

estimating parameters for the constant-elasticity-of-substitution translog model (Hisnanick1993)

applying econometric analysis for mass appraisal of real property (Amal and Weselowski1993)

forecasting telephone usage data (Fishetti, Heathcote, and Perry 1993)

forecasting demand and utilization of inpatient hospital services (Hisnanick 1992)

using conditional demand estimation to determine electricity demand (Keshani and Taylor1992)

estimating tree biomass for measurement of forestry yields (Parresol and Thomas 1991)

evaluating the theory of input separability in the production function of U.S. manufacturing(Hisnanick 1991)


forecasting dairy milk yields and composition (Benseman 1990)

predicting the gloss of coated aluminum products subject to weathering (Khan 1990)

learning curve analysis for predicting manufacturing costs of aircraft (Le Bouton 1989)

analyzing Dow Jones stock index trends (Early, Sweeney, and Zekavat 1989)

analyzing the usefulness of the composite index of leading economic indicators for forecastingthe economy (Lin and Myers 1988)

Contents of SAS/ETS Software

Procedures

SAS/ETS software includes the following SAS procedures:

ARIMA ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) modeling and forecasting

AUTOREG regression analysis with autocorrelated or heteroscedastic errors and ARCH andGARCH modeling

COMPUTAB spreadsheet calculations and financial report generation

COUNTREG regression modeling for dependent variables that represent counts

DATASOURCE access to financial and economic databases

ENTROPY maximum entropy-based regression

ESM forecasting by using exponential smoothing models with optimized smoothingweights

EXPAND time series interpolation, frequency conversion, and transformation of time series

FORECAST automatic forecasting

LOAN loan analysis and comparison

MDC multinomial discrete choice analysis

MODEL nonlinear simultaneous equations regression and nonlinear systems modeling andsimulation

PANEL panel data models

PDLREG polynomial distributed lag regression

QLIM qualitative and limited dependent variable analysis

SIMILARITY similarity analysis of time series data for time series data mining

SIMLIN linear systems simulation

SPECTRA spectral and cross-spectral analysis

STATESPACE state space modeling and automated forecasting of multivariate time series

SYSLIN linear simultaneous equations models

Contents of SAS/ETS Software F 19

TIMESERIES analysis of time-stamped transactional data

TSCSREG time series cross-sectional regression analysis

UCM unobserved components analysis of time series

VARMAX vector autoregressive and moving-average modeling and forecasting

X11 seasonal adjustment (Census X-11 and X-11 ARIMA)

X12 seasonal adjustment (Census X-12 ARIMA)

Macros

SAS/ETS software includes the following SAS macros:

%AR generates statements to define autoregressive error models for the MODEL proce-dure

%BOXCOXAR investigates Box-Cox transformations useful for modeling and forecasting a timeseries

%DFPVALUE computes probabilities for Dickey-Fuller test statistics

%DFTEST performs Dickey-Fuller tests for unit roots in a time series process

%LOGTEST tests to determine whether a log transformation is appropriate for modeling andforecasting a time series

%MA generates statements to define moving-average error models for the MODELprocedure

%PDL generates statements to define polynomial distributed lag models for the MODELprocedure

These macros are part of the SAS AUTOCALL facility and are automatically available for use inyour SAS program. Refer to SAS Macro Language: Reference for information about the SAS macrofacility.

Access Interfaces to Economic and Financial Databases

In addition to PROC DATASOURCE, these SAS/ETS access interfaces provide seamless access tofinancial and economic databases:

SASECRSP LIBNAME engine for accessing time series and event data residing in CRSPAc-cess database.

SASEFAME LIBNAME engine for accessing time or case series data residing in a FAMEdatabase.

SASEHAVR LIBNAME engine for accessing time series residing in a HAVER ANALYTICSData Link Express (DLX) database.


The Time Series Forecasting System

SAS/ETS software includes an interactive forecasting system, described in Part IV. This graphicaluser interface to SAS/ETS forecasting features was developed with SAS/AF software and usesPROC ARIMA and other internal routines to perform time series forecasting. The Time SeriesForecasting System makes it easy to forecast time series and provides many features for graphicaldata exploration and graphical comparisons of forecasting models and forecasts. (You must haveSAS/GRAPH installed to use the graphical features of the system.)

The Investment Analysis System

The Investment Analysis System, described in Part V, is an interactive environment for analyzing thetime-value of money in a variety of investments. Various analyses are provided to help analyze thevalue of investment alternatives: time value, periodic equivalent, internal rate of return, benefit-costratio, and break-even analysis.

About This Book

This book is a users guide to SAS/ETS software. Since SAS/ETS software is a part of the SASSystem, this book assumes that you are familiar with Base SAS software and have the books SASLanguage Reference: Dictionary and Base SAS Procedures Guide available for reference. It alsoassumes that you are familiar with SAS data sets, the SAS DATA step, and with basic SAS proceduressuch as PROC PRINT and PROC SORT. Chapter 3, Working with Time Series Data, in this booksummarizes the aspects of Base SAS software that are most relevant to the use of SAS/ETS software.

Chapter Organization

Following a brief Whats New, this book is divided into five major parts. Part I contains generalinformation to aid you in working with SAS/ETS Software. Part II explains the SAS procedures ofSAS/ETS software. Part III describes the available data access interfaces for economic and financialdatabases. Part IV is the reference for the Time Series Forecasting System, an interactive forecastingmenu system that uses PROC ARIMA and other routines to perform time series forecasting. Finally,Part V is the reference for the Investment Analysis System.

The new features added to SAS/ETS software since the publication of SAS/ETS Software: Changesand Enhancements for Release 8.2 are summarized in Chapter 1, Whats New in SAS/ETS 9.22. Ifyou have used SAS/ETS software in the past, you may want to skim this chapter to see whats new.

Part I contains the following chapters.

Chapter 2, the current chapter, provides an overview of SAS/ETS software and summarizes relatedSAS publications, products, and services.

Typographical Conventions F 21

Chapter 3, Working with Time Series Data, discusses the use of SAS data management andprogramming features for time series data.

Chapter 4, Date Intervals, Formats, and Functions, summarizes the time intervals, date and datetimeinformats, date and datetime formats, and date and datetime functions available in the SAS System.

Chapter 5, SAS Macros and Functions, documents SAS macros and DATA step financial functionsprovided with SAS/ETS software. The macros use SAS/ETS procedures to perform Dickey-Fullertests, test for the need for log transformations, or select optimal Box-Cox transformation parametersfor time series data.

Chapter 6, Nonlinear Optimization Methods, documents the NonLinear Optimization subsystemused by some ETS procedures to perform nonlinear optimization tasks.

Part II contains chapters that explain the SAS procedures that make up SAS/ETS software. Thesechapters appear in alphabetical order by procedure name.

Part III contains chapters that document the ETS access interfaces to economic and financialdatabases.

Each of the chapters that document the SAS/ETS procedures (Part II) and the SAS/ETS accessinterfaces (Part III) is organized as follows:

1. The Overview section gives a brief description of the procedure.

2. The Getting Started section provides a tutorial introduction on how to use the procedure.

3. The Syntax section is a reference to the SAS statements and options that control theprocedure.

4. The Details section discusses various technical details.

5. The Examples section contains examples of the use of the procedure.

6. The References section contains technical references on methodology.

Part IV contains the chapters that document the features of the Time Series Forecasting System.

Part V contains chapters that document the features of the Investment Analysis System.

Typographical Conventions

This book uses several type styles for presenting information. The following list explains the meaningof the typographical conventions used in this book:

roman is the standard type style used for most text.

UPPERCASE ROMAN is used for SAS statements, options, and other SAS language elementswhen they appear in the text. However, you can enter these elements in


your own SAS programs in lowercase, uppercase, or a mixture of thetwo.

UPPERCASE BOLD is used in the Syntax sections initial lists of SAS statements andoptions.

oblique is used for user-supplied values for options in the syntax definitions. Inthe text, these values are written in italic.

helvetica is used for the names of variables and data sets when they appear in thetext.

bold is used to refer to matrices and vectors and to refer to commands.

italic is used for terms that are defined in the text, for emphasis, and forreferences to publications.

bold monospace is used for example code. In most cases, this book uses lowercase typefor SAS statements.

Where to Turn for More Information

This section describes other sources of information about SAS/ETS software.

Accessing the SAS/ETS Sample Library

The SAS/ETS Sample Library includes many examples that illustrate the use of SAS/ETS software,including the examples used in this documentation. To access these sample programs, select Helpfrom the menu and then select SAS Help and Documentation. From the Contents list, select thesection Sample SAS Programs under Learning to Use SAS.

Online Help System

You can access online help information about SAS/ETS software in two ways, depending on whetheryou are using the SAS windowing environment in the command line mode or the pull-down menumode.

If you are using a command line, you can access the SAS/ETS help menus by typing help on theSAS windowing environment command line. Or you can issue the command help ARIMA (oranother procedure name) to display the help for that particular procedure.

If you are using the SAS windowing environment pull-down menus, you can pull-down the Helpmenu and make the following selections:

SAS Short Courses F 23

SAS Help and Documentation

Learning to Use SAS in the Contents list

SAS Products

SAS/ETS

The content of the Online Help System follows closely that of this book.

SAS Short Courses

The SAS Education Division offers a number of training courses that might be of interest to SAS/ETSusers. Please check the SAS web site for the current list of available training courses.

SAS Technical Support Services

As with all SAS products, the SAS Technical Support staff is available to respond to problems andanswer technical questions regarding the use of SAS/ETS software.

Major Features of SAS/ETS Software

The following sections briefly summarize major features of SAS/ETS software. See the chapters onindividual procedures for more detailed information.

Discrete Choice and Qualitative and Limited Dependent VariableAnalysis

The MDC procedure provides maximum likelihood (ML) or simulated maximum likelihood estimatesof multinomial discrete choice models in which the choice set consists of unordered multiplealternatives.

The MDC procedure supports the following models and features:

conditional logit

nested logit


heteroscedastic extreme value

multinomial probit

mixed logit

pseudo-random or quasi-random numbers for simulated maximum likelihood estimation

bounds imposed on the parameter estimates

linear restrictions imposed on the parameter estimates

SAS data set containing predicted probabilities and linear predictor (x0) values

decision tree and nested logit

model fit and goodness-of-fit measures including

likelihood ratio

Aldrich-Nelson

Cragg-Uhler 1

Cragg-Uhler 2

Estrella

Adjusted Estrella

McFaddens LRI

Veall-Zimmermann

Akaike Information Criterion (AIC)

Schwarz Criterion or Bayesian Information Criterion (BIC)

The QLIM procedure analyzes univariate and multivariate limited dependent variable models wheredependent variables take discrete values or dependent variables are observed only in a limited rangeof values. This procedure includes logit, probit, Tobit, and general simultaneous equations models.The QLIM procedure supports the following models:

linear regression model with heteroscedasticity

probit with heteroscedasticity

logit with heteroscedasticity

Tobit (censored and truncated) with heteroscedasticity

Box-Cox regression with heteroscedasticity

bivariate probit

bivariate Tobit

sample selection models

Regression with Autocorrelated and Heteroscedastic Errors F 25

multivariate limited dependent models

The COUNTREG procedure provides regression models in which the dependent variable takesnonnegative integer count values. The COUNTREG procedure supports the following models:

Poisson regression

negative binomial regression with quadratic and linear variance functions

zero inflated Poisson (ZIP) model

zero inflated negative binomial (ZINB) model

fixed and random effect Poisson panel data models

fixed and random effect NB (negative binomial) panel data models

The PANEL procedure deals with panel data sets that consist of time series observations on each ofseveral cross-sectional units.

The models and methods the PANEL procedure uses to analyze are as follows:

one-way and two-way models

fixed and random effects

autoregressive models

the Parks method

dynamic panel estimator

the Da Silva method for moving-average disturbances

Regression with Autocorrelated and Heteroscedastic Errors

The AUTOREG procedure provides regression analysis and forecasting of linear models withautocorrelated or heteroscedastic errors. The AUTOREG procedure includes the following features:

estimation and prediction of linear regression models with autoregressive errors

any order autoregressive or subset autoregressive process

optional stepwise selection of autoregressive parameters

choice of the following estimation methods:

exact maximum likelihood

exact nonlinear least squares


Yule-Walker

iterated Yule-Walker

tests for any linear hypothesis that involves the structural coefficients

restrictions for any linear combination of the structural coefficients

forecasts with confidence limits

estimation and forecasting of ARCH (autoregressive conditional heteroscedasticity), GARCH(generalized autoregressive conditional heteroscedasticity), I-GARCH (integrated GARCH),E-GARCH (exponential GARCH), and GARCH-M (GARCH in mean) models

combination of ARCH and GARCH models with autoregressive models, with or withoutregressors

estimation and testing of general heteroscedasticity models

variety of model diagnostic information including the following:

autocorrelation plots

partial autocorrelation plots

Durbin-Watson test statistic and generalized Durbin-Watson tests to any order

Durbin h and Durbin t statistics

Akaike information criterion

Schwarz information criterion

tests for ARCH errors

Ramseys RESET test

Chow and PChow tests

Phillips-Perron stationarity test

CUSUM and CUMSUMSQ statistics

exact significance levels (p-values) for the Durbin-Watson statistic

embedded missing values

Simultaneous Systems Linear Regression

The SYSLIN and ENTROPY procedures provide regression analysis of a simultaneous system oflinear equations.

The SYSLIN procedure includes the following features:

estimation of parameters in simultaneous systems of linear equations

full range of estimation methods including the following:

Simultaneous Systems Linear Regression F 27

ordinary least squares (OLS)

two-stage least squares (2SLS)

three-stage least squares (3SLS)

iterated 3SLS (IT3SLS)

seemingly unrelated regression (SUR)

iterated SUR (ITSUR)

limited-information maximum likelihood (LIML)

full-information maximum likelihood (FIML)

minimum expected loss (MELO)

general K-class estimators

weighted regression

any number of restrictions for any linear combination of coefficients, within a single model oracross equations

tests for any linear hypothesis, for the parameters of a single model or across equations

wide range of model diagnostics and statistics including the following:

usual ANOVA tables and R-square statistics

Durbin-Watson statistics

standardized coefficients

test for overidentifying restrictions

residual plots

standard errors and t tests

covariance and correlation matrices of parameter estimates and equation errors

predicted values, residuals, parameter estimates, and variance-covariance matrices saved inoutput SAS data sets

other features of the SYSLIN procedure that enable you to do the following:

impose linear restrictions on the parameter estimates

test linear hypotheses about the parameters

write predicted and residual values to an output SAS data set

write parameter estimates to an output SAS data set

write the crossproducts matrix (SSCP) to an output SAS data set

use raw data, correlations, covariances, or cross products as input

The ENTROPY procedure supports the following models and features:

generalized maximum entropy (GME) estimation


generalized cross entropy (GCE) estimation

normed moment generalized maximum entropy

maximum entropy-based seemingly unrelated regression (MESUR) estimation

pure inverse estimation

estimation of parameters in simultaneous systems of linear equations

Markov models

unordered multinomial choice problems

weighted regression

any number of restrictions for any linear combination of coefficients, within a single model oracross equations

tests for any linear hypothesis, for the parameters of a single model or across equations

Linear Systems Simulation

The SIMLIN procedure performs simulation and multiplier analysis for simultaneous systems oflinear regression models. The SIMLIN procedure includes the following features:

reduced form coefficients

interim multipliers

total multipliers

dynamic multipliers

multipliers for higher order lags

dynamic forecasts and simulations

goodness-of-fit statistics

acceptance of the equation system coefficients estimated by the SYSLIN procedure as input

Polynomial Distributed Lag Regression

The PDLREG procedure provides regression analysis for linear models with polynomial distributed(Almon) lags. The PDLREG procedure includes the following features:

Nonlinear Systems Regression and Simulation F 29

entry of any number of regressors as a polynomial lag distribution and the use of any numberof covariates

use of any order lag length and degree polynomial for lag distribution

optional upper and lower endpoint restrictions

specification of any number of linear restrictions on covariates

option to repeat analysis over a range of degrees for the lag distribution polynomials

support for autoregressive errors to any lag


Nonlinear Systems Regression and Simulation

The MODEL procedure provides parameter estimation, simulation, and forecasting of dynamicnonlinear simultaneous equation models. The MODEL procedure includes the following features:

nonlinear regression analysis for systems of simultaneous equations, including weightednonlinear regression

full range of parameter estimation methods including the following:

nonlinear ordinary least squares (OLS)

nonlinear seemingly unrelated regression (SUR)

nonlinear two-stage least squares (2SLS)

nonlinear three-stage least squares (3SLS)

iterated SUR

iterated 3SLS

generalized method of moments (GMM)

nonlinear full-information maximum likelihood (FIML)

simulated method of moments (SMM)

supports dynamic multi-equation nonlinear models of any size or complexity

uses the full power of the SAS programming language for model definition, including left-hand-side expressions

hypothesis tests of nonlinear functions of the parameter estimates

linear and nonlinear restrictions of the parameter estimates

bounds imposed on the parameter estimates

computation of estimates and standard errors of nonlinear functions of the parameter estimates


estimation and simulation of ordinary differential equations (ODEs)

vector autoregressive error processes and polynomial lag distributions easily specified for thenonlinear equations

variance modeling (ARCH, GARCH, and others)

computation of goal-seeking solutions of nonlinear systems to find input values needed toproduce target outputs

dynamic, static, or n-period-ahead-forecast simulation modes

simultaneous solution or single equation solution modes

Monte Carlo simulation using parameter estimate covariance and across-equation residualscovariance matrices or user-specified random functions

a variety of diagnostic statistics including the following

model R-square statistics

general Durbin-Watson statistics and exact p-values

asymptotic standard errors and t tests

first-stage R-square statistics

covariance estimates

collinearity diagnostics

simulation goodness-of-fit statistics

Theil inequality coefficient decompositions

Theil relative change forecast error measures

heteroscedasticity tests

Godfrey test for serial correlation

Hausman specification test

Chow tests

block structure and dependency structure analysis for the nonlinear system

listing and cross-reference of fitted model

automatic calculation of needed derivatives by using exact analytic formula

efficient sparse matrix methods used for model solution; choice of other solution methods

Model definition, parameter estimation, simulation, and forecasting can be performed interactivelyin a single SAS session or models can also be stored in files and reused and combined in later runs.

ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) Modeling and Forecasting F 31

ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) Modeling andForecasting

The ARIMA procedure provides the identification, parameter estimation, and forecasting of au-toregressive integrated moving-average (Box-Jenkins) models, seasonal ARIMA models, transferfunction models, and intervention models. The ARIMA procedure includes the following features:

complete ARIMA (Box-Jenkins) modeling with no limits on the order of autoregressive ormoving-average processes

model identification diagnostics including the following:

autocorrelation function

partial autocorrelation function

inverse autocorrelation function

cross-correlation function

extended sample autocorrelation function

minimum information criterion for model identification

squared canonical correlations

stationarity tests

outlier detection

intervention analysis

regression with ARMA errors

transfer function modeling with fully general rational transfer functions

seasonal ARIMA models

ARIMA model-based interpolation of missing values

several parameter estimation methods including the following:

exact maximum likelihood

conditional least squares

exact nonlinear unconditional least squares (ELS or ULS)

prewhitening transformations

forecasts and confidence limits for all models

forecasting tied to parameter estimation methods: finite memory forecasts for models estimatedby maximum likelihood or exact nonlinear least squares methods and infinite memory forecastsfor models estimated by conditional least squares


diagnostic statistics to help judge the adequacy of the model including the following:

Akaikes information criterion (AIC)

Schwarzs Bayesian criterion (SBC or BIC)

Box-Ljung chi-square test statistics for white-noise residuals

autocorrelation function of residuals

partial autocorrelation function of residuals

inverse autocorrelation function of residuals

automatic outlier detection

Vector Time Series Analysis

The VARMAX procedure enables you to model the dynamic relationship both between the dependentvariables and between the dependent and independent variables. The VARMAX procedure includesthe following features:

several modeling features:

vector autoregressive model

vector autoregressive model with exogenous variables

vector autoregressive and moving-average model

Bayesian vector autoregressive model

vector error correction model

Bayesian vector error correction model

GARCH-type multivariate conditional heteroscedasticity models

criteria for automatically determining AR and MA orders:

Akaike information criterion (AIC)

corrected AIC (AICC)

Hannan-Quinn (HQ) criterion

final prediction error (FPE)

Schwarz Bayesian criterion (SBC), also known as Bayesian information criterion (BIC)

AR order identification aids:

partial cross-correlations

Yule-Walker estimates

partial autoregressive coefficients

partial canonical correlations

Vector Time Series Analysis F 33

testing the presence of unit roots and cointegration:

Dickey-Fuller tests

Johansen cointegration test for nonstationary vector processes of integrated order one

Stock-Watson common trends test for the possibility of cointegration among nonstation-ary vector processes of integrated order one

Johansen cointegration test for nonstationary vector processes of integrated order two

model parameter estimation methods:

least squares (LS)

maximum likelihood (ML)

model checks and residual analysis using the following tests:

Durbin-Watson (DW) statistics

F test for autoregressive conditional heteroscedastic (ARCH) disturbance

F test for AR disturbance

Jarque-Bera normality test

Portmanteau test

seasonal deterministic terms

subset models

multiple regression with distributed lags

dead-start model that does not have present values of the exogenous variables

Granger-causal relationships between two distinct groups of variables

infinite order AR representation

impulse response function (or infinite order MA representation)

decomposition of the predicted error covariances

roots of the characteristic functions for both the AR and MA parts to evaluate the proximity ofthe roots to the unit circle

contemporaneous relationships among the components of the vector time series

forecasts

conditional covariances for GARCH models


State Space Modeling and Forecasting

The STATESPACE procedure provides automatic model selection, parameter estimation, and fore-casting of state space models. (State space models encompass an alternative general formulation ofmultivariate ARIMA models.) The STATESPACE procedure includes the following features:

multivariate ARIMA modeling by using the general state space representation of the stochasticprocess

automatic model selection using Akaikes information criterion (AIC)

user-specified state space models including restrictions

transfer function models with random inputs

any combination of simple and seasonal differencing; input series can be differenced to anyorder for any lag lengths


ability to save selected and fitted model in a data set and reuse for forecasting

wide range of output options including the ability to print any statistics concerning the dataand their covariance structure, the model selection process, and the final model fit

Spectral Analysis

The SPECTRA procedure provides spectral analysis and cross-spectral analysis of time series. TheSPECTRA procedure includes the following features:

efficient calculation of periodogram and smoothed periodogram using fast finite Fouriertransform and Chirp-Z algorithms

multiple spectral analysis, including raw and smoothed spectral and cross-spectral functionestimates, with user-specified window weights

choice of kernel for smoothing

output of the following spectral estimates to a SAS data set:

Fourier sine and cosine coefficients

periodogram

smoothed periodogram

cospectrum

quadrature spectrum

Seasonal Adjustment F 35

amplitude

phase spectrum

squared coherency

Fishers Kappa and Bartletts Kolmogorov-Smirnov test statistic for testing a null hypothesisof white noise

Seasonal Adjustment

The X11 procedure provides seasonal adjustment of time series by using the Census X-11 or X-11ARIMA method. The X11 procedure is based on the U.S. Bureau of the Census X-11 seasonaladjustment program and also supports the X-11 ARIMA method developed by Statistics Canada.The X11 procedure includes the following features:

decomposition of monthly or quarterly series into seasonal, trend, trading day, and irregularcomponents

both multiplicative and additive form of the decomposition

all the features of the Census Bureau program

support of the X-11 ARIMA method

support of sliding spans analysis

processing of any number of variables at once with no maximum length for a series

computation of tests for stable, moving, and combined seasonality

optional printing or storing in SAS data sets of the individual X11 tables that show the variouscomponents at different stages of the computation; full control over what is printed or output

ability to project seasonal component one year ahead, which enables reintroduction of seasonalfactors for an extrapolated series

The X12 procedure provides seasonal adjustment of time series using the X-12 ARIMA method.The X12 procedure is based on the U.S. Bureau of the Census X-12 ARIMA seasonal adjustmentprogram (version 0.3). It also supports the X-11 ARIMA method developed by Statistics Canada andthe previous X-11 method of the U.S. Census Bureau. The X12 procedure includes the followingfeatures:

decomposition of monthly or quarterly series into seasonal, trend, trading day, and irregularcomponents

support of multiplicative, additive, pseudo-additive, and log additive forms of decomposition

support of the X-12 ARIMA method


support of regARIMA modeling

automatic identification of outliers

support of TRAMO-based automatic model selection

use of regressors to process missing values within the span of the series

processing of any number of variables at once with no maximum length for a series

computation of tests for stable, moving, and combined seasonality

spectral analysis of original, seasonally adjusted, and irregular series

optional printing or storing in a SAS data set of the individual X11 tables that show the variouscomponents at different stages of the decomposition; full control over what is printed or output

optional projection of seasonal component one year ahead, which enables reintroduction ofseasonal factors for an extrapolated series

Structural Time Series Modeling and Forecasting

The UCM procedure provides a flexible environment for analyzing time series data using structuraltime series models, also called unobserved components models (UCM). These models representthe observed series as a sum of suitably chosen components such as trend, seasonal, cyclical, andregression effects. You can use the UCM procedure to formulate comprehensive models that bringout all the salient features of the series under consideration. Structural models are applicable in thesame situations where Box-Jenkins ARIMA models are applicable; however, the structural modelstend to be more informative about the underlying stochastic structure of the series. The UCMprocedure includes the following features:

general unobserved components modeling where the models can include trend, multipleseasons and cycles, and regression effects

maximum-likelihood estimation of the model parameters

model diagnostics that include a variety of goodness-of-fit statistics, and extensive graphicaldiagnosis of the model residuals

forecasts and confidence limits for the series and all the model components

Model-based seasonal decomposition

extensive plotting capability that includes the following:

forecast and confidence interval plots for the series and model components such as trend,cycles, and seasons

diagnostic plots such as residual plot, residual autocorrelation plots, and so on

Time Series Cross-Sectional Regression Analysis F 37

seasonal decomposition plots such as trend, trend plus cycles, trend plus cycles plusseasons, and so on

model-based interpolation of series missing values

full sample (also called smoothed) estimates of the model components

Time Series Cross-Sectional Regression Analysis

The TSCSREG procedure provides combined time series cross-sectional regression analysis. TheTSCSREG procedure includes the following features:

estimation of the regression parameters under several common error structures:

Fuller and Battese method (variance component model)

Wansbeek-Kapteyn method

Parks method (autoregressive model)

Da Silva method (mixed variance component moving-average model)

one-way fixed effects

two-way fixed effects

one-way random effects

two-way random effects

any number of model specifications

unbalanced panel data for the fixed or random-effects models

variety of estimates and statistics including the following:

underlying error components estimates

regression parameter estimates

standard errors of estimates

t-tests

R-square statistic

correlation matrix of estimates

covariance matrix of estimates

autoregressive parameter estimate

cross-sectional components estimates

autocovariance estimates

F tests of linear hypotheses about the regression parameters

specification tests


Automatic Time Series Forecasting

The ESM procedure provides a quick way to generate forecasts for many time series or transactionaldata in one step by using exponential smoothing methods. All parameters associated with theforecasting model are optimized based on the data.

You can use the following smoothing models:

simple

double

linear

damped trend

seasonal

Winters method (additive and multiplicative)

Additionally, PROC ESM can transform the data before applying the smoothing methods using anyof these transformations:

log

square root

logistic

Box-Cox

In addition to forecasting, the ESM procedure can also produce graphic output.

The ESM procedure can forecast both time series data, whose observations are equally spaced at aspecific time interval (for example, monthly, weekly), or transactional data, whose observations arenot spaced with respect to any particular time interval. (Internet, inventory, sales, and similar dataare typical examples of transactional data. For transactional data, the data are accumulated based ona specified time interval to form a time series.)

The ESM procedure is a replacement for the older FORECAST procedure. ESM is often moreconvenient to use than PROC FORECAST but it supports only exponential smoothing models.

The FORECAST procedure provides forecasting of univariate time series using automatic trendextrapolation. PROC FORECAST is an easy-to-use procedure for automatic forecasting and usessimple popular methods that do not require statistical modeling of the time series, such as exponentialsmoothing, time trend with autoregressive errors, and the Holt-Winters method.

The FORECAST procedure supplements the powerful forecasting capabilities of the econometricand time series analysis procedures described previously. You can use PROC FORECAST when you

Time Series Interpolation and Frequency Conversion F 39

have many series to forecast and you want to extrapolate trends without developing a model for eachseries.

The FORECAST procedure includes the following features:

choice of the following forecasting methods:

EXPO methodexponential smoothing: single, double, triple, or Holt two-parametersmoothing

exponential smoothing as an ARIMA Model

WINTERS methodusing updating equations similar to exponential smoothing to fitmodel parameters

ADDWINTERS methodlike the WINTERS method except that the seasonal parame-ters are added to the trend instead of multiplied with the trend

STEPAR methodstepwise autoregressive models with constant, linear, or quadratictrend and autoregressive errors to any order

Holt-Winters forecasting method with constant, linear, or quadratic trend

additive variant of the Holt-Winters method

support for up to three levels of seasonality for Holt-Winters method: time-of-year, day-of-week, or time-of-day

ability to forecast any number of variables at once

forecast confidence limits for all methods

Time Series Interpolation and Frequency Conversion

The EXPAND procedure provides time interval conversion and missing value interpolation for timeseries. The EXPAND procedure includes the following features:

conversion of time series frequency; for example, constructing quarterly estimates from annualseries or aggregating quarterly values to annual values

conversion of irregular observations to periodic observations

interpolation of missing values in time series

conversion of observation types; for example, estimate stocks from flows and vice versa. Allpossible conversions are supported between any of the following:

beginning of period

end of period

period midpoint

period total


period average

conversion of time series phase shift; for example, conversion between fiscal years and calendaryears

identifying observations including the following:

identification of the time interval of the input values

validation of the input data set observations

computation of the ID values for the observations in the output data set

choice of four interpolation methods:

cubic splines

linear splines

step functions

simple aggregation

ability to perform extrapolation by a linear projection of the trend of the cubic spline curve fitto the input data

ability to transform series before and after interpolation (or without interpolation) by usingany of the following:

constant shift or scale

sign change or absolute value

logarithm, exponential, square root, square, logistic, inverse logistic

lags, leads, differences

classical decomposition

bounds, trims, reverse series

centered moving, cumulative, or backward moving average

centered moving, cumulative, or backward moving range

centered moving, cumulative, or backward moving geometric mean

centered moving, cumulative, or backward moving maximum

centered moving, cumulative, or backward moving median

centered moving, cumulative, or backward moving minimum

centered moving, cumulative, or backward moving product

centered moving, cumulative, or backward moving corrected sum of squares

centered moving, cumulative, or backward moving uncorrected sum of squares

centered moving, cumulative, or backward moving rank

centered moving, cumulative, or backward moving standard deviation

centered moving, cumulative, or backward moving sum

centered moving, cumulative, or backward moving median

Trend and Seasonal Analysis on Transaction Databases F 41

centered moving, cumulative, or backward moving t-value

centered moving, cumulative, or backward moving variance

support for a wide range of time series frequencies:

YEAR

SEMIYEAR

QUARTER

MONTH

SEMIMONTH

TENDAY

WEEK

WEEKDAY

DAY

HOUR

MINUTE

SECOND

support for repeating of shifting the basic interval types to define a great variety of differentfrequencies, such as fiscal years, biennial periods, work shifts, and so forth

Refer to Chapter 3, Working with Time Series Data, and Chapter 4, Date Intervals, Formats, andFunctions, for more information about time series data transformations.

Trend and Seasonal Analysis on Transaction Databases

The TIMESERIES procedure can accumulate transactional data to time series and perform trend andseasonal analysis on the accumulated time series.

Time series analyses performed by the TIMESERIES procedure include the follows:

descriptive statistics relevant for time series data

seasonal decomposition and seasonal adjustment analysis

correlation analysis

cross-correlation analysis

The TIMESERIES procedure includes the following features:

ability to process large amounts of time-stamped transactional data


statistical methods useful for large-scale time series analysis or (temporal) data mining

output data sets stored in either a time series format (default) or a coordinate format (trans-posed)

The TIMESERIES procedure is normally used to prepare data for subsequent analysis that uses otherSAS/ETS procedures or other parts of the SAS system. The time series format is most useful whenthe data are to be analyzed with SAS/ETS procedures. The coordinate format is most useful whenthe data are to be analyzed with SAS/STAT procedures or SAS Enterprise MinerTM. (For example,clustering time-stamped transactional data can be achieved by using the results of TIMESERIESprocedure with the clustering procedures of SAS/STAT and the nodes of SAS Enterprise Miner.)

Access to Financial and Economic Databases

The DATASOURCE procedure and the SAS/ETS data access interface LIBNAME Engines (SASE-CRSP, SASEFAME and SASEHAVR) provide seamless, efficient access to time series data fromdata files supplied by a variety of commercial and governmental data vendors.

The DATASOURCE procedure includes the following features:

support for data files distributed by the following data vendors:

DRI/McGraw-Hill FAME Information Services HAVER ANALYTICS Standard & Poors Compustat Service Center for Research in Security Prices (CRSP) International Monetary Fund U.S. Bureau of Labor Statistics U.S. Bureau of Economic Analysis Organization for Economic Cooperation and Development (OECD)

ability to select the series, frequency, time range, and cross sections of extracted data

ability to create an output data set containing descriptive information on the series available inthe data file

ability to read EBCDIC data on ASCII systems and vice versa

The SASECRSP interface LIBNAME engine includes the following features:

enables random access to time series data residing in CRSPAccess databases

provides a seamless interface between CRSP and SAS data processing

Access to Financial and Economic Databases F 43

uses the LIBNAME statement to enable you to specify which time series you would like toread from the CRSPAccess database, and how you would like to perform selection

enables you access to CRSP Stock, CRSP/COMPUSTAT Merged (CCM) or CRSP IndicesData.

provides convenient formats, informats, and functions for CRSP and SAS datetime conversions

The SASEFAME interface LIBNAME engine includes the following features:

provides SAS and FAME users flexibility in accessing and processing time series data, caseseries, and formulas that reside in either a FAME database or a SAS data set

provides a seamless interface between FAME and SAS data processing

uses the LIBNAME statement to enable you to specify which time series you would like toread from the FAME database

enables you to convert the selected time series to the same time scale

works with the SAS DATA step to perform further subsetting and to store the resulting timeseries into a SAS data set

performs more analysis if desired either in the same SAS session or in another session at alater time

supports the FAME CROSSLIST function for subsetting via BYGROUPS using theCROSSLIST= option

you can use a FAME namelist that contains your BY variables for selection in theCROSSLIST

you can use a SAS input dataset, INSET, that contains the BY selection variables alongwith the WHERE= option in your SASEFAME libref

supports the use of FAME in a client/server environment that uses the FAME CHLI capabilityon your FAME server

enables access to your FAME remote data when you specify the port number of the TCP/IPservice that is defined for your FAME server and the node name of your FAME master serverin your SASEFAME librefs physical path

The SASEHAVR interface LIBNAME engine includes the following features:

enables Windows users random access to economic and financial data residing in a HAVERANALYTICS Data Link Express (DLX) database

the following types of HAVER data sets are available:

United States Economic Indicators

Specialized Databases


Financial Indicators

Industry

Industrial Co

Documents

SAS/ETS 9.22 User's Guide