Upload
vuongkiet
View
307
Download
17
Embed Size (px)
Citation preview
SAS/ETS 9.22Users Guide
SAS Documentation
The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2010. SAS/ETS 9.22 Users Guide.Cary, NC: SAS Institute Inc.
SAS/ETS 9.22 Users Guide
Copyright 2010, SAS Institute Inc., Cary, NC, USA
ISBN 978-1-60764-543-6
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in anyform or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of thepublisher, SAS Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor atthe time you acquire this publication.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentationby the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19,Commercial Computer Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st electronic book, May 2010
1st printing, May 2010
SAS Publishing provides a complete selection of books and electronic products to help customers use SAS software toits fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit theSAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS InstituteInc. in the USA and other countries. indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
Contents
I General Information 1Chapter 1. Whats New in SAS/ETS 9.22 . . . . . . . . . . . . . . . . . 3Chapter 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 15Chapter 3. Working with Time Series Data . . . . . . . . . . . . . . . . . 63Chapter 4. Date Intervals, Formats, and Functions . . . . . . . . . . . . . . 127Chapter 5. SAS Macros and Functions . . . . . . . . . . . . . . . . . . 153Chapter 6. Nonlinear Optimization Methods . . . . . . . . . . . . . . . . 169
II Procedure Reference 191Chapter 7. The ARIMA Procedure . . . . . . . . . . . . . . . . . . . . 193Chapter 8. The AUTOREG Procedure . . . . . . . . . . . . . . . . . . . 317Chapter 9. The COMPUTAB Procedure . . . . . . . . . . . . . . . . . . 463Chapter 10. The COUNTREG Procedure . . . . . . . . . . . . . . . . . . 517Chapter 11. The DATASOURCE Procedure . . . . . . . . . . . . . . . . . 563Chapter 12. The ENTROPY Procedure (Experimental) . . . . . . . . . . . . . 659Chapter 13. The ESM Procedure . . . . . . . . . . . . . . . . . . . . . 725Chapter 14. The EXPAND Procedure . . . . . . . . . . . . . . . . . . . 763Chapter 15. The FORECAST Procedure . . . . . . . . . . . . . . . . . . 817Chapter 16. The LOAN Procedure . . . . . . . . . . . . . . . . . . . . 871Chapter 17. The MDC Procedure . . . . . . . . . . . . . . . . . . . . . 913Chapter 18. The MODEL Procedure . . . . . . . . . . . . . . . . . . . . 993Chapter 19. The PANEL Procedure . . . . . . . . . . . . . . . . . . . . 1309Chapter 20. The PDLREG Procedure . . . . . . . . . . . . . . . . . . . 1395Chapter 21. The QLIM Procedure . . . . . . . . . . . . . . . . . . . . . 1421Chapter 22. The SEVERITY Procedure (Experimental) . . . . . . . . . . . . . 1491Chapter 23. The SIMILARITY Procedure . . . . . . . . . . . . . . . . . . 1589Chapter 24. The SIMLIN Procedure . . . . . . . . . . . . . . . . . . . . 1659Chapter 25. The SPECTRA Procedure . . . . . . . . . . . . . . . . . . . 1689Chapter 26. The STATESPACE Procedure . . . . . . . . . . . . . . . . . 1715Chapter 27. The SYSLIN Procedure . . . . . . . . . . . . . . . . . . . . 1761Chapter 28. The TIMEID Procedure (Experimental) . . . . . . . . . . . . . . 1825Chapter 29. The TIMESERIES Procedure . . . . . . . . . . . . . . . . . . 1849Chapter 30. The TSCSREG Procedure . . . . . . . . . . . . . . . . . . . 1919Chapter 31. The UCM Procedure . . . . . . . . . . . . . . . . . . . . . 1933Chapter 32. The VARMAX Procedure . . . . . . . . . . . . . . . . . . . 2047Chapter 33. The X11 Procedure . . . . . . . . . . . . . . . . . . . . . 2227Chapter 34. The X12 Procedure . . . . . . . . . . . . . . . . . . . . . 2295
III Data Access Engines 2395Chapter 35. The SASECRSP Interface Engine . . . . . . . . . . . . . . . . 2397Chapter 36. The SASEFAME Interface Engine . . . . . . . . . . . . . . . . 2499Chapter 37. The SASEHAVR Interface Engine . . . . . . . . . . . . . . . . 2555
IV Time Series Forecasting System 2605Chapter 38. Overview of the Time Series Forecasting System . . . . . . . . . . 2607Chapter 39. Getting Started with Time Series Forecasting . . . . . . . . . . . . 2611Chapter 40. Creating Time ID Variables . . . . . . . . . . . . . . . . . . 2667Chapter 41. Specifying Forecasting Models . . . . . . . . . . . . . . . . . 2681Chapter 42. Choosing the Best Forecasting Model . . . . . . . . . . . . . . . 2719Chapter 43. Using Predictor Variables . . . . . . . . . . . . . . . . . . . 2739Chapter 44. Command Reference . . . . . . . . . . . . . . . . . . . . . 2773Chapter 45. Window Reference . . . . . . . . . . . . . . . . . . . . . 2781Chapter 46. Forecasting Process Details . . . . . . . . . . . . . . . . . . 2889
V SAS/ETS Model Editor (Experimental) 2923Chapter 47. SAS/ETS Model Editor Window Reference . . . . . . . . . . . . 2925
VI Investment Analysis 2977Chapter 48. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 2979Chapter 49. Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . 2983Chapter 50. Investments . . . . . . . . . . . . . . . . . . . . . . . . 2991Chapter 51. Computations . . . . . . . . . . . . . . . . . . . . . . . 3035Chapter 52. Analyses . . . . . . . . . . . . . . . . . . . . . . . . . 3047Chapter 53. Details . . . . . . . . . . . . . . . . . . . . . . . . . . 3063
Subject Index 3075
Syntax Index 3117
iv
Credits and Acknowledgments
Credits
Documentation
Editing Anne Jones
Technical Review Evan L. Anderson, Ming-Chun Chang, Jan Chvosta,Brent Cohen, Allison Crutchfield, Paige Daniels, Gl Ege,Bruce Elsheimer, Donald J. Erdman, Kelly Felling-ham, Sanggohn Han, Laura Jackson, Wilma S. Jackson,Wen Ji, Kurt Jones, Kathleen Kiernan, Michael J. Leonard,Li C. Li, Mark R. Little, Kevin Meyer, Gina Marie Mon-dello, Steve Morrison, Youngjin Park, Jim Seabolt,David Schlotzhauer, Rajesh Selukar, Jennifer Sloan,Mark Traccarella, Michele A. Trovero, Charles Sun,Donna E. Woodward
Documentation Production Tim Arnold
Software
The procedures in SAS/ETS software were implemented by members of the Advanced Analyticsdivision. Program development includes design, programming, debugging, support, documentation,and technical review. In the following list, the name of the developer who currently has principalsupport responsibility for the procedure is listed first.
ARIMA Rajesh Selukar, Michael J. Leonard, Terry Woodfield
AUTOREG Xilong Chen, Jan Chvosta, Richard Potter, Jason Qiao, John P. Sall
COMPUTAB Michael J. Leonard, Alan R. Eaton
COUNTREG Jan Chvosta, Laura Jackson
DATASOURCE Kelly Fellingham, Meltem Narter
ENTROPY Xilong Chen, Arthur Sinko, Greg Sterijevski, Donald J. Erdman
ESM Michael J. Leonard
EXPAND Marc Kessler, Michael J. Leonard, Mark R. Little
FORECAST Michael J. Leonard, Mark R. Little, John P. Sall
LOAN Richard Potter, Gl Ege
MDC Jan Chvosta
MODEL Marc Kessler, Donald J. Erdman, Mark R. Little, John P. Sall
PANEL Jan Chvosta, Greg Sterijevski
PDLREG Xilong Chen, Richard Potter, Jan Chvosta, Leigh A. Ihnen
QLIM Jan Chvosta
SIMILARITY Michael J. Leonard
SEVERITY Mahesh V. Joshi
SIMLIN Mark R. Little, John P. Sall
SPECTRA Marc Kessler, Rajesh Selukar, Donald J. Erdman, John P. Sall
STATESPACE Donald J. Erdman, Michael J. Leonard
SYSLIN Laura. Jackson, Donald J. Erdman, Leigh A. Ihnen, John P. Sall
TIMEID Marc Kessler, Michael J. Leonard
TIMESERIES Marc Kessler, Michael J. Leonard
TSCSREG Jan Chvosta
vi
UCM Rajesh Selukar
VARMAX Youngjin Park
X11 Wilma S. Jackson, R. Bart Killam, Leigh A. Ihnen,Richard D. Langston
X12 Wilma S. Jackson
Time Series Evan L. Anderson, Michael J. Leonard, Meltem Narter, Gl EgeForecasting System
Investment Analysis Gl Ege, Scott Gray, Michael J. LeonardSystem
Compiler and Andrew Henrick, Stacey ChristianSymbolic Differentiation
SASEHAVR Kelly Fellingham
SASECRSP Kelly Fellingham, Peng Zang
SASEFAME Kelly Fellingham
Testing Shu An, Ming-Chun Chang, Bruce Elsheimer, Kelly Fellingham,Sanggohn Han, Li C. Li, Jennifer Sloan, Charles Sun, Peng Zang
Technical Support
Members Paige Daniels, Wen Ji, Kurt Jones, Kathleen Kiernan,Gina Marie Mondello, David Schlotzhauer, Donna E. Woodward
vii
Acknowledgments
Hundreds of people have helped the SAS System in many ways since its inception. The followingindividuals have been especially helpful in the development of the procedures in SAS/ETS software.Acknowledgments for the SAS System generally appear in Base SAS software documentation andSAS/ETS software documentation.
David Amick Idaho Office of Highway SafetyDavid M. DeLong Duke UniversityDavid Dickey North Carolina State UniversityDouglas J. Drummond Center for Survey StatisticsMichel Ferland Statistics CanadaSusie Fortier Statistics CanadaWilliam Fortney Boeing Computer ServicesWayne Fuller Iowa State UniversityA. Ronald Gallant The University North Carolina at Chapel HillPhil Hanser Sacramento Municipal Utilities DistrictMarvin Jochimsen Mississippi R&O CenterJeff Kaplan Sun GuardKen Kraus Center for Research in Security PricesDominique Ladiray INSEEGeorge McCollister San Diego Gas & ElectricDouglas Miller Purdue UniversityBrian Monsell U.S. Census BureauRobert Parks Washington UniversityBenoit Quenneville Statistics CanadaGregory Sali Idaho Office of Highway SafetyBob Spatz Center for Research in Security PricesMary Young Salt River Project
The final responsibility for the SAS System lies with SAS Institute alone. We hope that you willalways let us know your opinions about the SAS System and its documentation. It is through yourparticipation that SAS software is continuously improved.
viii
Part I
General Information
2
Chapter 1
Whats New in SAS/ETS 9.22
ContentsOverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Highlights of Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . 3Highlights of Enhancements in SAS/ETS 9.2 . . . . . . . . . . . . . . . . . 4
AUTOREG Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4COUNTREG Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5MDC Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6MODEL Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6QLIM Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7SASEFAME Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7SASEHAVR Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8New SEVERITY Procedure (Experimental) . . . . . . . . . . . . . . . . . . . . . 9SIMILARITY Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10New TIMEID Procedure (Experimental) . . . . . . . . . . . . . . . . . . . . . . . 10TIMESERIES Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10UCM Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11X12 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11SAS/ETS Model Editor Application (Experimental) . . . . . . . . . . . . . . . . . 12Date Intervals, Formats, and Functions . . . . . . . . . . . . . . . . . . . . . . . . 13
Overview
This chapter summarizes the new features available in SAS/ETS 9.22.
If you have used SAS/ETS procedures in the past, you can review this chapter to learn about the newfeatures that have been added. When you see a new feature that might be useful for your work, turnto the appropriate chapter to read about the feature in detail.
Highlights of Enhancements
The following new procedures have been added to SAS/ETS software:
4 F Chapter 1: Whats New in SAS/ETS 9.22
The SEVERITY procedure (Experimental)
The TIMEID procedure (Experimental)
The SIMILARITY procedure, which performs similarity analysis for sets of time series, wasexperimental in the previous release and is now production status.
A new Java application, called the SAS/ETS Model Editor (Experimental), provides a graphical userinterface for editing nonlinear statistical models and provides a convenient way to use the MODELprocedure.
New features have been added to the following SAS/ETS components:
The AUTOREG procedure
The COUNTREG procedure
The MDC procedure
The MODEL procedure
The QLIM procedure
SASEFAME interface engine
SASEHAVR interface engine
The TIMESERIES procedure
The UCM procedure
The X12 procedure
New features for defining custom time intervals have been added to Base SAS software that mightbe of interest to SAS/ETS users. For more information, see SAS Language Reference: Dictionary.
Highlights of Enhancements in SAS/ETS 9.2
Users who are updating directly to SAS/ETS 9.22 from a release prior to SAS/ETS 9.2 can findinformation about the SAS/ETS 9.2 changes and enhancements in the chapter Whats New inSAS/ETS in the SAS/ETS 9.2 Users Guide (see support.sas.com/whatsnewets92).
AUTOREG Procedure
The following new features have been added to the AUTOREG procedure:
http://support.sas.com/documentation/cdl/en/etsug/60372/HTML/default/whatsnew_toc.htm
COUNTREG Procedure F 5
Three asymmetric GARCH models, namely quadratic GARCH, threshold GARCH, and powerGARCH, are implemented to measure the impact of news on the future volatility. PowerGARCH also considers the long memory property in the volatility.
Besides the existing two tests for the existence of ARCH effect, Lee and Kings ARCH testand Wong and Lis ARCH test are implemented. Lee and Kings ARCH test is a one-sidedlocally most mean powerful (LMMP) test; Wong and Lis ARCH test is robust to outliers. Ifthe NLAG= option is specified, the statistics based on the final model residuals, along with theOLS residuals, can also be computed.
The Hannan-Quinn criterion (HQC) is implemented and included in the summary statistics.
Four statistical tests of independence are implemented: BDS test, runs test, turning point test,and rank version of the von Neumann ratio test. They are powerful tools for model selectionand specification test.
The augmented Dickey-Fuller (ADF) test for unit root is implemented. This test accounts forsome form of dependence between the innovations of the time series. The ADF formulationincludes lags of the order p in the regression. When the lag is specified to be zero, it reducesto the standard Dickey-Fuller Unit root test. In the presence of regressors, the Engle-Grangercointegration test is performed using the augmented Dickey-Fuller test statistic.
The Elliott-Rothenberg-Stock (ERS) unit root and Ng-Perron (NP) unit root test are imple-mented. These tests also perform automatic lag length selection by using the informationcriterion. The Bayesian information criterion (BIC) is used in the ERS test, and the modifiedAkaike information criterion (AICc) is used in Ng-Perron test.
The CLASS statement is now supported. A CLASS statement enables you to declare classifi-cation variables for use as explanatory effects in a model. When a CLASS variable is used asa predictor in the MODEL statement, the procedure automatically creates a dummy regressorthat corresponds to each discrete value or level of the CLASS variable.
The MODEL statement now supports the use of CLASS variables and interaction terms aspredictors.
The AR, GARCH, and HETERO parameters can be specified in the TEST and RESTRICTstatements.
The likelihood ratio (LR) test and the Lagrange multiplier (LM) test are supported in TESTstatement when GARCH= option is specified.
COUNTREG Procedure
The following new features have been added to the COUNTREG procedure:
The CLASS statement is now supported. A CLASS statement enables you to declare classifi-cation variables for use as explanatory effects in a model. When a CLASS variable is used as
6 F Chapter 1: Whats New in SAS/ETS 9.22
a predictor in the MODEL statement, the procedure automatically creates a dummy regressorthat corresponds to each discrete value or level of the CLASS variable.
The MODEL statement now supports the use of CLASS variables and interaction terms aspredictors.
The FREQ statement is now supported. A FREQ statement specifies a variable whose valuesindicate the number of cases that are represented by each observation. That is, the proceduretreats each observation as if it had appeared n times in the input data set, where n is the valueof the FREQ variable.
The WEIGHT statement is now supported. A WEIGHT statement specifies a variable whosevalues supply weights for each observation in the dataset. These weights control the importance(weight) given to the data observations in fitting the model.
The NLOPTIONS statement enables you to specify options for the subsystem that is used forthe nonlinear optimization.
MDC Procedure
The following new features have been added to the MDC procedure:
The CLASS statement is now supported. A CLASS statement enables you to declare classifi-cation variables for use as explanatory effects in a model. When a CLASS variable is used asa predictor in the MODEL statement, the procedure automatically creates a dummy regressorthat corresponds to each discrete value or level of the CLASS variable.
The MODEL statement now supports the use of CLASS variables and interaction terms aspredictors.
The TEST statement is now supported to test linear equality restrictions on the parameters.Three tests are available: Wald, Lagrange multiplier, and likelihood ratio.
MODEL Procedure
The following feature has been added to the MODEL procedure:
For the GMM estimation method, Hansens J statistic for the test of overidentifying restrictionsis reported along with its probabilty.
QLIM Procedure F 7
QLIM Procedure
The following new features have been added to the QLIM procedure:
The TE1 and TE2 options output technical efficiency measures for each producer in stochasticfrontier models as suggested by Battese and Coelli (1988) and Jondrow at al. (1982).
The WEIGHT statement is now supported. A WEIGHT statement identifies a variable tosupply weights for each observation in the dataset. By default, the weights are normalized sothat they add up to the sample size. If the NONORMALIZE option is used, the actual weightsare used without normalization.
SASEFAME Engine
The SASEFAME interface engine provides a seamless interface between Fame and SAS data toenable SAS users to access and process time series, case series, and formulas that reside in a Famedatabase. The following enhancements have been made to the SASEFAME access engine for Famedatabases:
The INSET= option enables you to pass Fame commands through an input SAS data set andselect your Fame input variables by using the KEEPLIST= clause or the WHERE= clause asselection input for BY variables.
The DBVERSION= option displays the version number of the Fame Work data base in theSAS log. SASEFAME uses Fame 10, which does not allow version 2 databases. Use the Famecompress utility with the -m option to convert your version 2 databases to version 3 or 4. Thedefault is version 4.
The TUNEFAME= option tunes the Fame database engines use of memory to reduce I/Otimes in favor of a bigger virtual memory for caching database objects. The default is 100 MB.
The TUNECHLI= option tunes the C host language interface (CHLI) database engines use ofmemory to reduce I/O times in favor of a bigger virtual memory for caching database objects.The default is 100 MB.
The WILDCARD= option enables you to select series by using the new Fame 10 wildcardingcapabilities which allow a longer 242-character wildcard to match data object series nameswithin the Fame database.
The interface uses the most current version of Fame 10 CHLI. The SAS log reports the versionnumber of the Fame 10 CHLI:
NOTE: The SASEFAME engine is using Version 10.03 of the HLI.
8 F Chapter 1: Whats New in SAS/ETS 9.22
SASEHAVR Engine
The SASEHAVR interface engine is a seamless interface between Haver and SAS data processingthat enables SAS users to read economic and financial time series data that reside in a Haver AnalyticsDLX (Data Link Express) database. The following enhancements have been made to the SASEHAVRaccess engine for Haver Analytics databases:
The AGGMODE= option enables you to specify a STRICT or RELAXED aggregation method.AGGMODE=RELAXED is the default setting. Aggregation is supported only from a morefrequent time interval to a less frequent time interval, such as from weekly to monthly. TheSAS log reports the status of AGGMODE.
The SHORT= option enables you to specify the list of Haver short sources to be included inthe output SAS data set. This list is comma-delimited and must be surrounded by quotationmarks .
The DROPSHORT= option enables you to specify the list of Haver short sources to beexcluded from the output SAS data set. This list is comma-delimited and must be surroundedby quotation marks .
The LONG= option enables you to specify the list of Haver long sources to be included in theoutput SAS data set. This list is comma-delimited and must be surrounded by quotation marks.
The DROPLONG= option enables you to specify the list of Haver long sources to be excludedfrom the output SAS data set. This list is comma-delimited and must be surrounded byquotation marks .
The GEOG1= option enables you to specify the list of Haver geography1 codes to be includedin the output SAS data set. This list is comma-delimited and must be surrounded by quotationmarks .
The DROPGEOG1= option enables you to specify the list of Haver geography1 codes to beexcluded from the output SAS data set. This list is comma-delimited and must be surroundedby quotation marks .
The GEOG2= option enables you to specify the list of Haver geography2 codes to be includedin the output SAS data set. This list is comma-delimited and must be surrounded by quotationmarks .
The DROPGEOG2= option enables you to specify the list of Haver geography2 codes to beexcluded from the output SAS data set. This list is comma-delimited and must be surroundedby quotation marks .
The OUTSELECT=ON option specifies that the output data set show values of selection keyssuch as geography codes, groups, sources, and short and long sources for each selected variablename (time series) in the database. The SAS log reports the status of OUTSELECT.
New SEVERITY Procedure (Experimental) F 9
The OUTSELECT=OFF option specifies that the output data set show the observations inrange for all selected time series. This is the default for this option.
The interface is now using the most current version of DLXAPI32. The SAS log reports theversion number of the Haver DLX api.
New SEVERITY Procedure (Experimental)
The new SEVERITY procedure fits models for statistical distributions of the severity (magnitude) ofevents. A couple of examples of the events typically modeled using the procedure are insurance losspayments and intermittent sales of products.
The SEVERITY procedure is experimental for this release. It provides the following features:
The magnitude of events can be modeled as a random variable with a continuous parametricprobability distribution. The SEVERITY procedure uses the maximum likelihood method tofit multiple specified distributions and identifies the best model based on a specified modelselection criterion.
The SEVERITY procedure is delivered with a set of predefined models for several commonlyused distributions. These include the Burr, exponential, gamma, inverse Gaussian, lognormal,Pareto, generalized Pareto, and Weibull distributions.
The SEVERITY procedure is can be extended to fit any continuous parametric distribution.You can specify the distributions model by using a set of functions and subroutines that aredefined by using the FCMP procedure. The model must include functions to provide the valuesof the probability density function (PDF) and the cumulative distribution function (CDF) ofthe distribution. The model can also optionally include functions or subroutines that providethe distributions description, the number of parameters, initial values and bounds for theparameters, the scale parameter transform, and the gradient vector and the Hessian matrix ofthe PDF and the CDF with respect to the parameters.
Exogenous variables can be specified for fitting a model that has a scale parameter. Theexogenous variables are modeled such that their linear combination affects the scale parametervia a specified link function. The regression coefficients that are associated with the variablesin the linear combination are estimated along with the parameters of the distribution. Currently,only the exponential link function is supported.
Censoring and truncation can be specified for each observed value of the response variable.Global values can also be specified to override the individual values that are associated witheach observed value. Currently, only censoring from above (that is, right-censoring) andtruncation from below (that is, left-truncation) are allowed.
10 F Chapter 1: Whats New in SAS/ETS 9.22
SIMILARITY Procedure
The SIMILARITY procedure was classified as experimental in SAS/ETS 9.2. PROC SIMILARITYis now production status.
New TIMEID Procedure (Experimental)
The new TIMEID procedure analyzes the sequence of ID values in a SAS data set to identify the timeinterval between observations and verifies that the observations in the data set represent a properlyspaced time series.
The TIMEID procedure provides the following features:
Specified time intervals and alignments can be used to evaluate a data sets time ID valuesin terms of the distributions of duplicated values, alignment offsets, and the gaps betweenadjacent observations.
The time intervals width, shift, and alignment can be inferred from a time ID variable. Wheneither the interval or its alignment is specified, this information is used to guide the process ofinferring the remaining quantity.
When multiple BY groups are present, detailed diagnostics for each BY group are reported inaddition to summarized diagnostic information which applies to all BY groups in the data set.
TIMESERIES Procedure
Three features have been added to the TIMESERIES procedure for performing spectral analyses ofthe input time series and native database accumulation of data for a time series.
Singular Spectrum Analysis
Singular spectrum analysis (SSA) is a technique for decomposing a time series into additive com-ponents and categorizing these components based on the magnitudes of their contributions. SSAuses a single parameter, the window length, to quantify patterns in a time series without relyingon preconceived notions about the structure of the time series. The window length represents themaximum lag considered in the analysis and corresponds to the dimensionality of the PCA (principlecomponents analysis) on which the SSA is based.
UCM Procedure F 11
In addition to SSA output options, an SSA statement has been added to explicitly control the windowlength parameter and the grouping of SSA series components.
Fourier Spectrum Analysis
Functionality similar to that available in PROC SPECTRA for analyzing periodograms of timeseries data has been incorporated into PROC TIMESERIES. Now ODS graphical representations ofperiodograms and spectral density estimates can be computed and displayed.
Database Accumulation
For Teradata-based input data sets, aggregation and accumulation can be performed using nativefacilities in the database server. Most ACCUMULATE= options specified in the ID and VARstatements can be performed by the database server.
UCM Procedure
The ARMA model specification options in the IRREGULAR statement, which were experimental inSAS 9.2, are now production.
X12 Procedure
Many new features have been added to the X12 procedure.
The CHECK statement produces statistics for diagnostic checking of residuals from the esti-mated regARIMA model. The following new tables are associated with the CHECK statement:Autocorrelation of regARIMA Model Residuals, Partial Autocorrelation of regARIMAModel Residuals, Autocorrelation of Squared regARIMA Model Residuals, SummaryStatistics for the Unstandardized Residuals, Normality Statistics for regARIMA ModelResiduals, and Table G Rs: 10*LOG(SPECTRUM) of the regARIMA Model Residuals.If ODS GRAPHICS ON is specified, the following new plots are associated with diagnosticchecking output: the autocorrelation function (ErrorACF) plot of the residuals, the partialautocorrelation function (ErrorPACF) plot of the residuals, the autocorrelation function (SqEr-rorACF) plot of the squared residuals, a histogram (ResidualHistogram) of the residuals, and aspectral plot (SpectralPlot) of the residuals.
The MAXLAG option of the IDENTIFY statement specifies the maximum number of lags forthe sample ACF and PACF that are associated with model identification.
12 F Chapter 1: Whats New in SAS/ETS 9.22
The following tables are now available through the OUTPUT statement: E1, E2, E3, and E8.
The SIGMALIM option of the X11 statement enables you to specify the upper and lowersigma limits that are used to identify and decrease the weight of extreme irregular values inthe internal seasonal adjustment computations.
The TYPE option of the X11 statement controls which factors are removed from the originalseries to produce the seasonally adjusted series (table D11) and also the final trend cycle (tableD12).
The OUTSTAT= option of the X12 statement specifies the optional output data set that containsthe summary statistics related to each seasonally adjusted series. The data set is sorted by theBY-group variables, if any, and by series names.
The PERIODOGRAM option of the X12 statement enables you to specify that the PERI-ODOGRAM rather than the SPECTRUM of the series be plotted in the G tables and plots.
The PLOTS= option of the X12 statement controls the plots that are produced through ODSGraphics.
The SPECTRUMSERIES option of the X12 statement specifies the table name of the seriesthat is used in the spectrum of the original series (table G0). The table names that can bespecified are A1, A19, B1, or E1. The default is B1.
The following tables are now available through the TABLES statement: E1, E2, and E3.
The following tables are now available through ODS: Model Description for ARIMA ModelIdentification, Model Description for ARIMA Model Estimation, Final Seasonal FilterSelection via Global MSR, Seasonal Filters by Period, and Final Trend Cycle Statistics.The model description information was previously displayed in notes; an ODS table enablesyou to export the information to a data set. The seasonal filter and trend filter tables are new.
Auxiliary variables have been added to ACF and PACF data sets that are available through ODSOUTPUT. The following variables have been added: _NAME_, Transform, Adjust, Regressors,Diff, and Sdiff. The purpose of the new variables is to help you identify the source of the datawhen multiple ACFs and PACFs are calculated.
The following new feature is experimental:
The AUXDATA= option of the X12 specifies an auxiliary input data set that can containuser-defined variables specified in the INPUT statement, the USERVAR= option of the RE-GRESSION statment, or the USERDEFINED statement. The AUXDATA= option is usefulwhen user-defined regressors are used for multiple time series data sets or multiple BY groups.
SAS/ETS Model Editor Application (Experimental)
A new interactive application, the SAS/ETS Model Editor, enables you to define, fit, and simulatenonlinear statistical models using the MODEL procedure. The SAS/ETS Model Editor enables you
Date Intervals, Formats, and Functions F 13
to use the powerful features of PROC MODEL through a convenient and interactive graphical userinterface.
Date Intervals, Formats, and Functions
The custom time intervals that are available in Base SAS software can be used in SAS/ETS procedures.Custom time intervals enable you to specify beginning and ending dates and seasonality for timeintervals according to any definition. Such intervals can be used to define the following:
fiscal intervals such as monthly intervals that begin on a day other than the first day of themonth (for example, intervals that begin on the 10th day of each month)
fiscal intervals such as monthly intervals that begin on different days for different months (forexample, March of 2000 can begin on March 10, but April of 2000 can begin on April 12)
business days, such as banking days that exclude holidays
hourly intervals that omit hours that the business is closed
14
Chapter 2
Introduction
ContentsOverview of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Uses of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Contents of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . 18
About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Chapter Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Typographical Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Where to Turn for More Information . . . . . . . . . . . . . . . . . . . . . . . . . 22Accessing the SAS/ETS Sample Library . . . . . . . . . . . . . . . . . . . 22Online Help System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22SAS Short Courses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23SAS Technical Support Services . . . . . . . . . . . . . . . . . . . . . . . . 23
Major Features of SAS/ETS Software . . . . . . . . . . . . . . . . . . . . . . . . 23Discrete Choice and Qualitative and Limited Dependent Variable Analysis . 23Regression with Autocorrelated and Heteroscedastic Errors . . . . . . . . . 25Simultaneous Systems Linear Regression . . . . . . . . . . . . . . . . . . . 26Linear Systems Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Polynomial Distributed Lag Regression . . . . . . . . . . . . . . . . . . . . 28Nonlinear Systems Regression and Simulation . . . . . . . . . . . . . . . . 29ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) Modeling and Forecasting . 31Vector Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 32State Space Modeling and Forecasting . . . . . . . . . . . . . . . . . . . . 34Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Seasonal Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Structural Time Series Modeling and Forecasting . . . . . . . . . . . . . . . 36Time Series Cross-Sectional Regression Analysis . . . . . . . . . . . . . . . . 37Automatic Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . 38Time Series Interpolation and Frequency Conversion . . . . . . . . . . . . . 39Trend and Seasonal Analysis on Transaction Databases . . . . . . . . . . . . 41Access to Financial and Economic Databases . . . . . . . . . . . . . . . . . 42Spreadsheet Calculations and Financial Report Generation . . . . . . . . . . 44Loan Analysis, Comparison, and Amortization . . . . . . . . . . . . . . . . 45Time Series Forecasting System . . . . . . . . . . . . . . . . . . . . . . . . 46Investment Analysis System . . . . . . . . . . . . . . . . . . . . . . . . . . . 47ODS Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
16 F Chapter 2: Introduction
Related SAS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Base SAS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49SAS Forecast Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51SAS High-Performance Forecasting . . . . . . . . . . . . . . . . . . . . . . 52SAS/GRAPH Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52SAS/STAT Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53SAS/IML Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54SAS/IML Stat Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55SAS/OR Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55SAS/QC Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56MLE for User-Defined Likelihood Functions . . . . . . . . . . . . . . . . . 56JMP Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57SAS Enterprise Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58SAS Add-In for Microsoft Office . . . . . . . . . . . . . . . . . . . . . . . 59Enterprise MinerTime Series nodes . . . . . . . . . . . . . . . . . . . . . 59SAS Risk Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Overview of SAS/ETS Software
SAS/ETS software, a component of the SAS System, provides SAS procedures for:
econometric analysis
time series analysis
time series forecasting
systems modeling and simulation
discrete choice analysis
analysis of qualitative and limited dependent variable models
seasonal adjustment of time series data
financial analysis and reporting
access to economic and financial databases
time series data management
In addition to SAS procedures, SAS/ETS software also includes seamless access to economic andfinancial databases and interactive environments for time series forecasting and investment analysis.
Uses of SAS/ETS Software F 17
Uses of SAS/ETS Software
SAS/ETS software provides tools for a wide variety of applications in business, government, andacademia. Major uses of SAS/ETS procedures are economic analysis, forecasting, economic andfinancial modeling, time series analysis, financial reporting, and manipulation of time series data.
The common theme relating the many applications of the software is time series data: SAS/ETSsoftware is useful whenever it is necessary to analyze or predict processes that take place over timeor to analyze models that involve simultaneous relationships.
Although SAS/ETS software is most closely associated with business, finance and economics, timeseries data also arise in many other fields. SAS/ETS software is useful whenever time dependencies,simultaneous relationships, or dynamic processes complicate data analysis. For example, an environ-mental quality study might use SAS/ETS softwares time series analysis tools to analyze pollutionemissions data. A pharmacokinetic study might use SAS/ETS softwares features for nonlinearsystems to model the dynamics of drug metabolism in different tissues.
The diversity of problems for which econometrics and time series analysis tools are needed isreflected in the applications reported by SAS users. The following listed items are some applicationsof SAS/ETS software presented by SAS users at past annual conferences of the SAS Users GroupInternational (SUGI).
forecasting college enrollment (Calise and Earley 1997)
fitting a pharmacokinetic model (Morelock et al. 1995)
testing interaction effect in reducing sudden infant death syndrome (Fleming, Gibson, andFleming 1996)
forecasting operational indices to measure productivity changes (McCarty 1994)
spectral decomposition and reconstruction of nuclear plant signals (Hoyer and Gross 1993)
estimating parameters for the constant-elasticity-of-substitution translog model (Hisnanick1993)
applying econometric analysis for mass appraisal of real property (Amal and Weselowski1993)
forecasting telephone usage data (Fishetti, Heathcote, and Perry 1993)
forecasting demand and utilization of inpatient hospital services (Hisnanick 1992)
using conditional demand estimation to determine electricity demand (Keshani and Taylor1992)
estimating tree biomass for measurement of forestry yields (Parresol and Thomas 1991)
evaluating the theory of input separability in the production function of U.S. manufacturing(Hisnanick 1991)
18 F Chapter 2: Introduction
forecasting dairy milk yields and composition (Benseman 1990)
predicting the gloss of coated aluminum products subject to weathering (Khan 1990)
learning curve analysis for predicting manufacturing costs of aircraft (Le Bouton 1989)
analyzing Dow Jones stock index trends (Early, Sweeney, and Zekavat 1989)
analyzing the usefulness of the composite index of leading economic indicators for forecastingthe economy (Lin and Myers 1988)
Contents of SAS/ETS Software
Procedures
SAS/ETS software includes the following SAS procedures:
ARIMA ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) modeling and forecasting
AUTOREG regression analysis with autocorrelated or heteroscedastic errors and ARCH andGARCH modeling
COMPUTAB spreadsheet calculations and financial report generation
COUNTREG regression modeling for dependent variables that represent counts
DATASOURCE access to financial and economic databases
ENTROPY maximum entropy-based regression
ESM forecasting by using exponential smoothing models with optimized smoothingweights
EXPAND time series interpolation, frequency conversion, and transformation of time series
FORECAST automatic forecasting
LOAN loan analysis and comparison
MDC multinomial discrete choice analysis
MODEL nonlinear simultaneous equations regression and nonlinear systems modeling andsimulation
PANEL panel data models
PDLREG polynomial distributed lag regression
QLIM qualitative and limited dependent variable analysis
SIMILARITY similarity analysis of time series data for time series data mining
SIMLIN linear systems simulation
SPECTRA spectral and cross-spectral analysis
STATESPACE state space modeling and automated forecasting of multivariate time series
SYSLIN linear simultaneous equations models
Contents of SAS/ETS Software F 19
TIMESERIES analysis of time-stamped transactional data
TSCSREG time series cross-sectional regression analysis
UCM unobserved components analysis of time series
VARMAX vector autoregressive and moving-average modeling and forecasting
X11 seasonal adjustment (Census X-11 and X-11 ARIMA)
X12 seasonal adjustment (Census X-12 ARIMA)
Macros
SAS/ETS software includes the following SAS macros:
%AR generates statements to define autoregressive error models for the MODEL proce-dure
%BOXCOXAR investigates Box-Cox transformations useful for modeling and forecasting a timeseries
%DFPVALUE computes probabilities for Dickey-Fuller test statistics
%DFTEST performs Dickey-Fuller tests for unit roots in a time series process
%LOGTEST tests to determine whether a log transformation is appropriate for modeling andforecasting a time series
%MA generates statements to define moving-average error models for the MODELprocedure
%PDL generates statements to define polynomial distributed lag models for the MODELprocedure
These macros are part of the SAS AUTOCALL facility and are automatically available for use inyour SAS program. Refer to SAS Macro Language: Reference for information about the SAS macrofacility.
Access Interfaces to Economic and Financial Databases
In addition to PROC DATASOURCE, these SAS/ETS access interfaces provide seamless access tofinancial and economic databases:
SASECRSP LIBNAME engine for accessing time series and event data residing in CRSPAc-cess database.
SASEFAME LIBNAME engine for accessing time or case series data residing in a FAMEdatabase.
SASEHAVR LIBNAME engine for accessing time series residing in a HAVER ANALYTICSData Link Express (DLX) database.
20 F Chapter 2: Introduction
The Time Series Forecasting System
SAS/ETS software includes an interactive forecasting system, described in Part IV. This graphicaluser interface to SAS/ETS forecasting features was developed with SAS/AF software and usesPROC ARIMA and other internal routines to perform time series forecasting. The Time SeriesForecasting System makes it easy to forecast time series and provides many features for graphicaldata exploration and graphical comparisons of forecasting models and forecasts. (You must haveSAS/GRAPH installed to use the graphical features of the system.)
The Investment Analysis System
The Investment Analysis System, described in Part V, is an interactive environment for analyzing thetime-value of money in a variety of investments. Various analyses are provided to help analyze thevalue of investment alternatives: time value, periodic equivalent, internal rate of return, benefit-costratio, and break-even analysis.
About This Book
This book is a users guide to SAS/ETS software. Since SAS/ETS software is a part of the SASSystem, this book assumes that you are familiar with Base SAS software and have the books SASLanguage Reference: Dictionary and Base SAS Procedures Guide available for reference. It alsoassumes that you are familiar with SAS data sets, the SAS DATA step, and with basic SAS proceduressuch as PROC PRINT and PROC SORT. Chapter 3, Working with Time Series Data, in this booksummarizes the aspects of Base SAS software that are most relevant to the use of SAS/ETS software.
Chapter Organization
Following a brief Whats New, this book is divided into five major parts. Part I contains generalinformation to aid you in working with SAS/ETS Software. Part II explains the SAS procedures ofSAS/ETS software. Part III describes the available data access interfaces for economic and financialdatabases. Part IV is the reference for the Time Series Forecasting System, an interactive forecastingmenu system that uses PROC ARIMA and other routines to perform time series forecasting. Finally,Part V is the reference for the Investment Analysis System.
The new features added to SAS/ETS software since the publication of SAS/ETS Software: Changesand Enhancements for Release 8.2 are summarized in Chapter 1, Whats New in SAS/ETS 9.22. Ifyou have used SAS/ETS software in the past, you may want to skim this chapter to see whats new.
Part I contains the following chapters.
Chapter 2, the current chapter, provides an overview of SAS/ETS software and summarizes relatedSAS publications, products, and services.
Typographical Conventions F 21
Chapter 3, Working with Time Series Data, discusses the use of SAS data management andprogramming features for time series data.
Chapter 4, Date Intervals, Formats, and Functions, summarizes the time intervals, date and datetimeinformats, date and datetime formats, and date and datetime functions available in the SAS System.
Chapter 5, SAS Macros and Functions, documents SAS macros and DATA step financial functionsprovided with SAS/ETS software. The macros use SAS/ETS procedures to perform Dickey-Fullertests, test for the need for log transformations, or select optimal Box-Cox transformation parametersfor time series data.
Chapter 6, Nonlinear Optimization Methods, documents the NonLinear Optimization subsystemused by some ETS procedures to perform nonlinear optimization tasks.
Part II contains chapters that explain the SAS procedures that make up SAS/ETS software. Thesechapters appear in alphabetical order by procedure name.
Part III contains chapters that document the ETS access interfaces to economic and financialdatabases.
Each of the chapters that document the SAS/ETS procedures (Part II) and the SAS/ETS accessinterfaces (Part III) is organized as follows:
1. The Overview section gives a brief description of the procedure.
2. The Getting Started section provides a tutorial introduction on how to use the procedure.
3. The Syntax section is a reference to the SAS statements and options that control theprocedure.
4. The Details section discusses various technical details.
5. The Examples section contains examples of the use of the procedure.
6. The References section contains technical references on methodology.
Part IV contains the chapters that document the features of the Time Series Forecasting System.
Part V contains chapters that document the features of the Investment Analysis System.
Typographical Conventions
This book uses several type styles for presenting information. The following list explains the meaningof the typographical conventions used in this book:
roman is the standard type style used for most text.
UPPERCASE ROMAN is used for SAS statements, options, and other SAS language elementswhen they appear in the text. However, you can enter these elements in
22 F Chapter 2: Introduction
your own SAS programs in lowercase, uppercase, or a mixture of thetwo.
UPPERCASE BOLD is used in the Syntax sections initial lists of SAS statements andoptions.
oblique is used for user-supplied values for options in the syntax definitions. Inthe text, these values are written in italic.
helvetica is used for the names of variables and data sets when they appear in thetext.
bold is used to refer to matrices and vectors and to refer to commands.
italic is used for terms that are defined in the text, for emphasis, and forreferences to publications.
bold monospace is used for example code. In most cases, this book uses lowercase typefor SAS statements.
Where to Turn for More Information
This section describes other sources of information about SAS/ETS software.
Accessing the SAS/ETS Sample Library
The SAS/ETS Sample Library includes many examples that illustrate the use of SAS/ETS software,including the examples used in this documentation. To access these sample programs, select Helpfrom the menu and then select SAS Help and Documentation. From the Contents list, select thesection Sample SAS Programs under Learning to Use SAS.
Online Help System
You can access online help information about SAS/ETS software in two ways, depending on whetheryou are using the SAS windowing environment in the command line mode or the pull-down menumode.
If you are using a command line, you can access the SAS/ETS help menus by typing help on theSAS windowing environment command line. Or you can issue the command help ARIMA (oranother procedure name) to display the help for that particular procedure.
If you are using the SAS windowing environment pull-down menus, you can pull-down the Helpmenu and make the following selections:
SAS Short Courses F 23
SAS Help and Documentation
Learning to Use SAS in the Contents list
SAS Products
SAS/ETS
The content of the Online Help System follows closely that of this book.
SAS Short Courses
The SAS Education Division offers a number of training courses that might be of interest to SAS/ETSusers. Please check the SAS web site for the current list of available training courses.
SAS Technical Support Services
As with all SAS products, the SAS Technical Support staff is available to respond to problems andanswer technical questions regarding the use of SAS/ETS software.
Major Features of SAS/ETS Software
The following sections briefly summarize major features of SAS/ETS software. See the chapters onindividual procedures for more detailed information.
Discrete Choice and Qualitative and Limited Dependent VariableAnalysis
The MDC procedure provides maximum likelihood (ML) or simulated maximum likelihood estimatesof multinomial discrete choice models in which the choice set consists of unordered multiplealternatives.
The MDC procedure supports the following models and features:
conditional logit
nested logit
24 F Chapter 2: Introduction
heteroscedastic extreme value
multinomial probit
mixed logit
pseudo-random or quasi-random numbers for simulated maximum likelihood estimation
bounds imposed on the parameter estimates
linear restrictions imposed on the parameter estimates
SAS data set containing predicted probabilities and linear predictor (x0) values
decision tree and nested logit
model fit and goodness-of-fit measures including
likelihood ratio
Aldrich-Nelson
Cragg-Uhler 1
Cragg-Uhler 2
Estrella
Adjusted Estrella
McFaddens LRI
Veall-Zimmermann
Akaike Information Criterion (AIC)
Schwarz Criterion or Bayesian Information Criterion (BIC)
The QLIM procedure analyzes univariate and multivariate limited dependent variable models wheredependent variables take discrete values or dependent variables are observed only in a limited rangeof values. This procedure includes logit, probit, Tobit, and general simultaneous equations models.The QLIM procedure supports the following models:
linear regression model with heteroscedasticity
probit with heteroscedasticity
logit with heteroscedasticity
Tobit (censored and truncated) with heteroscedasticity
Box-Cox regression with heteroscedasticity
bivariate probit
bivariate Tobit
sample selection models
Regression with Autocorrelated and Heteroscedastic Errors F 25
multivariate limited dependent models
The COUNTREG procedure provides regression models in which the dependent variable takesnonnegative integer count values. The COUNTREG procedure supports the following models:
Poisson regression
negative binomial regression with quadratic and linear variance functions
zero inflated Poisson (ZIP) model
zero inflated negative binomial (ZINB) model
fixed and random effect Poisson panel data models
fixed and random effect NB (negative binomial) panel data models
The PANEL procedure deals with panel data sets that consist of time series observations on each ofseveral cross-sectional units.
The models and methods the PANEL procedure uses to analyze are as follows:
one-way and two-way models
fixed and random effects
autoregressive models
the Parks method
dynamic panel estimator
the Da Silva method for moving-average disturbances
Regression with Autocorrelated and Heteroscedastic Errors
The AUTOREG procedure provides regression analysis and forecasting of linear models withautocorrelated or heteroscedastic errors. The AUTOREG procedure includes the following features:
estimation and prediction of linear regression models with autoregressive errors
any order autoregressive or subset autoregressive process
optional stepwise selection of autoregressive parameters
choice of the following estimation methods:
exact maximum likelihood
exact nonlinear least squares
26 F Chapter 2: Introduction
Yule-Walker
iterated Yule-Walker
tests for any linear hypothesis that involves the structural coefficients
restrictions for any linear combination of the structural coefficients
forecasts with confidence limits
estimation and forecasting of ARCH (autoregressive conditional heteroscedasticity), GARCH(generalized autoregressive conditional heteroscedasticity), I-GARCH (integrated GARCH),E-GARCH (exponential GARCH), and GARCH-M (GARCH in mean) models
combination of ARCH and GARCH models with autoregressive models, with or withoutregressors
estimation and testing of general heteroscedasticity models
variety of model diagnostic information including the following:
autocorrelation plots
partial autocorrelation plots
Durbin-Watson test statistic and generalized Durbin-Watson tests to any order
Durbin h and Durbin t statistics
Akaike information criterion
Schwarz information criterion
tests for ARCH errors
Ramseys RESET test
Chow and PChow tests
Phillips-Perron stationarity test
CUSUM and CUMSUMSQ statistics
exact significance levels (p-values) for the Durbin-Watson statistic
embedded missing values
Simultaneous Systems Linear Regression
The SYSLIN and ENTROPY procedures provide regression analysis of a simultaneous system oflinear equations.
The SYSLIN procedure includes the following features:
estimation of parameters in simultaneous systems of linear equations
full range of estimation methods including the following:
Simultaneous Systems Linear Regression F 27
ordinary least squares (OLS)
two-stage least squares (2SLS)
three-stage least squares (3SLS)
iterated 3SLS (IT3SLS)
seemingly unrelated regression (SUR)
iterated SUR (ITSUR)
limited-information maximum likelihood (LIML)
full-information maximum likelihood (FIML)
minimum expected loss (MELO)
general K-class estimators
weighted regression
any number of restrictions for any linear combination of coefficients, within a single model oracross equations
tests for any linear hypothesis, for the parameters of a single model or across equations
wide range of model diagnostics and statistics including the following:
usual ANOVA tables and R-square statistics
Durbin-Watson statistics
standardized coefficients
test for overidentifying restrictions
residual plots
standard errors and t tests
covariance and correlation matrices of parameter estimates and equation errors
predicted values, residuals, parameter estimates, and variance-covariance matrices saved inoutput SAS data sets
other features of the SYSLIN procedure that enable you to do the following:
impose linear restrictions on the parameter estimates
test linear hypotheses about the parameters
write predicted and residual values to an output SAS data set
write parameter estimates to an output SAS data set
write the crossproducts matrix (SSCP) to an output SAS data set
use raw data, correlations, covariances, or cross products as input
The ENTROPY procedure supports the following models and features:
generalized maximum entropy (GME) estimation
28 F Chapter 2: Introduction
generalized cross entropy (GCE) estimation
normed moment generalized maximum entropy
maximum entropy-based seemingly unrelated regression (MESUR) estimation
pure inverse estimation
estimation of parameters in simultaneous systems of linear equations
Markov models
unordered multinomial choice problems
weighted regression
any number of restrictions for any linear combination of coefficients, within a single model oracross equations
tests for any linear hypothesis, for the parameters of a single model or across equations
Linear Systems Simulation
The SIMLIN procedure performs simulation and multiplier analysis for simultaneous systems oflinear regression models. The SIMLIN procedure includes the following features:
reduced form coefficients
interim multipliers
total multipliers
dynamic multipliers
multipliers for higher order lags
dynamic forecasts and simulations
goodness-of-fit statistics
acceptance of the equation system coefficients estimated by the SYSLIN procedure as input
Polynomial Distributed Lag Regression
The PDLREG procedure provides regression analysis for linear models with polynomial distributed(Almon) lags. The PDLREG procedure includes the following features:
Nonlinear Systems Regression and Simulation F 29
entry of any number of regressors as a polynomial lag distribution and the use of any numberof covariates
use of any order lag length and degree polynomial for lag distribution
optional upper and lower endpoint restrictions
specification of any number of linear restrictions on covariates
option to repeat analysis over a range of degrees for the lag distribution polynomials
support for autoregressive errors to any lag
forecasts with confidence limits
Nonlinear Systems Regression and Simulation
The MODEL procedure provides parameter estimation, simulation, and forecasting of dynamicnonlinear simultaneous equation models. The MODEL procedure includes the following features:
nonlinear regression analysis for systems of simultaneous equations, including weightednonlinear regression
full range of parameter estimation methods including the following:
nonlinear ordinary least squares (OLS)
nonlinear seemingly unrelated regression (SUR)
nonlinear two-stage least squares (2SLS)
nonlinear three-stage least squares (3SLS)
iterated SUR
iterated 3SLS
generalized method of moments (GMM)
nonlinear full-information maximum likelihood (FIML)
simulated method of moments (SMM)
supports dynamic multi-equation nonlinear models of any size or complexity
uses the full power of the SAS programming language for model definition, including left-hand-side expressions
hypothesis tests of nonlinear functions of the parameter estimates
linear and nonlinear restrictions of the parameter estimates
bounds imposed on the parameter estimates
computation of estimates and standard errors of nonlinear functions of the parameter estimates
30 F Chapter 2: Introduction
estimation and simulation of ordinary differential equations (ODEs)
vector autoregressive error processes and polynomial lag distributions easily specified for thenonlinear equations
variance modeling (ARCH, GARCH, and others)
computation of goal-seeking solutions of nonlinear systems to find input values needed toproduce target outputs
dynamic, static, or n-period-ahead-forecast simulation modes
simultaneous solution or single equation solution modes
Monte Carlo simulation using parameter estimate covariance and across-equation residualscovariance matrices or user-specified random functions
a variety of diagnostic statistics including the following
model R-square statistics
general Durbin-Watson statistics and exact p-values
asymptotic standard errors and t tests
first-stage R-square statistics
covariance estimates
collinearity diagnostics
simulation goodness-of-fit statistics
Theil inequality coefficient decompositions
Theil relative change forecast error measures
heteroscedasticity tests
Godfrey test for serial correlation
Hausman specification test
Chow tests
block structure and dependency structure analysis for the nonlinear system
listing and cross-reference of fitted model
automatic calculation of needed derivatives by using exact analytic formula
efficient sparse matrix methods used for model solution; choice of other solution methods
Model definition, parameter estimation, simulation, and forecasting can be performed interactivelyin a single SAS session or models can also be stored in files and reused and combined in later runs.
ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) Modeling and Forecasting F 31
ARIMA (Box-Jenkins) and ARIMAX (Box-Tiao) Modeling andForecasting
The ARIMA procedure provides the identification, parameter estimation, and forecasting of au-toregressive integrated moving-average (Box-Jenkins) models, seasonal ARIMA models, transferfunction models, and intervention models. The ARIMA procedure includes the following features:
complete ARIMA (Box-Jenkins) modeling with no limits on the order of autoregressive ormoving-average processes
model identification diagnostics including the following:
autocorrelation function
partial autocorrelation function
inverse autocorrelation function
cross-correlation function
extended sample autocorrelation function
minimum information criterion for model identification
squared canonical correlations
stationarity tests
outlier detection
intervention analysis
regression with ARMA errors
transfer function modeling with fully general rational transfer functions
seasonal ARIMA models
ARIMA model-based interpolation of missing values
several parameter estimation methods including the following:
exact maximum likelihood
conditional least squares
exact nonlinear unconditional least squares (ELS or ULS)
prewhitening transformations
forecasts and confidence limits for all models
forecasting tied to parameter estimation methods: finite memory forecasts for models estimatedby maximum likelihood or exact nonlinear least squares methods and infinite memory forecastsfor models estimated by conditional least squares
32 F Chapter 2: Introduction
diagnostic statistics to help judge the adequacy of the model including the following:
Akaikes information criterion (AIC)
Schwarzs Bayesian criterion (SBC or BIC)
Box-Ljung chi-square test statistics for white-noise residuals
autocorrelation function of residuals
partial autocorrelation function of residuals
inverse autocorrelation function of residuals
automatic outlier detection
Vector Time Series Analysis
The VARMAX procedure enables you to model the dynamic relationship both between the dependentvariables and between the dependent and independent variables. The VARMAX procedure includesthe following features:
several modeling features:
vector autoregressive model
vector autoregressive model with exogenous variables
vector autoregressive and moving-average model
Bayesian vector autoregressive model
vector error correction model
Bayesian vector error correction model
GARCH-type multivariate conditional heteroscedasticity models
criteria for automatically determining AR and MA orders:
Akaike information criterion (AIC)
corrected AIC (AICC)
Hannan-Quinn (HQ) criterion
final prediction error (FPE)
Schwarz Bayesian criterion (SBC), also known as Bayesian information criterion (BIC)
AR order identification aids:
partial cross-correlations
Yule-Walker estimates
partial autoregressive coefficients
partial canonical correlations
Vector Time Series Analysis F 33
testing the presence of unit roots and cointegration:
Dickey-Fuller tests
Johansen cointegration test for nonstationary vector processes of integrated order one
Stock-Watson common trends test for the possibility of cointegration among nonstation-ary vector processes of integrated order one
Johansen cointegration test for nonstationary vector processes of integrated order two
model parameter estimation methods:
least squares (LS)
maximum likelihood (ML)
model checks and residual analysis using the following tests:
Durbin-Watson (DW) statistics
F test for autoregressive conditional heteroscedastic (ARCH) disturbance
F test for AR disturbance
Jarque-Bera normality test
Portmanteau test
seasonal deterministic terms
subset models
multiple regression with distributed lags
dead-start model that does not have present values of the exogenous variables
Granger-causal relationships between two distinct groups of variables
infinite order AR representation
impulse response function (or infinite order MA representation)
decomposition of the predicted error covariances
roots of the characteristic functions for both the AR and MA parts to evaluate the proximity ofthe roots to the unit circle
contemporaneous relationships among the components of the vector time series
forecasts
conditional covariances for GARCH models
34 F Chapter 2: Introduction
State Space Modeling and Forecasting
The STATESPACE procedure provides automatic model selection, parameter estimation, and fore-casting of state space models. (State space models encompass an alternative general formulation ofmultivariate ARIMA models.) The STATESPACE procedure includes the following features:
multivariate ARIMA modeling by using the general state space representation of the stochasticprocess
automatic model selection using Akaikes information criterion (AIC)
user-specified state space models including restrictions
transfer function models with random inputs
any combination of simple and seasonal differencing; input series can be differenced to anyorder for any lag lengths
forecasts with confidence limits
ability to save selected and fitted model in a data set and reuse for forecasting
wide range of output options including the ability to print any statistics concerning the dataand their covariance structure, the model selection process, and the final model fit
Spectral Analysis
The SPECTRA procedure provides spectral analysis and cross-spectral analysis of time series. TheSPECTRA procedure includes the following features:
efficient calculation of periodogram and smoothed periodogram using fast finite Fouriertransform and Chirp-Z algorithms
multiple spectral analysis, including raw and smoothed spectral and cross-spectral functionestimates, with user-specified window weights
choice of kernel for smoothing
output of the following spectral estimates to a SAS data set:
Fourier sine and cosine coefficients
periodogram
smoothed periodogram
cospectrum
quadrature spectrum
Seasonal Adjustment F 35
amplitude
phase spectrum
squared coherency
Fishers Kappa and Bartletts Kolmogorov-Smirnov test statistic for testing a null hypothesisof white noise
Seasonal Adjustment
The X11 procedure provides seasonal adjustment of time series by using the Census X-11 or X-11ARIMA method. The X11 procedure is based on the U.S. Bureau of the Census X-11 seasonaladjustment program and also supports the X-11 ARIMA method developed by Statistics Canada.The X11 procedure includes the following features:
decomposition of monthly or quarterly series into seasonal, trend, trading day, and irregularcomponents
both multiplicative and additive form of the decomposition
all the features of the Census Bureau program
support of the X-11 ARIMA method
support of sliding spans analysis
processing of any number of variables at once with no maximum length for a series
computation of tests for stable, moving, and combined seasonality
optional printing or storing in SAS data sets of the individual X11 tables that show the variouscomponents at different stages of the computation; full control over what is printed or output
ability to project seasonal component one year ahead, which enables reintroduction of seasonalfactors for an extrapolated series
The X12 procedure provides seasonal adjustment of time series using the X-12 ARIMA method.The X12 procedure is based on the U.S. Bureau of the Census X-12 ARIMA seasonal adjustmentprogram (version 0.3). It also supports the X-11 ARIMA method developed by Statistics Canada andthe previous X-11 method of the U.S. Census Bureau. The X12 procedure includes the followingfeatures:
decomposition of monthly or quarterly series into seasonal, trend, trading day, and irregularcomponents
support of multiplicative, additive, pseudo-additive, and log additive forms of decomposition
support of the X-12 ARIMA method
36 F Chapter 2: Introduction
support of regARIMA modeling
automatic identification of outliers
support of TRAMO-based automatic model selection
use of regressors to process missing values within the span of the series
processing of any number of variables at once with no maximum length for a series
computation of tests for stable, moving, and combined seasonality
spectral analysis of original, seasonally adjusted, and irregular series
optional printing or storing in a SAS data set of the individual X11 tables that show the variouscomponents at different stages of the decomposition; full control over what is printed or output
optional projection of seasonal component one year ahead, which enables reintroduction ofseasonal factors for an extrapolated series
Structural Time Series Modeling and Forecasting
The UCM procedure provides a flexible environment for analyzing time series data using structuraltime series models, also called unobserved components models (UCM). These models representthe observed series as a sum of suitably chosen components such as trend, seasonal, cyclical, andregression effects. You can use the UCM procedure to formulate comprehensive models that bringout all the salient features of the series under consideration. Structural models are applicable in thesame situations where Box-Jenkins ARIMA models are applicable; however, the structural modelstend to be more informative about the underlying stochastic structure of the series. The UCMprocedure includes the following features:
general unobserved components modeling where the models can include trend, multipleseasons and cycles, and regression effects
maximum-likelihood estimation of the model parameters
model diagnostics that include a variety of goodness-of-fit statistics, and extensive graphicaldiagnosis of the model residuals
forecasts and confidence limits for the series and all the model components
Model-based seasonal decomposition
extensive plotting capability that includes the following:
forecast and confidence interval plots for the series and model components such as trend,cycles, and seasons
diagnostic plots such as residual plot, residual autocorrelation plots, and so on
Time Series Cross-Sectional Regression Analysis F 37
seasonal decomposition plots such as trend, trend plus cycles, trend plus cycles plusseasons, and so on
model-based interpolation of series missing values
full sample (also called smoothed) estimates of the model components
Time Series Cross-Sectional Regression Analysis
The TSCSREG procedure provides combined time series cross-sectional regression analysis. TheTSCSREG procedure includes the following features:
estimation of the regression parameters under several common error structures:
Fuller and Battese method (variance component model)
Wansbeek-Kapteyn method
Parks method (autoregressive model)
Da Silva method (mixed variance component moving-average model)
one-way fixed effects
two-way fixed effects
one-way random effects
two-way random effects
any number of model specifications
unbalanced panel data for the fixed or random-effects models
variety of estimates and statistics including the following:
underlying error components estimates
regression parameter estimates
standard errors of estimates
t-tests
R-square statistic
correlation matrix of estimates
covariance matrix of estimates
autoregressive parameter estimate
cross-sectional components estimates
autocovariance estimates
F tests of linear hypotheses about the regression parameters
specification tests
38 F Chapter 2: Introduction
Automatic Time Series Forecasting
The ESM procedure provides a quick way to generate forecasts for many time series or transactionaldata in one step by using exponential smoothing methods. All parameters associated with theforecasting model are optimized based on the data.
You can use the following smoothing models:
simple
double
linear
damped trend
seasonal
Winters method (additive and multiplicative)
Additionally, PROC ESM can transform the data before applying the smoothing methods using anyof these transformations:
log
square root
logistic
Box-Cox
In addition to forecasting, the ESM procedure can also produce graphic output.
The ESM procedure can forecast both time series data, whose observations are equally spaced at aspecific time interval (for example, monthly, weekly), or transactional data, whose observations arenot spaced with respect to any particular time interval. (Internet, inventory, sales, and similar dataare typical examples of transactional data. For transactional data, the data are accumulated based ona specified time interval to form a time series.)
The ESM procedure is a replacement for the older FORECAST procedure. ESM is often moreconvenient to use than PROC FORECAST but it supports only exponential smoothing models.
The FORECAST procedure provides forecasting of univariate time series using automatic trendextrapolation. PROC FORECAST is an easy-to-use procedure for automatic forecasting and usessimple popular methods that do not require statistical modeling of the time series, such as exponentialsmoothing, time trend with autoregressive errors, and the Holt-Winters method.
The FORECAST procedure supplements the powerful forecasting capabilities of the econometricand time series analysis procedures described previously. You can use PROC FORECAST when you
Time Series Interpolation and Frequency Conversion F 39
have many series to forecast and you want to extrapolate trends without developing a model for eachseries.
The FORECAST procedure includes the following features:
choice of the following forecasting methods:
EXPO methodexponential smoothing: single, double, triple, or Holt two-parametersmoothing
exponential smoothing as an ARIMA Model
WINTERS methodusing updating equations similar to exponential smoothing to fitmodel parameters
ADDWINTERS methodlike the WINTERS method except that the seasonal parame-ters are added to the trend instead of multiplied with the trend
STEPAR methodstepwise autoregressive models with constant, linear, or quadratictrend and autoregressive errors to any order
Holt-Winters forecasting method with constant, linear, or quadratic trend
additive variant of the Holt-Winters method
support for up to three levels of seasonality for Holt-Winters method: time-of-year, day-of-week, or time-of-day
ability to forecast any number of variables at once
forecast confidence limits for all methods
Time Series Interpolation and Frequency Conversion
The EXPAND procedure provides time interval conversion and missing value interpolation for timeseries. The EXPAND procedure includes the following features:
conversion of time series frequency; for example, constructing quarterly estimates from annualseries or aggregating quarterly values to annual values
conversion of irregular observations to periodic observations
interpolation of missing values in time series
conversion of observation types; for example, estimate stocks from flows and vice versa. Allpossible conversions are supported between any of the following:
beginning of period
end of period
period midpoint
period total
40 F Chapter 2: Introduction
period average
conversion of time series phase shift; for example, conversion between fiscal years and calendaryears
identifying observations including the following:
identification of the time interval of the input values
validation of the input data set observations
computation of the ID values for the observations in the output data set
choice of four interpolation methods:
cubic splines
linear splines
step functions
simple aggregation
ability to perform extrapolation by a linear projection of the trend of the cubic spline curve fitto the input data
ability to transform series before and after interpolation (or without interpolation) by usingany of the following:
constant shift or scale
sign change or absolute value
logarithm, exponential, square root, square, logistic, inverse logistic
lags, leads, differences
classical decomposition
bounds, trims, reverse series
centered moving, cumulative, or backward moving average
centered moving, cumulative, or backward moving range
centered moving, cumulative, or backward moving geometric mean
centered moving, cumulative, or backward moving maximum
centered moving, cumulative, or backward moving median
centered moving, cumulative, or backward moving minimum
centered moving, cumulative, or backward moving product
centered moving, cumulative, or backward moving corrected sum of squares
centered moving, cumulative, or backward moving uncorrected sum of squares
centered moving, cumulative, or backward moving rank
centered moving, cumulative, or backward moving standard deviation
centered moving, cumulative, or backward moving sum
centered moving, cumulative, or backward moving median
Trend and Seasonal Analysis on Transaction Databases F 41
centered moving, cumulative, or backward moving t-value
centered moving, cumulative, or backward moving variance
support for a wide range of time series frequencies:
YEAR
SEMIYEAR
QUARTER
MONTH
SEMIMONTH
TENDAY
WEEK
WEEKDAY
DAY
HOUR
MINUTE
SECOND
support for repeating of shifting the basic interval types to define a great variety of differentfrequencies, such as fiscal years, biennial periods, work shifts, and so forth
Refer to Chapter 3, Working with Time Series Data, and Chapter 4, Date Intervals, Formats, andFunctions, for more information about time series data transformations.
Trend and Seasonal Analysis on Transaction Databases
The TIMESERIES procedure can accumulate transactional data to time series and perform trend andseasonal analysis on the accumulated time series.
Time series analyses performed by the TIMESERIES procedure include the follows:
descriptive statistics relevant for time series data
seasonal decomposition and seasonal adjustment analysis
correlation analysis
cross-correlation analysis
The TIMESERIES procedure includes the following features:
ability to process large amounts of time-stamped transactional data
42 F Chapter 2: Introduction
statistical methods useful for large-scale time series analysis or (temporal) data mining
output data sets stored in either a time series format (default) or a coordinate format (trans-posed)
The TIMESERIES procedure is normally used to prepare data for subsequent analysis that uses otherSAS/ETS procedures or other parts of the SAS system. The time series format is most useful whenthe data are to be analyzed with SAS/ETS procedures. The coordinate format is most useful whenthe data are to be analyzed with SAS/STAT procedures or SAS Enterprise MinerTM. (For example,clustering time-stamped transactional data can be achieved by using the results of TIMESERIESprocedure with the clustering procedures of SAS/STAT and the nodes of SAS Enterprise Miner.)
Access to Financial and Economic Databases
The DATASOURCE procedure and the SAS/ETS data access interface LIBNAME Engines (SASE-CRSP, SASEFAME and SASEHAVR) provide seamless, efficient access to time series data fromdata files supplied by a variety of commercial and governmental data vendors.
The DATASOURCE procedure includes the following features:
support for data files distributed by the following data vendors:
DRI/McGraw-Hill FAME Information Services HAVER ANALYTICS Standard & Poors Compustat Service Center for Research in Security Prices (CRSP) International Monetary Fund U.S. Bureau of Labor Statistics U.S. Bureau of Economic Analysis Organization for Economic Cooperation and Development (OECD)
ability to select the series, frequency, time range, and cross sections of extracted data
ability to create an output data set containing descriptive information on the series available inthe data file
ability to read EBCDIC data on ASCII systems and vice versa
The SASECRSP interface LIBNAME engine includes the following features:
enables random access to time series data residing in CRSPAccess databases
provides a seamless interface between CRSP and SAS data processing
Access to Financial and Economic Databases F 43
uses the LIBNAME statement to enable you to specify which time series you would like toread from the CRSPAccess database, and how you would like to perform selection
enables you access to CRSP Stock, CRSP/COMPUSTAT Merged (CCM) or CRSP IndicesData.
provides convenient formats, informats, and functions for CRSP and SAS datetime conversions
The SASEFAME interface LIBNAME engine includes the following features:
provides SAS and FAME users flexibility in accessing and processing time series data, caseseries, and formulas that reside in either a FAME database or a SAS data set
provides a seamless interface between FAME and SAS data processing
uses the LIBNAME statement to enable you to specify which time series you would like toread from the FAME database
enables you to convert the selected time series to the same time scale
works with the SAS DATA step to perform further subsetting and to store the resulting timeseries into a SAS data set
performs more analysis if desired either in the same SAS session or in another session at alater time
supports the FAME CROSSLIST function for subsetting via BYGROUPS using theCROSSLIST= option
you can use a FAME namelist that contains your BY variables for selection in theCROSSLIST
you can use a SAS input dataset, INSET, that contains the BY selection variables alongwith the WHERE= option in your SASEFAME libref
supports the use of FAME in a client/server environment that uses the FAME CHLI capabilityon your FAME server
enables access to your FAME remote data when you specify the port number of the TCP/IPservice that is defined for your FAME server and the node name of your FAME master serverin your SASEFAME librefs physical path
The SASEHAVR interface LIBNAME engine includes the following features:
enables Windows users random access to economic and financial data residing in a HAVERANALYTICS Data Link Express (DLX) database
the following types of HAVER data sets are available:
United States Economic Indicators
Specialized Databases
44 F Chapter 2: Introduction
Financial Indicators
Industry
Industrial Co