11
The New SAS Engine For CRSP Benefits of using SAS engine for CRSP Setting up the interface between SAS and CRSP Examples of use Performance Benchmarks

The New SAS Engine For CRSP Benefits of using SAS engine for CRSP Setting up the interface between SAS and CRSP Examples of use Performance

Embed Size (px)

Citation preview

Page 1: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

The New SAS Engine For CRSP

Benefits of using SAS engine for CRSP Setting up the interface between SAS and

CRSP Examples of use Performance Benchmarks

Page 2: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Benefits

Access native CRSPAccess Databases. Most SAS Dataset options work without modification

(including keep, rename, and drop). Native SAS datasets and SASECRSP dataset can be

used in the same statements. Procedures can directly access SASECRSP datasets. Supports random access on PERMNO. Supports global & PERMNO specific date restrictions. Supports functions and formats specific to CRSP

trading calendars. Often requires no intermediate files.

Saving coding time, CPU time, I/O, & Disk space.

Page 3: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Setting up the Interface

After CRSP Stock Data and SAS/ETSTM 8.2 are installed independently. Two notes on installation -1. CRSPDB_SASCAL environment variable must be set to you latest CRSP data.2. SASECRSP is currently only available for Windows and HP/UX.

After that, the interface can be accessed using simple extensions to the SAS libname statement. For instance,

To access CRSP daily or monthly sample database respectivelylibname dstk sasecrsp ‘c:\dasample\' setid=10;libname mstk sasecrsp ‘c:\masample\' setid=20;

The data files being accessed are the same ones that are accessed by FORTRAN, ‘C’, ts_print, or stk_print. Therefore disk space is only needed for one installation regardless of how users access the data. This saves not only disk space, but labor each time new data is shipped.

Page 4: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Printing HeadersExample 1: Use of keep statement, and direct use of the proc print procedure

proc print data = mstk.stkhead (keep = permno hcomnam begdt enddt);

Obs PERMNO BEGDT ENDDT HCOMNAM

1 10107 19860331 20010731 MICROSOFT CORP 2 10823 19251231 19980529 KEYSPAN ENERGY CORP 3 12079 19251231 20010731 GENERAL MOTORS CORP 4 12490 19251231 20010731 INTERNATIONAL BUSINESS MACHS COR 5 14218 19251231 19890428 R J R NABISCO INC 6 14322 19251231 20010731 SEARS ROEBUCK & CO 7 15069 19251231 20010731 U S X MARATHON GROUP 8 18948 19300430 19961231 ROCKWELL INTERNATIONAL CORP 9 24360 19500731 20010731 KEYSPAN CORP10 28353 19620131 20010731 RALSTON PURINA GROUP11 52070 19710331 19841031 ELECTRONIC DATA SYS CORP12 66931 19841031 19960628 GENERAL MOTORS CORP13 68451 19851231 20010731 GENERAL MOTORS CORP14 75228 19850830 20010731 IVAX CORP15 75390 19881031 19990831 DEVON ENERGY CORP16 76597 19910328 20001229 NABISCO GROUP HOLDING CORP17 78946 19930226 19970530 DEAN WITTER DISCOVER & CO18 79323 19930630 20010731 ALLSTATE CORP19 79348 19930831 19950531 RALSTON CONTINENTAL BAKING GROUP20 81191 19950131 20001229 NABISCO HOLDINGS CORP21 83596 19960628 20010731 ELECTRONIC DATA SYSTEMS CORP22 84381 19961231 20010731 ROCKWELL INTERNATIONAL CORP NEW23 85349 19971031 20010731 ARVINMERITOR INC24 86496 19990129 20010731 CONEXANT SYSTEMS INC25 87137 19990831 20010731 DEVON ENERGY CORP NEW

Page 5: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Average Returns Example 2: Demonstrates date restriction on a libname, direct use of proc means, and output of native SAS datasets

It is easy to restrict data to a specified date range before passing it to SAS for calculation.

To Calculate Average monthly returns for all the securities in the sample database for the period of January, 1996 – December, 2000, use the following commands.

libname mth5yr sasecrsp 'c:\masample\' range='19960131-20001229' setid=20;

proc means data=mth5yr.ret noprint; var ret; by permno; output mean=avgret out=avgret; run;

Page 6: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Works with Proc SQLExample 3: Demonstrates SASECRSP and native SAS dataset in the same statement.

/* Join with the company names and restrict to securities with exactly five years (60 months) of non-missing returns, and sort output by the avgret. */

proc sql;select a.permno, a.hcomnam, b.avgretfrom mth5yr.stkhead a, avgret bwhere a.permno=b.permno order and b._FREQ_ = 60 by avgret;run;

PERMNO Company Name Header Returns

__________________________________________________________________________ 14322 SEARS ROEBUCK & CO 0.004601 12079 GENERAL MOTORS CORP 0.010279 28353 RALSTON PURINA GROUP 0.013447 15069 U S X MARATHON GROUP 0.013855 79323 ALLSTATE CORP 0.018369 68451 GENERAL MOTORS CORP 0.022518 24360 KEYSPAN CORP 0.024243 75228 IVAX CORP 0.026936 12490 INTERNATIONAL BUSINESS MACHS COR 0.027325 10107 MICROSOFT CORP 0.031976

Page 7: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Performance MetricsExample 4: The SASECRSP engines performs at near CRSPAccess ‘C’ native speed. This was run on the full December 2000 daily database on a local drive of Pentium III 733 MHz running Windows 2000. This example reads over 50 million return observations, and calculates the minimum, maximum, mean, and standard deviation in approximately 2 ½ minutes.

libname dlyall sasecrsp 'c:\da200012\' setid=10;proc means data=dlyall.ret;var ret;run;

N Mean Std Dev Minimum Maximum___________________________________________________________________51398971 0.000845599 0.0442346 -0.9716981 12.7777777___________________________________________________________________

(Reads 205 MB and over 51 million observations in 2 minutes and 34 seconds)

WARNING: Defaulting to selecting all PERMNOs in CRSP database.NOTE: There were 52365201 observations read from the dataset DLYALL.RET.NOTE: PROCEDURE MEANS used: real time 2:34.78 cpu time 2:31.12

Page 8: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Performance MetricsExample 5: This example demonstrates that random access by PERMNO is very efficient. Accessing the same underlying 51,000,000 observation dataset as the previous example, accessing just a single permno, takes just a faction of a second.

libname dlyibm sasecrsp 'c:\da200012\' setid=10 permno=12490;proc means data=dlyibm.ret;var ret;run;

N Mean Std Dev Minimum Maximum__________________________________________________________________9691 0.000533690 0.0159335 -0.2296296 0.1316364__________________________________________________________________ (Reads one security (IBM) and the 9,695 observations in 0.07 seconds)

NOTE: There were 9695 observations read from the dataset DLYIBM.RET.NOTE: PROCEDURE MEANS used: real time 0.07 seconds cpu time 0.04 seconds

Page 9: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

Performance MetricsExample 6: This example demonstrates how a full pass of the header data can be made to restrict the data and then used to create an input list with date ranges. This input list is then used to restrict access to the 51,000,000 observation dataset from the first performance example, and it access just the securities and dates requested. Creating the list of securities, takes less than two seconds. The restricted list takes a little more than ½ minute.

/* This example has two parts. First it does a subset to just companies that delist due to mergers and creates a dataset of PERMNO, delist date, and 200 trading days before the delist date. The second part on the next screen, then then subsets based on that universe of perms */

data mrgdt (keep=permno date1 date2);set dlyall.stkhead; where dlstcd >= 200 and dlstcd <= 299; crspenddt = crspdicd(enddt,0); /* convert enddt to a CRSP calendar date */ date1 = crspdcid(crspenddt - 200); /* calculate 200 trading days prior to enddt

and store in YYYYMMDD format */ if date1 ne . and date1 >= begdt; /* subsetting if - ignores companies that

do not have 200 trading days of CRSP data available. */ date2 = enddt;run;

NOTE: There were 7790 observations read from the dataset DLYALL.STKHEAD.

WHERE ((dlstcd>=200 and dlstcd<=299));

NOTE: The dataset WORK.MRGDT has 7640 observations and 3 variables.

NOTE: DATA statement used:

real time 1.21 seconds

cpu time 0.42 seconds

Page 10: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

libname dlymrg sasecrsp 'c:\da200012\' setid=10 inset=‘mrgdt,PERMNO,PERMNO,date1,date2’;

proc means data=dlymrg.ret;

var ret;

run;

N Mean Std Dev Minimum Maximum

______________________________________________________

1532194 0.0022860 0.0400804 -0.8750000 2.5000000

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

NOTE: There were 1535640 observations read from the dataset DLYMRG.RET.

NOTE: PROCEDURE MEANS used:

real time 34.18 seconds

cpu time 11.19 seconds

Performance MetricsExample 6 – Continued

A new libname needs to be created using the mrgdt dataset created in the previous step. Any datasets accessed using the dlymrg libname will automatically be restricted using random access to PERMNOs in the mrgdt dataset, and then after being read, but before being processed by SAS, restricted to the security specific date range defined by date1 and date2.

Page 11: The New SAS Engine For CRSP   Benefits of using SAS engine for CRSP   Setting up the interface between SAS and CRSP   Examples of use   Performance

11 04/19/23

CRSP User Support

Phone: 773.834.1025

9:30 A.M. – 2:00 P.M. Central Time Monday-Friday, Business Days

Email: [email protected]

User Support Web Page Form:

http://www.crsp.uchicago.edu/support/tech_hlp.html