25
Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps=20000 ls=72; There are many options, these are ones I typically use for SAS on a unix based computer. See the SAS manual for more complete information and many more options.

OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Embed Size (px)

Citation preview

Page 1: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Options Statement

OPTIONS option...;

Example:

options nocenter nodate nonumber ps=20000 ls=72;

There are many options, these are ones I typically use for SAS on a unix based computer. See theSAS manual for more complete information and many more options.

Page 2: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

File Statement

FILENAME fileref ‘filename’;

The fileref provides SAS a name to use when referencing the external data file. The filenameprovides the complete path and filename of the external file that the fileref points to. The first example usesa unix style of pointing to a file, the second is for a DOS based operating system.

Example:

filename file1 ‘/staff/research/gdead/data/cansum96/hstap1_96.tap’;filename readraw1 ‘c:\win95\sas\data\rawdata1.dat’;

Page 3: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Libname Statement

LIBNAME libref ‘directory’;

The libname statement defines a directory to be used by sas to read or create a permanent SAS datafile. The libname is the first part of a two part name. The first example is for unix, the second for DOS.

Example:

libname perm ‘/staff/research/gdead/data’;

libname dir1 ‘c:\sas\data\’;

Page 4: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Data Statement

DATA SASdataset (options);

SASdataset provides a name for the SAS dataset being created. If you use a one level name, thedataset is temporary and is deleted after your SAS program ends. If you use a two level name, SASattempts to create a permanent SAS dataset. Two level names use a libref and a dataset name separated bya period. Here are two examples.

Examples:

data fecdata;

libname perm 'c:\win95\desktop\';data perm.wbank97;

The second example tells sas to create a permanent SAS data file called"c:\win95\desktop\wbank97.XXX." This data file will reflect all the programming in the data step. TheXXX is an extension that can be different using different versions of SAS and different operating systems.

Page 5: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Infile Statement

INFILE fileref;

The infile statement is used by SAS to read an external raw data file. The fileref used here mustmatch the one listed in your fileref statement. In other words, when you read raw data in SAS, you needboth a filename and infile statement, and the fileref’s must be the same.

Example:

infile file1;

Page 6: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Input Statement

INPUT specification;

The input statement describes the location and type of data to SAS for each input record. In otherwords, this is where you define your variables for SAS. This statement can take many forms, and is toocomplicated to fully explore ina class like this one. If you want to be a SAS programmer, this is anexcellent area to study more thoroughly.

We discussed two types of input format statements, list or free-format, and column. List formatassumes that 1) variables are separated by at least one blank character, 2) periods represent missing values,and 3) character variables have a maximum length of eight columns. Column input assumes that variablesare 1) in the same columns on all the data records, and 2) are in a standard numeric of character form.

Example:

List Format:

input v1 v2 v3 name $;

Column Format:

input v1 1-2 v2 4 v3 8-12 .2 name $ 15-25;

Page 7: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Label Statement

LABEL variable = ‘label’;

Variable labels attach a descriptive tag to variables that are printed by many SAS procedures. Youcan enter several distinct labels, or put them in a long list.

Example:

label v1 = ‘Student Name’;

label v1 = ‘Student Name’ v2 = ‘Examination One Grade’ v3 = ‘Examination Two Grade’ v4 = ‘Final Average Grade’;

Page 8: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Comment Statements

*comment;or*/ comment */

Comments are used to add descriptive text to your programs or to make SAS skip or pass overcertain parts of your program. Use them copiously to document your computer code, as well as fordebugging.

Examples:

total = (v1 + v2 + v3); /* Sum the three examination grades */

/* Sum the three examination grades*/total = (v1 + v2 + v3);

*This line will not be executed;

/**************************************************** * * * Add a nice box to make your comments stand out! * * * ****************************************************/

Page 9: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

IF-THEN/ELSE

IF expression THEN statement; <ELSE statement;>

Example:

if 0 <= age < 10 then agegroup = 0; else if 10 <= age < 20 then agegroup = 10; else if 20 <= age < 30 then agegroup = 20; else if 30 <= age < 40 then agegroup = 30; else if 40 <= age < 50 then agegroup = 40;

agegroup = 40;if age < 40 then agegroup = 30;if age < 30 then agegroup = 20;if age < 20 then agegroup = 10;if age < 10 then agegroup = 0;

Page 10: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

SELECT

SELECT <(select-expression)>; WHEN-1 (when-expression-1<, ... when-expression-n>) statement;> <OTHERWISE statement;>END;

Example:

select (a); when (1) x=x*100; when (2); when (3, 4, 5) x=x*100; otherwise;end;

select; when (age >= 0 & age < 10) agegroup = 0; when (age >= 10 & age < 20) agegroup = 10; when (age >= 20 & age < 30) agegroup = 20; when (age >= 30 & age < 40) agegroup = 30; when (age >= 40 & age < 50) agegroup = 40; otherwise;end;

Page 11: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

PROC FORMAT

PROC FORMAT; VALUE name range-1 = 'formatted-value-1' range-1 = 'formatted-value-2' . . . range-n = 'formatted-value-n';

Example

proc format; value agefmt 0- 9 = ' 0' 10-19 = '10' 20-29 = '20' 30-39 = '30' 40-49 = '40;

proc print; var age; format age agefmt.;

Page 12: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Logical Expressions

This method of recoding data takes advantage of the fact that SAS evaluates logical expressions aseither true or false, 1 or 0. For example, if a respondent's age was coded as equal to 13 then:

(10 <= age < 20)evaluates to "true" or 1.

Example:

Agegroup = 10*(10 <= age < 20) + 20*(20 <= age < 30) + 30*(30 <= age < 40) + 40*(40 <= age < 50);

Page 13: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Subsetting Records

DELETE;

The delete statement is typically used with an if/then statement to selectively remove observationsfrom the data set created in the data step.

Example:

if sex = ‘M’ then delete;

if contrib <= 1000 then delete;

Page 14: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Subsetting Variables

DROP variables;

KEEP variables;

To keep the dataset created in the data step small, you can choose to use a subset of all the variablesin your dataset. Do not use these two statements in the same data step since they lead to an obviousconflict. However, the drop statement is preferred if fewer variables are being dropped than kept. The keepstatement is preferred if fewer variables are being kept than dropped. In the following examples, assume adataset with five variables, v1 to v5. Both statements produce the same dataset with variables v1, v2, v3.Dropping variables is preferred since there is less typing.

Examples:

keep v1 v2 v3;

drop v4 v5;

Page 15: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Proc Contents;

PROC CONTENTS options;

The contents procedure prints descriptions of the contents (e.g. number of observations, variablenames, variable formats, etc.) for a SAS dataset. The most used option is position, wich produces a secondlisting of variables, sorted by position in the dataset. The default is to sort tha variables alphabetically.

Examples:

proc contents;

proc contents position;

Page 16: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Proc Print

PROC PRINT options; VAR variables; ID variables; BY variables; PAGEBY byvariable; SUM variables; SUMBY byvariable; LABEL variable = ‘variable title’;

The print procedure is quite powerful for producing reports, but it is most used to make lists of thevalue of variables. The sum option totals, for example, provides subtotals and totals for numeric variableslisted in the by statement. Remember, when using a by statement, the dataset must be sorted by thevariables named in the by statement.

Example;

proc print; var v1 v2 v3 v4;

proc sort; by sex;proc print; var v1 v2 v3 v4; by sex; sum v1;

Page 17: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

PROC FREQ

PROC FREQ options; TABLES requests / options; WEIGHT variable; BY variables;

The frequency procedure produces one-way to n-way frequency and crosstabulation tables.

Examples;

proc freq; tables race sex;

proc freq; tables race*sex;

Page 18: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Arrays

ARRAY array-name<{subscript}> <$> <length> <<array-elements> <(initial-values)>>;

array-name names the array for SAS to reference. When subsequently used, SAS substitutes one of thearray elements for the array name, based on the index variable.

subscript is either an asterisk (*), a number, or a range of numbers used to describe the number andarrangement of elements in an array.

$ indicates that the array elements are characters.

length indicates the length of array elements, if they have not been previously assigned.

array-elements names the elements that make up the array.

Examples:

array simple{3} red green yellow;

array test{3} t1 t2 t3 (90 80 70);

array test2{*} a1 a2 a3 ('A' "B" "C");

array miss{*} numeric;do i=1 to dim(miss); if miss{i} < 0 then miss{i} = .;end;drop i;

Page 19: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

DO STATEMENTS

DO; more SAS statementsEND;

The DO statement is often combined with the if/then/else programming construct to designatre agroup of statements to be executed depending on the resolution of the logical expression(s) listed in the ifstatement.

Examples:

if office ='H' then do: nterm = term + 2;end;else nterm = term + 6;

Page 20: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

DO STATEMENT, Iterative

DO index-variable=specification-1<,. . . specification-n>; more SAS statementsEND;

The iterative do statement causes the SAS statements between the DO and END statements to beexecuted repetitively based on the value of the index variable.

Examples:

Do i=1 to 10; SAS statementsEnd;

do i=1 to 10 until(flag); SAS statements if expression then flag=1; SAS statementsend;

array miss{*} numeric;do i=1 to dim(miss); if miss{i} < 0 then miss{i} = .;end;drop i;

Page 21: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

SET Statement

SET <data-set-name-1> <data-set-name-n>;

The SET statement is used to access SAS system data files. These are files that have been saved in aproprietary binary format readable by the SAS software. The data files can be either temporary SAS files,created in a data step, or permanent data files, existing on disk somewhere. The SET statement can also beused to concatenate data files. Concatenation means to append one file to another (bottom of the first file tothe top of the second file). More than one file may be concatenated.

Examples:

This example creates a temporary SAS data file called "file1" in the first data step. Once created,"file1" is copied to a new temporary SAS data file called "file2" in the second data step. A new variable isalso created in the file, "file2."

data file1;input x y;cards;1 232 343 78;run;

data file2;set file1;ysqr=y**2;run;

This example uses the libname statement to associate the directory where a permanent SAS data setis located with the SAS internal name "perm." Then, the SET statement is used to read a file named"census" into a temporary SAS data file called "step1." The actual name of the file on disk will be"census.XXX" where XXX is an extension that is different for different operating systems.

libname perm '/staff/research/gdead/data';data step1;set perm.census;run;

Finally, the SET statement can be used to append or concatenate SAS data files. In this case, threefiles are appended together. A permanent file (file1) and two temporary files (file2 and file3).

data append;set perm.file1 file2 file3;run;

Page 22: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Merge Statement

MERGE data-set-name-1 <data-set-options>data-set-name-2 <data-set-options>data-set-name-n <data-set-options>

The MERGE statement joins corresponding observations from two or more SAS data sets intosingle observations in a new SAS data set. The way the SAS system joins the observations depends onwhether a BY statement accompanies the MERGE statement.

One-to-One Merging

If the SAS data sets you are merging have the same number of observations in exactly the sameorder, you do nor need to use a BY statement. For example, assume that you have three SAS data sets, eachcontaining data based on all the states. In this example, the first data file contains information on personalincome, the second on population data, and the third on welfare expenditures and poverty rates. You couldmerge all of them with the following SAS code:

data merged;merge income popdata poverty;run;

Note that all the data files are temporary SAS data files. You could also "mix-and-match"temporary and permanent data files.

Page 23: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Match-Merging

Match-merging assumes that you have a key variable, a variable common to all the data sets that areto be merged. You do not need to have the same number of cases in both data sets. Consider the followingtwo data sets:

Candidate Name Office StateTotal

ContributionsLINCOLN, GEORGIANNA H AK 239874YOUNG, DONALD E H AK 1131527STEVENS, THEODORE F (TED) S AK 2737381BACHUS, SPENCER T III H AL 427395CRAMER, ROBERT E "BUD" JR H AL 965365EVERETT, TERRY H AL 472643HILLIARD, EARL FREDERICK H AL 229731LITTLE, T D (TED) H AL 777695DUPWE, WARREN E H AR 537311HENRY, ANN H AR 451391BONO, SONNY H CA 463422CAMPBELL, THOMAS J H CA 1668897DELLUMS, RONALD V H CA 412657DIXON, JULIAN C H CA 93120DORNAN, ROBERT KENNETH H CA 747181

StateIncome

Per capitaAK 19051AL 12846AR 12216CA 18763

The obvious matching key is the variable STATE. Match-merging allows us to produce thefollowing data set:

Candidate Name Office StateTotal

ContributionsIncome

Per CapitaLINCOLN, GEORGIANNA H AK 239874 19051YOUNG, DONALD E H AK 1131527 19051STEVENS, THEODORE F (TED) S AK 2737381 19051BACHUS, SPENCER T III H AL 427395 12846CRAMER, ROBERT E "BUD" JR H AL 965365 12846EVERETT, TERRY H AL 472643 12846HILLIARD, EARL FREDERICK H AL 229731 12846LITTLE, T D (TED) H AL 777695 12846DUPWE, WARREN E H AR 537311 12216HENRY, ANN H AR 451391 12216BONO, SONNY H CA 463422 18763CAMPBELL, THOMAS J H CA 1668897 18763DELLUMS, RONALD V H CA 412657 18763DIXON, JULIAN C H CA 93120 18763DORNAN, ROBERT KENNETH H CA 747181 18763

Page 24: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

The Following SAS code produces the desired results:

data cand;set perm.canddata;proc sort; by state;run;

data state;set perm.statdata;proc sort; by state;run;

data perm.merged;merge cand statdata; by state;proc contents;run;

This program read in two permanent SAS data sets (canddata and statdata), creating two temporarySAS data sets (cand and state). The two temporary SAS data sets (cand and state) are merged together andsaved as a permanent SAS data set (merged).

Page 25: OPTIONS option; - University Of Maryland · Options Statement OPTIONS option...; Example: options nocenter nodate nonumber ps = 20000 ls=72; There are many options , these are ones

Writing and Reading Export Files --The SAS Export Engine

Something very useful to understand is how to write a SAS data file (permanent or temporary) as anexport file. An export file is a SAS data set written to disk in such a way that it may be transported (e.g. bydiskette, tape, ftp, ot some other way) to a completely different computer, operating system, or version ofSAS, or any combination of these things.

To create a transport file you use a combination of a special LIBNAME statement and PROCCOPY. Consider the following SAS code:

SAS Program to Export a SAS Data Set

libname read '~gdead/fall97/sas_class';libname write xport '~gdead/fall97/sas_class/cand9296.xpt';

data _null_;proc copy in=read out=write memtype=data; select final;run;

The first libname statement refers to a location on a hard drive that contains at least one permanentSAS data set. In this example, it points to a subdirectory on my account. Remember, you will need to pointSAS to locations based on your account and needs. The second libname statement refers to a new file, thetransport or export file. Notice the xport option. This tells SAS to use the xport engine to write this file.

PROC COPY will copy either all SAS data files, or selected ones from the libname "read" and writethem to the libname "write."

Once the export file has been moved to another computer system, you need to reverse the process.

SAS Program to Import a SAS Data Set

options nocenter nonumber nodate formdlim=' ' ps=20000;

libname write '~gdead/fall97/sas_class';libname read xport '~gdead/fall97/sas_class/cand9296.xpt';

data _null_;proc copy in=read out=write memtype=data;run;

In this program, the transport file cand9296.xpt (associated with "read") will be converted to a SASsystem file located in the directory associated with "write."