Upload
leonard-horton
View
222
Download
0
Embed Size (px)
Citation preview
YET ANOTHER TIPS, TRICKS,TRAPS, TECHNIQUES
PRESENTATION:
A Random Selection of What I Learned From 15+
Years of SAS Programming
John Pirnat Kaiser Permanente
Oct 15, 2009
2
ARE THESE EQUIVALENT?
data allschools1;set school1 school2;keep id lname fname;run;
#1
#2
data allschools2(keep = id lname fname);set school1 school2;run;
#3
data allschools3;set school1 (keep = id lname fname)school2 (keep = id lname fname);run;
3
EXAMPLE OF WHY HOW YOU WRITE THE CODE MIGHT MATTER:
data school1;
input id 8. region $1. lname $25. fname $10. pass_math $1. pass_eng $1. pass_span $1.;
datalines;
72186 4Mouse Minnie YYN
51682 8Duck Donald NNN
368882 2Hall Annie YYY
45111 2Sawyer Tom NYN
;
run;data school2;
input id 8. region 8. lname $25. fname $10.pass_math $1. pass_eng $1. pass_span $1.;
datalines;
46631 3 Lane Lois NYN
866322 7 Mouse Mickey YYN
63358 6 Kent Clark NYY
42643 1 Bunker Edith YYY
;
run;
4
CODE #1
99 data allschools1;
100 set school1 school2;
ERROR: Variable region has been defined as both character and numeric.
101 keep id lname fname;
102 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.ALLschoolS1 may be incomplete. When this step was stopped there
were 0 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
5
CODE #2
105 data allschools2
106 (keep = id lname fname);
107 set school1 school2;
ERROR: Variable region has been defined as both character and numeric.
108 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.ALLschoolS2 may be incomplete. When this step was stopped there
were 0 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
6
CODE #3
110 data allschools3;
111 set school1 (keep = id lname fname)
112 school2 (keep = id lname fname);
113 run;
NOTE: There were 4 observations read from the data set WORK.school1.
NOTE: There were 4 observations read from the data set WORK.school2.
NOTE: The data set WORK.ALLschoolS3 has 8 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
7
IN CODE #3 EXAMPLE, ONLY THE COLUMNS IN THE KEEP COMMAND ARE BEING LOADED INTO PROGRAM DATA VECTOR (PDV)
IN CODE #1 AND #2 EXAMPLES, ALL COLUMNS OF THE TWO SCHOOL DATASETS ARE BEING READ IN THE PDV, BUT ONLY COLUMNS INCLUDED IN KEEP STATEMENT ARE READ TO THE OUTPUT DATASET
8
DON’T “FREQ” OUT
proc freq data = school1;tables pass_math pass_eng pass_span /out = grade_results;run;
proc print data = grade_results;run;
The SAS System pass_ Obs span COUNT PERCENT
1 N 3 75 2 Y 1 25
9
USE ODS INSTEAD
proc freq data = school1;tables pass_math pass_eng pass_span;ods output OneWayFreqs = grade_results;run;
proc print data = grade_results;var pass_eng pass_math pass_span frequency percent ;run;
The SAS System pass_ pass_ Obs pass_eng math span Frequency Percent 1 N 2 50.00 2 Y 2 50.00 3 N 1 25.00 4 Y 3 75.00 5 N 3 75.00 6 Y 1 25.00
SAS DATASET NAME
ODS TABLE NAME
10
YOU CAN BE TOO CLEVER IN YOUR CODE
data school1;
input schoolid 8. lname $25. fname $10. pass_math $1. pass_eng $1. pass_span $1.;
datalines;
72186 Mouse Minnie YYN
51682 Duck Donald NNN
368882 Hall Annie YYY
45111 Sawyer Tom NYN
;
run;
data school2;
input id 8. lname $25. fname $10. pass_math $1. pass_eng $1. pass_span $1.;
datalines;
46631 Lane Lois NYN
866322 Mouse Mickey YYN
63358 Kent Clark NYY
42643 Bunker Edith YYY
;
run;
11
lname fname id schoolid
Mouse Minnie 72186 72186
Duck Donald 72186 51682
Hall Annie 72186 368882
Sawyer Tom 72186 45111
Lane Lois 46631 .
Mouse Mickey 866322 .
Kent Clark 63358 .
Bunker Edith 42643 .
HUH?
data allschools;set school1 school2;if id = . then id = schoolid;run;
proc print data = allschools noobs;var lname fname id schoolid;run;
12
“When variables are read with a SET, MERGE, or UPDATE statement, the SAS System sets the value to missing only before the first iteration of the DATA step…Thereafter, the variables retain their values until new values become available …”
-SAS® Language
Reference
Version 6
First Edition
13
ONE WORKAROUND:
data allschools;set school1 (rename = (schoolid = id))school2;run;
proc print data = allschools;var lname fname id;run;
Obs lname fname id
1 Mouse Minnie 72186
2 Duck Donald 51682
3 Hall Annie 368882
4 Sawyer Tom 45111
5 Lane Lois 46631
6 Mouse Mickey 866322
7 Kent Clark 63358
8 Bunker Edith 42643
14
1. TS486 – Quick Reference Guide to SAS Functions Informats Formats
Updated at:
http://www.sascommunity.org/wiki/TS_486_Functions,_Informats,_and_Formats
data school1;
input id 8. region $1. lname $25. fname $10. pass_math $1. pass_eng $1. pass_span $1.;
datalines;
72186 4Mouse Minnie YYN
51682 8Duck Donald NNN
368882 2Hall Annie YYY
45111 2Sawyer Tom NYN
;
run;
2. =:
QUICK HITTERS:
15
data test1;set school1;if lname =: 'M';run;
proc print data = test1 noobs;var lname;run;
lname
Mouse
data test2;set school1;if lname>=: 'M';run;
proc print data = test2 noobs;var lname;run;
lname
Mouse
Sawyer
16
data test3;set school1;if lname<=: 'M';run;
proc print data = test3 noobs;var lname;run;
Lname
Mouse DuckHall
17
OFFBEAT MACRO USES
%macro skip;
Lotsa comments /* */ in code
%mend;
Use when debugging program and don’t want to run heavily commented code
Let %abc =
Big chunk of code that you will repeat often ;
18
AND THEN THERE IS PROC IMPORT
How many of you receive spreadsheets like this?
SOUTHLAND SCHOOL
ID LNAME FNAMEPASS_MATH
PASS_ENG
PASS_SPAN
38793 NEWMAN N N N
5763 GEKKO GORDON Y Y Y
414452 SPAULDING GEOFFREY Y Y N
916547 O'HARA SCARLETT N N Y
43256 MARPLE JANE Y N Y
602345 CHARLES NORA N Y Y
402395 MASON PERRY Y Y Y
19
proc importdatafile = 'P:\My Documents\CO DAY 2009\SOUTHLAND SCHOOL.xls'out = school3dbms = excelreplace;run;
proc print data = school3;run;
SOUTHLAND_ Obs SCHOOL F2 F3 F4 F5 F6 1 . 2 . LNAME FNAME PASS_MATH PASS_ENG PASS_SPAN 3 38793 NEWMAN N N N 4 5763 GEKKO GORDON Y Y Y 5 414452 SPAULDING GEOFFREY Y Y N 6 916547 O'HARA SCARLETT N N Y 7 43256 MARPLE JANE Y N Y 8 602345 CHARLES NORA N Y Y 9 . MASON PERRY Y Y Y
20
CORRECTION ATTEMPT 1proc importdatafile = 'P:\My Documents\CO DAY 2009\SOUTHLAND SCHOOL.xls'out = school3dbms = excelreplace;mixed = yes;run;
proc print data = school3;run;
SOUTHLAND_ Obs SCHOOL F2 F3 F4 F5 F6 1 2 ID LNAME FNAME PASS_MATH PASS_ENG PASS_SPAN 3 38793 NEWMAN N N N 4 5763 GEKKO GORDON Y Y Y 5 414452 SPAULDING GEOFFREY Y Y N 6 916547 O'HARA SCARLETT N N Y 7 43256 MARPLE JANE Y N Y 8 602345 CHARLES NORA N Y Y 9 402395 MASON PERRY Y Y Y
INSERT MIXED COMMAND
21
TO GET MIXED RESULTS BEYOND 8 OBS NEED TO ADJUST YOUR WINDOWS REGISTRY – I WOULDN’T TRY IT EVEN IF I COULD
CORRECTION ATTEMPT 2proc importdatafile = 'P:\My Documents\CO DAY 2009\SOUTHLAND SCHOOL NEW.xls'out = school3dbms = excel replace;mixed = yes;run;proc print data = school3;run;
PASS_ PASS_ Obs ID LNAME FNAME MATH PASS_ENG SPAN 1 38793 NEWMAN N N N 2 5763 GEKKO GORDON Y Y Y 3 414452 SPAULDING GEOFFREY Y Y N 4 916547 O'HARA SCARLETT N N Y 5 43256 MARPLE JANE Y N Y 6 602345 CHARLES NORA N Y Y 7 3958475 DOODY HOWDY Y Y N 8 5457346 BOND JAMES N Y Y 9 . MASON PERRY Y Y Y
22
THIS COULD HAPPEN TO YOU IF YOU ARE NOT CAREFUL – ANOTHER PROC IMPORT EXAMPLE
proc importdatafile = 'P:\My Documents\CO DAY 2009\ACCEPTED_PROCEDURES.xls'out = proceduresdbms = excelreplace;run;
data patients;input id 8. lname $25. fname $10. procedure 8.;datalines;72186 Mouse Minnie 9920151682 Duck Donald 99204368882 Hall Annie45111 Sawyer Tom 99402;run;
proc sql;select id, lname, procedurefrom patientswhere procedure in(select procedurefrom procedures);quit;
23
id lname procedure
72186 Mouse 99201
51682 Duck 99204
368882 Hall .
WHY DID THIS HAPPEN?
SPREADSHEET “ACCEPTED PROCEDURES” CONTAINED BLANK CELLS IN “PROCEDURE” COLUMN
24
WORKAROUNDS:
1. Use DDE
cf. www.lexjansen.com and search on “DDE”
2. Save EXCEL spreadsheet in CSV or tab-delimitted format and input into SAS through your code
3. Put criteria in code to ensure you input what you really desire
25
QUICK REPORT TIPS FOR THE HARD-TO-SATISFY CLIENT
FOR ASSEMBLING SUMMARY DATA FROM VARIOUS DATASETS INTO A SPECIFIED LAYOUT
SCHOOL EXAMPLE:
DISTRICT SCHOOL SUPERINTENDENT WANTS COUNTS AND PERCENTAGES FOR PASSING MATH IN A SPREADSHEET IN A LAYOUT ONLY HE/SHE WOULD COME UP WITH
%macro pass(num);%global tot&num pass#proc sql;select count(*) into :tot&numfrom school#quit;
proc sql;select count(*) into :pass&numfrom school&numwhere pass_math = 'Y';quit;
%mend;
CREATING MACROVARIABLE IN PROC SQL
SAVE MACRO VARIABLESOUTSIDE MACRO
26
%pass(1);%pass(2);%pass(3);run;
data _NULL_;file 'P:\My Documents\CO DAY 2009\MATH COUNTS.txt';
pct1 = &pass1 / &tot1;pct2 = &pass2 / &tot2;pct3 = &pass3 / &tot3;
retain t '09'x;
if _N_ = 1 thenput "MATH PASS RESULTS" / /"SCHOOL" t "STUDENTS" t "NUMBER PASSED" t "PERCENT PASSED" /"SCHOOL1" t "&tot1" t "&pass1" /t t t pct1/"SCHOOL2" t "&tot2" t "&pass2" /t t t pct2 /"SOUTHLAND SCHOOL" t "&tot3" t "&pass3" /t t t pct3;format pct1 pct2 pct3 percent10.2;options missing = 0;run;
TAB DELIMITTED
27
AFTER BRINGING INTO EXCEL AS TAB-DELIMITTED AND SOME MANUAL ADJUSTMENTS TO COLUMN WIDTH:
CLIENT GETS DATA IN LAYOUT HE/SHE DESIRES
MATH PASS RESULTS
SCHOOL STUDENTS NUMBER PASSED PERCENT PASSEDSCHOOL1 4 2
50.00%SCHOOL2 4 2
50.00%SOUTHLAND SCHOOL 8 5
62.50%
CLIENT GETS DATA IN LAYOUT HE/SHE DESIRES