Upload
brent-harmon
View
237
Download
3
Embed Size (px)
Citation preview
SAS介绍和举例
Presented by
经济实验教学中心商务数据挖掘中心
Raw Data
Read in Data
Process Data(Create new variables)
Output Data(Create SAS Dataset)
Analyze Data Using Statistical Procedures
Data Step
PROCs
Structure of Data
• Made up of rows and columns• Rows in SAS are called observations• Columns in SAS are called variables
An observation is all the information for one entity (patient, patient visit, clinical center, county)SAS processes data one observation at a time
Example of Data
12 observations and 5 variables
F 23 S 15 MNF 21 S 15 WIF 22 S 09 MNF 35 M 02 MNF 22 M 13 MNF 25 S 13 WIM 20 S 13 MNM 26 M 15 WIM 27 S 05 MNM 23 S 14 IAM 21 S 14 MNM 29 M 15 MN
Example of Data
12 observations and 5 variables
F 23 S 15 MNF 21 S 15 WIF 22 S 09 MNF 35 M 02 MNF 22 M 13 MNF 25 S 13 WIM 20 S 13 MNM 26 M 15 WIM 27 S 05 MNM 23 S 14 IAM 21 S 14 MNM 29 M 15 MN
Example of Data
12 observations and 5 variables ?
F23S15MNF21S15WIF22S09MNF35M02MNF22M13MNF25S13WIM20S13MNM26M15WIM27S05MNM23S14IAM21S14MNM29M15MN
Need to know the starting and ending
position for each variable.
Types of Data
• Numeric (e.g. age, blood pressure)
• Character (patient ID, diagnosis)
You need to tell SAS if the data is character. The default is numeric.
Rules for SAS Statements and Variables
• SAS statements end with a semicolon (;)• SAS statements can be entered in lower or
uppercase• Multiple SAS statements can appear on one
line• A SAS statement can use multiple lines• Variable names can be from 1-32 characters
and must begin with A-Z or an underscore (_)
* This is a short example program to demonstrate what a SAS program looks like. This is a comment statement because it begins with a * and ends with a semi-colon ;
DATA demo; INFILE DATALINES; INPUT gender $ age marstat $ credits state $ ;
if credits > 12 then fulltime = 'Y'; else fulltime = 'N'; if state = 'MN' then resid = 'Y'; else resid = 'N'; DATALINES;F 23 S 15 MNF 21 S 15 WIF 22 S 09 MNF 35 M 02 MNF 22 M 13 MNF 25 S 13 WIM 20 S 13 MNM 26 M 15 WIM 27 S 05 MNM 23 S 14 IAM 21 S 14 MNM 29 M 15 MN;RUN;TITLE 'Running the Example Program';PROC PRINT DATA=demo ; VAR gender age marstat credits fulltime state ;RUN;
1 DATA demo; Create a SAS dataset called demo2 INFILE DATALINES; Where is the data?3 INPUT gender $ What are the variable age names and types? marstat $ credits state $ ;
4 if credits > 12 then fulltime = 'Y'; else fulltime = 'N';
5 if state = 'MN' then resid = 'Y'; else resid = 'N';
Statements 4 and 5 create 2 new variables
6 DATALINES; Tells SAS the data is comingF 23 S 15 MNF 21 S 15 WIF 22 S 09 MNF 35 M 02 MNF 22 M 13 MNF 25 S 13 WIM 20 S 13 MNM 26 M 15 WIM 27 S 05 MNM 23 S 14 IAM 21 S 14 MNM 29 M 15 MN; Tells SAS the data is ending
7 RUN; Tells SAS to run the statements above
Main SAS Windows (PC)
• Editor Window – where you type your program
• Log Window –lists program statements processed, giving notes, warnings and errors.
Always look at the log window !
Tells how SAS understood your program
• Output Window – gives the output generated from the PROCs
Submit program by clicking on run icon
PC SAS WINDOWS (OUTPUT WINDOW IS HIDDEN)
Main SAS Files
• Program file – type your program in text editor– fname.sas
• Log file – lists program statements processed, giving notes, warnings and errors. – fname.log
• Output file – gives the output generated from the PROCs– fname.lst
Submit program by typing: sas fname.sas
Messages in SAS Log
• Notes – messages that may or may not be important
• Warnings – messages that are usually important
• Errors – fatal in that program will abort
(notes and warnings will not abort your program)
* This is a short example program to demonstrate what a SAS program looks like. This is a comment statement because it begins with a * and ends with a semi-colon ;
DATA demo; INFILE DATALINES; INPUT gender $ age marstat $ credits state $ ;
if credits > 12 then fulltime = 'Y'; else fulltime = 'N'; if state = 'MN' then resid = 'Y'; else resid = 'N'; DATALINES;F 23 S 15 MNF 21 S 15 WIF 22 S 09 MNF 35 M 02 MNF 22 M 13 MNF 25 S 13 WIM 20 S 13 MNM 26 M 15 WIM 27 S 05 MNM 23 S 14 IAM 21 S 14 MNM 29 M 15 MN;RUN;TITLE 'Running the Example Program';PROC PRINT DATA=demo ; VAR gender age marstat credits fulltime state ;RUN;
LOG WINDOW (or file)
NOTE: Copyright (c) 1999-2001 by SAS Institute Inc., Cary, NC, USA.NOTE: SAS (r) Proprietary Software Release 8.2 (TS2M0) Licensed to UNIVERSITY OF MINNESOTA, Site 0009012001.NOTE: This session is executing on the WIN_NT platform.
NOTE: SAS initialization used: real time 7.51 seconds cpu time 0.89 seconds
1 * This is a short example program to demonstrate what a2 SAS program looks like. This is a comment statement because3 it begins with a * and ends with a semi-colon ;45 DATA demo;6 INFILE DATALINES;7 INPUT gender $ age marstat $ credits state $ ;89 if credits > 12 then fulltime = 'Y'; else fulltime = 'N';10 if state = 'MN' then resid = 'Y'; else resid = 'N';11 DATALINES;
NOTE: The data set WORK.DEMO has 12 observations and 7 variables.NOTE: DATA statement used: real time 0.38 seconds cpu time 0.06 seconds
25 RUN;26 TITLE 'Running the Example Program';27 PROC PRINT DATA=demo ;28 VAR gender age marstat credits fulltime state ;29 RUN;
NOTE: There were 12 observations read from the data set WORK.DEMO.NOTE: PROCEDURE PRINT used: real time 0.19 seconds cpu time 0.02 seconds
30 PROC MEANS DATA=demo N SUM MEAN;31 VAR age credits ;32 RUN;
NOTE: There were 12 observations read from the data set WORK.DEMO.NOTE: PROCEDURE MEANS used: real time 0.25 seconds cpu time 0.03 seconds
33 PROC FREQ DATA=demo; TABLES gender;34 RUN;
NOTE: There were 12 observations read from the data set WORK.DEMO.NOTE: PROCEDURE FREQ used: real time 0.15 seconds cpu time 0.03 seconds
OUTPUT WINDOW (OR LST FILE)Running the Example Program
Obs gender age marstat credits fulltime state
1 F 23 S 15 Y MN 2 F 21 S 15 Y WI 3 F 22 S 9 N MN 4 F 35 M 2 N MN 5 F 22 M 13 Y MN 6 F 25 S 13 Y WI 7 M 20 S 13 Y MN 8 M 26 M 15 Y WI 9 M 27 S 5 N MN 10 M 23 S 14 Y IA 11 M 21 S 14 Y MN 12 M 29 M 15 Y MN
The MEANS Procedure
Variable N Sum Mean----------------------------------------------age 12 294.0000000 24.5000000credits 12 143.0000000 11.9166667-----------------------------------------------
The FREQ Procedure
Cumulative Cumulativegender Frequency Percent Frequency Percent-----------------------------------------------------------F 6 50.00 6 50.00M 6 50.00 12 100.0
Some common procedures
PROC PRINT• print out your data - always a good idea!!
PROC MEANS• descriptive statistics for continuous data
PROC FREQ• descriptive statistics for categorical data
PROC UNIVARIATE• very detailed descriptive statistics for continuous data
PROC TTEST• performs t-tests (continuous data)