Upload
trandien
View
267
Download
4
Embed Size (px)
Citation preview
SAS® 101
Based on
Learning SAS by Example:
A Programmer‟s Guide
Chapters 9, 11 & 12
By Tasha Chapman, Oregon Health Authority
Topics covered…
SAS dates
Date functions
Numeric functions
Character functions
Special functions
PUT and INPUT
SAS dates
Dates are stored as the number of days from
January 1st, 1960
Dates before 1960 are negative numbers
Dates after 1960 are positive numbers
Examples:
1/1/1960 = 0
1/2/1960 = 1
10/21/1950 = -3,359
12/31/2012 = 19,358
SAS dates
Times are stored as the number of seconds from
midnight
Times are always positive
Examples:
Midnight = 0
Midnight and 36 seconds = 36
5:38am and 9 seconds= 20,289
12:51pm and 2 seconds = 46,262
SAS dates
Datetime values are stored as the number of
seconds from 1/1/1960
Datetimes before 1960 are negative numbers
Datetimes after 1960 are positive numbers
Examples:
1/1/1960 at midnight and 36 seconds = 36
1/2/1960 at 5:38am and 9 seconds = 106,689
10/21/1950 at 12:51pm & 2 secs = -290,171,338
12/31/2012 at midnight = 1,672,531,200
Format Result
MMDDYY10. 03/17/2000
MMDDYY8. 03/17/00
MONNAME. March
YEAR. 2000
MONYY. MAR2000
WORDDATE. March 17, 2000
WEEKDATE. Friday, March 17, 2000
DOWNAME. Friday
QTR. 1
YYQ. 2000Q1
If you input 14686:
Common date formats/informats
If you input 14686:
Common date formats/informats
Format Result
MMDDYY10. 03/17/2000
MMDDYY8. 03/17/00
MONNAME. March
YEAR. 2000
MONYY. MAR2000
WORDDATE. March 17, 2000
WEEKDATE. Friday, March 17, 2000
DOWNAME. Friday
QTR. 1
YYQ. 2000Q1
Examples of date formats
SAS Documentation
Date constants can be used in IF and WHERE
statements, functions, and other portions of your
SAS code
These are written as: 'DDMONCCYY'd
Using dates
Datetime constants can be also be used in your SAS
code
These are written as:
'DDMONCCYY:HH:MM:SS'dt
Using dates
BEWARE!
If your database stores datetime variables…
'30sep2005:11:23:57'dt
is the not same thing as
'30sep2005'd
Using dates
1,441,193,037
16,709 „30sep2005:00:00:00‟dt
BEWARE!
If your database stores datetime variables…
'01sep2005'd
le mod_date le
'30sep2005'd
may ignore datetime values on
September 30th with non-zero times
Using dates
BEWARE!
If your database stores datetime variables…
Alternatives:
'01sep2005:00:00:00'dt
le mod_date le
'30sep2005:23:59:59'dt
Using dates
BEWARE!
If your database stores datetime variables…
Alternatives:
'01sep2005'd
le mod_date lt
'01oct2005'd
* This option does not work with a SAS dataset
Using dates
Reading dates from raw data
Informat works fine if you know the format of the incoming data
and the data is consistent…
…but what if the data is wonky?
Dates come in all shapes and sizes, sometimes even
in the same data file…
How do we deal with them all?
Reading dates from raw data
Reading dates from raw data
Use anydtdte. informat to deal with wonky date data.
Reading dates from raw data
Informat Purpose
anydtdte. Extracts the date portion
anydtdtm. Extracts the datetime portion
anydttme. Extracts the time portion
When reading a two-digit year, SAS uses a 100-
year interval to determine which century the year
belongs to
Default is 1920
Use yearcutoff= system option to change default
Reading dates from raw data
What are functions?
function(argument-1, argument-2)
Functions return a value from a computation or
system manipulation based on supplied arguments
Arguments can be constants, variables, or
expressions, depending on the function
Can nest functions within functions
Most often use functions to create new variables or
as part of conditional logic
Functions vs. PROCs
Functions vs. PROCs
Functions calculate across variables
(one value per observation)
PROCs calculate down observations
(one value per dataset)
Would provide a single score per
student that averages all of his/her
test scores
Would provide an average score for
all students for test 1
Graphic from “Introduction to SAS Functions” by Neil Howard
Examples of functions
SAS Documentation
Function Purpose Sample Output
datepart(datetime); Extract date portion from a datetime value 17905
timepart(datetime); Extract time portion from a datetime value 46262
year(date); Extract year from a date value 2009
month(date); Extract month from a date value 12
day(date); Extract day of the month from a date value 31
weekday(date); Extract day of the week from a date value 2
mdy(month, day, year); Create a date using month, day and year values 17905
Date functions
Date functions
Function Purpose Sample Output
datepart(datetime); Extract date portion from a datetime value 17905
timepart(datetime); Extract time portion from a datetime value 46262
year(date); Extract year from a date value 2009
month(date); Extract month from a date value 12
day(date); Extract day of the month from a date value 31
weekday(date); Extract day of the week from a date value 2
mdy(month, day, year); Create a date using month, day and year values 17905
INTCK function
INTCK(interval, from, to)
Returns a count of the number of interval boundaries
between two dates, times, or datetime values
interval : desired interval
Examples: 'YEAR'
'MONTH'
'DAY'
'QTR'
from : Starting date, time, or datetime value
to : Ending date, time, or datetime value
Interval counted each time a boundary is crossed
INTCK function
Interval Boundary
'YEAR' January 1
'MONTH' First of any month
'DAY' Start of each day
'QTR' January 1, April 1, July 1, and October 1
INTCK function
Function Result
intck('DAY', '01jan2013'd, '01jan2013'd) 0
intck('DAY', '01jan2013'd, '02jan2013'd) 1
intck('YEAR', '01jan2013'd, '01apr2013'd) 0
intck('YEAR', '01dec2012'd, '01apr2013'd) 1
intck('YEAR', '01jan2013'd, '01apr2014'd) 1
Examples of INTCK
INTCK function
Function Result
intck('DAY', '01jan2013'd, '01jan2013'd) 0
intck('DAY', '01jan2013'd, '02jan2013'd) 1
intck('YEAR', '01jan2013'd, '01apr2013'd) 0
intck('YEAR', '01dec2012'd, '01apr2013'd) 1
intck('YEAR', '01jan2013'd, '01apr2014'd) 1
Examples of INTCK
INTCK function
0
200
400
600
800
1000
1200
0 1 2 3 4 5 6 7 8 9 10
Fre
quency
of
adm
issi
ons
Calendar quarter
Hospital Admissions by Quarter January 1, 2003 through June 30, 2006
Calculating age
Different methods produce different results…
Most commonly
recognized
INTNX function
INTNX(interval, from, n, <alignment>)
Increments a date, time, or dateime value by a given interval
interval : desired interval
from : Starting date, time, or datetime value
n : number of interval increments
Can be positive, negative, or zero
alignment : optional argument that controls the alignment of the date
'SAMEDAY', 'BEGINNING', 'MIDDLE', or 'END‘
Default is 'BEGINNING'
INTNX function
Function Result
intnx('DAY', '01jan2013'd, 3) 01/04/2013
intnx('YEAR', '25jan2013'd, 1) 01/01/2014
intnx('YEAR', '30aug2012'd, -31) 01/01/1981
intnx('YEAR', '30aug2012'd, -31, 'SAMEDAY') 08/30/1981
Examples of INTNX
INTNX function
Function Result
intnx('DAY', '01jan2013'd, 3) 01/04/2013
intnx('YEAR', '25jan2013'd, 1) 01/01/2014
intnx('YEAR', '30aug2012'd, -31) 01/01/1981
intnx('YEAR', '30aug2012'd, -31, 'SAMEDAY') 08/30/1981
Examples of INTNX
INTNX function
Can use INTNX function to help automate reports
ROUND function
ROUND(argument, <rounding-unit>)
Rounds a number to the selected rounding unit
argument : numeric value to be rounded
rounding-unit : optional increment for rounding
Default is 1 (nearest integer)
Function Result
round(156.826, 1) 157
round(156.826, 10) 160
round(156.826, .01) 156.83
INT function
INT(argument)
Truncates argument to an integer value (drops the
decimal point)
argument : numeric value to be truncated
Function Result
int(156.826) 156
Descriptive stats functions
Two methods:
function(var1, var2,… varN)
function(of var1-varN)
Can use these methods separately or together
Function (example) Result
N(Q1, Q2, Q3, Q4) Returns the number of non-missing numeric values
Mean(of Q1-Q4) Returns the arithmetic mean (average)
Sum(0, of Q1-Q4) Returns the sum of nonmissing values
(in this example, if all arguments are missing, returns 0)
Min(of Q1-Q4) Returns the smallest value
Max(of Q1-Q4) Returns the largest value
RANUNI function
RANUNI(seed)
Generates a random number between 0 and 1
seed : a number which is used to generate the first
number in the random sequence
Can be any number
If you use a seed of 0 or negative number, SAS will use
the system clock to supply the seed
Function Result
ranuni(7815985) Returns a random number
LAG function
LAG<n>(argument)
Return value from a previous observation (generally)
n : specifies the number of lagged values
argument : number, expression or variable to be
lagged
LAG function
DIF function similar to LAG
Returns the difference between current
value and previous value
i.e. DIF(x) = x – lag(x)
CASE functions
Converts character string to upper or lower case
Useful when using comparison operators
Function (example) Result
upcase('sas rocks') SAS ROCKS
lowcase('I AM WHISPERING') i am whispering
PROPCASE function
PROPCASE(argument, <delimiters>)
Converts all words to proper case
argument : character value that will be converted
delimiters : specifies one or more delimiters that indicate the beginning of a new word
Enclose delimiters in quotation marks
Default delimiters are blank, forward slash, hyphen, open parenthesis, period, and tab
Use of option overrides all default delimiters
Function Result
propcase('SAINT KITTS/NEVIS(U.S.)') Saint Kitts/Nevis(U.S.)
propcase("GEORGE O'KEEFE-BAIN", " '-"); George O‟Keefe-Bain
SUBSTR function
SUBSTR(string, position, <length>)
Extracts a substring (portion) of text
string : character value that text will be extracted from
position : beginning character position for substring
length : length of substring to extract
If omitted, SAS extracts the remainder of the string
Function Result
substr('JEFF STREET', 5) STREET
substr('621110', 1, 2) 62
substr('Dr. Paul Jones', 5, 4) Paul
SCAN function
Function Result
scan('JEFF STREET', 2) STREET
SCAN(string, count, <charlist, <mods>>)
Returns the nth word from a character string
string : character value that words will be pulled from
count : integer that specifies the number of the word to select
Positive – counts from left to right
Negative – counts from right to left
charlist : optional delimiter identifier
modifiers : optional modifier that changes the behavior of the
SCAN function
CAT function
Function Result
cat('John', 'Smith') JohnSmith
cat('I', ' ', 'Heart', ' ', 'You') I Heart You
cat('(', area, ')', prefix, '-', suffix) (503)373-1793
cat(1225, ' Ferry St.') 1225 Ferry St.
CAT(item1, item2,… itemN)
Concatenates items into a single character string
items : value to be concatenated
If numeric, value is first converted to character
Can also be written as:
CAT(of item1-itemN)
COMPRESS function
COMPRESS(<source>, <chars>, <modifiers>)
Removes specified characters from a string
source : character value from which characters will be removed
chars : characters to be removed
If omitted, only blanks will be removed
modifiers : optional modifier that changes the behavior of the
COMPRESS function
Function( Result
compress('Mc Cartney') McCartney
compress('(503) 373-1793', '()- ') 5033731793
PUT function
PUT(source, format.)
Returns a value using a specified format
Most often used for converting numeric data to character
source : value to be reformatted
format : the format to be applied
The format must be the same type as the source (character/numeric)
Function Result
put(patient_id, 8.) A patient ID that is character
put(32000, dollar24.) $32,000
put(16739, date9.) 30OCT2005
put('M', $gender.) Male
PUT function
PUT function vs. Format statements:
Format statements change the appearance of the value
PUT functions change the value itself
Date and Char_Date look the same,
but one is character and the other is
numeric.
INPUT function
INPUT(source, <?|??> informat.)
Converts a value using a specified informat
Most often used for converting character data to numeric
source : value to be reformatted
? | ?? : Optional modifiers that suppress error messages
informat : the informat to be applied
Function Result
input('32000', 5.) 32000
input('32000', 5.2) 320.00
input('$2432.99', dollar24.2) 2432.99
input('30oct2005', date9.) 16739