63
SAS ® 101 Based on Learning SAS by Example: A Programmer‟s Guide Chapters 9, 11 & 12 By Tasha Chapman, Oregon Health Authority

SAS 101 - sasCommunity_11...SAS® 101 Based on Learning SAS by Example: A Programmer‟s Guide Chapters 9, 11 & 12 By Tasha Chapman, Oregon Health Authority

Embed Size (px)

Citation preview

SAS® 101

Based on

Learning SAS by Example:

A Programmer‟s Guide

Chapters 9, 11 & 12

By Tasha Chapman, Oregon Health Authority

Topics covered…

SAS dates

Date functions

Numeric functions

Character functions

Special functions

PUT and INPUT

SAS dates

SAS dates

Dates are stored as the number of days from

January 1st, 1960

Dates before 1960 are negative numbers

Dates after 1960 are positive numbers

Examples:

1/1/1960 = 0

1/2/1960 = 1

10/21/1950 = -3,359

12/31/2012 = 19,358

SAS dates

Times are stored as the number of seconds from

midnight

Times are always positive

Examples:

Midnight = 0

Midnight and 36 seconds = 36

5:38am and 9 seconds= 20,289

12:51pm and 2 seconds = 46,262

SAS dates

Datetime values are stored as the number of

seconds from 1/1/1960

Datetimes before 1960 are negative numbers

Datetimes after 1960 are positive numbers

Examples:

1/1/1960 at midnight and 36 seconds = 36

1/2/1960 at 5:38am and 9 seconds = 106,689

10/21/1950 at 12:51pm & 2 secs = -290,171,338

12/31/2012 at midnight = 1,672,531,200

Format Result

MMDDYY10. 03/17/2000

MMDDYY8. 03/17/00

MONNAME. March

YEAR. 2000

MONYY. MAR2000

WORDDATE. March 17, 2000

WEEKDATE. Friday, March 17, 2000

DOWNAME. Friday

QTR. 1

YYQ. 2000Q1

If you input 14686:

Common date formats/informats

If you input 14686:

Common date formats/informats

Format Result

MMDDYY10. 03/17/2000

MMDDYY8. 03/17/00

MONNAME. March

YEAR. 2000

MONYY. MAR2000

WORDDATE. March 17, 2000

WEEKDATE. Friday, March 17, 2000

DOWNAME. Friday

QTR. 1

YYQ. 2000Q1

Date constants can be used in IF and WHERE

statements, functions, and other portions of your

SAS code

These are written as: 'DDMONCCYY'd

Using dates

The current date can be computed using the

TODAY() function

Using dates

Datetime constants can be also be used in your SAS

code

These are written as:

'DDMONCCYY:HH:MM:SS'dt

Using dates

BEWARE!

If your database stores datetime variables…

'30sep2005:11:23:57'dt

is the not same thing as

'30sep2005'd

Using dates

1,441,193,037

16,709 „30sep2005:00:00:00‟dt

BEWARE!

If your database stores datetime variables…

'01sep2005'd

le mod_date le

'30sep2005'd

may ignore datetime values on

September 30th with non-zero times

Using dates

BEWARE!

If your database stores datetime variables…

Alternatives:

'01sep2005:00:00:00'dt

le mod_date le

'30sep2005:23:59:59'dt

Using dates

BEWARE!

If your database stores datetime variables…

Alternatives:

'01sep2005'd

le mod_date lt

'01oct2005'd

* This option does not work with a SAS dataset

Using dates

Reading dates from raw data

Informat works fine if you know the format of the incoming data

and the data is consistent…

…but what if the data is wonky?

Dates come in all shapes and sizes, sometimes even

in the same data file…

How do we deal with them all?

Reading dates from raw data

Reading dates from raw data

Use anydtdte. informat to deal with wonky date data.

Reading dates from raw data

Informat Purpose

anydtdte. Extracts the date portion

anydtdtm. Extracts the datetime portion

anydttme. Extracts the time portion

When reading a two-digit year, SAS uses a 100-

year interval to determine which century the year

belongs to

Default is 1920

Use yearcutoff= system option to change default

Reading dates from raw data

Reading dates from raw data

Reading dates from raw data

Functions

What are functions?

function(argument-1, argument-2)

Functions return a value from a computation or

system manipulation based on supplied arguments

Arguments can be constants, variables, or

expressions, depending on the function

Can nest functions within functions

Most often use functions to create new variables or

as part of conditional logic

Functions vs. PROCs

Functions vs. PROCs

Functions calculate across variables

(one value per observation)

PROCs calculate down observations

(one value per dataset)

Would provide a single score per

student that averages all of his/her

test scores

Would provide an average score for

all students for test 1

Graphic from “Introduction to SAS Functions” by Neil Howard

Date functions

Date functions

Function Purpose Sample Output

datepart(datetime); Extract date portion from a datetime value 17905

timepart(datetime); Extract time portion from a datetime value 46262

year(date); Extract year from a date value 2009

month(date); Extract month from a date value 12

day(date); Extract day of the month from a date value 31

weekday(date); Extract day of the week from a date value 2

mdy(month, day, year); Create a date using month, day and year values 17905

Date functions

Date functions

Function Purpose Sample Output

datepart(datetime); Extract date portion from a datetime value 17905

timepart(datetime); Extract time portion from a datetime value 46262

year(date); Extract year from a date value 2009

month(date); Extract month from a date value 12

day(date); Extract day of the month from a date value 31

weekday(date); Extract day of the week from a date value 2

mdy(month, day, year); Create a date using month, day and year values 17905

Date functions

Many uses of functions…

INTCK function

INTCK(interval, from, to)

Returns a count of the number of interval boundaries

between two dates, times, or datetime values

interval : desired interval

Examples: 'YEAR'

'MONTH'

'DAY'

'QTR'

from : Starting date, time, or datetime value

to : Ending date, time, or datetime value

Interval counted each time a boundary is crossed

INTCK function

Interval Boundary

'YEAR' January 1

'MONTH' First of any month

'DAY' Start of each day

'QTR' January 1, April 1, July 1, and October 1

INTCK function

Function Result

intck('DAY', '01jan2013'd, '01jan2013'd) 0

intck('DAY', '01jan2013'd, '02jan2013'd) 1

intck('YEAR', '01jan2013'd, '01apr2013'd) 0

intck('YEAR', '01dec2012'd, '01apr2013'd) 1

intck('YEAR', '01jan2013'd, '01apr2014'd) 1

Examples of INTCK

INTCK function

Function Result

intck('DAY', '01jan2013'd, '01jan2013'd) 0

intck('DAY', '01jan2013'd, '02jan2013'd) 1

intck('YEAR', '01jan2013'd, '01apr2013'd) 0

intck('YEAR', '01dec2012'd, '01apr2013'd) 1

intck('YEAR', '01jan2013'd, '01apr2014'd) 1

Examples of INTCK

Can use INTCK function to classify observations

INTCK function

INTCK function

0

200

400

600

800

1000

1200

0 1 2 3 4 5 6 7 8 9 10

Fre

quency

of

adm

issi

ons

Calendar quarter

Hospital Admissions by Quarter January 1, 2003 through June 30, 2006

Calculating age

Different methods produce different results…

Most commonly

recognized

INTNX function

INTNX(interval, from, n, <alignment>)

Increments a date, time, or dateime value by a given interval

interval : desired interval

from : Starting date, time, or datetime value

n : number of interval increments

Can be positive, negative, or zero

alignment : optional argument that controls the alignment of the date

'SAMEDAY', 'BEGINNING', 'MIDDLE', or 'END‘

Default is 'BEGINNING'

INTNX function

Function Result

intnx('DAY', '01jan2013'd, 3) 01/04/2013

intnx('YEAR', '25jan2013'd, 1) 01/01/2014

intnx('YEAR', '30aug2012'd, -31) 01/01/1981

intnx('YEAR', '30aug2012'd, -31, 'SAMEDAY') 08/30/1981

Examples of INTNX

INTNX function

Function Result

intnx('DAY', '01jan2013'd, 3) 01/04/2013

intnx('YEAR', '25jan2013'd, 1) 01/01/2014

intnx('YEAR', '30aug2012'd, -31) 01/01/1981

intnx('YEAR', '30aug2012'd, -31, 'SAMEDAY') 08/30/1981

Examples of INTNX

INTNX function

Can use INTNX function to help automate reports

Numeric functions

ROUND function

ROUND(argument, <rounding-unit>)

Rounds a number to the selected rounding unit

argument : numeric value to be rounded

rounding-unit : optional increment for rounding

Default is 1 (nearest integer)

Function Result

round(156.826, 1) 157

round(156.826, 10) 160

round(156.826, .01) 156.83

INT function

INT(argument)

Truncates argument to an integer value (drops the

decimal point)

argument : numeric value to be truncated

Function Result

int(156.826) 156

Descriptive stats functions

Two methods:

function(var1, var2,… varN)

function(of var1-varN)

Can use these methods separately or together

Function (example) Result

N(Q1, Q2, Q3, Q4) Returns the number of non-missing numeric values

Mean(of Q1-Q4) Returns the arithmetic mean (average)

Sum(0, of Q1-Q4) Returns the sum of nonmissing values

(in this example, if all arguments are missing, returns 0)

Min(of Q1-Q4) Returns the smallest value

Max(of Q1-Q4) Returns the largest value

RANUNI function

RANUNI(seed)

Generates a random number between 0 and 1

seed : a number which is used to generate the first

number in the random sequence

Can be any number

If you use a seed of 0 or negative number, SAS will use

the system clock to supply the seed

Function Result

ranuni(7815985) Returns a random number

LAG function

LAG<n>(argument)

Return value from a previous observation (generally)

n : specifies the number of lagged values

argument : number, expression or variable to be

lagged

LAG function

DIF function similar to LAG

Returns the difference between current

value and previous value

i.e. DIF(x) = x – lag(x)

Character functions

CASE functions

Converts character string to upper or lower case

Useful when using comparison operators

Function (example) Result

upcase('sas rocks') SAS ROCKS

lowcase('I AM WHISPERING') i am whispering

PROPCASE function

PROPCASE(argument, <delimiters>)

Converts all words to proper case

argument : character value that will be converted

delimiters : specifies one or more delimiters that indicate the beginning of a new word

Enclose delimiters in quotation marks

Default delimiters are blank, forward slash, hyphen, open parenthesis, period, and tab

Use of option overrides all default delimiters

Function Result

propcase('SAINT KITTS/NEVIS(U.S.)') Saint Kitts/Nevis(U.S.)

propcase("GEORGE O'KEEFE-BAIN", " '-"); George O‟Keefe-Bain

SUBSTR function

SUBSTR(string, position, <length>)

Extracts a substring (portion) of text

string : character value that text will be extracted from

position : beginning character position for substring

length : length of substring to extract

If omitted, SAS extracts the remainder of the string

Function Result

substr('JEFF STREET', 5) STREET

substr('621110', 1, 2) 62

substr('Dr. Paul Jones', 5, 4) Paul

SCAN function

Function Result

scan('JEFF STREET', 2) STREET

SCAN(string, count, <charlist, <mods>>)

Returns the nth word from a character string

string : character value that words will be pulled from

count : integer that specifies the number of the word to select

Positive – counts from left to right

Negative – counts from right to left

charlist : optional delimiter identifier

modifiers : optional modifier that changes the behavior of the

SCAN function

CAT function

Function Result

cat('John', 'Smith') JohnSmith

cat('I', ' ', 'Heart', ' ', 'You') I Heart You

cat('(', area, ')', prefix, '-', suffix) (503)373-1793

cat(1225, ' Ferry St.') 1225 Ferry St.

CAT(item1, item2,… itemN)

Concatenates items into a single character string

items : value to be concatenated

If numeric, value is first converted to character

Can also be written as:

CAT(of item1-itemN)

COMPRESS function

COMPRESS(<source>, <chars>, <modifiers>)

Removes specified characters from a string

source : character value from which characters will be removed

chars : characters to be removed

If omitted, only blanks will be removed

modifiers : optional modifier that changes the behavior of the

COMPRESS function

Function( Result

compress('Mc Cartney') McCartney

compress('(503) 373-1793', '()- ') 5033731793

PUT and INPUT

Special functions

PUT function

PUT(source, format.)

Returns a value using a specified format

Most often used for converting numeric data to character

source : value to be reformatted

format : the format to be applied

The format must be the same type as the source (character/numeric)

Function Result

put(patient_id, 8.) A patient ID that is character

put(32000, dollar24.) $32,000

put(16739, date9.) 30OCT2005

put('M', $gender.) Male

PUT function

PUT function vs. Format statements:

Format statements change the appearance of the value

PUT functions change the value itself

Date and Char_Date look the same,

but one is character and the other is

numeric.

INPUT function

INPUT(source, <?|??> informat.)

Converts a value using a specified informat

Most often used for converting character data to numeric

source : value to be reformatted

? | ?? : Optional modifiers that suppress error messages

informat : the informat to be applied

Function Result

input('32000', 5.) 32000

input('32000', 5.2) 320.00

input('$2432.99', dollar24.2) 2432.99

input('30oct2005', date9.) 16739

Read chapters 14 & 19

For next week…