week-05-24sep07 - Miami University€¦ · File: week-05-24sep07.doc Directory (hp/compaq):...

Preview:

Citation preview

Week 05 Class Activities

File: week-05-24sep07.doc Directory (hp/compaq): C:\baileraj\Classes\Fall 2007\sta402\handouts Directory: \\Muserver2\USERS\B\\baileraj\Classes\sta402\handouts

Better Graphics in SAS via SAS/Graph

HIGH-RESOLUTION GRAPHICS AND FORMATS

* Introduce concepts related to high-resolution graphs

* PROC GCHART and PROC GPLOT for producing high-resolution graphs

other SAS high resolution graphics procedures …

GCHART (bar charts, pie charts, star charts)

GPLOT (line plot, scatter plot, regression plot, high-low plots, bubble plots)

G3D (3-dimensional surface plots)

GCONTOUR, G3GRID (interpolate/smooth data)

GMAP (block, choropleth, prism, surface)

GSLIDE (create text slide)

GPRINT (display as a graphic SAS procedure output that has been saved in a text file)

GREPLAY (combine several graphs into a single output)

To list all graphics devices in the current catalog …

proc gdevice catalog=sashelp.devices nofs; list; run;

* SAS-supplied formats and PROC FORMAT for user-defined formats

References

http://support.sas.com/onlinedoc/912/docMainpage.jsp (follow SAS/GRAPH links)

or

http://www.units.muohio.edu/doc/sassystem/sasonlinedocv8/sasdoc/sashtml/main.htm

GPLOT figures

/* example sas program that does simple linear regression */ /* defines library location for permanently storing a SAS data set */ libname class “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples”;

options ls=75; data class.manatee; input year nboats manatees; cards; 77 447 13 78 460 21 79 481 24 80 498 16 81 513 24 82 512 20 83 526 15 84 559 34 85 585 33 86 614 33 87 645 39 88 675 43 89 711 50 90 719 47 ;

proc print data=class.manatee; title “Manatee mortality data”; run; /* Line printer scatterplot */ proc plot data=class.manatee; title2 “PROC PLOT: manatees killed plotted vs. # boats registered”; plot manatees*nboats; run;

/*

SAS has an experimental ODS procedures for Statistical Graphics Using ODS

*/

ODS GRAPHICS (experimental in SAS 9) From http://support.sas.com/rnd/base/topics/statgraph/ …

On an experimental basis in SAS 9.1, a number of SAS/STAT procedures support an extension to the Output Delivery System (ODS) to create statistical graphics as automatically as tables. This facility is

referred to as ODS Statistical Graphics (or ODS Graphics for short).

With ODS Graphics, a procedure creates the graphs that are most commonly needed for a particular analysis. Using ODS Graphics eliminates the need to save numerical results in an output data set,

manipulate them with a DATA step program, and display them with a graphics procedure.

http://support.sas.com/91doc/getDoc/statug.hlp/odsgraph_sect4.htm ODS Graph produces graphs similar in nature to what Minitab does. [Thanks to Dr. Schaefer for the links] SAS Procedures with ODS Graphics support

Base SAS

• CORR

SAS/ETS

• ARIMA • AUTOREG • ENTROPY • EXPAND • MODEL • SYSLIN • TIMESERIES • UCM

SAS/STAT

• ANOVA • CORRESP • GAM • GENMOD • GLM • KDE • LIFETEST • LOESS • LOGISTIC • MI • MIXED • PHREG

• VARMAX • X12

SAS High-Performance Forecasting

• HPF

• PRINCOMP • PRINQUAL • REG • ROBUSTREG

(from http://support.sas.com/91doc/docMainpage.jsp?_topic=statug.hlp/odsgraph_index.htm)

In many ways, creating graphics with ODS is analogous to creating tables with ODS. You use

• procedure options and defaults to determine which graphs are created • ODS destination statements (such as ODS HTML) to specify the output

destination for graphics

Additionally, you can use

• graph names in ODS SELECT and ODS EXCLUDE statements to select or exclude graphs from your output

• ODS styles to control the general appearance and consistency of all graphs • ODS templates to control the layout and details of individual graphs . A default

template is provided by SAS for each graph.

In SAS 9.1, the ODS destinations that support ODS Graphics include HTML, LATEX, PRINTER, and

RTF.

ODS html; ODS graphics on; proc reg data=class.manatee plots(unpack); title2 “Stat graphics with ODS”; model manatees=nboats; run; quit; ODS graphics off; ODS html close;

Manatee mortality data

Stat graphics with ODS

The REG Procedure Model: MODEL1 Dependent Variable: manatees

Number of Observations Read 14

Number of Observations Used 14

Analysis of Variance

Source DF Sum of Squares

Mean Square

F Value Pr > F

Model 1 1711.97866 1711.97866 93.61 <.0001

Error 12 219.44991 18.28749

Corrected Total 13 1931.42857

Root MSE 4.27639 R-Square 0.8864

Dependent Mean 29.42857 Adj R-Sq 0.8769

Coeff Var 14.53141

Parameter Estimates

Variable DF Parameter Estimate

Standard Error

t Value Pr > |t|

Intercept 1 -41.43044 7.41222 -5.59 0.0001

nboats 1 0.12486 0.01290 9.68 <.0001

Manatee mortality data

Stat graphics with ODS

The REG Procedure Model: MODEL1 Dependent Variable: manatees

/*

Now look at what ODS graphics does with GLM

*/

data meat; input condition $ logcount @@; datalines; Plastic 7.66 Plastic 6.98 Plastic 7.80 Vacuum 5.26 Vacuum 5.44 Vacuum 5.80 Mixed 7.41 Mixed 7.33 Mixed 7.04 Co2 3.51 Co2 2.91 Co2 3.66 ; title bacteria growth under 4 packaging conditions; ods html; ods graphics on;

proc glm data=meat order=data; title2 fitting the one-way anova model via GLM;

class condition; model logcount = condition; run;

quit; ods graphics off; ods html close;

bacteria growth under 4 packaging conditions

fitting the one-way anova model via GLM

The GLM Procedure

Class Level Information

Class Levels Values

condition 4 Plastic Vacuum Mixed Co2

Number of Observations Read 12

Number of Observations Used 12

bacteria growth under 4 packaging conditions

fitting the one-way anova model via GLM

The GLM Procedure Dependent Variable: logcount

Source DF Sum of Squares Mean Square F Value Pr > F

Model 3 32.87280000 10.95760000 94.58 <.0001

Error 8 0.92680000 0.11585000

Corrected Total 11 33.79960000

R-Square Coeff Var Root MSE logcount Mean

0.972580 5.768940 0.340367 5.900000

Source DF Type I SS Mean Square F Value Pr > F

condition 3 32.87280000 10.95760000 94.58 <.0001

Source DF Type III SS Mean Square F Value Pr > F

condition 3 32.87280000 10.95760000 94.58 <.0001

/*

Now how about specifying graphs in SAS/Graph

*/ ODS RTF file=

“\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig1A.rtf”;

proc gplot data=class.manatee; title ‘Number of Manatees killed regressed on the number of boats registered in Florida’; plot manatees*nboats; run;

ODS RTF CLOSE;

/*

Let’s now add bells and whistles to this plot

*/

ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig2.rtf”; proc gplot data=class.manatee; title h=1.5 'A plot of the number of manatee deaths versus the number of boats registered in Florida';

title2 h=1 '[Regression line with CI for mean along with smoothing spline fit displayed]'; symbol1 interpol=rlclm95 /* r=regression, l=linear (q,c also possible), clm=conf. int. mean (cli option), 95= conf. level */ value=diamond height=3 cv=red ci=blue co=green width=2; symbol2 interpol=SM55s /* smoothing spline (SM) that first sorts the x-axis data */ ; plot manatees*nboats manatees*nboats / hminor=1 overlay regeqn; /* adds regression eqn to bottom left of plot */ run; ODS RTF CLOSE;

/*

The NITROFEN data set revisited …

*/

data class.nitrofen; infile 'M:\public.www\classes\sta402\SAS-programs\ch2-dat.txt' firstobs=16 expandtabs missover pad ; input @9 animal 2. @17 conc 3. @25 brood1 2. @33 brood2 2.

@41 brood3 2. @49 total 2.; /* creates character variable based on format */ * cbrood3 = put(brood3,totfmt.); label animal = animal ID number; label conc = Nitrofen concentration; label brood1 = number of young in first brood; label brood2 = number of young in 2nd brood; label brood3 = number of young in 3rd brood; *label total = total young produced in three broods; proc print data=class.nitrofen; run; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig2.rtf'; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig2.rtf”; /* SYMBOL = defines characteristics of plotted symbols */ proc gplot data=class.nitrofen; title h=1.5 'A plot of the number of C. dubia produced at different Nitrofen concentrations'; title2 h=1 '[mean +/- 2SD is plotted for each concentration]'; symbol1 interpol=STD2T /* plots +/- 2 SD from the mean at each conc */ /* T= add top and bottom to each 2 SD diff */ value=dot; plot total*conc / hminor=1 /* hminor=# tick markets before x values */ haxis=0 to 350 by 50; run; ODS RTF CLOSE;

proc means data=class.nitrofen; class conc; var total; output out=nitromean mean=n_mean stddev=n_sd; run; proc print; run;

Obs conc _TYPE_ _FREQ_ n_mean n_sd 1 . 0 50 22.88 10.7241 2 0 1 10 31.40 3.5963

3 80 1 10 31.50 3.2745 4 160 1 10 28.30 2.3594 5 235 1 10 17.20 5.9029 6 310 1 10 6.00 3.7118 * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig3.rtf'; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig3.rtf”; proc gplot data=nitromean; title h=1.5 'Plot of mean number of C. dubia young produced at different Nitrofen concentrations'; title2 h=1 '[bubble area proportional to std dev.]'; bubble n_mean*conc=n_sd / bsize=15; /* bsize helps resize bubble for display */ run; ODS RTF CLOSE;

GCHART figures

* ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig4.rtf'; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig4.rtf”;

goptions reset=global gunit=pct border cback=white colors=(black blue green red) ftext=swiss ftitle=swissb htitle=5 htext=3.5; title1 'Average Total Young by Nitrofen concentration'; axis1 label=('Total Young' j=c 'Error Bar Confidence Limits: 95%') minor=(number=1); axis2 label=('Nitrofen' j=r 'Concentration'); pattern1 color=cyan; proc gchart data=class.nitrofen; hbar conc / type=mean sumvar=total /* freqlabel='Number in Group' */ /* meanlabel='Mean Number Young' */ errorbar=bars clm=95 midpoints=(0 80 160 235 310) raxis=axis1 maxis=axis2 noframe coutline=black; run; ODS RTF CLOSE;

ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig5.rtf”; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig5.rtf'; proc gchart data=class.nitrofen; title1 'Total Young by Nitrofen concentration'; star conc / sumvar=total; run; proc gchart data=class.nitrofen; title1 'Total Young by Nitrofen concentration';

pie conc / sumvar=total; run; ODS RTF CLOSE;

GREPLAY figures

/* now trying something fancy using templates and GREPLAY

to get multiple figures on a page

REF:

http://www.units.muohio.edu/doc/sassystem/sasonlinedocv8/sasdoc/sashtml/gref/z61-ex.htm

*/

* libname class 'D:\baileraj\Classes\Fall 2003\sta402\data’; libname class “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples”;

/* libname class previously defined */ goptions reset=global gunit=pct border cback=white

ftext=swissb htitle=6 htext=3; proc greplay tc=class.tempcat nofs; tdef newtemp des='Five panel template' 1/llx=0 lly=10 ulx=0 uly=50 urx=50 ury=50 lrx=50 lry=10 color=blue 2/llx=0 lly=50 ulx=0 uly=90 urx=50 ury=90 lrx=50 lry=50 color=red 3/llx=50 lly=50 ulx=50 uly=90 urx=100 ury=90 lrx=100 lry=50 color=green 4/llx=50 lly=10 ulx=50 uly=50 urx=100 ury=50 lrx=100 lry=10 color=cyan; template newtemp; list template; quit;

proc gplot data=class.nitrofen gout=class.graph; title c=red 'Brood 1'; plot brood1*conc; run; title 'Brood 2'; plot brood2*conc; run; title 'Brood 3'; plot brood3*conc; run; title 'TOTAL'; plot total*conc; run; goptions hsize=0in vsize=0in; proc gslide gout=class.graph; title 'PLOT of brood and total responses versus nitrofen concentration'; run; goptions display; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig6.rtf”; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig6.rtf'; proc greplay igout=class.graph gout=class.excat tc=class.tempcat nofs template=newtemp; treplay 3:gplot2 /* plot bottom left - brood 3 */ 1:gplot /* top left - brood 1 */ 2:gplot1 /* top right - brood 2 */ 4:gplot3 ; /* bottom right - total */ quit; ODS RTF CLOSE;

G3D figures

libname class “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples”;

* libname class 'D:\baileraj\Classes\Fall 2003\sta402\data’;

data new; set class.nitrofen; retain conc; brood=1; young=brood1; output; brood=2; young=brood2; output; brood=3; young=brood3; output; keep conc brood young; data new2; set new; jconc = conc + (10-20*ranuni(0)); jbrood = brood + (1-2*ranuni(0)); run; ODS RTF file= “\\M 2\USERS\B\BAILERAJ\ bli \ l \ t 402\ l \ k05 fi 7 tf”

* ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig7.rtf'; proc g3d data=new2; title h=1 ‘Scatter plot of # young by conc. and brood (jittered)’; scatter jconc*jbrood=young / xticknum=2 yticknum=2; run; quit;

ODF RTF CLOSE;

proc means data=new;

class conc brood;

var young;

output out=new3 mean=ymean;

run;

ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig8.rtf”; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig8.rtf'; proc g3d data new3;

title h=1 'Surface plot of mean # young by conc. and brood'; plot conc*brood=ymean / xticknum=2 yticknum=2 tilt=80; run; quit; ODS RTF CLOSE;

/*

Collection of graphics used to compare distributions

Stefan Stanev contributed to the code presented below as part of an independent study

*/

/* =================================================================

Enter a simple two group comparison data set

REF:

================================================================= */

options formdlim='-'; data one; input value treat $9. @@; datalines; 18 Drug 40 Untreated 43 Drug 54 Untreated 28 Drug 26 Untreated 50 Drug 63 Untreated 16 Drug 21 Untreated 32 Drug 37 Untreated 13 Drug 39 Untreated 35 Drug 23 Untreated 38 Drug 48 Untreated 33 Drug 58 Untreated 6 Drug 28 Untreated 7 Drug 39 Untreated ; /* ================================================================= Construct summary statistics for the two groups ================================================================= */ proc sort data=one; by treat; proc means mean std data=one; by treat; output out=bar mean=mean std=std; run; options nocenter nodate nonumber; proc print data=bar; run;

Obs treat _TYPE_ _FREQ_ mean std

1 Drug 0 12 26.5833 14.3619 2 Untreated 0 12 39.6667 13.8586

/* ============================================================================== Dynamite/”bar” graphs – not necessarily best data display but often constructed ================================================================================= */ proc gchart data = bar; vbar treat/ sumvar=mean type=mean; run;

mean MEAN

0

10

20

30

40

treat

Drug Untreated

goptions reset; axis1 label=none value=('Drug' 'Untreated'); axis2 label=('Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc gchart data = bar; vbar treat/ maxis =axis1 raxis = axis2 sumvar=mean type=mean; run;

Number of Tapeworms

0

10

20

30

40

50

60

Drug Untreated

goptions reset; axis1 offset = (2 cm) label=none value=('Drug' 'Untreated'); axis2 label=(angle =90 'Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc gchart data = bar; vbar treat/ maxis =axis1 raxis = axis2 sumvar=mean type=mean; run;

0

10

20

30

40

50

60

Drug Untreated

[*] So, how can we add whiskers?

/* creating an ANNOTATE data set … Can label points on a graph … Can create custom graphs … From SAS help >>>> * annotate data set – each observation = command to draw graphic element = command to perform an action - annotate variables = action/position/attribute (“what”/”where”/”how”) - variable types * drawing/programming action = FUNCTION POLY/ DRAW/ MOVE/ POINT/ BAR/ LABEL (draw text) * positioning GROUP (GCHART) MIDPOINT (GCHART) SUBGROUP (GCHART) X, Y, Z (G3D) * attributes ANGLE (pie)/ CBORDER / COLOR /LINE POSITION (placement and alignment for text strings) TEXT (text to used in label, symbol, comment) */

data myanno; retain xsys ysys '2' ; set bar; function='move'; midpoint=treat; y=mean; output; function='draw'; y=mean+std; width=2; output; run; proc print data=myanno; run;

Obs xsys ysys treat _TYPE_ _FREQ_ mean std function midpoint y width 1 2 2 Drug 0 12 26.5833 14.3619 move Drug 26.5833 .

2 2 2 Drug 0 12 26.5833 14.3619 draw Drug 40.9453 2

3 2 2 Untreated 0 12 39.6667 13.8586 move Untreated 39.6667 . 4 2 2 Untreated 0 12 39.6667 13.8586 draw Untreated 53.5253 2

goptions reset; axis1 offset = (2 cm) label=none value=('Drug' 'Untreated'); axis2 label=(angle =90 'Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc gchart data = bar; vbar treat/ anno=myanno space = 2 maxis =axis1 raxis = axis2 sumvar=mean type=mean; run;

0

10

20

30

40

50

60

Drug Untreated /* ================================================================= Side-by-side boxplots ================================================================= */ proc sort data=one; by treat; proc boxplot data=one; plot value*treat;

Drug Untreated

0

20

40

60

80

value

treat goptions reset; axis1 offset = (2 cm) label=none value=('Drug' 'Untreated'); axis2 label=(angle =90 'Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc boxplot data = one; plot value*treat=' ' / haxis=axis1 vaxis=axis2 ; run;

Drug Untreated

0

20

40

60

80

Number of Tapeworms

/* ================================================================= Stacked histograms ================================================================= */ axis3 label=none; axis4 order = (0 1 2 3 4) label=(angle =90 'Frequency'); proc capability data=one; by treat; histogram value / endpoints = 0 10 20 30 40 50 60 70

haxis = axis3 vaxis = axis4; run;

/* ================================================================= Scatterplot with superimposed mean ================================================================= */ axis5 offset = (2 cm) label = ('Points Jittered horizontally') order = (-.2 .167 .54 .917 1.29) value = (' ' 'Drug' ' ' 'Untreated' ' ') major=none minor=none; data jitter; set one; if treat = 'Drug' then x = ranuni(0)/3; else x = .75 + ranuni(0)/3; data anno2; retain xsys ysys '2' ; set bar; if treat = 'Drug' then do function='move'; x=0; y=mean; output; function='draw'; x=0.33; width=2; output; end; if treat = 'Untreated' then do function='move'; x=0.75; y=mean; output;

run; proc gplot data=jitter; plot value*x / haxis=axis5 vaxis=axis2 anno=anno2; run;

Recommended