View
0
Download
0
Category
Preview:
Citation preview
Week 05 Class Activities
File: week-05-24sep07.doc Directory (hp/compaq): C:\baileraj\Classes\Fall 2007\sta402\handouts Directory: \\Muserver2\USERS\B\\baileraj\Classes\sta402\handouts
Better Graphics in SAS via SAS/Graph
HIGH-RESOLUTION GRAPHICS AND FORMATS
* Introduce concepts related to high-resolution graphs
* PROC GCHART and PROC GPLOT for producing high-resolution graphs
other SAS high resolution graphics procedures …
GCHART (bar charts, pie charts, star charts)
GPLOT (line plot, scatter plot, regression plot, high-low plots, bubble plots)
G3D (3-dimensional surface plots)
GCONTOUR, G3GRID (interpolate/smooth data)
GMAP (block, choropleth, prism, surface)
GSLIDE (create text slide)
GPRINT (display as a graphic SAS procedure output that has been saved in a text file)
GREPLAY (combine several graphs into a single output)
To list all graphics devices in the current catalog …
proc gdevice catalog=sashelp.devices nofs; list; run;
* SAS-supplied formats and PROC FORMAT for user-defined formats
References
http://support.sas.com/onlinedoc/912/docMainpage.jsp (follow SAS/GRAPH links)
or
http://www.units.muohio.edu/doc/sassystem/sasonlinedocv8/sasdoc/sashtml/main.htm
GPLOT figures
/* example sas program that does simple linear regression */ /* defines library location for permanently storing a SAS data set */ libname class “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples”;
options ls=75; data class.manatee; input year nboats manatees; cards; 77 447 13 78 460 21 79 481 24 80 498 16 81 513 24 82 512 20 83 526 15 84 559 34 85 585 33 86 614 33 87 645 39 88 675 43 89 711 50 90 719 47 ;
proc print data=class.manatee; title “Manatee mortality data”; run; /* Line printer scatterplot */ proc plot data=class.manatee; title2 “PROC PLOT: manatees killed plotted vs. # boats registered”; plot manatees*nboats; run;
/*
SAS has an experimental ODS procedures for Statistical Graphics Using ODS
*/
ODS GRAPHICS (experimental in SAS 9) From http://support.sas.com/rnd/base/topics/statgraph/ …
On an experimental basis in SAS 9.1, a number of SAS/STAT procedures support an extension to the Output Delivery System (ODS) to create statistical graphics as automatically as tables. This facility is
referred to as ODS Statistical Graphics (or ODS Graphics for short).
With ODS Graphics, a procedure creates the graphs that are most commonly needed for a particular analysis. Using ODS Graphics eliminates the need to save numerical results in an output data set,
manipulate them with a DATA step program, and display them with a graphics procedure.
http://support.sas.com/91doc/getDoc/statug.hlp/odsgraph_sect4.htm ODS Graph produces graphs similar in nature to what Minitab does. [Thanks to Dr. Schaefer for the links] SAS Procedures with ODS Graphics support
Base SAS
• CORR
SAS/ETS
• ARIMA • AUTOREG • ENTROPY • EXPAND • MODEL • SYSLIN • TIMESERIES • UCM
SAS/STAT
• ANOVA • CORRESP • GAM • GENMOD • GLM • KDE • LIFETEST • LOESS • LOGISTIC • MI • MIXED • PHREG
• VARMAX • X12
SAS High-Performance Forecasting
• HPF
• PRINCOMP • PRINQUAL • REG • ROBUSTREG
(from http://support.sas.com/91doc/docMainpage.jsp?_topic=statug.hlp/odsgraph_index.htm)
In many ways, creating graphics with ODS is analogous to creating tables with ODS. You use
• procedure options and defaults to determine which graphs are created • ODS destination statements (such as ODS HTML) to specify the output
destination for graphics
Additionally, you can use
• graph names in ODS SELECT and ODS EXCLUDE statements to select or exclude graphs from your output
• ODS styles to control the general appearance and consistency of all graphs • ODS templates to control the layout and details of individual graphs . A default
template is provided by SAS for each graph.
In SAS 9.1, the ODS destinations that support ODS Graphics include HTML, LATEX, PRINTER, and
RTF.
ODS html; ODS graphics on; proc reg data=class.manatee plots(unpack); title2 “Stat graphics with ODS”; model manatees=nboats; run; quit; ODS graphics off; ODS html close;
Manatee mortality data
Stat graphics with ODS
The REG Procedure Model: MODEL1 Dependent Variable: manatees
Number of Observations Read 14
Number of Observations Used 14
Analysis of Variance
Source DF Sum of Squares
Mean Square
F Value Pr > F
Model 1 1711.97866 1711.97866 93.61 <.0001
Error 12 219.44991 18.28749
Corrected Total 13 1931.42857
Root MSE 4.27639 R-Square 0.8864
Dependent Mean 29.42857 Adj R-Sq 0.8769
Coeff Var 14.53141
Parameter Estimates
Variable DF Parameter Estimate
Standard Error
t Value Pr > |t|
Intercept 1 -41.43044 7.41222 -5.59 0.0001
nboats 1 0.12486 0.01290 9.68 <.0001
Manatee mortality data
Stat graphics with ODS
The REG Procedure Model: MODEL1 Dependent Variable: manatees
/*
Now look at what ODS graphics does with GLM
*/
data meat; input condition $ logcount @@; datalines; Plastic 7.66 Plastic 6.98 Plastic 7.80 Vacuum 5.26 Vacuum 5.44 Vacuum 5.80 Mixed 7.41 Mixed 7.33 Mixed 7.04 Co2 3.51 Co2 2.91 Co2 3.66 ; title bacteria growth under 4 packaging conditions; ods html; ods graphics on;
proc glm data=meat order=data; title2 fitting the one-way anova model via GLM;
class condition; model logcount = condition; run;
quit; ods graphics off; ods html close;
bacteria growth under 4 packaging conditions
fitting the one-way anova model via GLM
The GLM Procedure
Class Level Information
Class Levels Values
condition 4 Plastic Vacuum Mixed Co2
Number of Observations Read 12
Number of Observations Used 12
bacteria growth under 4 packaging conditions
fitting the one-way anova model via GLM
The GLM Procedure Dependent Variable: logcount
Source DF Sum of Squares Mean Square F Value Pr > F
Model 3 32.87280000 10.95760000 94.58 <.0001
Error 8 0.92680000 0.11585000
Corrected Total 11 33.79960000
R-Square Coeff Var Root MSE logcount Mean
0.972580 5.768940 0.340367 5.900000
Source DF Type I SS Mean Square F Value Pr > F
condition 3 32.87280000 10.95760000 94.58 <.0001
Source DF Type III SS Mean Square F Value Pr > F
condition 3 32.87280000 10.95760000 94.58 <.0001
/*
Now how about specifying graphs in SAS/Graph
*/ ODS RTF file=
“\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig1A.rtf”;
proc gplot data=class.manatee; title ‘Number of Manatees killed regressed on the number of boats registered in Florida’; plot manatees*nboats; run;
ODS RTF CLOSE;
/*
Let’s now add bells and whistles to this plot
*/
ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig2.rtf”; proc gplot data=class.manatee; title h=1.5 'A plot of the number of manatee deaths versus the number of boats registered in Florida';
title2 h=1 '[Regression line with CI for mean along with smoothing spline fit displayed]'; symbol1 interpol=rlclm95 /* r=regression, l=linear (q,c also possible), clm=conf. int. mean (cli option), 95= conf. level */ value=diamond height=3 cv=red ci=blue co=green width=2; symbol2 interpol=SM55s /* smoothing spline (SM) that first sorts the x-axis data */ ; plot manatees*nboats manatees*nboats / hminor=1 overlay regeqn; /* adds regression eqn to bottom left of plot */ run; ODS RTF CLOSE;
/*
The NITROFEN data set revisited …
*/
data class.nitrofen; infile 'M:\public.www\classes\sta402\SAS-programs\ch2-dat.txt' firstobs=16 expandtabs missover pad ; input @9 animal 2. @17 conc 3. @25 brood1 2. @33 brood2 2.
@41 brood3 2. @49 total 2.; /* creates character variable based on format */ * cbrood3 = put(brood3,totfmt.); label animal = animal ID number; label conc = Nitrofen concentration; label brood1 = number of young in first brood; label brood2 = number of young in 2nd brood; label brood3 = number of young in 3rd brood; *label total = total young produced in three broods; proc print data=class.nitrofen; run; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig2.rtf'; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig2.rtf”; /* SYMBOL = defines characteristics of plotted symbols */ proc gplot data=class.nitrofen; title h=1.5 'A plot of the number of C. dubia produced at different Nitrofen concentrations'; title2 h=1 '[mean +/- 2SD is plotted for each concentration]'; symbol1 interpol=STD2T /* plots +/- 2 SD from the mean at each conc */ /* T= add top and bottom to each 2 SD diff */ value=dot; plot total*conc / hminor=1 /* hminor=# tick markets before x values */ haxis=0 to 350 by 50; run; ODS RTF CLOSE;
proc means data=class.nitrofen; class conc; var total; output out=nitromean mean=n_mean stddev=n_sd; run; proc print; run;
Obs conc _TYPE_ _FREQ_ n_mean n_sd 1 . 0 50 22.88 10.7241 2 0 1 10 31.40 3.5963
3 80 1 10 31.50 3.2745 4 160 1 10 28.30 2.3594 5 235 1 10 17.20 5.9029 6 310 1 10 6.00 3.7118 * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig3.rtf'; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig3.rtf”; proc gplot data=nitromean; title h=1.5 'Plot of mean number of C. dubia young produced at different Nitrofen concentrations'; title2 h=1 '[bubble area proportional to std dev.]'; bubble n_mean*conc=n_sd / bsize=15; /* bsize helps resize bubble for display */ run; ODS RTF CLOSE;
GCHART figures
* ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig4.rtf'; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig4.rtf”;
goptions reset=global gunit=pct border cback=white colors=(black blue green red) ftext=swiss ftitle=swissb htitle=5 htext=3.5; title1 'Average Total Young by Nitrofen concentration'; axis1 label=('Total Young' j=c 'Error Bar Confidence Limits: 95%') minor=(number=1); axis2 label=('Nitrofen' j=r 'Concentration'); pattern1 color=cyan; proc gchart data=class.nitrofen; hbar conc / type=mean sumvar=total /* freqlabel='Number in Group' */ /* meanlabel='Mean Number Young' */ errorbar=bars clm=95 midpoints=(0 80 160 235 310) raxis=axis1 maxis=axis2 noframe coutline=black; run; ODS RTF CLOSE;
ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig5.rtf”; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig5.rtf'; proc gchart data=class.nitrofen; title1 'Total Young by Nitrofen concentration'; star conc / sumvar=total; run; proc gchart data=class.nitrofen; title1 'Total Young by Nitrofen concentration';
pie conc / sumvar=total; run; ODS RTF CLOSE;
GREPLAY figures
/* now trying something fancy using templates and GREPLAY
to get multiple figures on a page
REF:
http://www.units.muohio.edu/doc/sassystem/sasonlinedocv8/sasdoc/sashtml/gref/z61-ex.htm
*/
* libname class 'D:\baileraj\Classes\Fall 2003\sta402\data’; libname class “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples”;
/* libname class previously defined */ goptions reset=global gunit=pct border cback=white
ftext=swissb htitle=6 htext=3; proc greplay tc=class.tempcat nofs; tdef newtemp des='Five panel template' 1/llx=0 lly=10 ulx=0 uly=50 urx=50 ury=50 lrx=50 lry=10 color=blue 2/llx=0 lly=50 ulx=0 uly=90 urx=50 ury=90 lrx=50 lry=50 color=red 3/llx=50 lly=50 ulx=50 uly=90 urx=100 ury=90 lrx=100 lry=50 color=green 4/llx=50 lly=10 ulx=50 uly=50 urx=100 ury=50 lrx=100 lry=10 color=cyan; template newtemp; list template; quit;
proc gplot data=class.nitrofen gout=class.graph; title c=red 'Brood 1'; plot brood1*conc; run; title 'Brood 2'; plot brood2*conc; run; title 'Brood 3'; plot brood3*conc; run; title 'TOTAL'; plot total*conc; run; goptions hsize=0in vsize=0in; proc gslide gout=class.graph; title 'PLOT of brood and total responses versus nitrofen concentration'; run; goptions display; ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig6.rtf”; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig6.rtf'; proc greplay igout=class.graph gout=class.excat tc=class.tempcat nofs template=newtemp; treplay 3:gplot2 /* plot bottom left - brood 3 */ 1:gplot /* top left - brood 1 */ 2:gplot1 /* top right - brood 2 */ 4:gplot3 ; /* bottom right - total */ quit; ODS RTF CLOSE;
G3D figures
libname class “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples”;
* libname class 'D:\baileraj\Classes\Fall 2003\sta402\data’;
data new; set class.nitrofen; retain conc; brood=1; young=brood1; output; brood=2; young=brood2; output; brood=3; young=brood3; output; keep conc brood young; data new2; set new; jconc = conc + (10-20*ranuni(0)); jbrood = brood + (1-2*ranuni(0)); run; ODS RTF file= “\\M 2\USERS\B\BAILERAJ\ bli \ l \ t 402\ l \ k05 fi 7 tf”
* ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig7.rtf'; proc g3d data=new2; title h=1 ‘Scatter plot of # young by conc. and brood (jittered)’; scatter jconc*jbrood=young / xticknum=2 yticknum=2; run; quit;
ODF RTF CLOSE;
proc means data=new;
class conc brood;
var young;
output out=new3 mean=ymean;
run;
ODS RTF file= “\\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week05-fig8.rtf”; * ODS RTF file='D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week5-fig8.rtf'; proc g3d data new3;
title h=1 'Surface plot of mean # young by conc. and brood'; plot conc*brood=ymean / xticknum=2 yticknum=2 tilt=80; run; quit; ODS RTF CLOSE;
/*
Collection of graphics used to compare distributions
Stefan Stanev contributed to the code presented below as part of an independent study
*/
/* =================================================================
Enter a simple two group comparison data set
REF:
================================================================= */
options formdlim='-'; data one; input value treat $9. @@; datalines; 18 Drug 40 Untreated 43 Drug 54 Untreated 28 Drug 26 Untreated 50 Drug 63 Untreated 16 Drug 21 Untreated 32 Drug 37 Untreated 13 Drug 39 Untreated 35 Drug 23 Untreated 38 Drug 48 Untreated 33 Drug 58 Untreated 6 Drug 28 Untreated 7 Drug 39 Untreated ; /* ================================================================= Construct summary statistics for the two groups ================================================================= */ proc sort data=one; by treat; proc means mean std data=one; by treat; output out=bar mean=mean std=std; run; options nocenter nodate nonumber; proc print data=bar; run;
Obs treat _TYPE_ _FREQ_ mean std
1 Drug 0 12 26.5833 14.3619 2 Untreated 0 12 39.6667 13.8586
/* ============================================================================== Dynamite/”bar” graphs – not necessarily best data display but often constructed ================================================================================= */ proc gchart data = bar; vbar treat/ sumvar=mean type=mean; run;
mean MEAN
0
10
20
30
40
treat
Drug Untreated
goptions reset; axis1 label=none value=('Drug' 'Untreated'); axis2 label=('Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc gchart data = bar; vbar treat/ maxis =axis1 raxis = axis2 sumvar=mean type=mean; run;
Number of Tapeworms
0
10
20
30
40
50
60
Drug Untreated
goptions reset; axis1 offset = (2 cm) label=none value=('Drug' 'Untreated'); axis2 label=(angle =90 'Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc gchart data = bar; vbar treat/ maxis =axis1 raxis = axis2 sumvar=mean type=mean; run;
0
10
20
30
40
50
60
Drug Untreated
[*] So, how can we add whiskers?
/* creating an ANNOTATE data set … Can label points on a graph … Can create custom graphs … From SAS help >>>> * annotate data set – each observation = command to draw graphic element = command to perform an action - annotate variables = action/position/attribute (“what”/”where”/”how”) - variable types * drawing/programming action = FUNCTION POLY/ DRAW/ MOVE/ POINT/ BAR/ LABEL (draw text) * positioning GROUP (GCHART) MIDPOINT (GCHART) SUBGROUP (GCHART) X, Y, Z (G3D) * attributes ANGLE (pie)/ CBORDER / COLOR /LINE POSITION (placement and alignment for text strings) TEXT (text to used in label, symbol, comment) */
data myanno; retain xsys ysys '2' ; set bar; function='move'; midpoint=treat; y=mean; output; function='draw'; y=mean+std; width=2; output; run; proc print data=myanno; run;
Obs xsys ysys treat _TYPE_ _FREQ_ mean std function midpoint y width 1 2 2 Drug 0 12 26.5833 14.3619 move Drug 26.5833 .
2 2 2 Drug 0 12 26.5833 14.3619 draw Drug 40.9453 2
3 2 2 Untreated 0 12 39.6667 13.8586 move Untreated 39.6667 . 4 2 2 Untreated 0 12 39.6667 13.8586 draw Untreated 53.5253 2
goptions reset; axis1 offset = (2 cm) label=none value=('Drug' 'Untreated'); axis2 label=(angle =90 'Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc gchart data = bar; vbar treat/ anno=myanno space = 2 maxis =axis1 raxis = axis2 sumvar=mean type=mean; run;
0
10
20
30
40
50
60
Drug Untreated /* ================================================================= Side-by-side boxplots ================================================================= */ proc sort data=one; by treat; proc boxplot data=one; plot value*treat;
Drug Untreated
0
20
40
60
80
value
treat goptions reset; axis1 offset = (2 cm) label=none value=('Drug' 'Untreated'); axis2 label=(angle =90 'Number of Tapeworms') order=(0, 10, 20, 30, 40, 50, 60); proc boxplot data = one; plot value*treat=' ' / haxis=axis1 vaxis=axis2 ; run;
Drug Untreated
0
20
40
60
80
Number of Tapeworms
/* ================================================================= Stacked histograms ================================================================= */ axis3 label=none; axis4 order = (0 1 2 3 4) label=(angle =90 'Frequency'); proc capability data=one; by treat; histogram value / endpoints = 0 10 20 30 40 50 60 70
haxis = axis3 vaxis = axis4; run;
/* ================================================================= Scatterplot with superimposed mean ================================================================= */ axis5 offset = (2 cm) label = ('Points Jittered horizontally') order = (-.2 .167 .54 .917 1.29) value = (' ' 'Drug' ' ' 'Untreated' ' ') major=none minor=none; data jitter; set one; if treat = 'Drug' then x = ranuni(0)/3; else x = .75 + ranuni(0)/3; data anno2; retain xsys ysys '2' ; set bar; if treat = 'Drug' then do function='move'; x=0; y=mean; output; function='draw'; x=0.33; width=2; output; end; if treat = 'Untreated' then do function='move'; x=0.75; y=mean; output;
run; proc gplot data=jitter; plot value*x / haxis=axis5 vaxis=axis2 anno=anno2; run;
Recommended