14
1 Integrating SAS and Excel: an Overview and Comparison of Three Methods for Using SAS ® to Create and Access Data in Excel Nathan Clausen, U.S. Bureau of Labor Statistics, Washington, DC Edmond Cheng, U.S. Bureau of Labor Statistics, Washington, DC ABSTRACT This paper looks at several ways integrating Excel with SAS. While SAS is powerful data manipulation and analy- sis tool, there are times when it may be better to view and interact with data in Excel. SAS easily facilitates this interaction in several ways. Only a basic knowledge of SAS is assumed, with references to more in-depth SAS resources provided. First, we look at three methods for creating/consuming Excel data in SAS: PROC EXPORT, ODS with the Ex- celXP tagset, and the Excel libname engine. For each method, we give a brief overview of usage and an outline of features that can be advantageous depending on the use case. Second, we outline a practical scenario for implementing each of the interaction methods, with some basic exam- ples of the resulting solution. These include creating custom reports, distributing SAS data to non-SAS users, creating interactive Excel analysis tools, and incorporating Excel inputs and logic in SAS programs. Finally, we summarize the three methods using a feature comparison matrix to highlight the strengths and weak- nesses of each method such as the ability to customize output, data integration, effort to create/maintain code, platform limitations, and technical considerations. INTRODUCTION SAS/BASE comes with a range of methods for working with data between SAS and Microsoft Excel. SAS users sometimes run into occasions when the data they are working with come and go in formats other than SAS data- sets. These data might be ASCII files, some form of delimited text files, Microsoft files, XML, HTML, or variety types of proprietary database formats. Unless the customers have some knowledge of SAS, they most likely need their data in a format other than a SAS datasets. Or they might prefer the data be summarized in customized re- ports on a recurring basis. Although SAS is a powerful software package for running statistical procedures, per- forming routine data manipulation tasks, and handling large data, external customers will need a data format which they can easily access and perform independent analysis. For such cases, this paper focuses on a com- mon format in most work environment: Microsoft Excel spreadsheets. Many SAS user papers and SAS documents cover techniques for working with data beyond SAS data sets. This paper does not focus on one particular method, but rather compares at three different methods PROC IM- PORT/EXPORT, ODS with ExcelXP tagset, and Excel LIBNAME. The first section provides an overview on each method. The next section demonstrates how the three methods are applicable in four common work scenarios. Users who are already familiar with working Excel and SAS can skip to the last section which shows a summary matrix on features and usages among the three methods. OVERVIEW OF THE THREE METHODS SAS software provides various methods in working with Microsoft Excel data. The following list identifies the common modes of interacting with Excel. The paper will go in details for the three methods that are highlighted. PROC ACCESS, PROC IMPORT/EXPORT ODS CSV, MSOFFICE2K, ExcelXP, HTML SAS/ACCESS LIBNAME engine, SAS/ACCESS SQL Pass-Through facility, PROC DBLOAD Dynamic Data Exchange (DDE) METHOD 1 PROC EXPORT Description The EXPORT procedure is part of the SAS/ACCESS package. It reads a SAS data set and writes the data to an external file, including not only Excel files, but also ACCESS, delimited files (CSV), Lotus 1-2-3, and a number of other database formats. Foundations and Fundamentals NESUG 2012

Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

1

Integrating SAS and Excel: an Overview and Comparison of Three Methods for Using SAS® to Create and Access Data in Excel

Nathan Clausen, U.S. Bureau of Labor Statistics, Washington, DC Edmond Cheng, U.S. Bureau of Labor Statistics, Washington, DC

ABSTRACT

This paper looks at several ways integrating Excel with SAS. While SAS is powerful data manipulation and analy-sis tool, there are times when it may be better to view and interact with data in Excel. SAS easily facilitates this interaction in several ways. Only a basic knowledge of SAS is assumed, with references to more in-depth SAS resources provided.

First, we look at three methods for creating/consuming Excel data in SAS: PROC EXPORT, ODS with the Ex-celXP tagset, and the Excel libname engine. For each method, we give a brief overview of usage and an outline of features that can be advantageous depending on the use case.

Second, we outline a practical scenario for implementing each of the interaction methods, with some basic exam-ples of the resulting solution. These include creating custom reports, distributing SAS data to non-SAS users, creating interactive Excel analysis tools, and incorporating Excel inputs and logic in SAS programs.

Finally, we summarize the three methods using a feature comparison matrix to highlight the strengths and weak-nesses of each method such as the ability to customize output, data integration, effort to create/maintain code, platform limitations, and technical considerations.

INTRODUCTION

SAS/BASE comes with a range of methods for working with data between SAS and Microsoft Excel. SAS users sometimes run into occasions when the data they are working with come and go in formats other than SAS data-sets. These data might be ASCII files, some form of delimited text files, Microsoft files, XML, HTML, or variety types of proprietary database formats. Unless the customers have some knowledge of SAS, they most likely need their data in a format other than a SAS datasets. Or they might prefer the data be summarized in customized re-ports on a recurring basis. Although SAS is a powerful software package for running statistical procedures, per-forming routine data manipulation tasks, and handling large data, external customers will need a data format which they can easily access and perform independent analysis. For such cases, this paper focuses on a com-mon format in most work environment: Microsoft Excel spreadsheets.

Many SAS user papers and SAS documents cover techniques for working with data beyond SAS data sets. This paper does not focus on one particular method, but rather compares at three different methods – PROC IM-PORT/EXPORT, ODS with ExcelXP tagset, and Excel LIBNAME. The first section provides an overview on each method. The next section demonstrates how the three methods are applicable in four common work scenarios. Users who are already familiar with working Excel and SAS can skip to the last section which shows a summary matrix on features and usages among the three methods.

OVERVIEW OF THE THREE METHODS

SAS software provides various methods in working with Microsoft Excel data. The following list identifies the common modes of interacting with Excel. The paper will go in details for the three methods that are highlighted.

• PROC ACCESS, PROC IMPORT/EXPORT

• ODS CSV, MSOFFICE2K, ExcelXP, HTML

• SAS/ACCESS LIBNAME engine, SAS/ACCESS SQL Pass-Through facility, PROC DBLOAD

• Dynamic Data Exchange (DDE)

METHOD 1 – PROC EXPORT

Description The EXPORT procedure is part of the SAS/ACCESS package. It reads a SAS data set and writes the data to an external file, including not only Excel files, but also ACCESS, delimited files (CSV), Lotus 1-2-3, and a number of other database formats.

Foundations and FundamentalsNESUG 2012

Page 2: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

2

Usage The EXPORT procedure consists of a single statement identifying the input data set, the output file, and some optional arguments. Options include whether or not to replace existing data when exporting the data (de-faults to not replace), and whether or not to use variable labels or names (defaults to names). The EXPORT pro-cedure has an option (DBMS) to specify the database identifier, identifying the format of the output file. For pur-poses of this analysis, the relevant values are either EXCEL or XLS. If you specify DBMS=EXCEL, the exported file will use either the current Excel file format (.xlsx) or the previous Excel file format (.xls) depending on the file name in the OUTFILE option.

The following code shows how to export the shoes data set to Excel, using variable labels as column headers:

PROC EXPORT DATA=togo DBMS=EXCEL

OUTFILE='C:\path\t1.xlsx'

REPLACE label;

RUN;

It should be noted that while the DBMS=EXCEL setting can create either .xls or .xlsx files and the DBMS=XLS setting creates only .xls files, the XLS database identifier retains more formatting, such as dollar signs. It is also significantly faster.

Advantages and Limitations The EXPORT procedure is a quick and convenient method for exporting data from SAS into Excel files. It works both on Windows as well as UNIX environments, though only the XLS data-base identifier is supported with a standard installation.

METHOD 2 – ODS EXCELXP TAGSET

Description The ODS ExcelXP tagset leverages the SAS ODS to produce XML formatted output that can be opened and displayed in Excel. The customizability of the ODS tagset allows for a high level of precision in the layout and presentation of customized reports.

Usage To direct report output to the .xml file using the ODS ExcelXP tagset, simply open the ODS tag-sets.excelxp destination and specify the output file name. There are a large number of options than can be speci-fied in the ODS statement to control the output formatting. For example, the „sheet_name=‟ option allows the user to specify the name of the spreadsheet in the workbook to which the output will be saved.

ods tagsets.excelxp file="C:\path\rpt.xml";

PROC PRINT…

PROC REPORT…

ods tagsets.excelxp close;

Advantages and Limitations The ODS ExcelXP tagset is by far the most powerful of the options presented in this paper for generating customized reports in Excel. The ability to control layout, colors, and fonts as well as page orientation and print options gives the report designer a large range of tools for designing reports. And, be-cause the output is XML, this method is platform independent. The XML file format does have some drawbacks, however. Code to create customized reports can quickly become quite complex, and the output is not compatible with further automated processing in Excel. (See http://support.sas.com/rnd/base/ods/odsmarkup/excltags.tpl)

METHOD 3 – EXCEL LIBNAME ENGINE

Description The Excel Libname Engine allows SAS to access Excel worksheets within an Excel file as if they were datasets within a SAS library.

Usage The Excel libname engine can be accessed by simply indicating an Excel file name in the physical path when assigning a library. The SAS log output from the libname statement indicates that the libname engine is EXCEL. For example:

LIBNAME ex "C:\path\L1.xlsx";

Advantages and Limitations Interacting with Excel through the EXCEL libname engine provides an easy way to read and create Excel data from SAS as if it were SAS data. However, it is not without some drawbacks. A minor drawback is that much formatting is lost as variables are converted either to numeric or general Excel for-

Foundations and FundamentalsNESUG 2012

Page 3: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

3

mats. A major drawback is that the engine does not allow for directly updating data in the Excel sheet. In order to change the content in an Excel file through the libname engine, a temporary dataset must first be created with the original and revised data values, the Excel table must then be deleted, and finally the revised data can be saved to the Excel libname.

FOUR PRACTICAL SCENARIOS

The typical interactions between SAS and Microsoft Excel are concentrated in three areas: data import/export, analysis, and reporting. Furthermore, the role of analysis and reporting could either be handled in SAS or Excel. This section presents examples of each interaction in work scenario. Each scenario will be solved by using one or more of the three methods – PROC EXPORT, ExcelXP tagset, and Excel LIBNAME. With each method‟s unique functionalities and different limitations, this section shows how one might be more practical than others in various tasks.

The following examples use the PRSALE, PRSAL2, and PRSAL3 dataset available in SASHELP. The dataset contains two years of furniture sales by products and geographies. The multiple categorical dependent variables (country, region, division, prodtype, product), periodicity (year, quarter, month), and sales (actual, predicted) make this a good dataset for tabulation and reporting purposes.

SCENARIO 1 – DISTRIBUTING DATA

Let‟s begin with a simple, yet common, task of exporting data from SAS to Excel. In many work environments, users might not have access to the SAS software. This scenario addresses a situation where data needs to be exported from SAS as an Excel spreadsheet. Working with the SASHELP.PRSALE, PRSAL2, and PRSAL3 data sets, the task is to export every observation to multiple worksheets in one Excel file. The variable names stay the same and no special values formatting are necessary.

PROC EXPORT

In the PROC EXPORT procedure, the DATA parameter is set to the SAS data set name SASHELP.PRSALE, the DBMS engine is set to EXCEL, and the OUTFILE is set to the destination path and filename. Options such as LABEL, REPLACE, SHEET, and others could be used for additional, but limited, customization for the export.

PROC EXPORT DATA=sashelp.prdsale

OUTFILE="C:\SAS\prdsale.xlsx"

DBMS=EXCEL REPLACE;

SHEET="PRSALE";

RUN;

PROC EXPORT DATA=sashelp.prdsal2

OUTFILE="C:\SAS\prdsale.xlsx"

DBMS=EXCEL REPLACE;

SHEET="PRDSAL2";

RUN;

PROC EXPORT DATA=sashelp.prdsal3

OUTFILE="C:\SAS\prdsale.xlsx"

DBMS=EXCEL REPLACE;

SHEET="PRDSAL3";

RUN;

The procedure reads every variables and observations from the SAS data sets and writes to a native Excel spreadsheet. Including the REPLACE option overwrites existing Excel worksheet if the same SHEET name al-ready exists in the workbook.

Foundations and FundamentalsNESUG 2012

Page 4: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

4

EXCELXP Tagset

The ExcelXP Tagset method involves more coding compares to the two other methods. The method calls for ODS with ExcelXP tagset destination, and then uses a PROC PRINT procedure to insert each observation in the XML file. The ExcelXP tagset file needs to be loaded prior to the ODS statements in order for SAS to the XML markup language for outputs.

%INCLUDE 'C:\SAS\excltags.tpl';

ODS LISTING CLOSE;

ODS tagsets.excelxp FILE="C:\SAS\prdsale_excelxp.xls";

ODS tagsets.excelxp options(sheet_name="PRDSALE");

PROC PRINT LABEL DATA=sashelp.prdsale NOOBS;

RUN;

ODS tagsets.excelxp options(sheet_name="PRDSAL2");

PROC PRINT LABEL DATA=sashelp.prdsal2 NOOBS;

RUN;

ODS tagsets.excelxp options(sheet_name="PRDSAL3");

PROC PRINT LABEL DATA=sashelp.prdsal3 NOOBS;

RUN;

ODS tagSETs.excelxp CLOSE;

ODS LISTING;

Comparing to the PROC EXPORT procedure, the ODS ExcelXP tagset demands more CPU resources and real processing time. The output file took approximately 60 seconds and 33MB file size versus 5 seconds and 2MB with PROC EXPORT.

In this example, the file is saved with the .XLS extension to make the XML file more recognizable for users to open in Microsoft Excel. Each time when the files opens, Excel will ask for a confirmation before opening the file.

Opening the prdsale_excelxp.xls spreadsheet, the three worksheets are formatted with SAS default style, as the code did not specify any style template in the ODS. The variable labels are used as column headers. The AC-TUAL and PREDICT values are dollar formatted. The MONTH values are formatted to the abbreviated name of the month.

Foundations and FundamentalsNESUG 2012

Page 5: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

5

EXCEL LIBNAME

Using the Excel LIBNAME engine for exporting data is much like the PROC EXPORT procedure. There are mul-tiple ways of putting SAS data to Excel by setting an Excel workbook as a LIBNAME. This example assigns a new LIBNAME calls EXCELIB using the EXCEL engine directing to the Excel file path and workbook name. It uses the COPY function within the PROC DATASET procedure to copy PRDSALE, PRDSAL1, and PRDSAL2 data from SASHELP library to the Excel library. Lastly, the Excel LIBNAME statement is cleared, so the workbook is un-locked from SAS and accessible to other software.

LIBNAME excelib EXCEL "C:\SAS\prdsale_all.xlsx";

PROC COPY IN=sashelp OUT=excelib;

SELECT prdsale prdsal2 prdsal3;

RUN;

LIBNAME excelib CLEAR;

Note: the Excel libname method does not allow overwriting existing data if the worksheet already contains data. This either requires re-creating the workbook, or deleting the content from the worksheet. The example below uses PROC DATASETS with DELETE function to remove the existing data.

LIBNAME excelib EXCEL "C:\SAS\prdsale_all.xlsx";

PROC DATASETS LIB=excelib;

DELETE PRDSALE PRDSAL2 PRDSAL3;

QUIT;

LIBNAME excelib CLEAR;

The exported spreadsheets look identical to the outputs from PROC EXPORT.

Foundations and FundamentalsNESUG 2012

Page 6: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

6

SCENARIO 2 – CREATING CUSTOM REPORTS

Another common request of non-SAS users is displaying SAS results in customized reports. The second scenario shows how to use ODS ExcelXP tagset method to create highly customized Excel reports. SAS performs the tasks of data processing, running statistical models, formatting reports, and exporting the results in Excel format. Often times, SAS could be automated to generate these customized reports on a scheduled basis or in batches.

EXCEL Tagset

ODS with ExcelXP tagset destination has the ability of combining the flexibility of SAS ODS, built-in style tem-plates, and customizable styling elements to generate visually pleasing outputs from SAS to Excel. One advan-tage of using ODS is that all formatting and layout are handled by SAS. The ExcelXP tagset includes the markup language to put SAS data in ODS appearance in XML format readable by Excel. The tagset offers a high degree of customization on style elements and attributes, in addition to controls over view and print settings in Excel.

Using the SASHELP.PRDALE, this example produces a summarized report on the sales of furniture by country. To illustrate the customization level available, the ODS uses the built-in BARRETTSBLUE style, formatted col-umns, computed totals, summary totals, and spanned headers. To make the report printer-friendly, the print op-tions are set to fit the report to print in one page.

ODS LISTING CLOSE;

ODS PDF FILE = "C:\SAS\report.pdf" NOTOC STYLE=STYLES.BARRETTSBLUE;

ODS TAGSETS.EXCELXP

FILE="C:\SAS\report.xml"

STYLE=BARRETTSBLUE

OPTIONS(FROZEN_HEADERS='3' FITTOPAGE = 'YES'

DEFAULT_COLUMN_WIDTH='20,10,10,10,10,10,10'

CENTER_HORIZONTAL = 'YES' CENTER_VERTICAL = 'YES'

PRINT_HEADER = "Furnitures Sales Report 1993Q1 to 1994Q4"

PRINT_FOOTER = "Printed on &sysdate");

PROC REPORT DATA=sashelp.prdsale NOWINDOWS CENTER;

COLUMN country product year quarter ('Sales Amount' actual predict diff);

DEFINE country / ORDER GROUP NOPRINT;

DEFINE product / ORDER GROUP;

DEFINE year / ORDER GROUP;

DEFINE quarter / ORDER GROUP;

DEFINE actual / SUM 'Actual' FORMAT=DOLLAR10.;

DEFINE predict / SUM 'Predicted' FORMAT=DOLLAR10.;

DEFINE diff / COMPUTED 'Difference' FORMAT=DOLLAR10.;

Foundations and FundamentalsNESUG 2012

Page 7: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

7

COMPUTE diff;

diff = actual.SUM - predict.SUM;

ENDCOMP;

BREAK AFTER year / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE

FONT_WEIGHT=BOLD};

BREAK AFTER product / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE

FONT_WEIGHT=BOLD};

BREAK AFTER country / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE

FONT_WEIGHT=BOLD};

BY country;

RUN;

QUIT;

ODS TAGSETS.EXCELXP CLOSE;

ODS PDF CLOSE;

ODS LISTING;

Between the opening and closing of the ODS statements, the PROC REPORT procedure summarizes the actual and predicted sales for each product by years and quarters in report.xls workbook. A new column, DIFF, is calcu-lated on based on the ACTUAL – PREDICT sales. A summary total is computed after each year and each furni-ture product. A grand total is computed for each country after all the furniture products (not shown). Lastly, the workbook contains three worksheets with one for each country.

Whether a report is generated with PDF, HTML, Office2k, or ExcelXP tagset, or other destinations the output ap-pearance and layouts are consistent among ODS outputs. To illustrate this example, the example codes include an ODS PDF destination statement which produces the same report in PDF format.

EXCEL LIBNAME

The Excel LIBNAME method does not produce customized report within SAS. While any compatible SAS proce-dures can be directed to an Excel output thru Excel LIBNAME engine, but only the data values will be stored in the Excel file. Formatting or styles are lost when the file is saved to Excel. Using Excel LIBNAME method, the fur-niture sales report with the same PROC REPORT example are stored and displayed in default formatting after output to Excel.

Excel Output

PDF Output

Foundations and FundamentalsNESUG 2012

Page 8: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

8

LIBNAME excelib EXCEL "C:\SAS\report_excelib.xlsx";

PROC REPORT DATA=sashelp.prdsale NOWINDOWS CENTER OUT=excelib.prdsale;

COLUMN country product year quarter ('Sales Amount' actual predict diff);

DEFINE country / ORDER GROUP NOPRINT;

DEFINE product / ORDER GROUP;

DEFINE year / ORDER GROUP;

DEFINE quarter / ORDER GROUP;

DEFINE actual / SUM 'Actual' FORMAT=DOLLAR10.;

DEFINE predict / SUM 'Predicted' FORMAT=DOLLAR10.;

DEFINE diff / COMPUTEd 'Difference' FORMAT=DOLLAR10.;

COMPUTE diff;

diff = actual.SUM - predict.SUM;

ENDCOMP;

BREAK AFTER year / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE

FONT_WEIGHT=BOLD};

BREAK AFTER product / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE

FONT_WEIGHT=BOLD};

BREAK AFTER country / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE

FONT_WEIGHT=BOLD};

BY country;

RUN;

QUIT;

LIBNAME excelib CLEAR;

The furniture sales report with Excel LIBNAME looks like:

PROC EXPORT

The functionality of PROC EXPORT procedure to export raw data does not make this a good method for produc-ing customized report.

SCENARIO 3 – CREATING INTERACTIVE EXCEL ANALYSIS TOOLS

This scenario looks at using interactive Excel analysis tools based on data updated from SAS. Most people have learned or are familiar with how to use Microsoft Excel. Even without formal training, Excel‟s point-and-click fea-

Foundations and FundamentalsNESUG 2012

Page 9: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

9

tures offer intuitive ways to perform data analysis, summary data, and graphical presentations. With Pivot Tables, large data can be quickly summarized in a tabularized form with multiple categorized levels. The Excel LIBNAME engine is the best method to harness the analytical functions and data presentation available in Excel.

EXCEL LIBNAME

The Excel LIBNAME method allows SAS to replace existing Excel spreadsheets within a workbook without delet-ing or modifying existing ones. This functionality allows making updates to workbooks, spreadsheets, and named ranges. In order for SAS to create an Excel spreadsheet with built-in Excel analysis tools, an empty spreadsheet with analysis capability must already exist. First, an empty Excel spreadsheet will have to be created with the de-sired data layouts, whether they are individual worksheets or named range, for SAS to load the data into. Then add the analysis features such as formulas, tables, charts, or Pivot tables into the spreadsheet. The source data for the analysis would use the cell references of the empty data layouts. Once SAS loads the data into the spreadsheet, Excel will display the data in the designated data layout and perform the analytical functions.

To make it clear, this example loads the SASHELP.PRDSALE dataset into a preformatted Excel spreadsheet, analysis.xlsx. The analysis.xlsx is prepared with three worksheets – 1) an empty worksheet PRDSALE for SAS to load, 2) a PIVOT TABLE worksheet to summarize the furniture sales from PRDSALE, and 3) a PIVOT CHART worksheet to graph the total sales from PRDSALE. The data source for PIVOT CHART and PIVOT TABLE is set to use the cell references in PRDSALE. When the DATA step writes data into the PRDSALE worksheet, the pivot table and chart will be updated when the workbook opens next time in Excel.

Note: a PROC DATASETS statement is placed prior to the DATA STEP to remove existing data in the PRDSALE worksheet prior to loading of new data.

LIBNAME excelib EXCEL "C:\SAS\analysis.xlsx";

PROC DATASETS LIB=excelib;

DELETE PRDSALE;

QUIT;

DATA excelib.PRDSALE;

SET sashelp.prdsale;

IF actual => predict THEN performance=1.00;

ELSE performance=0.00;

RUN;

LIBNAME excelib CLEAR;

The option for the pivot table and pivot chart is set to refresh data when opening file; it is necessary for the table and chart to be updated after each time SAS puts new data into the file. Excel automatically recalculates the summary table and chart reflecting new information.

Foundations and FundamentalsNESUG 2012

Page 10: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

10

EXCEL Tagset

The ODS ExcelXP tagset destination is designed to create a new XML file each time when outputs are generated. This limitation does not allow updating data into a preformatted spreadsheet. Therefore, the ExcelXP tagset me-thod will not easily be able to take advantage of the built-in Excel analysis tools. Yet, ExcelXP tagset allows writ-ing Excel formulas as outputs. When the spreadsheet is opened, Excel automatically runs any included formula entries and performs calculations. This might be an important feature if one is interested in using Excel rather than SAS for calculations.

PROC EXPORT

The functionality of PROC EXPORT procedure to export raw data does not make this a good method for creating Excel spreadsheet using built-in analysis tools.

SCENARIO 4 – IMPORTING EXCEL WORKSHEET AND APPLYING SAS LOGICS

The final scenario covers techniques for running SAS DATA steps and PROC procedures on data stored in Excel. This scenario looks at using Excel as a data collection instrument. The data is imported into SAS for editing and analysis. Before version SAS 9, a common way would be to import from Excel, then run the desired SAS func-tions on the locally stored SAS dataset. With Excel LIBNAME engine, SAS directly connects to the data in Excel for seamless access.

PROC IMPORT

The PROC IMPORT method is a dependable, quick, and practical method in getting data from Excel into SAS. The data are imported as SAS dataset with reliable degree of proper variable attributes and value formats. The procedure has limited, yet sufficient, options controlling the data import.

PROC IMPORT OUT=WORK.prdsale

DATAFILE="C:\SAS\prdsale_original.xlsx"

DBMS=EXCEL REPLACE;

RANGE="PRSALE";

RUN;

Once the data are stored as a SAS dataset in the WORK directory, any SAS data steps or procedures can be done.

DATA STEPS ...;

PROC STEPS ...;

EXCEL LIBNAME

Through setting an Excel LIBNAME directly to the source data stored in Excel spreadsheets, SAS accesses the data as a database with its internal ACCESS engine. This feature allows SAS to access the data as if they are any other datasets. The user can view the data in SAS Explorer, as long as the LIBNAME remains active for the SAS session.

The advantage of establishing direct access between SAS and Excel files is that it allows DATA steps and PROC procedures to conveniently run on the Excel data without extra conversion or local storage. This example shows how simple it is to run PROC MEANS and PROC SGPLOT procedures to produce summarized outputs on Excel

Foundations and FundamentalsNESUG 2012

Page 11: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

11

data without creating extra copies of the data. This method is especially useful when the source data size is large and when interest is focused on the analysis results, not the actual raw data.

LIBNAME excelib EXCEL "C:\SAS\prdsale.xlsx";

PROC MEANS DATA= excelib.prdsale SUM;

CLASS year quarter;

VAR actual predict;

RUN;

PROC SGPLOT DATA= excelib.prdsale (WHERE=(country='GERMANY' AND prod-

uct='SOFA'));

TITLE "Germany: Actual vs. Predict Sales for Sofa";

VBAR month / RESPONSE=actual;

VLINE month / RESPONSE=predict Y2AXIS;

BY region;

RUN;

LIBNAME excelib CLEAR;

The quarterly actual and predicted sales using PROC MEANS on the prdsale.xlsx spreadsheet is:

And the regional sales charts using PROC SGPLOT on Germany sofa are:

Foundations and FundamentalsNESUG 2012

Page 12: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

12

EXCEL Tagset

The ExcelXP tagset can import XML files. The steps to import Excel spreadsheets thru ExcelXP tagset method involve converting the native Excel file to XML format and running a customized macro available from SAS. After the files are exported to SAS, the data can run with any SAS data or procedure steps.

COMPARISON MATRIX

Having outlined the various functionalities, advantages, and limitations of these three methods of interacting with Excel from SAS, and having walked through several scenarios of how to implement each method, we now want to present a matrix comparing these three methods. In a glance, the table summarizes the general description for PROC EXPORT, ODS ExcelXP tagset, and Excel LIBNAME.

Category Details PROC EXPORT ExcelXP tagset Excel LIBNAME

General Description Writes SAS data to an external file such as Excel files and others.

Generate a XML file readable in Ex-cel.

Access worksheets within an Excel file as datasets within a SAS library.

Technology Thru SAS/ ACCESS, SAS reads SAS mem-ber and write to standard Excel ta-ble

SAS uses ODS output with Ex-celXP tagset desti-nation to generate XML compatible codes readable in Excel.

Thru SAS/ ACCESS, SAS allows reading and writing of Excel file with a LIBNAME engine.

Requirements SAS version SAS 7.0 SAS 9.1 SAS 9.0

SAS products SAS/Base, SAS/ACCESS

SAS/Base SAS/Base, SAS/ACCESS

Excel version 4.0, 5.0, 97, 2000, 2002, 2003, 2007

2002, 2003, 2007 5.0, 97, 2000, 2002, 2003, 2007

Excel installed No No No

Compatibility Input format .xls, .xlsb, .xlsx .xml .xls, .xlsb, .xlsm, .xlsx

Output format .xls, .xlsb, .xlsx .xml .xls, .xlsb, .xlsm, .xlsx

Naive Excel format Yes No Yes

Platform Windows, Unix*

Unix, Windows Windows

* On Unix, only DBMS=XLS is supported

Foundations and FundamentalsNESUG 2012

Page 13: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

13

All three methods have their distinctive ways to perform SAS and Excel functions on different usages. A method might be more suitable for exporting tasks, while another might be practical at generating reports. There are vary-ing strengths and limitations depending on the tasks. Users with different degree of experience in Excel and SAS might have their own preference in the amount of work spread between the two software. Choosing the suitable method to perform certain tasks could be dangling at first. Based on summary from the four scenarios presented, the detailed usage notes are put together in the comparison table.

Category Details PROC EXPORT ExcelXP tagset Excel LIBNAME

Usages Import Yes No (.xml only) Yes

Export Yes Yes Yes

2007 or newer exten-sion .xlsx

Yes No Yes

Direct access to workbook

No No Yes

Data storage, file size Small Large Small

CPU time Fast Slow Fast

Create customized Excel outputs

No Good Limited

Use Excel analysis tool

No Limited Good

SAS related features

SAS DATA steps No Yes Yes

SAS PROC steps No Yes Yes

SAS functions No Yes Yes

SAS template No Yes No

SAS styles No Yes No

Excel related features

Create workbook Yes Yes Yes

Create single work-sheet

Yes Yes Yes

Create multiple worksheets

Yes Yes Yes

Update existing workbook

Add, replace Replace Add, delete, re-place

Update existing worksheets

Replace No Add, delete, re-place

Update named range No No Yes

Formatting Default SAS defined Excel defined

Data format Default SAS defined Excel defined

Data display Default SAS defined Excel defined

View settings Default SAS defined Excel defined

Formulas (write) No Yes No

Formulas (execute) No Yes Yes

Excel tables No No Yes

Excel graphs No No Yes

Pivot table and charts

No No Yes

Print settings No Yes Yes

Foundations and FundamentalsNESUG 2012

Page 14: Integrating SAS and Excel: An overview and comparison of ... · change the content in an Excel file through the libname engine, a temporary dataset must first be created with the

14

CONCLUSION

In many work environments, Excel spreadsheets are a common medium of exchanging information. This paper presented three different ways of integrating Excel with SAS. Each method has its own strengths and limitations that make it more or less desirable in different use cases. PROC EXPORT has been an excellent method import-ing and exporting raw data between SAS and Excel for a long time. Starting with SAS 9 releases, the introduction of ExcelXP tagset in ODS and the Excel LIBNAME thru SAS/ACCESS bring new efficient ways of working with Excel, especially in creating customized reports. Beyond the implementations to Excel, the three presented me-thods are also applicable to other database formats; this topic is left to the readers to explore.

REFERENCES

Andrews, Rick. 2008. “Printable Spreadsheets Made Easy: Utilizing the SAS® Excel XP Tagset”.

Choate, Paul and Carol Martell. 2006. “De-Mystifying the SAS LIBNAME Engine in Microsoft Excel: A Practical Guide”.

DelGobbo, Vincent. 2007. “Creating AND Importing Multi-Sheet Excel Workbooks the Easy Way with SAS”.

Droogendyk, Harry. 2008. “Customized Excel Output Using the Excel Libname”.

Ford, Jason. 2012. “Getting Good-looking Output Data in Excel and Word for Special Requests”. U.S. Bureau of Labor Statistics SAS User Group.

SAS Institute Inc. “SAS/ACCESS: The LIBNAME Statement for PC Files on Windows”. Available http://support.sas.com/91doc/getdoc/acpcref.hlp/pclibname.htm

SAS Institute Inc. “Base SAS: Quick Reference for the TAGSETS.EXCELXP Tagset”. Available http://support.sas.com/rnd/base/ods/odsmarkup/excelxp_help.html

ACKNOWLEDGMENTS

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are registered trademarks or trademarks of their respective companies.

CONTACT INFORMATION

Your comments and questions are valued and encouraged. Contact the authors at:

Nathan Clausen

U.S. Bureau of Labor Statistics

2 Massachusetts Avenue, NE

Washington, DC 20212

202-691-5470

[email protected]

Edmond Cheng

U.S. Bureau of Labor Statistics

2 Massachusetts Avenue, NE

Washington, DC 20212

202-691-5458

[email protected]

Foundations and FundamentalsNESUG 2012