1月 292014

SAS users are big data consumers and big data creators. Often, we have to deal in large data files (or many smaller files) -- and that means ZIP compression. ZIP compression tools such as gzip, 7-Zip, and WinZip are ubiquitous, but they aren't always convenient to use from within a SAS program. To use an external ZIP utility you must issue a shell command via the X command or SYSTASK function, and that's not always possible within today's complex SAS environments.

Fortunately, SAS can read and write ZIP files directly. Ever since SAS 9.2, we've been able to create ZIP files with ODS PACKAGE. Beginning with SAS 9.4, we can read ZIP content by using FILENAME ZIP.

In this post, I'll review how to create ZIP files using ODS PACKAGE. I'll cover reading ZIP files with FILENAME ZIP in a future post.

Let's pretend that I'm working for a government agency, and that part of my job is to crunch some government data and publish it for the public. Of course, I'm using SAS for the analysis, but I need to publish the data in a non-proprietary format such as CSV. (It seems unbelievable, I know, but not every citizen is lucky enough to have access to SAS.)

First, I'll set up the output directory for this project. Since the ZIP file will contain a couple of files, including a subfolder, I want to mirror that structure here. The FEXIST and FDELETE functions will delete an existing ZIP file (perhaps left over from the last time I ran the process). The DLCREATEDIR option will create a "data" subfolder as needed. All of these mechanisms interact with the file system, but do not require XCMD privileges. This means that they'll work in SAS Enterprise Guide and stored processes.

%let projectDir = c:\projects\sgf2013\filenamezip;
/* Clean slate! */
filename newfile "&projectDir./carstats.zip";
data _null_;
  if (fexist('newfile')) then 
  	rc = fdelete('newfile');
filename newfile clear;
/* Create folder if it doesn't exist */
options dlcreatedir;
libname out "&projectDir./data";

Next, I need to create the content to include in the ZIP file. In this scenario, I'm crunching some heavy-duty numbers about Cars data, and then putting the results into a CSV file. Then I'm creating a README file in RTF format; the document contains a simple data dictionary plus instructions (such as they are) for using the data. I used ODS TEXT to throw in some ad-hoc text among the SAS output.

/* Create some data */
filename newcsv "&projectDir./data/pct.csv";
proc means noprint data=sashelp.cars;
var msrp;
output out=out.pct median=p50 p95=p95 p99=p99;
ods csv file=newcsv;
proc print data=out.pct;
format _all_; /* clear the formats */
ods csv close;
/* Create an informative document about this package */
filename rm "&projectDir./readme.rtf";
ods rtf(readme) 
  file="&projectDir./readme.rtf" style=Printer;
ods rtf(readme) 
  text="These are some instructions for what to do next";
proc datasets lib=out nolist;
contents data=pct;
ods rtf(readme) close;

Finally, I'm going to take those results and package them in a ZIP file. The ODS PACKAGE mechanism was originally designed to share results from a SAS stored process. By default, it adds a PackageMetaData entry that a consuming SAS application could use to interpret the result. In this case we don't need this entry; the NOPF option suppresses it.

Notice that I specify the PATH= option to place the CSV file in the "data" folder within the archive. As soon as the ODS PACKAGE CLOSE statement executes, the ZIP file is created.

/* Creating a ZIP file with ODS PACKAGE */
ods package(newzip) open nopf;
ods package(newzip) add file=newcsv path="data/";
ods package(newzip) add file=rm;
ods package(newzip) publish archive 
ods package(newzip) close;

Here's a screen shot of the ZIP file opened in WinZip:

That's it! I can add any file that I want to the ZIP archive; I'm not restricted to files that were created by SAS. This makes it easy to use SAS as an automated method to update data archives regularly, creating user-friendly packages for consumers to make use of our data.

Note: A common question: does ODS PACKAGE (and FILENAME ZIP) support password-protected ZIP files (encryption)? The answer is No. If that's a requirement, you'll need to use an external package such as 7-Zip.

Download the complete program (SAS 9.3 or later): createZipODSPackage.sas

You might also enjoy:

7月 032013

When I work on SAS projects that create lots of files as results, it's often a requirement that those files be organized in a certain folder structure. The exact structure depends on the project, but here's an example:

   |__ html
       |__ images
   |__ xls
   |__ data

Before you can have SAS populate these file folders, the folders have to actually exist. Traditionally, SAS programmers have handled this by doing one of the following:

  • Simply require that the folders exist before you run through the project. (This is the SEP method: Somebody Else's Problem.)
  • Use SAS statements and shell commands (via SYSTASK or other method) to create the folders as needed. The SAS-related archives are full of examples of this. It can get complex when you have to account for operating system differences, and whether operating system commands are even permitted (NOXCMD system option).

In SAS 9.3 there is a new system option that simplifies this: DLCREATEDIR. When this option is in effect, a LIBNAME statement that points to a non-existent folder will take matters into its own hands and create that folder.

Here's a simple example, along with the log messages:

options dlcreatedir;
libname newdir "/u/sascrh/brand_new_folder";

NOTE: Library NEWDIR was created.
NOTE: Libref NEWDIR was successfully assigned as follows: 
      Engine:        V9 
      Physical Name: /u/sascrh/brand_new_folder

You might be thinking, "Hey, SAS libraries are for data, not for other junk like ODS results." Listen: we've just tricked the LIBNAME statement into making a folder for you -- you can use it for whatever you want. I won't tell.

In order to create a series of nested folders, you'll have to create each folder level in top-down order. For example, if you need a "results" and a "results/images" folder, you can do this:

%let outdir=%sysfunc(getoption(work));
/* create a results folder in the WORK area, with images subfolder */
options dlcreatedir;
libname res "&outdir./results";
libname img "&outdir./results/images";
/* clear the librefs - don't need them */
libname res clear;
libname img clear;

Or (and this is a neat trick) you can use a single concatenated LIBNAME statement to do the job:

libname res ("&outdir./results", "&outdir./results/images");
libname res clear;

NOTE: Libref RES was successfully assigned as follows: 
      Levels:           2
      Engine(1):        V9 
      Physical Name(1): /saswork/SAS_workC1960000554D_gsf0/results
      Engine(2):        V9 
      Physical Name(2): /saswork/SAS_workC1960000554D_gsf0/results/images

If you feel that folder creation is best left to the card-carrying professionals, don't worry! It is possible for a SAS admin to restrict use of the DLCREATEDIR option. That means that an admin can set the option (perhaps to NODLCREATEDIR to prohibit willy-nilly folder-making) and prevent end users from changing it. Just let them try, and they'll see:

13         options dlcreatedir;
WARNING 36-12: SAS option DLCREATEDIR is restricted by your Site 
Administrator and cannot be updated.

That's right -- DENIED! Mordac the Preventer would be proud. Job security: achieved!

Exact documentation for how to establish Restricted Options can be a challenge to find. You'll find it within the Configuration Guides for each platform in the Install Center. Here are quick links for SAS 9.3: Windows x64, Windows x86, and UNIX.

tags: DLCREATEDIR, Restricted Options, sas administration, SAS libraries, SAS programming, SAS tips