FILENAME ZIP

10月 102017
 

Remember when 100MB was large?

SAS 9.4 Maintenance 5 includes new support for reading and writing GZIP files directly. GZIP files, usually found with a .gz file extension, are a different format than ZIP files. Although both are forms of compressed files, a GZIP file is usually a compressed copy of a single file, whereas a ZIP file is an "archive" -- a collection of files in a compressed virtual folder. GZIP tools are built into Unix/Linux platforms and are commonly used to save space when storing large text-based files that you're not ready to part with: log files, csv files, and more. The algorithm used to compress GZIP files performs especially well with text files, although you can technically GZIP any file that you want.

I've written extensively about using FILENAME ZIP to read and write ZIP archives with SAS. The latest version of filename my_gz ZIP "path-to-file/compressedfile.txt.gz" GZIP;

Here's an example that creates a compressed version of a log file:

filename source "C:\Logs\SEGuide_log.10168.txt";
filename tozip ZIP "C:\Logs\SEGuide_log.10168.txt.gz" GZIP;
 
data _null_;   
   infile source;
   file tozip ;
   input;
   put _infile_ ;
run;

In my test here, the result represents a significant size difference, with the compressed file occupying just 14% of the space.


To "re-inflate" the compressed file, we can perform the opposite operation. (I added the ENCODING option here because I know my log file was UTF-8 encoded.)

filename target "C:\LogsExpanded\SEGuide_log.10168.txt" encoding='utf-8';
filename fromzip ZIP "C:\Logs\SEGuide_log.10168.txt.gz" GZIP;
 
data _null_;   
   infile fromzip;
   file target ;
   input;
   put _infile_ ;
run;

You don't have to explicitly expand a compressed text file in order to read it with SAS. You can use the GZIP method to read and parse a .gz file directly, similar to the zcat command that you might be familiar with from the Unix shell:

filename fromzip ZIP "C:\Logs\SEGuide_log.10168.txt.gz" GZIP;
data logdata;   
   infile fromzip; /* read directly from compressed file */
   input  date : yymmdd10. time : anydttme. ;
   format date date9. time timeampm.;
run;

If your file is in a binary format such as a SAS data set (sas7bdat) or Excel (XLS or XLSX), you probably will need to expand the file completely before reading it as data. These files are read using special drivers that don't process the bytes sequentially, so you need the entire file available on disk.

Note: Because each GZIP file represents just one compressed file, the MEMBER= option doesn't apply. When dealing with ZIP file archives that contain multiple files, you could use the MEMBER= option on FILENAME ZIP to address a specific file that you want. My recent example about FINFO and file details relies heavily on that approach. However, the GZIP option and MEMBER= options are mutually exclusive. In that way, it's much simpler...just like its Unix shell equivalent.


* ZIP drive image By © Raimond Spekking / CC BY-SA 4.0 (via Wikimedia Commons), CC BY-SA 4.0, Link

The post Reading and writing GZIP files with SAS appeared first on The SAS Dummy.

9月 082017
 

It's time to share another tip about working with ZIP files in SAS. Since I first wrote about FILENAME ZIP to list and extract files from a ZIP archive, readers have been asking for more. Specifically, they want additional details about the files that are contained in a ZIP, including the original file datetime stamps, file size, and compressed size. Thanks to a feature that was quietly added into SAS 9.4 Maintenance 3, you can use the FINFO function to retrieve these details. In this article, I share a SAS macro program that does the job.

Here's an abridged example of the output. If you need to create something like this without the use of external ZIP tools like 7-Zip or WinZip (which are often unavailable in controlled environments), read on.

FILENAME ZIP output

You can download the full program from my public gist on GitHub: zipfiles_list_details.sas

ZIPpy details: a solution in three macros

Here's my basic approach to this problem:

  • First, create a list of all of the ZIP files in a directory and all of the file "members" that are compressed within. I've already shared this technique in a previous article. Like an efficient (or lazy) programmer, I'm just reusing that work. That's macro routine #1 (%listZipContents).
  • With this list in hand, iterate through each ZIP file member, "open" the file with FOPEN, and gather all of the available file attributes with FINFO. I've divided this into two macros for readability. %getZipMemberInfo (macro routine #2) retrieves all of the file details for a single member and stores them in a data set. %getZipDetails (macro routine #3) iterates through the list of ZIP file members, calls %getZipMemberInfo on each, and concatenates the results into a single output data set.

Here's a sample usage:

  %listzipcontents (targdir=C:\Projects\ZIPPED_Examples, outlist=work.zipfiles);
  %getZipDetails (inlist=work.zipfiles, outlist=work.zipdetails);

I tried to add decent comments to my program so that interested coders can study and adapt as needed. Here's a snippet of code that uses the FINFO function, which is really the important part for retrieving these file details.

/*
 Assumes an assignment like:
  FILENAME F ZIP "C:\ZIPPED_Examples\SudokuSolver_src.zip" member="src/AboutThisProject.txt";
*/
fId = fopen("&f","S");
if fID then
  do;
   infonum=foptnum(fid);
     do i=1 to infonum;
      infoname=foptname(fid,i);
      select (infoname);
       when ('Filename') filename=finfo(fid,infoname);
       when ('Member Name') membername=finfo(fid,infoname);
       when ('Size') filesize=input(finfo(fid,infoname),15.);
       when ('Compressed Size') compressedsize=input(finfo(fid,infoname),15.);
       when ('CRC-32') crc32=finfo(fid,infoname);
       when ('Date/Time') filetime=input(finfo(fid,infoname),anydtdtm.);
      end;    
   end;
 compressedratio = compressedsize / filesize;
 output;
 fId = fClose( fId );

The FINFO function in SAS provides access to file attributes and their values for a given file that you've accessed using the FOPEN function. The available file attributes can differ according to the type of file (FILENAME access method) that is used. ZIP files, as you can guess, have some attributes that are specific to them: "Compressed Size", "CRC-32", and others. This code checks for all of the available attributes and keeps those that we need for our detailed output. (And see the use of the SELECT/WHEN statement? So much more readable than a bunch of IF/THEN/ELSEs.)

Look, I'm not going to claim that my approach to this problem is the most elegant or most efficient -- but it works. If it can be improved, then I'm sure I'll hear from a few of you experts out there. Bring it on!

For more about ZIP files in SAS

The post Using FILENAME ZIP and FINFO to list the details in your ZIP files appeared first on The SAS Dummy.

10月 172016
 

SAS programmers often resort to using the X command to list the contents of file directories and to process the contents of ZIP files (or gz files on UNIX). In centralized SAS environments, the X command is unavailable to most programmers. NOXCMD is the default setting for these environments (disallowing shell commands), and SAS admins are reluctant to change it.

In this article, I'll share a SAS program that can retrieve the contents of a file directory (all of the file names), and then also report on the contents of every ZIP file within that directory -- without using any shell commands. The program uses two lesser-known tricks to retrieve the information:

  1. The FILENAME statement can be applied to a directory, and then the DOPEN, DNUM, DREAD, and DCLOSE functions can be used to retrieve information about that directory. (Check SAS Note 45805 for a better example of just this - click the Full Code tab.)
  2. The FILENAME ZIP method (added in SAS 9.4) can retrieve the names of the files within a compressed archive (ZIP or gz files). For more information, see all of my previous articles about the FILENAME ZIP access method.

I wrote the program as a SAS macro so that it should be easy to reuse. And I tried to be liberal with the comments, providing a view into my thinking and maybe some opportunities for improvement.

%macro listzipcontents (targdir=, outlist=);
  filename targdir "&targdir";
 
  /* Gather all ZIP files in a given folder                */
  /* Searches just one folder, not subfolders              */
  /* for a fancier example see                             */
  /* http://support.sas.com/kb/45/805.html (Full Code tab) */
  data _zipfiles;
    length fid 8;
    fid=dopen('targdir');
 
    if fid=0 then
      stop;
    memcount=dnum(fid);
 
    /* Save just the names ending in ZIP*/
    do i=1 to memcount;
      memname=dread(fid,i);
      /* combo of reverse and =: to match ending string */
      /* Looking for *.zip and *.gz files */
      if (reverse(lowcase(trim(memname))) =: 'piz.') OR
         (reverse(lowcase(trim(memname))) =: 'zg.') then
        output;
    end;
 
    rc=dclose(fid);
  run;
 
  filename targdir clear;
 
  /* get the memnames into macro vars */ 
  proc sql noprint;
    select memname into: zname1- from _zipfiles;
    %let zipcount=&sqlobs;
  quit;
 
  /* for all ZIP files, gather the members */
  %do i = 1 %to &zipcount;
    %put &targdir/&&zname&i;
    filename targzip ZIP "&targdir/&&zname&i";
 
    data _contents&i.(keep=zip memname);
      length zip $200 memname $200;
      zip="&targdir/&&zname&i";
      fid=dopen("targzip");
 
      if fid=0 then
        stop;
      memcount=dnum(fid);
 
      do i=1 to memcount;
        memname=dread(fid,i);
 
        /* save only full file names, not directory names */
        if (first(reverse(trim(memname))) ^='/') then
          output;
      end;
 
      rc=dclose(fid);
    run;
 
    filename targzip clear;
  %end;
 
  /* Combine the member names into a single data set        */
  /* the colon notation matches all files with "_contents" prefix */
  data &outlist.;
    set _contents:;
  run;
 
  /* cleanup temp files */
  proc datasets lib=work nodetails nolist;
    delete _contents:;
    delete _zipfiles;
  run;
 
%mend;

Use the macro like this:

%listzipcontents(targdir=c:temp, 
 outlist=work.allfiles);

Here's an example of the output.
zip file contents within the target directory

Experience has taught me that savvy SAS programmers will scrutinize my example code and offer improvements. For example, they might notice my creative use of the REVERSE function and "=:" operator to simulate and "ends with" comparison function -- and then suggest something better. If I don't receive at least a few suggestions for improvements, I'll know that no one has read the post. I hope I'm not disappointed!

tags: FILENAME ZIP, SAS programming, xcmd, ZIP files

The post List the contents of your ZIP and gz files using SAS appeared first on The SAS Dummy.

3月 052016
 

In previous articles, I've shared tips about how you can work with SAS and ZIP files without requiring an external tool like WinZip, gzip, or 7-Zip. I've covered:

But a customer approached me the other day with one scenario I missed: how to add SAS data sets to an existing ZIP file. It's a variation of a tip that I've already shared, but with two differences. First, in order to add a data set to a ZIP file, you have to know its physical filename -- not just the LIBNAME.MEMBER reference that you use in SAS procedure steps. And second, I had not shown how to add a new file to an existing ZIP archive -- though it turns out that's pretty simple.

Find the file name for a SAS data set

There are several ways to do this. For my approach, I used the output from PROC CONTENTS. Notice that I had to capture the ODS output (not the OUT= data set) to grab the file name. I wrapped it in a macro for easy reuse. And since I ultimately need a SAS fileref to map to the path, I've assigned one (data_fn) in my macro.

/* macro to assign a fileref to a SAS data set in a Base library */
%macro assignFilerefToDataset(_dataset_name);
    %local outDsName;
    ods output EngineHost=File;
    proc contents data=&_dataset_name.;
    run;
    proc sql noprint;
        select cValue1 into: outDsName 
            from work.file where Label1="Filename";
    quit;
    filename data_fn "&outDsName.";
%mend;

How to add a new member to a ZIP file

Now that I have the source file, I need to designate a destination file in a ZIP archive. The FILENAME ZIP method will create a new ZIP file if one does not yet exist, or it can add to an existing ZIP. To ensure I'm starting from scratch, I assign a simple fileref to my target destination and then delete the file.

/* Assign the fileref - basic file method */
filename projzip "&projectDir./project.zip";
/* Start with a clean slate - delete ZIP if it exists */
data _null_;
    rc=fdelete('projzip');
run;

To create a new ZIP file and designate a path and file name within it, I used the FILENAME ZIP method with the MEMBER= option. Note that I specified the "data/" subfolder in the MEMBER= value; this will place the file into a named subfolder within the archive.

/* Use FILENAME ZIP to add a new member -- CLASS */
/* Put it in the data subfolder */
filename addfile zip "&projectDir./project.zip" 
    member='data/class.sas7bdat';

Then finally, I need to actually "copy" the file into the archive. I do this by streaming the source file into the target fileref byte-by-byte:

/* byte-by-byte copy */
/* "copies" the new file into the ZIP archive */
data _null_;
    infile data_fn recfm=n;
    file addfile recfm=n;
    input byte $char1. @;
    put  byte $char1. @;
run;
 
filename addfile clear;

That's it! I now have a ZIP file with one member entry. Now I can "press repeat" to add a second entry:

%assignFilerefToDataset(sashelp.cars);
/* Use FILENAME ZIP to add a new member -- CARS */
/* Put it in the data subfolder */
filename addfile zip "&projectDir./project.zip" 
    member='data/cars.sas7bdat';
/* byte-by-byte copy */
/* "copies" the new file into the ZIP archive */
data _null_;
    infile data_fn recfm=n;
    file addfile recfm=n;
    input byte $char1. @;
    put  byte $char1. @;
run;
 
filename addfile clear;

Optional: Report on the ZIP file contents

If I want to report on the total contents of the ZIP file now, here's a DATA step and PROC CONTENTS step that does the job:

/* OPTIONAL for reporting */
/* Report on the contents of the ZIP file */
/* Assign a fileref wth the ZIP method */
filename inzip zip "&projectDir./project.zip";
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname);
    length memname $200;
    fid=dopen("inzip");
    if fid=0 then
        stop;
    memcount=dnum(fid);
    do i=1 to memcount;
        memname=dread(fid,i);
        output;
    end;
    rc=dclose(fid);
run;
/* create a report of the ZIP contents */
title "Files in the ZIP file";
proc print data=contents noobs N;
run;

Result:

Files in the ZIP file 

memname
---------------------
data/class.sas7bdat
data/cars.sas7bdat 
N = 2

I hope that this helps to make the FILENAME ZIP method more useful to those who want to try it out. I'm sure that there will be more scenarios that people will ask about; someday, if I write enough blog posts, I'll have it all covered!

Sample program: You can view/download the entire SAS program (containing the snippets I've featured and more) from my GitHub profile.

tags: FILENAME ZIP, SAS 9.4, SAS programming, ZIP files

The post Add files to a ZIP archive with FILENAME ZIP appeared first on The SAS Dummy.

5月 112015
 

I've written about how to use the FILENAME ZIP method to read and update ZIP files in your SAS programs. The ZIP method was added in SAS 9.4, and its advantage is that you can accomplish more in SAS without having to launch external utilities such as WinZip, gunzip, or 7-Zip.

Several readers replied with questions about how you can use the content of these ZIP files within your SAS program. The basic scenario is: "I've got some data files in my ZIP archive. I want to use SAS to unzip these and then use them as data within my SAS process. Can I do this?"

Yes, you can -- but it does require an extra step. Even though FILENAME ZIP can show you the contents and structure of your ZIP file, most SAS procedures cannot access the content directly while it's in the archive. So, the additional step is to copy the file to another location, effectively extracting it from the ZIP file.

As an example, I created a ZIP file with two files and a subfolder:

data.zip
  |__ sas_tech_talks_15.xlsx
  |__ sas/
      |__ instanttitles.sas7bdat

This SAS program helps me to discover how FILENAME ZIP sees the file:

filename inzip ZIP "c:projectsdata.zip";
 
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname isFolder);
 length memname $200 isFolder 8;
 fid=dopen("inzip");
 if fid=0 then
  stop;
 memcount=dnum(fid);
 do i=1 to memcount;
  memname=dread(fid,i);
  /* check for trailing / in folder name */
  isFolder = (first(reverse(trim(memname)))='/');
  output;
 end;
 rc=dclose(fid);
run;
 
/* create a report of the ZIP contents */
title "Files in the ZIP file";
proc print data=contents noobs N;
run;

Output:

        Files in the ZIP file                                         
 memname                       isFolder
 sas/                             1  
 sas/instanttitles.sas7bdat       0  
 sas_tech_talks_15.xlsx           0  
                N = 3

With this information, I can now "copy" the XLSX file out of the ZIP file and then import it into a SAS data set. Notice how I can use the "member" syntax (fileref with the file I want in parentheses) to address a specific file in the ZIP archive. I want to copy just from the actual files, and not the folder-level entries.

/* identify a temp folder in the WORK directory */
filename xl "%sysfunc(getoption(work))/sas_tech_talks_15.xlsx" ;
 
/* hat tip: "data _null_" on SAS-L */
data _null_;
   /* using member syntax here */
   infile inzip(sas_tech_talks_15.xlsx) 
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file   xl lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;
 
proc import datafile=xl dbms=xlsx out=confirmed replace;
  sheet=confirmed;
run;

Sample output from my SAS log:

NOTE: The infile INZIP(sas_tech_talks_15.xlsx) is:
      Filename=c:projectsdata.zip,
      Member Name=sas_tech_talks_15.xlsx

NOTE: UNBUFFERED is the default with RECFM=N.
NOTE: The file XL is:
      Filename=C:SAS Temporary Files_TD396_Prc2sas_tech_talks_15.xlsx,
      RECFM=N,LRECL=256,File Size (bytes)=0,
      Last Modified=11May2015:11:38:59,
      Create Time=11May2015:11:20:23

NOTE: A total of 55 records were read from the infile library INZIP.
NOTE: 55 records were read from the infile INZIP(sas_tech_talks_15.xlsx).
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

To use the SAS data set in the file, I need to copy it into a location shared by a SAS library. In this example, I will again use the WORK location. Because my SAS data set is in a logical subfolder (named "sas") within the archive, I need to include that path as part of the member syntax on the INFILE statement.

/* Copy a zipped data set into the WORK library */
filename ds "%sysfunc(getoption(work))/instanttitles.sas7bdat" ;
 
data _null_;
   /* reference the member name WITH folder path */
   infile inzip(sas/instanttitles.sas7bdat) 
	  lrecl=256 recfm=F length=length eof=eof unbuf;
   file   ds lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;
 
proc contents data=work.instanttitles;
run;

Partial output in my example:

                             Files in the ZIP file                          
                             The CONTENTS Procedure

 Data Set Name        WORK.INSTANTTITLES            Observations          1475
 Member Type          DATA                          Variables             6   
 Engine               V9                            Indexes               0   
 Created              01/29/2015 15:09:54           Observation Length    248 
 Last Modified        01/29/2015 15:09:54           Deleted Observations  0   
 Protection                                         Compressed            NO  
 Data Set Type                                      Sorted                NO  
 Label                                                                        
 Data Representation  WINDOWS_64                                              
 Encoding             wlatin1  Western (Windows)                              

Of course, all of this can be automated even further by writing SAS code that automatically iterates through the ZIP file member names and copies/imports each of the members as needed.

tags: Copy Files, FILENAME ZIP, SAS 9.4, ZIP files

The post Using FILENAME ZIP to unzip and read data files in SAS appeared first on The SAS Dummy.

1月 302014
 

In a previous post, I shared an example of using ODS PACKAGE to create ZIP files. But what if you need to read a ZIP file within your SAS program? In SAS 9.4, you can use the FILENAME ZIP access method to do the job.

In this example, let's pretend that I need to analyze data that a government agency published (maybe by using SAS!) into a ZIP file. I've selected an exciting data source (found via data.gov) about Large Truck Crash Causation.

First, I need to download the latest version of the data file. I'll use PROC HTTP to do that job:

/* detect proper delim for UNIX vs. Windows */
%let delim=%sysfunc(ifc(%eval(&sysscp. = WIN),\,/));
 
/* create a name for our downloaded ZIP */
%let ziploc = %sysfunc(getoption(work))&delim.datafile.zip;
filename download "&ziploc";
 
/* Download the ZIP file from the Internet*/
proc http
 method='GET'
 url="http://ai.fmcsa.dot.gov/ltccs/Data/TEXT/Public/LTCCS_db_txt_public_01.zip"
 out=download;
run;

Next, I need to discover what files are within the ZIP file. I'll assign a fileref using the new FILENAME ZIP method. FILENAME ZIP is a directory-based access method, similar to the CATALOG access method or to using FILENAME to map to a folder. You can use functions such as DOPEN and DREAD to treat the ZIP file as if it's a file directory (since that's what it is, in concept).

/* Assign a fileref wth the ZIP method */
filename inzip zip "&ziploc";
 
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname);
 length memname $200;
 fid=dopen("inzip");
 if fid=0 then
  stop;
 memcount=dnum(fid);
 do i=1 to memcount;
  memname=dread(fid,i);
  output;
 end;
 rc=dclose(fid);
run;
 
/* create a report of the ZIP contents */
title "Files in the ZIP file";
proc print data=contents noobs N;
run;

Here's the report of files within the ZIP archive:


I've identified the HAZMAT.TXT file as the one that I want to analyze. I peeked at the first couple of records and was able to scratch out a simple DATA step to read the data. Notice how I don't need to explicitly extract the HAZMAT.TXT file -- I can simply reference it as a "member" of the INZIP fileref. The ZIP access method does the rest.

/* Import a text file directly from the ZIP */
data hazmat;
 infile inzip(hazmat.txt) 
   firstobs=2 dsd dlm='09'x;
 input 
  CaseID $10.
  VehicleNumber 
  Material 
  Reportable 
  Waiver 	
  PSU	 
  PSUStrata	
  RATWeight;
run;
 
title "Box plot of Vehicles # per incident";
ods graphics / height=200 width=450;
proc sgplot data=hazmat;
	hbox vehiclenumber;
	label VehicleNumber="# of vehicles";
	xaxis labelattrs=(size=12) valueattrs=(size=12);
run;

SAS reads my data file successfully, and yields this interesting box plot from the SGPLOT step:


(It looks like most "hazardous materials" accidents involved just 2 or 3 vehicles, except for one messy outlier that had nearly 30. Imagine the cleanup effort on that one!)

As an alternative, if I know exactly which file I need, I can assign a direct fileref by using the MEMBER= syntax:

filename inzip zip "&ziploc" member="hazmat.txt";
 
/* then my INFILE references the file directly, no parenthesized-member */
data hazmat;
 infile inzip
   firstobs=2 dsd dlm='09'x;
/* ...  */

The ZIP access method isn't just for reading. I can also use it to create and update ZIP files. For creating ZIP files, I prefer to use ODS PACKAGE. But it's very handy to be able to update ZIP files from a SAS program without using an external tool. For example, here's a program that deletes an extraneous file from an existing ZIP file:

/* Remove the PackageMetadata piece that ODS PACKAGE creates */
filename pkg ZIP "c:\projects\filenamezip\new.zip" member="PackageMetaData";
data _null_;
 if (fexist('pkg')) then 
  rc = fdelete('pkg');
run;

Note: Like ODS PACKAGE, the FILENAME ZIP method does not support encrypted (password-protected) ZIP archives.

Download the complete SAS 9.4 program: filenameZipHttpExample.sas

Thanks to the growing size of data files, ZIP files are created and consumed by SAS users everywhere. Between ODS PACKAGE and FILENAME ZIP, you can teach your SAS programs to build and read the files without having to rely on external tools. The more you that you can use native SAS methods for this work, the more portable your SAS programs will be.

tags: FILENAME ZIP, PROC HTTP, SAS 9.4, ZIP files
1月 292014
 

SAS users are big data consumers and big data creators. Often, we have to deal in large data files (or many smaller files) -- and that means ZIP compression. ZIP compression tools such as gzip, 7-Zip, and WinZip are ubiquitous, but they aren't always convenient to use from within a SAS program. To use an external ZIP utility you must issue a shell command via the X command or SYSTASK function, and that's not always possible within today's complex SAS environments.

Fortunately, SAS can read and write ZIP files directly. Ever since SAS 9.2, we've been able to create ZIP files with ODS PACKAGE. Beginning with SAS 9.4, we can read ZIP content by using FILENAME ZIP.

In this post, I'll review how to create ZIP files using ODS PACKAGE. I'll cover reading ZIP files with FILENAME ZIP in a future post.

Let's pretend that I'm working for a government agency, and that part of my job is to crunch some government data and publish it for the public. Of course, I'm using SAS for the analysis, but I need to publish the data in a non-proprietary format such as CSV. (It seems unbelievable, I know, but not every citizen is lucky enough to have access to SAS.)

First, I'll set up the output directory for this project. Since the ZIP file will contain a couple of files, including a subfolder, I want to mirror that structure here. The FEXIST and FDELETE functions will delete an existing ZIP file (perhaps left over from the last time I ran the process). The DLCREATEDIR option will create a "data" subfolder as needed. All of these mechanisms interact with the file system, but do not require XCMD privileges. This means that they'll work in SAS Enterprise Guide and stored processes.

%let projectDir = c:\projects\sgf2013\filenamezip;
 
/* Clean slate! */
filename newfile "&projectDir./carstats.zip";
data _null_;
  if (fexist('newfile')) then 
  	rc = fdelete('newfile');
run;
filename newfile clear;
 
/* Create folder if it doesn't exist */
options dlcreatedir;
libname out "&projectDir./data";

Next, I need to create the content to include in the ZIP file. In this scenario, I'm crunching some heavy-duty numbers about Cars data, and then putting the results into a CSV file. Then I'm creating a README file in RTF format; the document contains a simple data dictionary plus instructions (such as they are) for using the data. I used ODS TEXT to throw in some ad-hoc text among the SAS output.

/* Create some data */
filename newcsv "&projectDir./data/pct.csv";
proc means noprint data=sashelp.cars;
var msrp;
output out=out.pct median=p50 p95=p95 p99=p99;
run;
ods csv file=newcsv;
proc print data=out.pct;
format _all_; /* clear the formats */
run;
ods csv close;
 
/* Create an informative document about this package */
filename rm "&projectDir./readme.rtf";
ods rtf(readme) 
  file="&projectDir./readme.rtf" style=Printer;
ods rtf(readme) 
  text="These are some instructions for what to do next";
proc datasets lib=out nolist;
contents data=pct;
quit;
ods rtf(readme) close;

Finally, I'm going to take those results and package them in a ZIP file. The ODS PACKAGE mechanism was originally designed to share results from a SAS stored process. By default, it adds a PackageMetaData entry that a consuming SAS application could use to interpret the result. In this case we don't need this entry; the NOPF option suppresses it.

Notice that I specify the PATH= option to place the CSV file in the "data" folder within the archive. As soon as the ODS PACKAGE CLOSE statement executes, the ZIP file is created.

/* Creating a ZIP file with ODS PACKAGE */
ods package(newzip) open nopf;
ods package(newzip) add file=newcsv path="data/";
ods package(newzip) add file=rm;
ods package(newzip) publish archive 
  properties(
   archive_name="carstats.zip" 
   archive_path="&projectDir."
  );
ods package(newzip) close;

Here's a screen shot of the ZIP file opened in WinZip:

That's it! I can add any file that I want to the ZIP archive; I'm not restricted to files that were created by SAS. This makes it easy to use SAS as an automated method to update data archives regularly, creating user-friendly packages for consumers to make use of our data.

Note: A common question: does ODS PACKAGE (and FILENAME ZIP) support password-protected ZIP files (encryption)? The answer is No. If that's a requirement, you'll need to use an external package such as 7-Zip.

Download the complete program (SAS 9.3 or later): createZipODSPackage.sas

You might also enjoy:

tags: DLCREATEDIR, FILENAME ZIP, ODS PACKAGE, ZIP files