Macro Programming

7月 062018
 

SAS programmers have long wanted the ability to control the flow of their SAS programs without having to resort to complex SAS macro programming. With SAS 9.4 Maintenance 5, it's now supported! You can now recently came to light on SAS Support Communities. (Thanks to Super User Tom for asking about it.)

Prior to this change, if you wanted to check a condition -- say, whether a data set exists -- before running a PROC, you had to code it within a macro routine. It would look something like this:

/* capture conditional logic in macro */
%macro PrintIfExists();
 %if %sysfunc(exist(work.result)) %then
  %do;
    proc means data=work.result;
    run;
  %end; 
%else
  %do;
    %PUT WARNING: Missing WORK.RESULT - report process skipped.;
  %end;
%mend;
 
/* call the macro */
%PrintIfExists();

Now you can simplify this code to remove the %MACRO/%MEND wrapper and the macro call:

/* If a file exists, take an action */
/* else fail gracefully */
%if %sysfunc(exist(work.result)) %then
  %do;
    proc means data=work.result;
    run;
  %end;
%else
  %do;
    %PUT WARNING: Missing WORK.RESULT - report process skipped.;
  %end;

Here are some additional ideas for how to use this feature. I'm sure you'll be able to think of many more!

Run "debug-level" code only when in debug mode

When developing your code, it's now easier to leave debugging statements in and turn them on with a simple flag.

/* Conditionally produce debugging information */
%let _DEBUG = 0; /* set to 1 for debugging */
%if &_DEBUG. %then
  %do;
    proc print data=sashelp.class(obs=10);
    run;
  %end;

If you have code that's under construction and should never be run while you work on other parts of your program, you can now "IF 0" out the entire block. As a longtime C and C++ programmer, this reminds me of the "#if 0 / #endif" preprocessor directives as an alternative for commenting out blocks of code. Glad to see this in SAS!

/* skip processing of blocks of code */
/* like #if 0 / #endif in C/C++      */
%if 0 %then
  %do;
    proc ToBeDetermined;
      READMYMIND = Yes;
    run;
  %end;

Run code only on a certain day of the week

I have batch jobs that run daily, but that send e-mail to people only one day per week. Now this is easier to express inline with conditional logic.

/*If it's Monday, send a weekly report by email */
%if %sysfunc(today(),weekday1.)=2 %then
  %do;
    options emailsys=smtp emailhost=myhost.company.com;
    filename output email
      subject = "Weekly report for &SYSDATE."
      from = "SAS Dummy <sasdummy@sas.com>"
      to = "knowledgethirster@curious.net"
      ct ='text/html';
 
  ods tagsets.msoffice2k(id=email) 
    file=OUTPUT(title="Important Report!")
    style=seaside;
   title "The Weekly Buzz";
   proc print data=amazing.data;
   run;
  ods tagsets.msoffice2k(id=email) close;
  %end;

Check a system environment variable before running code

For batch jobs especially, system environment variables can be a rich source of information about the conditions under which your code is running. You can glean user ID information, path settings, network settings, and so much more. If your SAS program needs to pick up cues from the running environment, this is a useful method to accomplish that.

/* Check for system environment vars before running code */
%if %sysfunc(sysexist(ORACLE_HOME)) %then
  %do;
    %put NOTE: ORACLE client is installed.;
    /* assign an Oracle library */
    libname ora oracle path=corp schema=alldata authdomain=oracle;
  %end;

Limitations of %IF/%THEN in open code

As awesome as this feature is, there are a few rules that apply to the use of the construct in open code. These are different from what's allowed within a %MACRO wrapper.

First rule: your %IF/%THEN must be followed by a %DO/%END block for the statements that you want to conditionally execute. The same is true for any statements that follow the optional %ELSE branch of the condition.

And second: no nesting of multiple %IF/%THEN constructs in open code. If you need that flexibility, you can do that within a %MACRO wrapper instead.

And remember, this works only in SAS 9.4 Maintenance 5 and later. That includes the most recent release of SAS University Edition, so if you don't have the latest SAS release in your workplace, this gives you a way to kick the tires on this feature if you can't wait to try it.

The post Using %IF-%THEN-%ELSE in SAS programs appeared first on The SAS Dummy.

7月 122016
 

I've been working on a SAS program that can add content to the SAS Support Communities (more on that in a future post). Despite my 20+ years of SAS experience, there are a lot of SAS programming tricks that I don't know. Or that I use so infrequently that I always need to remind myself how to accomplish them.

Here's one. I needed to read the contents of an external text file into a SAS macro variable, so that I could then use that value (a very long text string) as part of an API call. In searching for a technique that would work for me, I came across a similar question on SAS Support Communities -- one that had been solved by our resident SASJedi, Mark Jordan. Perfect!

Here's the solution that worked for me:

FILENAME msghtml "path-to-text-file" ;
data _null_;
   length text $32767;
   retain text '';
   infile msghtml flowover dlmstr='//' end=last;
   input;
   text=cats(text,_infile_);
   if last then call symput('MSGBODY',text);
run;

The RETAIN statement allows me to build up the "text" variable as the DATA step processes multiple lines. The END=last on the INFILE statement sets a flag when we hit end-of-file, so I know that we're done and I can CALL SYMPUT the macro value. The FLOWOVER option tells the INPUT statement to keep reading even if no input values are found in the current record. (FLOWOVER is the default behavior, so the option probably isn't needed here.) DLMSTR allows you to specify a multichar delimiter string that's different than the default delimiter (a space character). We're using the CATS function to concatenate a trimmed version of the input buffer (_INFILE_) to the RETAINed "text" variable.

For my project I needed to URL-encode the text value for use in an HTTP-based REST API. So for me, the last line is really:

if last then call symput('MSGBODY',urlencode(trim(text)));

The SAS Support Communities has been a big help to me during this project -- a project that is designed to improve the communities even more. It's a virtuous cycle! I hope that this helps some of you out there, too.

tags: macro programming, SAS Communities, SAS programming

The post How to read the contents of a file into a SAS macro variable appeared first on The SAS Dummy.

1月 262015
 

splitdataBack in the day when the prison system forced inmates to perform "hard labor", folks would say (of someone in prison): "He's busy making little ones out of big ones." This evokes the cliché image of inmates who are chained together, forced to swing a chisel to break large rocks into smaller rocks. (Yes, it seems like a pointless chore. Here's a Johnny Cash/Tony Orlando collaboration that sets it to music.)

SAS programmers are often asked to break large data sets into smaller ones. Conventional wisdom says that this is also a pointless chore, since you can usually achieve what you want (that is, process a certain subset of data) by applying a WHERE= option or FIRSTOBS=/OBS= combination. Splitting a data set creates more files, which occupy more disk space and forces more I/O operations. I/O and disk access is often the most expensive part of your SAS processing, performance-wise.

But if the boss asks for broken-up data sets, you might as well spend the least possible effort on the task. Let's suppose that you need to break up a single data set into many based on the value of one of the data columns. For example, if you need to break SASHELP.CARS into different tables based on the value of Origin, the SAS program would look like:

DATA out_Asia;
 set sashelp.cars(where=(origin='Asia'));
run;
DATA out_Europe;
 set sashelp.cars(where=(origin='Europe'));
run;
DATA out_USA;
 set sashelp.cars(where=(origin='USA'));
run;

I'm going to admit right now that this isn't the most efficient or elegant method, but it's something that most beginning SAS programmers could easily come up with.

Writing the above program is easy, especially since there are only 3 different values for Origin and I've memorized their values. But if there are more discrete values for the "split-by" column, the task could involve much more typing and has a high possibility for error. This is when I usually use PROC SQL to generate the code for me.

If you've read my article about implementing BY processing for an entire SAS program, you know that you can use PROC SQL and SELECT INTO to place data values from a data set into a macro variable. For example, consider this simple program:

proc sql;
 select distinct ORIGIN into :valList separated by ',' from SASHELP.CARS;
quit;

It creates a macro variable VALLIST that contains the comma-separated list: "Asia,Europe,USA".

But we can use SAS functions to embellish that output, and create additional code statements that weave the data values into SAS program logic. For example, we can use the CAT function to combine the values that we query from the data set with SAS keywords. The results are complete program statements, which can then be referenced/executed in a SAS macro program. I'll share my final program, and then I'll break it down a little bit for you. Here it is:

/* define which libname.member table, and by which column */
%let TABLE=sashelp.cars;
%let COLUMN=origin;
 
proc sql noprint;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct 
   cat("DATA out_",compress(&COLUMN.,,'kad'),
   "; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
   "')); run;") into :allsteps separated by ';' 
  from &TABLE.;
quit;
 
/* macro that includes the program we just generated */
%macro runSteps;
 &allsteps.;
%mend;
 
/* and...run the macro when ready */
%runSteps;

Here are the highlights from the PROC SQL portion of the program:

  • SELECT DISTINCT ensures that the results include just one record for each unique value of the variable.
  • The CAT function concatenates a set of string values together. Note that CATX and CATS and CATT -- other variations of this function -- will trim out white space from the various string elements. In this case I want to keep any blank characters that occur in the data values because we're using those values in an equality check.
  • The program calculates a name for each output data set by using each data value as a suffix ("OUT_dataValue"). SAS data set names can contain only numbers and letters, so I use the COMPRESS function to purge any invalid characters from the data set name. The 'kad' options on COMPRESS tell it to keep only alpha and digit characters.
  • The resulting program statements all end up in the &ALLSTEPS macro variable. I could just reference the &ALLSTEPS variable in the body of the SAS program, and SAS would run it as-is. Instead I chose to wrap it in the macro %runSteps. This makes it a little bit easier to control the scope and placement of the executable SAS program statements.

"By each value of a variable" is just one criterion that you might use for splitting a data set. I've seen cases where people want to split the data based on other rules, such as:

  • Quantity of observations (split a 3-million-record table into 3 1-million-record tables)
  • Rank or percentiles (based on some measure, put the top 20% in its own data set)
  • Time span (break up a data set by year or month, assuming the data records contain a date or datetime variable)

With a small modification, my example program can be adapted to serve any of these purposes. What about you? Are you ever asked to split up SAS data sets, and if so, based on what criteria? Leave a comment and tell us about it.

tags: data management, macro programming, PROC SQL, SAS programming
10月 022013
 

Has this ever happened to you? You have a SAS program with statements that you wrote, then you make use of a macro function or %include file supplied by a helpful colleague, and when your SAS code resumes, you find that all of your SAS titles have been changed out from under you!

To be a "good citizen" within a macro function, you should return the SAS environment to the way you found it. You'll occasionally see constructs like this to store and restore SAS option values:

%let _currValidName = %sysfunc(getoption(validvarname)); 
%put Overriding current VALIDVARNAME setting of &_currValidName; 
options validvarname=v7;
 
/* some statements that rely on VALIDVARNAME */
 
%put Restoring previous VALIDVARNAME setting to &_currValidName; 
options validvarname=&_currValidName.;

(Even more is possible with PROC OPTSAVE; see Denise Poll's paper on this topic.)

But what about the titles and footnotes? Is there a way to squirrel those values away before usurping them, and then put them back the way that you found them?

Yes, there is. Even though TITLE and FOOTNOTE are global SAS statements and not system options per se, you can programmatically query the current settings from one of the SAS dictionary tables: SASHELP.VTITLE. (Despite the "VTITLE" name, this table also contains FOOTNOTE settings.)

You can use these macros (or the code within them) to save and restore the current settings for TITLE and FOOTNOTE statements:

/* Define macro to save titles */
%macro saveTitles;
  data _savedTitles;
    set sashelp.vtitle;
  run;
%mend;
 
/* Define macro to restore previously saved titles */
%macro restoreTitles;
  proc sql noprint;
    /* Using a SAS 9.3 feature that allows open-ended macro range */
    select text into :SavedTitles1- from _savedTitles where type="T";
    %let SavedTitlesCount = &sqlobs.;
 
    /* and footnotes */
    select text into :SavedFootnotes1- from _savedTitles where type="F";
    %let SavedFootnotesCount = &sqlobs.;
 
    /* remove data set that stored our titles*/
    drop table _savedTitles;
  quit;
 
  /* emit statements to reinstate the titles */
  TITLE; /* clear interloping titles */
  %do i = 1 %to &SavedTitlesCount.;
    TITLE&i. "&&SavedTitles&i.";
  %end;
 
  FOOTNOTE; /* clear interloping footnotes */
  %do i = 1 %to &SavedFootnotesCount.;
    FOOTNOTE&i. "&&SavedFootnotes&i.";
  %end;
%mend;

Sample use:

title "This is my tremendous title";
title2 "and this is a subtitle";
footnote "Created by ME";
proc means data=sashelp.class;
  var weight;
run;
%saveTitles;
 
%someMacroThatChangesTitles();
 
/* restore the old titles */
%restoreTitles;
proc means data=sashelp.class;
  var height;
run;

There is at least one idiosyncrasy of this approach: if your title or footnote includes a macro variable or expression, that macro will be resolved when the title is stored. So when you restore to the original value with %restoreTitles(), the value will be "stuck" as it was when you used %saveTitles(). If you dislike that limitation, then perhaps some hotshot macro programmer will add a solution here in the comments.

tags: macro programming, PROC OPTSAVE, SAS programming
9月 182013
 

A couple of years ago I shared a method for copying any file within a SAS program. It was a simple approach, copying the file byte-by-byte from one fileref (SAS file reference) to another.

My colleague Bruno Müller, a SAS trainer in Switzerland, has since provided a much more robust method. Bruno's method has several advantages:

  • It's coded as a SAS macro, so it is simple to reuse -- similar to a function.
  • It copies the file content in chunks rather than byte-by-byte, so it's more efficient.
  • It provides good error checks and reports any errors and useful diagnostics to the SAS log.
  • It's an excellent example of a well-documented SAS program!

Bruno tells me that "copying files" within a SAS program -- especially from nontraditional file systems such as Web sites -- is a common need among his SAS students. I asked Bruno for his permission to share his solution here, and he agreed.

To use the macro, you simply define two filerefs: _bcin (source) and _bcout (target), then call the %binaryFileCopy() macro. Here is an example use that copies a file from my Dropbox account:

filename _bcin TEMP;
filename _bcout "C:\temp\streaming.sas7bdat";
proc http method="get" 
 url="https://dl.dropbox.com/s/pgo6ryv8tfjodiv/streaming.sas7bdat" 
 out=_bcin
;
run;
 
%binaryFileCopy()
%put NOTE: _bcrc=&_bcrc;
 
filename _bcin clear;
filename _bcout clear;

The following is partial log output from the program:

NOTE: BINARYFILECOPY start  17SEP2013:20:50:33
NOTE: BINARYFILECOPY infile=_bcin C:\SASTempFiles\_TD5888\#LN00066
NOTE: BINARYFILECOPY outfile=_bcout C:\temp\streaming.sas7bdat

NOTE: BINARYFILECOPY processed 525312 bytes
NOTE: DATA statement used (Total process time):
      real time           0.20 seconds
      cpu time            0.07 seconds    

NOTE: BINARYFILECOPY end  17SEP2013:20:50:34
NOTE: BINARYFILECOPY processtime 00:00:00.344

You can download the program -- which should work with SAS 9.2 and later -- from here: binaryfilecopy.sas

Update: using FCOPY in SAS 9.4

Updated: 18Sep2013
Within hours of my posting here, Vince DelGobbo reminded me about the new FCOPY function SAS 9.4. With two filerefs assigned to binary-formatted files, you can use FCOPY to copy the content from one to the other. When I first tried it with my examples, I had problems because of the way FCOPY treats logical record lengths. However, Jason Secosky (the developer for FCOPY and tons of other SAS functions) told me that if I use RECFM=N on each FILENAME statement, the LRECL would not be a problem. And of course, he was correct.

Here's my example revisited:

filename _bcin TEMP recfm=n /* RECFM=N needed for a binary copy */;
filename _bcout "C:\temp\streaming.sas7bdat" recfm=n;
 
proc http method="get" 
 url="https://dl.dropbox.com/s/pgo6ryv8tfjodiv/streaming.sas7bdat" 
 out=_bcin
;
run;
 
data _null_;
   length msg $ 384;
   rc=fcopy('_bcin', '_bcout');
   if rc=0 then
      put 'Copied _bcin to _bcout.';
   else do;
      msg=sysmsg();
      put rc= msg=;
   end;
run;
 
filename _bcin clear;
filename _bcout clear;
tags: Copy Files, FCOPY, macro programming, SAS 9.4, SAS programming
6月 052013
 

One of the great things about SAS libraries is that you can write your programs to read and write data without having to worry about where the data lives. SAS data set on a file system? Oracle table in a database server? Hadoop data in Hive? For many SAS applications, the programmer can treat these very different data sources in exactly the same way.

...except for the cases where you can't treat them the same, because they are different. These different databases have different capabilities and behaviors, and sometimes you need to optimize your SAS programs to take advantage of those differences.

Recently, a colleague at SAS needed a reliable SAS macro-based method to detect the library engine for a given libref. Here's what we came up with:

%macro getEngine(libref);
 %global ENGINE;
 %let dsid=%sysfunc(open(sashelp.vlibnam(where=(libname="&libref.")),i));
 %if (&dsid ^= 0) %then %do;  
   %let engnum=%sysfunc(varnum(&dsid,ENGINE));
   %let rc=%sysfunc(fetch(&dsid));
   %let engine=%sysfunc(getvarc(&dsid,&engnum));
       %put &libref. ENGINE is &engine.;
   %let rc= %sysfunc(close(&dsid.));
 %end;
%mend;

This simple macro peeks inside SASHELP.VLIBNAM, a virtual table that maintains the vital characteristics of all of the active SAS libraries in a session. The macro queries the table for the engine name for the given library, and places the result in a macro variable named &ENGINE.

Here are some example uses and results. The first two calls are for built-in SAS libraries, which use the BASE engine (aliased to "V9"). The third call is for a MySQL library that I use for reporting on our WordPress database.

34         %getEngine(SASHELP);
SASHELP ENGINE is V9
35         %getEngine(WORK);
WORK ENGINE is V9
36         %getEngine(WPBLOGS);
WPBLOGS ENGINE is MYSQL
tags: data access, macro programming, SAS libraries, SAS programming
5月 172013
 

A few months ago I released the Copy Files task for use with SAS Enterprise Guide. The task allows you to transfer any files between your PC and a SAS Workspace session, much like an FTP process. It doesn't rely on FTP though; it uses a combination of SAS code, Windows APIs, and SAS Integration Technologies to get the job done.

It's proven to be a very popular task, because it can be useful in so many situations. It even earned a mention in a SAS Global Forum paper this year (and no, it wasn't a paper that I wrote).

Today I'm going to point out the things that the task doesn't do so well. Or at least, that it didn't do well until I made some updates. My changes were based on two "complaints" from several SAS users.

Read on for the details. But if you don't care and you just want the latest version of the task, you can download it from here.

Complaint #1: Wildcards that are a little too "wild"

The task allows you to use wildcard characters in your file specifications so that you can match multiple files to transfer. A problem occurs though, when your file specification looks like this:

/usr/local/data/*.xls

Can you guess the problem? What if I told you that the task stores your file specification in a SAS macro variable? Yep, it's that "/*" sequence in the value that trips things up, because SAS interprets it as the start of a comment. Left unchecked, this sabotages the remainder of the SAS code that is included in the process.

The SAS macro experts are already shouting out the answer to fix this: use %STR to wrap the slash and "hide" the token from the SAS parser. That's a great idea! Except that the task relies on the SAS "internal" value for this value --and not the displayed value -- when it comes time to process. These values are different when %STR wraps a special character like the forward slash. The macro facility changes out this character with a hexadecimal character called a delta character.

To illustrate, I used another popular custom task -- the SAS Macro Variable Viewer -- to show the inner value of a SAS macro variable:

Notice the funky arrow characters. Is that what you were expecting?

Now the task detects the presence of a forward slash (and some other special characters) and will automatically add the %STR so you don't have to. (But you can still use %STR if you want to.) And it correctly detects the delta characters, if present, to convert them back to their correct form before trying to use the value.

Complaint #2: Fixing line-ending characters but breaking other stuff

Users of FTP might be familiar with binary versus ASCII mode for file transfers. Because UNIX line-endings are different than Windows line-endings for text files, transferring a file in ASCII mode helps to ensure proper line-ending behavior for the target host.

The Copy Files task transfers ALL files using a binary mode. Why? Because in today's global workplace even text-based files often don't adhere to the limited English-centric ASCII standard. Attempting a text-based file transfer could result in encoding mismatches, so it's much safer to transfer content as "binary blobs".

But you still want your text files to have the proper line endings for the target host. To answer that, the Copy Files task offers a "Fix line-ending characters" option that does the following:

  • Scans the file to determine whether it's a text file. (This relies on the file content and not on special file extensions such as .TXT or .CSV.).
  • Rewrites the file and replaces the line-ending characters as needed for the target file system (Windows or UNIX).

The problem was that in rewriting the file (using Windows-based StreamReader and StreamWriter functions), the Copy Files task was changing the file encoding to UTF-8. That encoding works fine on Windows and most users didn't even notice. But some users sent me output from file dump tools and comparisons that showed the byte-order mark characters that were added to the file. (SAS users: I knew I could count on you!)

To address this, I changed the "fix line endings" process to use lower level I/O functions that simply scan through the text files as a binary stream, byte-for-byte, and change the line endings as needed. Trying to decide on proper encoding is risky business, so I decided to leave the character encoding untouched.

In addition to my own testing, a couple of users out there have confirmed that my changes fix the issues -- at least for now. Thanks for that! If you want to try the latest, get it now from here:

>> Download the Copy Files task

Related articles

Copying files in SAS Enterprise Guide
Inspecting SAS macro variables in SAS Enterprise Guide

tags: FTP, macro programming, SAS custom tasks, SAS Enterprise Guide
10月 092012
 

You might know about the many automatic macro variables that are available in SAS. They provide plenty of information about the SAS environment, such as the current user (SYSUSERID), the SAS version (SYSVER and SASVLONG), and the operating system where SAS is running (SYSCP and SYSCPL). That information is often useful for your SAS programs and reporting output.

Did you know that SAS Enterprise Guide also generates several macro variables? These provide information about the SAS Enterprise Guide environment, which is often different than the SAS environment when you're connected to a remote SAS workspace.

You can use these macro variables to code various logic tricks into your programs, such as to determine where the current project file resides and assign a library to its path.

Or, even if you don't write SAS programs, you can use the macro variables in task titles and footnotes. For example:

Generated by &_CLIENTUSERNAME (&_CLIENTUSERID) with SAS Enterprise Guide &_CLIENTVERSION

For me, this yields:

Generated by 'Chris Hemedinger' ('sascrh') with SAS Enterprise Guide '5.100.0.12019'

Here's the complete list of the generated macro variables (as of SAS Enterprise Guide 4.3 and 5.1). These macro variables are also generated in the SAS Add-In for Microsoft Office. You can always find this list in the SAS Enterprise Guide online help, which you access from the help menu. Simply search the keyword index for "macro".

Macro Variable Description
_CLIENTAPP Name of the client application.
_CLIENTMACHINE Client machine node name.
_CLIENTPROJECTNAME The filename for the project.
_CLIENTPROJECTPATH The full path and filename for the project.
_CLIENTTASKFILTER The filter that is defined for the task. You can use this macro variable in the titles and footnotes of the task, so that the filter information is displayed in the title or footnote of your results.
_CLIENTTASKLABEL Label for the current task. This is the text label that is displayed in the project tree and the process flow.
_CLIENTUSERID User ID of the client user.
_CLIENTUSERNAME Full user name, if that information is available.
_CLIENTVERSION Application version, including build number.
_SASHOSTNAME Server node name (IP address or DNS name).
_SASPROGRAMFILE The full path and filename of the SAS program that is currently being run. This macro variable is available only for SAS program files that are saved on the same server on which your SAS Enterprise Guide code is being run.
_SASSERVERNAME Name of the logical server.

And remember, you can easily see all of the available macro variables, along with their current values, by using the SAS Macro Variable viewer custom task.

tags: macro programming, SAS Enterprise Guide
3月 232012
 

Earlier this week I described a common programming pattern in the SAS macro language. The pattern sets up a loop for processing each distinct value of a classification variable. The program uses the PROC SQL SELECT INTO feature to populate SAS macro variables. The effect: you can roll your own BY processing in SAS.

I posted the example and received many useful comments, including news about new features in SAS 9.3 that allow us to improve this pattern. Instead of two SELECT statements (one to gather a count, and a second to populate a range of variables), we can now achieve the same with just one SELECT statement (and thus, just one pass through the data):

/* SAS 9.3 approach */
/* Create macro vars with values and a total count of distinct values */
proc sql noprint;
  select distinct TYPE into :varVal1- from SASHELP.CARS;
  %let varCount = &SQLOBS.;
quit;

With the new SAS 9.3 syntax, we can now specify an open-ended range for macro variable names. SAS will allocate as many macro variables as are needed to store each value. The values will be TRIMMED of whitespace by default, but you can control this by specifying NOTRIM before the FROM keyword.

Since we don't need to know the count ahead of time, we can then safely use &SQLOBS after the SELECT to determine the upper bound of our index.

You can also specify leading zeros in the macro variable range, which can help with sorting the ordinals later. Here's a revised example:

proc sql noprint;
  /* using leading zero */
  select distinct MAKE into :varVal01- from SASHELP.CARS;
  %let varCount = &SQLOBS.;
quit;

The accompanying image shows the resulting macro variables in the SAS Macro Variable Viewer, neatly collated in numerical order. (Remember that if you do add the leading zeros, you may need to use the Zw.d format when building up the macro variable names within a loop.)

tags: macro programming, sas 9.3, SAS programming, sql
3月 202012
 

Most SAS procedures support the BY statement, which allows you to create a report or analysis for each distinct value of a variable in your data set. The syntax is simple, and SAS procedures are usually tuned to do a good job of processing the data efficiently.

However, the BY statement approach has some limitations:

For example, suppose you want to show a PROC PRINT output followed by a PROC SGPLOT chart for each value of a variable in a data set. Using the BY statement in each of these two steps would produce a series of PROC PRINT results, followed by a series of PROC SGPLOT results. (Oh, and don't forget that you can use SGPLOT and SGPANEL for classification plots without explicit BY processing.)

There is a SAS programming pattern that allows you to extend the concept of BY processing to larger segments of your SAS program. In pseudo-code, this allows you to implement program logic such as:

for each value BY_VAL of VAR in DATASET 
  do;
    PROC step1 (for VAR=BY_VAL)
    run;
    PROC step2 (for VAR=BY_VAL)
    run;
    /* other PROC or DATA steps as needed */
    /* each for the case of VAR=BY_VAL    */
  end;

In real SAS code, the programming pattern uses the SAS macro language. (Of course! Because with SAS macro, there is almost nothing that you can't do.)

Let's take a simple code example through this transformation. Let's combine a PROC FREQ step with a PROC SGPLOT step. Here's the example without BY processing or classification. The results show a report and plot for ALL values of Type within SASHELP.CARS. It's okay, but it doesn't provide much insight into the different classifications of car Type (Hybrid, Truck, Sedan, and so on).

title "Type by Origin in SASHELP.CARS";
proc freq data=sashelp.cars;
  table origin*type /nocum nopercent list;
run;
 
proc sgplot data=sashelp.cars;
  histogram mpg_city;
  density mpg_city / type=kernel;
run;

Step 1. Get your program working for one value using WHERE=

Before introducing any macro statements into the mix, it's a good idea to get your non-macro logic correct. Things are much easier to debug and troubleshoot before you add any macro processing. This modification shows what the program would look like with one known value of Type:

title "HYBRID by Origin in SASHELP.CARS";
proc freq data=sashelp.cars (where=(type="Hybrid"));
  table origin*type /nocum nopercent list;
run;
 
proc sgplot data=sashelp.cars(where=(type="Hybrid"));
  histogram mpg_city;
  density mpg_city / type=kernel;
run;

Example output:

Step 2. Count the distinct values and create macro variables for each

Once you're happy with the output for the single distinct value in you created in Step 1, it's time to gather the information you need to repeat that step for each distinct value in the data. Before you can achieve this with a SAS macro loop, you're going to need two bits of information: how many distinct values are there (for the loop index), and what ARE those distinct values (for each iteration). Here's the pattern of code to figure this out:

/* create macro vars with values and a total count of distinct values */
proc sql noprint;
  select strip(put(count(distinct VAR),15.)) into :varCount from DATASET;
  select distinct VAR into :varVal1- :varVal&varCount  from DATASET;
quit;

This pattern uses the SELECT INTO feature of PROC SQL, which allows you to populate SAS macro variables with the output of any query operation. I use this code pattern often, so I created an editor abbreviation to make it easy to insert into any SAS program as I work. Then I can simply replace the key parts of the statements with the values I need from the program, such as for this example:

proc sql noprint;
  select strip(put(count(distinct TYPE),15.)) into :varCount from SASHELP.CARS;
  select distinct TYPE into :varVal1- :varVal&varCount  from SASHELP.CARS;
quit;

I replaced "VAR" with my class variable, "TYPE". And I replaced "DATASET" with "SASHELP.CARS". That's it.

UPDATE 26Mar2012: With new features in SAS 9.3, the above code pattern becomes even simpler. See how to improve on this SAS programming pattern.

When I run just these statements in SAS Enterprise Guide, I can use my SAS Macro Variable Viewer to see how my macro variables were initialized.

Step 3. Wrap your program logic in a macro function, and add a %DO loop

Now it's time to wrap the program segment in a %MACRO statement, and then add a %DO loop so that the segment is processed varCount times. (That's going to be 6 times, in our example.)

%macro ReportOnEachType;
  %do index = 1 %to &varCount;
    title "HYBRID by Origin in SASHELP.CARS";
    proc freq data=sashelp.cars (where=(type="Hybrid"));
      table origin*type /nocum nopercent list;
    run;
 
    proc sgplot data=sashelp.cars(where=(type="Hybrid"));
      histogram mpg_city;
      density mpg_city / type=kernel;
    run;
  %end;
%mend;
%ReportOnEachType;

Step 4. Fix your WHERE= processing to reference the macro variables

Now we've got the correct number of steps running, but the output is the same for each step -- not very interesting! We need to customize the program statements to use each distinct value of Type.

%macro ReportOnEachType;
  %do index = 1 %to &varCount;
    title "&&varVal&index. by Origin in SASHELP.CARS";
    proc freq data=sashelp.cars (where=(type="&&varVal&index."));
      table origin*type /nocum nopercent list;
    run;
 
    proc sgplot data=sashelp.cars(where=(type="&&varVal&index."));
      histogram mpg_city;
      density mpg_city / type=kernel;
    run;
  %end;
%mend;
%ReportOnEachType;

Note how we need to reference our macro variable that contains the distinct value: "&&varVal&index." You need a "double &" to dereference (fancy word for reference the reference) the correct macro variable for the current index value.

Step 5. (OPTIONAL) Keep tinkering towards perfection

Now you've "rolled your own" BY group processing, but don't stop there! With a few more tweaks we can make it even better:

/* create macro vars with values and a total count of distinct values */
proc sql noprint;
  select strip(put(count(distinct TYPE),15.)) into :varCount from SASHELP.CARS;
  select distinct TYPE into :varVal1- :varVal&varCount  from SASHELP.CARS;
quit;
 
%macro ReportOnEachType;
  title;
  ods noproctitle;
  ods layout start columns=2;
  %do index = 1 %to &varCount;
    ods region;
    proc freq data=sashelp.cars (where=(type="&&varVal&index."));
      SYSECHO "Processing freq for &&varVal&index., &index. of &varCount.";
      table origin*type /nocum nopercent list;
    run;
    ods region;
    ods graphics / width=500 height=400 imagename="plot&&varVal&index.";
    proc sgplot data=sashelp.cars(where=(type="&&varVal&index."));
      SYSECHO "Processing plot for &&varVal&index., &index. of &varCount.";
      histogram mpg_city;
      density mpg_city / type=kernel;
      xaxis label="MPG (City) for &&varVal&index.";
    run;
  %end;
  ods layout end;
%mend;
%ReportOnEachType;

I added ODS LAYOUT statements (officially experimental, but works well in HTML) to generate two-column output, with tables and charts side-by-side. I added SYSECHO statements so that I can track progress of the program as it runs in SAS Enterprise Guide. If you've got lots of data with lots of distinct values, it can take a while. You might appreciate the running status message. And I added some ODS GRAPHICS options to control the name of the output image files, just to make my results easier to track on the file system.

Here's an example of my final report. Can you take it even further? I'll bet that you can!

tags: BY group, macro programming, ods layout, SAS programming, SGPLOT