SAS tips

3月 292011
Many SAS users love "undocumented features" within SAS software that they have found or heard about. Sometimes they can be really useful, and the fact that they are undocumented adds to the mystique. Some users have written entire conference papers on the subject.

After 35 years of evolution, SAS contains a lot of undocumented features. Visit and search for "undocumented" and you'll see what I mean.

Actually, at SAS we don't like to call them features; we call them "behaviors". And the reason that we don't document them is not because we are ashamed or that we want to make customers hunt for them. It's just that we know that the behavior is likely to change (based on our knowledge of the technology or the roadmap for the product), so we don't want you to get too accustomed to the behavior.

Here's an example. Suppose you know a customer who wants to limit the number of output data sets that he sees in the SAS Enterprise Guide process flow when he runs a SAS program. Did you know that if you name your output data set with a prefix of "_TO" that SAS will "hide" the data from the list of data members in a library? It's completely stealth, and won't show up in your process flow, in PROC DATASETS output, SAS EXPLORER window in Display Manager, or anywhere. (This is on SAS for Windows only -- not UNIX.) But you can still reference it in code and use it for other DATA steps and procedures. For example:

libname stealth "c:\output";
data stealth._toInvisible; /* won't show up in your windows */
  set sashelp.class;
proc means data=stealth._toInvisible; /* valid - will work fine */
This is an undocumented behavior (although Wei Cheng describes it in the paper I referenced). I didn't even know about it until one of our stellar tech support consultants mentioned it as a possible workaround for the SAS Enterprise Guide customer who wanted to declutter his process flow view.

This revelation motivated me to learn more about the behavior. Here's what I learned:

  • Prior to SAS 9.1, some applications within SAS for Windows could generate temporary output files, prefixed with the "_TO" name.
  • There was logic added to SAS to "hide" these artifact files, so that they didn't clutter up the view of files within the SAS EXPLORER or LIBRARIES windows.
  • In SAS 9.1, the code in SAS was changed to no longer generate these "_TO"-prefixed files. (I don't know why exactly -- but I do know that SAS Version 9 architectural changes helped tremendously to streamline processes such as this, and SAS 9.2 makes it even better).
  • However, the behavior to "mask" out the "_TO"-prefixed data members remains in SAS 9.1 and SAS 9.2.
  • I've been testing in SAS 9.3 (under development now), I see that this behavior has been "fixed," and these files are no longer hidden.
The original reason for the behavior is long gone. And since SAS never documented this as an officially supported feature, the behavior has been "cleaned up" and no one will be the wiser. No one, that is, except for those of you who dive into the undocumented.
3月 022011
The ODS Graphics Designer allows you to design and build your own statistical graphs in SAS, without having to learn how to program in the new graph template language (GTL). The ODS Graphics Designer is a rich user interface that allows you to design these graphs based upon your own data, and then save those designs as graph definitions that you can "replay" in a SAS program later. For more information about the ODS Graphics Designer and the incredible graphs that you can create by using it, see this SAS Global Forum paper by Sanjay Matange.

In SAS 9.2, ODS Graphics Designer is installed only with SAS/GRAPH in your SAS environment, so it can be run only on a machine that has SAS installed. But while the designer application can be run only with a local SAS installation, the graph definitions that you create can be replayed in any SAS session, local or remote. You can even render these graphs within SAS stored processes.

We often receive requests to make it easier to access the power of the ODS Graphics Designer from within SAS Enterprise Guide. To support this request, we’ve created a series of custom tasks that allow you to:

  1. Launch the ODS Graphics Designer from within SAS Enterprise Guide, if you have SAS and SAS/GRAPH installed on the same machine with SAS Enterprise Guide.
  2. Replay the graph definitions (SGD files) that you can create with the ODS Graphics Designer. This requires only the SGD files you created and appropriate data to render the graph, plus access to a SAS session, local or remote.

You can download the task from this location. The custom tasks are included within a single DLL inside a ZIP file. The ZIP archive also contains documentation for how to install and use the tasks. (Note: you do not need administrative privileges on your PC to install a custom task! You can drop it into your user profile area on Windows, and SAS Enterprise Guide automatically detects it. See the PDF documentation with the download for details.)

The features offered by these custom tasks are likely to appear in future versions of SAS Enterprise Guide. If you try it and like it, provide feedback in the comments. If you think of ways to improve the tasks, tell us that as well.

2月 172011
Put your data on a dietI've known several people who were raised during the Great Depression, and I've observed that they are very mindful of waste. My wife's grandmother used to save plastic bags, twist ties, and relatively clean aluminum foil for potential reuse in the household -- because such materials were once scarce. The youth of today, to their credit, are also mindful of waste, but the concern is for the environment and recycling effort, not necessarily material shortages.

In a similar way, those of us who have been in the computer industry for a while can remember when it was critical to scrimp and save for every byte in memory and sector on disk, because storage was scarce in both mediums. But unlike the youth in the real world, the new recruits in the computer industry do not have the same frugality when it comes to use of system resources. Machines have fast-growing capacities for disk space and memory, and not everyone sees the incentive to optimize their use of these resources.

But SAS programmers do. I know this because this SAS note about "shrinking character variables to minimum length required" is popular and highly rated.

I decided to take the popular sample program and extend it into a custom task for SAS Enterprise Guide. The SAS program is a macro that examines each character variable in the data set, measures the length to the longest value within the data, and then adjusts the data set to "shrink" the length of each character variable to just the size that is needed to fit the data. For data sets with lots of observations and grossly overallocated storage, it can result in a significant reduction in the file size.

I also added an option to compress the data set using the COMPRESS= option, which can reduce the data set file size even further. (Because there is overhead associated with compression, it might not always make the data smaller; in fact, it could make it larger.)

Here is an example report that the task will generate as a result, so you can get an idea of the benefit.
Sample report

The task can work with SAS Enterprise Guide 4.2 or 4.3. To download the task, click here to save a ZIP file with the task and a README file. It's simple to deploy and use, even if you're not an administrator on your PC. See the README instructions for details.

If you find the task to be useful, let me know here in the comments section. And if you can suggest changes/improvements, let me know that too.

1月 102011
AUTOEXEC.SAS wasn't enough for you. Yes, it's a sure-fire way to run SAS statements (such as LIBNAME assignments or macro definitions) whenever you start your SAS session, but you found it has limitations when used in configurations with lots of users who connect with SAS Enterprise Guide. Limitations such as:
  • The statements in the Autoexec file will be run for everyone who connects to the server, whether they need them or not. It doesn't allow for very fine control.
  • Only an administrator can change the content of the Autoexec file. If a mere-mortal end user wants to inject his own autoexec processing, he has to petition the administrator to add it on his behalf. The admin then needs to consider the impact on the other users of the system.
You were clever though: you discovered an undocumented hook in SAS Enterprise Guide 3.0 and 4.1, where you could inject your own statements and have them submitted when you connected to a SAS session. By adding content to a not-so-secret file named, you could "fool" the system into doing your bidding.

Impressed with your ingenuity, the product team removed in SAS Enterprise Guide 4.2. In its place they established a formal option that does the same thing, complete with support and documentation. Within Tools -> Options -> SAS Programs you can select Submit SAS code when server is connected, then click Edit, then enter the SAS code you want to execute on the SAS server when you connect. It's all described for you in this tip about "user-specific autoexecs", and it's all you would ever need. Right?

Wrong! It wasn't enough to provide user-specific autoexec processing. You then asked for project-specific autoexecs, so that you could specify different actions that would happen automatically when you open different SAS Enterprise Guide projects.

SAS Enterprise Guide 4.3 adds this support, and it's easy to use. Simply add a new process flow to your project (File->New->Process Flow) and name it "Autoexec". You can put anything that you want in this process flow: a program with library assignments or macro definitions, one or more "Assign Library" tasks, Upload or Download data set tasks, even custom tasks. The Autoexec process flow helps you to get everything ready for you to work with your project.

an example project with autoexec stuff

When you open a project that contains an Autoexec process flow, SAS Enterprise Guide can run that flow automatically. By default, you'll see a prompt asking if that's what you want to do.

an example project with autoexec stuff

If you don't want to see the prompt, but just want that autoexec to run, well, automatically, you can change that behavior in Tools->Options.

There. That should keep you happy for a while. Right?

12月 242010
A customer phoned up SAS Technical Support the other day with a problem. (Yes, that happens occasionally...) When he ran a certain program through SAS Enterprise Guide, the program didn't seem to complete and no results were returned. Here is a snippet of the program (modified a bit to make it less proprietary) :
  1. proc sql noprint;
  2.   connect to teradata as tera2
  3.   (user=&user pw=&password server=&server
  4.   mode=teradata FASTEXPORT=YES);
  5.   create table buyers as
  6.     select * from connection to tera2
  7.      (select id,
  8.         trxn_dt,
  9.         sum (ln_itm_qty) as Items,
  10.         sum (ext_prc_amt as Sales
  11.         from store.linetrxn a
  12.           join cust.indv_xref b
  13.           on a.rid = b.rid
  14. /* remainder removed for brevity */
Can you spot the problem? Other than the use of a user ID and password, which can be avoided, what else is wrong with this program? Why does it fail to finish?

The SAS code formatter in SAS Enterprise Guide 4.3 can tell you. Click Ctrl+I, and the code formatter scans your program and attempts to indent it for proper readability. But for this program, it stops short of that with this message:

Formatting Error
Please check for an unmatched parenthesis on line 7.

Now you know: there is an unmatched parenthesis! And every SAS programmer knows that a missing parenthesis, unmatched quote, or missing semicolon can cause problems for the SAS language parser. (Rick Wicklin calls this a "parse-time error" in his post about finding and fixing programming errors.)

But where is the unmatched paren in this program? The message says "line 7", but line 7 looks okay. There is an open paren for the inner SELECT clause. But starting there, there must be a missing matching parenthesis within the clause. To find it, use the arrow keys in the program editor to navigate the program and visit every parenthesis character in the region. The program editor in SAS Enterprise Guide 4.3 highlights the matching parentheses pairs as you navigate:

See the matches

You can also use the Ctrl+[ (open bracket) key to move the cursor from one parenthesis to its match, if there is one. If no match can be found, the program editor emits a friendly beep to tell you, "sorry, can't find a match".

In this example, you don't have to look further than line 10, where the programmer has left off a closing parenthesis. The line should read: sum (ext_prc_amt) as Sales.

People are really starting to appeciate the new SAS program editor (including the formatter). Earlier this week I received an unsolicited phone call from a SAS programmer (who happens to work at SAS). She was calling just to say that this feature is "AWESOME" and that it has saved her countless hours as she maintains legacy SAS programs. Angela Hall also highlights it in her blog about real-world BI experiences.

I love to hear these stories. Have you started using SAS Enterprise Guide 4.3 yet? What do you like about it? And what could be improved? Let us know, either here in the comments or via SAS Technical Support.

12月 182010
You might be too young to remember Clara Peller. She was the star of a series of fast-food burger commercials in the 1980s, in which she demanded meatier meals by shouting "Where's the beef?" at the pickup counter or drive-through window. Alas, the competitor restaurant meals were afflicted with "Fluffy bun", meaning that it was difficult to find the all-beef patty because it was dwarfed by the bread in which it was served.

SAS Enterprise Guide can also serve up some meaty content in the SAS log when you run a SAS program. But in order get that content to you from your SAS session, SAS Enterprise Guide wraps your program in what we call the "ODS sandwich" -- SAS statements that open and close one or more ODS destinations around your program, so that your results can be packaged from SAS and delivered into your project view.

Sometimes, the ODS sandwich can obscure the content that you're really interested in. Consider this log:

1   ;*';*";*/;quit;run;
NOTE: Procedures may not support all options or statements for all devices. For details, see the documentation for each procedure.
12    ODS tagsets.sasreport12(ID=EGSR) FILE=EGSR STYLE=Analysis
12  ! STYLESHEET=(URL="file:///C:/Projects/f2ec43/winclient/Build/Debug/Styles/Analysis.css") NOGTITLE NOGFOOTNOTE
12  ! GPATH=&sasworklocation ENCODING=UTF8 options(rolap="on");
15   proc means;
16     class origin;
17   run;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: PROCEDURE MEANS used (Total process time):
      real time           0.16 seconds
      cpu time            0.04 seconds
25   ;*';*";*/;quit;run;
29   QUIT; RUN;

There are only a few lines of interest here, but lots of fluff. It's necessary fluff, mind you. Just like you wouldn't want the fast-food restaurant to hand over your burger (and all the fixins) without a bun, you need to use ODS to deliver your output. But it's not what you came for.

In SAS Enterprise Guide 4.2 and 4.3, you can change an option to hide the fluff in your log. Select Tools->Options and go to the "Results General" page. See the checkbox option that says "Show generated wrapper code in SAS log"? If you clear that checkbox, the ODS sandwich will be hidden from you when you run your programs. (If you're curious how this works, this is the technique that we use.)

Here's the previous example with this option cleared, sans the fluffy ODS bun:

1    ;*';*";*/;quit;run;
3    %_eg_hidenotesandsource;
19   proc means;
20     class origin;
21   run;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: PROCEDURE MEANS used (Total process time):
      real time           0.04 seconds
      cpu time            0.06 seconds
23   %_eg_hidenotesandsource;
37   QUIT; RUN;

Look carefully at the log. You'll notice that the line numbers are not consecutive -- it skips a few! That's because the ODS statements are still submitted and processed, but the log output for those statements is suppressed. The result is a cleaner log, with more meat than fluff.

We know that many programmers work in regulated industries, and aren't permitted to suppress portions of the SAS log like this. That's why the option is set to "show all" by default. But if you're in search of "just the beef", give this handy option a try.

11月 242010
Update 25Nov2010: I've updated this example to correct the code so that it works correctly for positive UTC offsets. Thanks to Bruno Müller, my colleague at SAS, for finding my mistakes.

One of my SAS colleagues was recently working on a project where she had to create reports that include the UTC offset encoded into the report. The UTC offset is that part of your computer time zone notation that indicates how far off the local time is from "Coordinated Universal Time" (which, in practice, is the same as Greenwich Mean Time). For example, I live on the east coast of USA, which puts me in UTC -05:00. My colleagues in Sydney, Australia are in UTC +11:00 at the moment (since they are enjoying their daylight savings time).

Prior to SAS 9.2, gathering this information required that you use some type of shell command to get the UTC offset value from your host machine (using systeminfo on Windows or date -u on Unix). But with SAS 9.2, you can use the new E8601LZ format to help calculate the correct value in a platform-independent way. (This SAS format is part of a group of formats and informats that support ISO 8601 date and time notations.)

There is also an undocumented function, gmtoff(), which returns the raw number for the offset in seconds. Even though the function is not documented, it does appear in some examples on I don't feel that I'm revealing too many secrets by sharing it with you.

Here is a SAS program example that calculates the UTC offset for the machine that hosts your SAS session:

data _null_;
     utc_offset_display $ 7
     utc_offset 8;

  /* E8601LZ formats as a UTC value and converts to a local time */
  utc_offset_display=substr( put('00:00:00't,E8601LZ.), 9);
  utc_offset=gmtoff()/3600; /* undocumented */
  call symput('utc_offset',utc_offset);
  call symput('utc_offset_display',utc_offset_display);

%put &utc_offset_display;
%put &utc_offset;

When I run this on my machines in Cary, NC, the output log looks like this:

27         %put &utc_offset_display;
28         %put &utc_offset;
Special thanks to one of my colleagues in SAS Japan, Shin Kayano, who is one of our software globalization experts. He's the one who pointed me to this method of using the E8601LZ format.

Update: you can achieve the same result in SAS 9.1.3, but the format name was IS8601LZ instead of E8601LZ. In SAS 9.1.3, these formats were added mainly in support of the XML LIBNAME engine.

11月 232010
Many people mistakenly assume that just because you want to use a SAS program to access a protected resource (such as a database table), you must include the credentials for the resource inside your program.

Few things cause a database administrator to lose more sleep than coming across this within a SAS program:

 libname ora10 oracle
    path=ora10g2 schema=PAYROLL
It doesn't have to be this way! SAS 9.2 offers so many security-related features that you should never have to code user IDs and passwords in your SAS programs again. This blog post summarizes my five favorite approaches.

1. Use the AUTHDOMAIN= option to access SAS libraries
The AUTHDOMAIN option allows you to delegate the authentication by allowing SAS to "look up" credentials as needed using the SAS metadata environment. For each domain that has a unique set of credentials, your SAS administrator can create an "auth domain" in metadata and associate it with a database server (or other resource). Every SAS/ACCESS database engine supports AUTHDOMAIN=, in LIBNAME statements as well as in PROC SQL CONNECT statements.

With AUTHDOMAIN, the above LIBNAME example becomes:

 libname ora10 oracle
    path=ora10g2 schema=PAYROLL
The beauty of this solution is that SAS can resolve the database credentials differently for each user or group who runs this program, using the credentials that are defined in metadata for those identities.

Bonus: AUTHDOMAIN works for other resources too, such as FTP connections.

2. Use the META engine or INFOMAPS engine to access data sources
The META engine provides a layer of indirection in front of a SAS library, so that not only do you not need credentials to access it, you don't even need to know the details of how the data tables are stored. A LIBNAME statement can be as simple as:

libname MYLIB Library="My SAS Library";
where "My SAS Library" is an administered library that a SAS administrator defined in SAS Management Console. MYLIB might resolve to a folder with SAS data sets, or it might be a set of tables in a Teradata database. The implementation details are hidden from the program and the programmer.

Sometimes you might see an example that looks like this:

libname MYLIB Library="My SAS Library"
   metaserv="" /* don't want this */
   metaport=8561   /* don't want this */
   user="userid"      /* don't want this */
   pw="mySecret";  /* don't want this */
If you run your programs in SAS Enterprise Guide or within a SAS stored process, you don't need (and don't want) the connection/credential information! The SAS session you're connected to already knows who you are, and the extra connection information isn't necessary. (If you make use of the META engine in client applications, you'll want to read up on the nuances described in this tech support paper -- based on 9.1.3 but still mostly relevant for SAS 9.2.)

The INFOMAPS engine provides an administered view of data, further abstracted from the physical structure of the data tables. Check out my previous post for a detailed example of programming with Information Maps.

3. Use SAS Token Authentication
SAS Token Authentication uses your established SAS metadata connection to generate and validate single-use tokens for every other SAS-related resource that you might need access to. To put this another way, once the SAS metadata server knows who you are, it "vouches for you" and facilitates connections to anything else you might need, including SAS workspace servers, OLAP servers, database servers (using AUTHDOMAIN) and more.

A big advantage of SAS Token Authentication is that you don't actually need a host account for all of the resources that you might connect to. This cuts down on the sys admin tasks required to get a group of users up and running. It's a best-practice alternative to using group host accounts on a SAS workspace server; configure the SAS workspace server to use SAS token authentication instead.

4. Use Integrated Windows Authentication
The best way to hide passwords? Don't have them in the first place. That's what many customers do when they implement fingerprint readers or retina scanners in their corporate workstations. The users of these workstations wouldn't know how to supply a password if you asked for one.

Even if your users need a password to log in to your workstation, you can use Integrated Windows Authentication to prevent them from seeing another password challenge as they connect to their SAS environment.

SAS Token Authentication and Integrated Windows Authentication are examples of authentication mechanisms that SAS 9.2 supports. Using these authentication mechanisms can reduce the "authentication friction" that results when your SAS applications must hop among different protected resources that would traditionally require a user ID and password to access.

5. If you must, use PROC PWENCODE to obscure passwords
Sometimes, despite your best efforts, you cannot avoid the odd password in your programs. For example, if you've got to access password-protected SAS data sets, you need to specify the password. But there is no need to have the clear-text password appear in your code. You can use the PWENCODE procedure to encode the password so that prying eyes cannot guess at it. For example:

proc pwencode in="ItzaSecret"
yields this in the log:

18         proc pwencode in=XXXXXXXXXXXX
19           method=sas002;
20         run;


NOTE: PROCEDURE PWENCODE used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
You can then use the encoded password in your program:

proc means
WARNING: Using an encoded password does not prevent someone else from running your program, encoded-password intact, and accessing the same protected resource. So you still need to protect the content of your program. However, at least someone glancing over your shoulder won't be able to guess your password, and most likely won't be able to memorize the encoded gobbledy-gook that appears in the password field.
11月 222010
It was over two years ago that I shared the tip of using Process Explorer to look under the covers at which SAS processes were running on your Windows environment.

Process Explorer recently received a refresh, and it's worth downloading the new version. (Yes, it's free.)

If you want to know how influential this tool has been, take a look at the built-in Task Manager in Windows 7 or Windows 2008. The Task Manager now has many of the features that previously only an advanced tool like Process Explorer could offer. But in the meantime, Process Explorer (also funded by Microsoft) has advanced even further, so don't deny yourself the super-duper advanced view into your running processes.

11月 162010
Are you afraid of big commitments? Do you like to shop around thoroughly before making a purchase decision? I can't help you with most of Life's Big Choices, but I can help you in one way: I can show you how to learn more about your data set before you open it.

In SAS Enterprise Guide 4.2, we added a feature that allows you to view the details of your data sets (such as the data set labels, how many records they contain, and more) while you browse the contents of a SAS library. What's more, you can also see the column names and column types right from within the file dialog, providing you with that peace of mind that comes only with knowing that, yes, you've selected the correct data set.

But, the feature is not turned on by default. Why not? Because gathering this information incurs a slight performance hit as you browse your libraries. For local file-based SAS data sets you probably won't notice much of a slowdown.

Here's how you turn it on:

  1. Select Tools->Options. The Options window appears.
  2. On the General page, check the box that reads, "Show full details when viewing SAS libraries in File Dialog details view."

The next time that you select File->Open->Data and navigate to the Libraries view, select the Details option from the toolbar in the file dialog, as shown here:

Toolbar for selecting details


When you drill down into a SAS library, you'll be overwhelmed with a tremendous set of new details, as shown here:

What the list looks like with details


But wait -- there's more! Go back up to the toolbar and tick the Show detail pane option, and on the right-hand side you'll see the list of SAS variables that lurk inside a SAS data set, as you select it within the list.

And with the column list in view


Try it! Turn on the option and give it a whirl. It will help you to eliminate "buyer's remorse" (or at least "data-opener's remorse") from your daily life.