sas programming

1月 162017
 

For SAS programmers, the PUT statement in the DATA step and the %PUT macro statement are useful statements that enable you to display the values of variables and macro variables, respectively. By default, the output appears in the SAS log. This article shares a few tips that help you to use these statements more effectively.

Tip 1: Display the name and value of a variable

The PUT statement supports a "named output" syntax that enables you to easily display a variable name and value. The trick is to put an equal sign immediately after the name of a variable: PUT varname=; For example, the following statement displays the text "z=" followed by the value of z:

data _null_;
x = 9.1; y = 6; z = sqrt(x**2 + y**2);
put z=;           /* display variable and value */
run;
z=10.9

Tip 2: Display values of arrays

You can extend the previous tip to arrays and to sets of variables. The PUT statement enables you to display elements of an array (or multiple variables) by specifying the array name in parentheses, followed by an equal sign in parentheses, as follows:

data _null_;
array x[5];
do k = 1 to dim(x);
   x[k] = k**2;
end;
put (x[*]) (=);     /* put each element of array on separate lines */
put (x1 x3 x5) (=); /* put each variable/value on separate lines */
run;
x1=1 x2=4 x3=9 x4=16 x5=25
x1=1 x3=9 x5=25

This syntax is not supported for _TEMPORARY_ arrays. However, as a workaraoun, you can use the CATQ function to concatenate array values into a character variable, as follows:

temp = catq('d', ',', of x[*]);         /* x can be _TEMPORARY_ array */
put temp=;

Incidentally, if you ever want to apply a format to the values, the format name goes inside the second set of parentheses, after the equal sign: put (x1 x3 x5) (=6.2);

Tip 3: Display values on separate lines

The previous tip displayed all values on a single line. Sometimes it is useful to display each value on its own line. To do that, put a slash after the equal sign, as follows:

...
put (x[*]) (=/);                   /* put each element on separate lines */
...
x1=1
x2=4
x3=9
x4=16
x5=25

Tip 4: Display all name-value pairs

You can display all values of all variables by using the _ALL_ keyword, as follows:

data _null_;
x = 9.1; y = 6; z = sqrt(x**2 + y**2);
A = "SAS"; B = "Statistics";
put _ALL_;              /* display all variables and values */
run;
x=9.1 y=6 z=10.9 A=SAS B=Statistics _ERROR_=0 _N_=1

Notice that in addition to the user-defined variables, the _ALL_ keyword also prints the values of two automatic variables named _ERROR_ and _N_.

Tip 5: Display the name and value of a macro variable

Just as the PUT statement displays the value of an ordinary variable, you can use the %PUT statement to display the value of a macro variable. If you use the special "&=" syntax, SAS will display the name and value of a macro variable. For example, to display your SAS version, you can display the value of the SYSVLONG automatic system macro variable, as follows:

%put &=SYSVLONG;
SYSVLONG=9.04.01M4P110916

The results above are for my system, which is running SAS 9.4M4. Your SAS version might be different.

Tip 6: Display all name-value pairs for macros

You can display the name and value of all user-defined macros by using the _USER_ keyword. You can display the values of all SAS automatic system macros by using the _AUTOMATIC_ keyword.

%let N = 50;
%let NumSamples = 1e4;
%put _USER_;
GLOBAL N 50
GLOBAL NUMSAMPLES 1e4

Conclusion and References

There you have it: six tips to make it easier to display the value of SAS variables and macro variables. Thanks to Jiangtang Hu who pointed out the %PUT &=var syntax in his blog in 2012. For additional features of the PUT and %PUT statements, see:

tags: SAS Programming, Tips and Techniques

The post PUT it there! Six tips for using PUT and %PUT statements in SAS appeared first on The DO Loop.

1月 032017
 

How many of you have been given a SAS data set with variables such as Age, Height, and Weight and some or all of them were stored as character values instead of numeric?  Probably EVERYONE! Yes, we all know how to do the old "swap and drop" (rename and convert), but […]

The post Character to Numeric Conversion in SAS appeared first on SAS Learning Post.

12月 292016
 

This SAS Jedi is very excited about the SAS 9.4 M4 release, which brought many wonderful gifts just in time for Christmas. So in the interest of extending the Christmas spirit, I'm going to blog about some of my favorites! I've long loved the SAS DO statement variant which allows […]

The post SAS Jedi Christmas - SAS 9.4 M4 DS2 Do Loop Upgrade appeared first on SAS Learning Post.

12月 282016
 

SAS temporary arrays are an underutilized jewel in the SAS toolbox. I find that many beginning to intermediate SAS programmers are not familiar with temporary arrays. The good news is that there is nothing complicated about them and they are very useful. First of all, what is a temporary array? […]

The post SAS Temporary Arrays, Not Just for Experts appeared first on SAS Learning Post.

12月 222016
 

TL; DR

Free training from SAS: "SAS Programming for R Users." The schedule of Live Web offerings is here. If you prefer self-study, the complete course materials are on the SAS Software GitHub space and you can practice with the free SAS University Edition software.


The details: how R programmers can learn SAS for free

diagbeta As much as I would love for SAS customers to use SAS to the exclusion of everything else, that rarely happens. Every time I visit a SAS customer, I hear about the other non-SAS tools that they use alongside SAS and their integration points. The most popular of these include desktop tools such as Microsoft Excel, or enterprise databases from other vendors. But increasingly, I hear from users who dabble in open source tools such as Python and R, or who work with other teams that use those tools.

Programmers tend to favor the programming languages that they know. When you learn a new programming language, your experience is colored by inevitable comparisons with the languages you've already mastered. If you work with R coders who want to learn SAS, you should consider that they probably won't learn SAS the same way that you did.

A SAS programming course for experienced programmers

The traditional way to learn SAS begins with the DATA step, where you learn how to read files, how to write files, about the program data vector, and basically how the DATA step "thinks". Then you move on to the various procedures for descriptive stats, reporting, and maybe even some graphing. While this approach can make you productive with simple tasks quickly, to an R coder this might feel too much like "starting over." That's why R programmers (or even MATLAB or Stata users) need an approach that leverages what they already know to hit the ground running.

That's the thinking behind the new SAS Programming for R Users course. This course does not start with the basics about statistics or the importance of data prep -- the assumption is that you already know that. Instead, you'll get hands-on experience with SAS/IML -- a statistical matrix language that will certainly feel familiar to R users. You'll eventually get to the DATA step and other procedures, of course -- and these will open new worlds for you -- but you'll learn to be productive quickly using the skills you already have. (You can read more about the genesis of the course from its creator and main instructor, Jordan Bakerman.)

The course centers around classic and real statistical problems, from Bayesian logistic regression to the Monty Hall problem. If you don't know your statistics, you might feel that you're swimming in waters over your head. But if you're comfortable with the concepts, you should feel right at home. (If you're just beginning with statistics, SAS offers this different free e-learning course.)

The classic game show proof

The classic game show proof - click for code

"SAS Programming for R Users" also shows you how to use SAS and R together, submitting R code from within your SAS program. That's made possible by a special connection between SAS/IML and R -- something that SAS has supported for years.

This is a free instructor-led course that's offered in Live Web format. "Live Web" means that you connect from your desk at home or work, tune into the lecture and demos, and then practice your skills on a hosted classroom environment. And this course is free -- costing you only your time (5 half-day sessions). Check out the SAS Training site to see when the next offering might meet your schedule.

Find the course materials on GitHub, right now

What if you can't find a Live Web offering that meets your schedule? In the spirit of openness, the SAS Training team has published the complete course materials on GitHub. You'll find the course notes (over 600 pages), data sets, and over 80 SAS programs to support the course exercises. You can use the free SAS University Edition to try the course exercises yourself and practice with the software. (The only part that you can't practice is the "submit to R" lessons, because the SAS University Edition doesn't support the connection to R.)

tags: open source, SAS programming, sas training

The post Learning SAS programming for R users appeared first on The SAS Dummy.

12月 082016
 

As technology expands, we have a similarly increasing need to create programs that can be handed off – to clients, to regulatory agencies, to parent companies, or to other projects – and handed off with little or no modification needed by the recipient. Minimizing modification by the recipient often requires […]

The post Using the SAS Macro Language to Create Portable Programs appeared first on SAS Learning Post.

12月 032016
 

JSON is the new XML. The number of SAS users who need to access JSON data has skyrocketed, thanks mainly to the proliferation of REST-based APIs and web services. Because JSON is structured data in text format, we've been able to offer simple parsing techniques that use DATA step and most recently PROC DS2. But finally*, with SAS 9.4 Maintenance 4, we have a built-in LIBNAME engine for JSON.

Simple JSON example: Who is in space right now?

Speaking of skyrocketing, I discovered a cool web service that reports who is in space right now (at least on the International Space Station). It's actually a perfect example of a REST API, because it does just that one thing and it's easily integrated into any process, including SAS. It returns a simple stream of data that can be easily mapped into a tabular structure. Here's my example code and results, which I produced with SAS 9.4 Maintenance 4.

filename resp temp;
 
/* Neat service from Open Notify project */
proc http 
 url="http://api.open-notify.org/astros.json"
 method= "GET"
 out=resp;
run;
 
/* Assign a JSON library to the HTTP response */
libname space JSON fileref=resp;
 
/* Print result, dropping automatic ordinal metadata */
title "Who is in space right now? (as of &sysdate)";
proc print data=space.people (drop=ordinal:);
run;

JSON who is in space
But what if your JSON data isn't so simple? JSON can represent information in nested structures that can be many layers deep. These cases require some additional mapping to transform the JSON representation to a rectangular data table that we can use for reporting and analytics.

JSON map example: Most recent topics from SAS Support Communities

In a previous post I shared a PROC DS2 program that uses the DS2 JSON package to call and parse our SAS Support Communities API. The parsing process is robust, but it requires quite a bit of fore knowledge about the structure and fields within the JSON payload. It also requires many lines of code to extract each field that I want.

Here's a revised pass that uses the JSON engine:

/* split URL for readability */
%let url1=http://communities.sas.com/kntur85557/restapi/vc/categories/id/bi/topics/recent;
%let url2=?restapi.response_format=json%str(&)restapi.response_style=-types,-null,view;
%let url3=%str(&)page_size=100;
%let fullurl=&url1.&url2.&url3;
 
filename topics temp;
 
proc http
 url= "&fullurl."
 method="GET"
 out=topics;
run;
 
/* Let the JSON engine do its thing */
libname posts JSON fileref=topics;
title "Automap of JSON data";
 
/* examine resulting tables/structure */
proc datasets lib=posts; quit;
proc print data=posts.alldata(obs=20); run;

Thanks to the many layers of data in the JSON response, here are the tables that SAS creates automatically.

json Auto tables
There are 12 tables that contain various components of the message data that I want, plus the ALLDATA member that contains everything in one linear table. ALLDATA is good for examining structure, but not for analysis. You can see that it's basically name-value pairs with no data types/formats assigned.

json ALLDATA
I could use DATA steps or PROC SQL to merge the various tables into a single denormalized table for my reporting purposes, but there is a better way: define and apply a JSON map for the libname engine to use.

To get started, I need to rerun my JSON libname assignment with the AUTOMAP option. This creates an external file with the JSON-formatted mapping that SAS generates automatically. In my example here, the file lands in the WORK directory with the name "top.map".

filename jmap "%sysfunc(GETOPTION(WORK))/top.map";
 
proc http
 url= "&fullurl."
 method="GET"
 out=topics;
run;
 
libname posts JSON fileref=topics map=jmap automap=create;

This generated map is quite long -- over 400 lines of JSON metadata. Here's a snippet of the file that describes a few fields in just one of the generated tables.

"DSNAME": "messages_message",
"TABLEPATH": "/root/response/messages/message",
"VARIABLES": [
{
  "NAME": "ordinal_messages",
  "TYPE": "ORDINAL",
  "PATH": "/root/response/messages"
},
{
  "NAME": "ordinal_message",
  "TYPE": "ORDINAL",
  "PATH": "/root/response/messages/message"
},
{
  "NAME": "href",
  "TYPE": "CHARACTER",
  "PATH": "/root/response/messages/message/href",
  "CURRENT_LENGTH": 19
},
{
  "NAME": "view_href",
  "TYPE": "CHARACTER",
  "PATH": "/root/response/messages/message/view_href",
  "CURRENT_LENGTH": 134
},

By using this map as a starting point, I can create a new map file -- one that is simpler, much smaller, and defines just the fields that I want. I can reference each field by its "path" in the JSON nested structure, and I can also specify the types and formats that I want in the final data.

In my new map, I eliminated many of the tables and fields and ended up with a file that was just about 60 lines long. I also applied sensible variable names, and I even specified SAS formats and informats to transform some columns during the import process. For example, instead of reading the message "datetime" field as a character string, I coerced the value into a numeric variable with a DATETIME format:

{
  "NAME": "datetime",
   "TYPE": "NUMERIC",
  "INFORMAT": [ "IS8601DT", 19, 0 ],
  "FORMAT": ["DATETIME", 20],
  "PATH": "/root/response/messages/message/post_time/_",
  "CURRENT_LENGTH": 8
},

I called my new map file 'minimap.map' and then re-issued the libname without the AUTOMAP option:

filename minmap 'c:tempminmap.map';
 
proc http
 url= "&fullurl."
 method="GET"
 out=topics;
run;
 
libname posts json fileref=topics map=minmap;
proc datasets lib=posts; quit;
 
data messages;
 set posts.messages;
run;

Here's a snapshot of the single data set as a result.

JSON final data
I think you'll agree that this result is much more usable than what my first pass produced. And the amount of code is much smaller and easier to maintain than any previous SAS-based process for reading JSON.

Here's the complete program in public GitHub gist, including my custom JSON map.


* By the way, tags: JSON, REST API, SAS programming

The post Reading data with the SAS JSON libname engine appeared first on The SAS Dummy.

12月 012016
 

In my earlier post about WHERE and IF statements, I announced that the DATA step debugger has finally arrived in SAS Enterprise Guide. (I admit that I might have buried the lead in that post.) Let's use this post to talk about the new debugger and how it works.

First, let's address some important limitations. This tool is for debugging DATA step code. It can't be used to debug PROC SQL or PROC IML or SAS macro programs. Next, it can't be used to debug DATA steps that read data from CARDS or DATALINES. That's an unfortunate limitation, but it's a side effect of the way the DATA step "debug" mode works with client applications like SAS Enterprise Guide. (Workaround: load your data in a separate step, then debug your more complex DATA step logic in a subsequent step.)

Ye olde DATA step debugger

1986 called; they want their debugger back

1986 called; they want their debugger back.

If you've been around SAS programs for a while then you might remember the full-screen DATA step debugger in the SAS windowing environment. Introduced as production in SAS 6.09E (E="enhanced!"), it was basic but it did the job, relying on command-line processing to direct the debugger actions. It had only two windows: one for the source, and one for the "log", meaning the debugger console log. You could set breakpoints, variable watch conditions, examine variables and calculate values -- all with commands that you typed. (Even though I'm writing this in the past tense and it seems like I'm eulogizing, this debugger still lives on in Base SAS!)

The new DATA step debugger

The new debugging environment, introduced in SAS Enterprise Guide 7.13, has all of the features of its ancestor. And it's much more usable, with toolbars and windows that allow you to control its behavior. But keyboard junkies, don't worry -- that command line is still there too!

To activate the debugger, click the new "bug" toolbar icon in the program editor window. Once activated, you can click the bug in the left "gutter" of the program editor to begin a debug session. (You can also press F5 to debug the active DATA step.)
Starting the Debugger
Examine the screenshot below. You see the source window on top and the console window at the bottom, plus a convenient "watch" window that shows much of the content in the program data vector (PDV). That's all of the variables defined in the DATA step, plus automatic variables like _N_ and _ERROR_.

EG debugger
As you step through the DATA step, the line pointer in the source window advances to show the next line that will execute. You can use keyboard shortcuts (F10), the toolbar, or typed a typed command ("step") to execute that line and advance. With every step, the watch window is updated with the latest values of the variables in your step. When a variable changes value, it's colored red. If you want to the DATA step to break processing when a certain variable changes value, check the Watch box for that variable.

Diving deeper with advanced debugging

Here's another example of debugging a different DATA step program. This program uses a BY statement and FIRST.variable logic, and you can see the additional automatic variables (FIRST.Make and LAST.Make) that the debugger is tracking. I also used END=eof on the SET statement; that adds the eof "flag" variable into the mix during run time.

egdebug_adv
In the Debug Console window you can see that I've issued some pretty fancy commands. The DATA step debugger allows you to set breakpoints that trigger on specific conditions. For example, "b 8 when (running_price > 10000)" will break on Line 8 when the value of running_price exceeds 10,000. "b 8 after 5" will break on Line 8 after 5 passes through the DATA step. You can set and clear line-specific breakpoints by clicking in the "gutter" (that left-hand margin next to the line numbers).

The "list _all_" command reveals the details about your open data sets and files. Here's what I see during the run of my program.

list command
Other commands let you SET variable values, EXAMINE variables, CALCulate expressions, GO and JUMP to specific lines, and more. The SAS documentation contains a complete reference for DATA step debugger commands, and most of those work exactly as documented, even within SAS Enterprise Guide. Here's the list:

This old-but-still relevant SAS Global Forum paper (written by a SAS user) also covers some useful debugging concepts in SAS which you can apply in this new environment.

A personal note: eating my words

I've presented "SAS Enterprise Guide for SAS programmers" as a topic in one form or another for the past 15 years. Every so often the topic of the DATA step debugger comes up, and I've said "don't look for it anytime soon." Knowing how the full-screen debugger is closely tied to the SAS windowing environment, I didn't hold out hope for a client application like SAS Enterprise Guide to get it working. Kudos to the R&D team! They creatively found a solution with the "/ldebug" option, an even more obscure debugging approach that works in SAS batch mode. I think this feature will be tremendous productivity boost for experienced SAS programmers, and a useful learning and teaching tool for those just getting started with the DATA step.

tags: SAS Enterprise Guide, SAS programming

The post Using the DATA step debugger in SAS Enterprise Guide appeared first on The SAS Dummy.

11月 302016
 

Do you want to create customized SAS graphs by using PROC SGPLOT and the other ODS graphics procedures? An essential skill that you need to learn is how to merge, join, append, and concatenate SAS data sets that come from different sources. The SAS statistical graphics procedures (SG procedures) enable you to overlay all kinds of customized curves, markers, and bars. However, the SG procedures expect all the data for a graph to be in a single SAS data set. Therefore it is often necessary to append two or more data sets before you can create a complex graph.

This article discusses two ways to combine data sets in order to create ODS graphics. An alternative is to use the SG annotation facility to add extra curves or markers to the graph. Personally, I prefer to use the techniques in this article for simple features, and reserve annotation for adding highly complex and non-standard features.

Overlay curves

sgplotoverlay

In a previous article, I discussed how to structure a SAS data set so that you can overlay curves on a scatter plot.

The diagram at the right shows the main idea of that article. The X and Y variables contain the original data, which are the coordinates for a scatter plot. Secondary information was appended to the end of the data. The X1 and Y1 variables contain the coordinates of a custom scatter plot smoother. The X2 and Y2 variables contain the coordinates of a different scatter plot smoother.

This structure enables you to use the SGPLOT procedure to overlay two curves on the scatter plot. You use a SCATTER statement and two SERIES statements to create the graph. See the previous article for details.

Overlay markers: Wide form

In addition to overlaying curves, I sometimes want to add special markers to the scatter plot. In this article I will show how to add a marker that shows the location of the sample mean. This article shows how to use PROC MEANS to create an output data set that contains the coordinates of the sample mean, then append that data set to the original data.


Add special markers to a graph using PROC SGPLOT #SASTip
Click To Tweet


The following statements use PROC MEANS to compute the sample mean for four variables in the SasHelp.Iris data set, which contains the measurements for 150 iris flowers. To emphasize the general syntax of this computation, I use macro variables, but that is not necessary:

%let DSName = Sashelp.Iris;
%let VarNames = PetalLength PetalWidth SepalLength SepalWidth;
 
proc means data=&DSName noprint;
var &VarNames;
output out=Means(drop=_TYPE_ _FREQ_) mean= / autoname;
run;

The AUTONAME option on the OUTPUT statement tells PROC MEANS to append the name of the statistic to the variable names. Thus the output data set contains variables with names like PetalLength_Mean and SepalWidth_Mean. As shown in the diagram in the previous section, this enables you to append the new data to the end of the old data in "wide form" as follows:

data Wide;
   set &DSName Means; /* add four new variables; pad with missing values */
run;
 
ods graphics / attrpriority=color subpixel;
proc sgplot data=Wide;
scatter x=SepalWidth y=PetalLength / legendlabel="Data";
ellipse x=SepalWidth y=PetalLength / type=mean;
scatter x=SepalWidth_Mean y=PetalLength_Mean / 
         legendlabel="Sample Mean" markerattrs=(symbol=X color=firebrick);
run;
Scatter plot with markers for sample means

The first SCATTER statement and the ELLIPSE statement use the original data. Recall that the ELLIPSE statement draws an approximate confidence ellipse for the mean of the population. The second SCATTER statement uses the sample means, which are appended to the end of the original data. The second SCATTER statement draws a red marker at the location of the sample mean.

You can use this same method to plot other sample statistics (such as the median) or to highlight special values such as the origin of a coordinate system.

Overlay markers: Long form

In some situations it is more convenient to append the secondary data in "long form." In the long form, the secondary data set contains the same variable names as in the original data. You can use the SAS data step to create a variable that identifies the original and supplementary observations. This technique can be useful when you want to show multiple markers (sample mean, median, mode, ...) by using the GROUP= option on one SCATTER statement.

The following call to PROC MEANS does not use the AUTONAME option. Therefore the output data set contains variables that have the same name as the input data. You can use the IN= data set option to create an ID variable that identifies the data from the computed statistics:

/* Long form. New data has same name but different group ID */
proc means data=&DSName noprint;
var &VarNames;
output out=Means(drop=_TYPE_ _FREQ_) mean=;
run;
 
data Long;
set &DSName Means(in=newdata);
if newdata then 
   GroupID = "Mean";
else GroupID = "Data";
run;

The DATA step created the GroupID variable, which has the values "Data" for the original observations and the value "Mean" for the appended observations. This data structure is useful for calling PROC SGSCATTER, which supports the GROUP= option, but does not support multiple PLOT statements, as follows:

ods graphics / attrpriority=none;
proc sgscatter data=Long 
   datacontrastcolors=(steelblue firebrick)
   datasymbols=(Circle X);
plot (PetalLength PetalWidth)*(SepalLength SepalWidth) / group=groupID;
run;
Scatter plot matrix with markers for sample means

In conclusion, this article demonstrates a useful technique for adding markers to a graph. The technique requires that you concatenate the original data with supplementary data. Appending and merging data is a technique that is used often when creating ODS statistical graphics in SAS. It is a great technique to add to your programming toolbox.

tags: SAS Programming, Statistical Graphics, Tips and Techniques

The post Append data to add markers to SAS graphs appeared first on The DO Loop.

11月 272016
 

In the DATA step, the WHERE statement and the IF statement (a.k.a. the "subsetting IF") have similar functions. In many scenarios, they produce identical results. But new SAS programmers are taught early on that these two statements work very differently, and in important ways. To understand the differences, it helps to step through the program line-by-line to see how SAS "thinks." Fortunately, the new DATA step debugger in SAS Enterprise Guide 7.13 makes this really easy to do.

Difference between WHERE statement and IF statement

Here are the basics: the WHERE statement is applied when the DATA step is compiled. Incoming data (from a SET or MERGE statement) is filtered immediately to just those records that match the WHERE condition, so only those records are ever loaded into the program data vector (PDV). This results in fewer iterations through DATA step code, but provides no opportunity for "dynamic" decisions about which records to examine.

In contrast, the IF statement is evaluated at run time, and operates on the variables in the PDV. When the IF condition is met, the current observation is kept for eventual output. Unlike the WHERE statement, the IF statement can examine values of new variables that are defined within the step.

Consider these two DATA steps. They produce identical output of 10 records, but the first one processes only those 10 records whereas the second step processes all 19 records from the input.

data results1;
  set sashelp.class;
  /* WHERE applied at compile time  */
  /* Processes ONLY matching obs    */
  where sex='M';
run;
 
data results2;
  set sashelp.class;
  /* IF evaluated at run time  */
  /* Processes EVERY obs       */
  if sex='M';
run;

Using the DATA step debugger to understand the DATA step

The new DATA step debugger in SAS Enterprise Guide makes it very easy to illustrate how WHERE is processed differently from IF. I loaded each of the above programs into my session, then clicked the new "bug" toolbar icon to activate the debugger. Once activated, you can click the bug in the left "gutter" of the program editor to begin a debug session. (You can also press F5 to debug the active DATA step.)
Starting the Debugger
Watch this first animation of a debugger session and see what you notice about the WHERE statement.

Debugger with WHERE
Watching this little movie, I see a few things that reveal some insights.

  • The statement pointer never lands on Line 5 (the WHERE statement). That's because the WHERE statement isn't processed at run time.
  • Even though the CLASS data contains 19 records, the value of the _N_ automatic variable reaches only 11, indicating that only 10 records were processed.
  • The variable watch window uses red to indicate when a variable changes between iterations. The Sex variable never changes from 'M', and thus stays colored black through the entire session.

Let's compare that to the IF statement. Study this animation and see what stands out to you.

Debugger with IF
Here's what I see:

  • The statement pointer begins at Line 2, then 5, and moves to Line 6 (the RUN statement) only when the record has made it past the IF condition and into the output. For each observation where Sex='F', the DATA step stops processing the record and the RUN statement is skipped.
  • In this program, _N_ reaches 20 -- that's because all 19 records in SASHELP.CLASS are processed and the step exits at the end-of-file condition.

Learning more about subsetting IF, IF-THEN, WHERE, and debugging

There are several good articles about how the IF statement works, on its own and in combination with IF-THEN-ELSE constructs. Here's a recent article by SAS trainer Charu Shankar. And here's another reference that's included in a piece about the Top 10 SAS coding efficiencies.

The new DATA step debugger in SAS Enterprise Guide opens a new world of understanding for beginner and veteran SAS programmers. It has all of the functions of the "classic" debugger available in the Base SAS windowing environment, but with a much friendlier user interface, keyboard shortcuts, and useful watch windows. In a future post, I'll cover the debugging functions in more detail.

tags: SAS Enterprise Guide, SAS programming

The post Debugging the difference between WHERE and IF in SAS appeared first on The SAS Dummy.