Call Execute

6月 252021
 

In many programming languages, there is a function named eval() that can be used to evaluate an expression and return the result at run time. For example, in Python, the eval() function parses the expression passed to it and runs a Python expression or code within the program. Even Python eval() supports more optional parameters, such as global/local dictionary for runtime context, but the goal for evaluation remains the same.

When an expression is stored as a character variable of an SAS observation, it suggests the running context is the SAS program execution phase, so the internal data structure like Input Buffer (IB), Program Data Vectors (PDV) and Descriptor Information of the output database are all accessible. Furthermore, all SAS system functions are also accessible in an expression, just like the raw SAS code exposed to SAS compiler and runtime. In the following SAS data set, what the user wants is to get the real value from the expression in variable C. It is different from the commonly used Calculated Column, which uses the same computing rule to expand a new variable for all observations. Here the column C has a different computing rule for each observation; the user's expected result is the column D.

%let MVAR=2;
data a; 
  length a b 8 c $ 255;
  a=3; b=4; c='a**2+b**2'; output; /* Arithmetic or Logical */  
  a=7; b=22; c='b/a * &MVAR'; output; /* SAS Macro Var*/
  a=113; b=355; c='cos(b/a)'; output; /* Trigonometric */
  a=0; b=1; c='cdf("NORMAL", 0, a, b)'; output; /* Probability */
run;
proc print;run;

What solutions ahead?

Someone might want a solution that parses the expression in variable C, and then try to rebuild the abstract syntax tree and try to evaluate it from bottom to top. I can’t say this solution is totally wrong but it’s very complex and too hard to generate a general solution. If we have an SAS function eval() in DATA Step, then we can easily use the following code to achieve the goal. Unfortunately, SAS did not provide the eval() function.

 
data b; 
  set a;
  d=eval(c);
run;

SAS provides Macro function %eval, %sysevalf to evaluate arithmetic and logical expressions using integer or floating-point arithmetic. SAS also provides function resolve() (not supported in the DATA Step that runs in CAS) to return the resolved value of the argument after it has been processed by the macro facility. Anyway, resolve() can’t access the values of macro variable assigned by symput() or symputn() at program execution phase. So, for the SAS code as below, it outputs c=150 and d=200. For simple arithmetic and logical expression, we can use the tranwrd() function to replace all variable names with real values, and then use resolve('%sysevalf('|| cats(expression_with_real_values) || ')') to get evaluated result, but it limited to only SAS functions supported by %eval() and %sysevalf() macro functions.

%let a=100;
%let b=50;
data _null_;
  call symput("b", 200);  
  c=resolve("%eval(&a+&b)");
  put c=;
  d=symget("b");
  put d=; 
run;

Now let’s return to the key question we raised: How can we implement the eval() function in SAS Data Step? The best solution we found so far is to use the dynamic code generation mechanism in SAS, and then submit the code to SAS for parsing and execution. In this way, we retrieve string expression from a variable to a real expression in SAS code successfully. We don’t care what the valid SAS expression is, we totally convert it to the code snippets and submit it for execution. Syntax check, reference to PDV variables, internal functions, and even SAS Macro variables are all supported. Yes, it’s a smart and concise implementation for general purposes.

Due to each observation having a different computing rule, we need to use _N_ to control the observation mapping. So, for the existing dataset A, we can use the following code to achieve the goal. The key secret is to use CALL EXECUTE to generate SAS code dynamically and delay execution after the code is ready. In the output dataset B, we have a numeric column D with the value evaluated with the character expression from column C. You can see the output on the RIGHT part of the Figure 1.

data _null_;
  set a end=last;
  if _n_=1 then call execute("data b; set a;");
  call execute( "if _N_=" || compress(_N_) || " then d=" || trim(c) || "; ");
  if last then call execute("run;");
run;
proc print; run;

Wrap up as a reusable SAS macro

We also can wrap up the above SAS code in a reusable SAS macro, so this logic can be used anywhere you want. The %evalvar macro has four arguments: ds= and var= are the input dataset and columns with expression to be evaluated. And outds= and outvar= are the output dataset and columns to store result. We also can specify same value for ds= and outds=, so the user just expands the existing dataset with an additional evaluated column.

%macro EvalVar(ds=, var=, outds=, outvar=);
data _null_;
  set &ds end=last;
  if _n_=1 then call execute("data &outds; set &ds;");
  call execute( "if _n_=" || compress(_n_) || " then &outvar=" || &var || ";");
  if last then call execute("run;");
 run;
%mend; 
 
%EvalVar(ds=a, var=c, outds=b, outvar=d);
proc print; run;

By the way, I also had a temporary SAS code generation solution implemented via the %include SAS macro. The temporary SAS code will be deleted automatically when the SAS session is closed. The sample code is also attached here for your information.

filename tmpfile temp;
data _null_;
  file tmpfile;
  set a end=last;  
  if _N_=1 then put "data b; set a;";
  put "if _N_ =" _N_ "then d=" c ";";
  if last then 	put "run;";
run;
%include tmpfile;
proc print; run;

Summary

In this article, we talk about how to evaluate SAS expressions in Data Step dynamically. The expression parsing and execution are totally handled by SAS at program execution phase. It avoids handling abstract syntax tree parsing and evaluation ourselves on it. We introduce two dynamic code generation implementations via call execute or %include. We also use _N_ to control observation mapping due to each observation has different computing rules. This implementation can reflect the beauty of simplicity in SAS before SAS provides system function eval() one day.

How to evaluate SAS expression in DATA Step dynamically was published on SAS Users.

8月 022017
 

In my prior posts (Data-driven SAS macro loops, Modifying variable attributes in all datasets of a SAS library, Automating the loading of multiple database tables into SAS tables), I presented various data-driven applications using SAS macro loops.

However, macro loops are not the only tools available in SAS for developing data-driven programs.

CALL EXECUTE is one of them. The CALL EXECUTE routine accepts a single argument that is a character string or character expression. The character expression is usually a concatenation of strings containing SAS code elements to be executed after they have been resolved. Components of the argument expression can be character constants, data step variables, macro variable reference, as well as macro references. CALL EXECUTE dynamically builds SAS code during DATA step iterations; that code executes after the DATA step’s completion outside its boundary. This makes a DATA step iterating through a driver table an effective SAS code generator similar to that of SAS macro loops.

However, the rather peculiar rules of the CALL EXECUTE argument resolution may make its usage somewhat confusing. Let’s straighten things out.

Argument string has no macro or macro variable reference

If an argument string to the CALL EXECUTE contains SAS code without any macro or macro variable references, that code is simply pushed out (of the current DATA step) and appended to a queue after the current DATA step. As the DATA step iterates, the code is appended to the queue as many times as there are iterations of the DATA step. After the DATA step completes, the code in the queue gets executed in the order of its creation (First In First Out).

The beauty of this process is that the argument string can be a concatenation of character constants (in single or double quotes) and SAS variables which get substituted with their values by CALL EXECUTE for each DATA step iteration. This will produce data-driven, dynamically generated SAS code just like an iterative SAS macro loop.

Let’s consider the following example. Say we need to load multiple Oracle tables into SAS tables.

Step 1. Creating a driver table

In order to make our process data-driven, let’s first create a driver table containing a list of the table names that needed to be extracted and loaded:

/* create a list of tables to extract & load */
libname parmdl '/sas/data/parmdata';
data parmdl.tablelist;
        length tname $8;
        input tname;
        datalines;
ADDRESS
ACCOUNT
BENEFIT
FINANCE
HOUSING
;

This program runs just once to create the driver table parmdl.tablelist.

Step 2. Loading multiple tables

Then, you can use the following data-driven program that runs each time you need to reload Oracle tables into SAS:

/* source ORACLE library */
libname oralib oracle path="xxx" schema="yyy" user="uuu"
 	PASSWORD="{SAS002}ABCDEFG12345678RTUR" access=readonly;
 
/* target SAS library */
libname sasdl '/sas/data/appdata';
 
/* driver table SAS library */
libname parmdl '/sas/data/parmdata';
 
data _null_;
   set parmdl.tablelist;
   call execute(cats(
      'data sasdl.',tname,';',
         'set oralib.',tname,';',
      'run;'));
run;

In order to concatenate the components of the CALL EXECUTE argument I used the cats() SAS function which returns a concatenated character string removing leading and trailing blanks.

When this program runs, the SAS log indicates that after the data _null_ step the following statements are added and executed:

NOTE: CALL EXECUTE generated line.
1   + data sasdl.ADDRESS;set oralib.ADDRESS;run;
2   + data sasdl.ACCOUNT;set oralib.ACCOUNT;run;
3   + data sasdl.BENEFIT;set oralib.BENEFIT;run;
4   + data sasdl.FINANCE;set oralib.FINANCE;run;
5   + data sasdl.HOUSING;set oralib.HOUSING;run;

In this example we use data _null_ step to loop through the list of tables (parmdl.tablelist) and for each value of the tname column a new data step gets generated and executed after the data _null_ step. The following diagram illustrates the process:

Diagram explaining CALL EXECUTE for SAS data-driven programming

Argument string has macro variable reference in double quotes

If an argument to the CALL EXECUTE has macro variable references in double quotes, they will be resolved by the SAS macro pre-processor during the DATA step compilation. Nothing unusual. For example, the following code will execute exactly as the above, and macro variable references &olib and &slib will be resolved to oralib and sasdl before CALL EXECUTE takes any actions:

%let olib = oralib;
%let slib = sasdl;
 
data _null_;
   set parmdl.tablelist;
   call execute (
      "data &slib.."!!strip(tname)!!';'
         "set &olib.."!!strip(tname)!!';'!!
      'run;'
   );
run;

Argument string has macro or macro variable reference in single quotes

Here comes the interesting part. If the argument to CALL EXECUTE has macro or macro variable references in single quotes, they still will be resolved before the code is pushed out of the DATA step, but not by the SAS macro pre-processor during the DATA step compilation as it was in the case of double quotation marks. Macro or macro variable references in single quotes will be resolved by CALL EXECUTE itself. For example, the following code will execute exactly as the above, but macro variable references &olib and &slib will be resolved by CALL EXECUTE:

%let olib = oralib;
%let slib = sasdl;
 
data _null_;
   set parmdl.tablelist;
   call execute('data &slib..'!!strip(tname)!!';'!!
                'set &olib..'!!strip(tname)!!';'!!
                'run;'
               );
run;

Timing considerations

CAUTION: If your macro contains some non-macro language constructs for assigning macro variables during run time, such as a CALL SYMPUT or SYMPUTX statement (in a DATA step) or an INTO clause (in PROC SQL), resolving those macro variable references by CALL EXECUTE will happen too soon, before your macro-generated code gets pushed out and executed. This will result in unresolved macro variables. Let’s run the following code that should extract Oracle tables into SAS tables as above, but also re-arrange column positions to be in alphabetical order:

%macro onetable (tblname);
   proc contents data=oralib.&tblname out=one(keep=name) noprint;
   run;
 
   proc sql noprint;
      select name into :varlist separated by ' ' from one;
   quit;
   %put &=varlist;
 
   data sasdl.&tblname;
      retain &varlist;
      set oralib.&tblname end=last nobs=n;
      if last then call symput('n',strip(put(n,best.)));
   run;
   %put Table &tblname has &n observations.;
%mend onetable;
 
data _null_;
   set parmdl.tablelist;
   call execute('%onetable('!!strip(tname)!!');');
run;

Predictably, the SAS log will show unresolved macro variable references, such as:

WARNING: Apparent symbolic reference VARLIST not resolved.
WARNING: Apparent symbolic reference N not resolved.
Table ADDRESS has &n observations.

SOLUTION: To avoid the timing issue when a macro reference gets resolved by CALL EXECUTE too soon, before macro variables have been assigned during macro-generated step execution, we can strip CALL EXECUTE of the macro resolution privilege. In order to do that, we can mask & and % characters using the %nrstr macro function, thus making CALL EXECUTE “macro-blind,” so it will push the macro code out without resolving it. In this case, macro resolution will happen after the DATA step where CALL EXECUTE resides. If an argument to CALL EXECUTE has a macro invocation, then including it in the %nrstr macro function is the way to go. The following code will run just fine:

data _null_;
   set parmdl.tablelist;
   call execute('%nrstr(%onetable('!!strip(tname)!!'));');
run;

When this DATA step runs, the SAS log indicates that the following statements are added and executed:

NOTE: CALL EXECUTE generated line.
1   + %onetable(ADDRESS);
2   + %onetable(ACCOUNT);
3   + %onetable(BENEFIT);
4   + %onetable(FINANCE);
5   + %onetable(HOUSING);

CALL EXECUTE argument is a SAS variable

The argument to CALL EXECUTE does not necessarily have to contain or be a character constant. It can be a SAS variable, a character variable to be precise. In this case, the behavior of CALL EXECUTE is the same as when the argument is a string in single quotes. It means that if a macro reference is part of the argument value it needs to be masked using the %nrstr() macro function in order to avoid the timing issue mentioned above.

In this case, the argument to the CALL EXECUTE may look like this:

arg = '%nrstr(%mymacro(parm1=VAL1,parm2=VAL2))';
call execute(arg);

Making CALL EXECUTE totally data-driven

In the examples above we used the tablelist driver table to retrieve values for a single macro parameter for each data step iteration. However, we can use a driver table not only to dynamically assign values to one or more macro parameters, but also to control which macro to execute in each data step iteration. The following diagram illustrates the process of totally data-driven SAS program:

Diagram explaining using CALL EXECUTE for SAS data-driven programming

Conclusion

CALL EXECUTE is a powerful tool for developing dynamic data-driven SAS applications. Hopefully, this blog post demonstrates its benefits and clearly explains how to avoid its pitfalls and use it efficiently to your advantage. I welcome your comments, and would love to hear your experiences with CALL EXECUTE.

CALL EXECUTE made easy for SAS data-driven programming was published on SAS Users.

4月 182014
 

Some very common questions we receive into Technical Support are, “I need to be able to send the data for every hospital listed in my data set to a separate .csv file. How can I do that?” Or, “How can I run a PROC REPORT for each region listed in my data set?” And, “How can I send an email to each manager listed in my data set?”

Each one of these questions has something in common. Each question is asking how to execute a task based on a data set variable. Within SAS, the best way to execute a repetitive task is to use the SAS Macro language. The best way to invoke a macro based on a data set variable is to use CALL EXECUTE. The two most important aspects of CALL EXECUTE are that it allows you to:

  • conditionally execute a macro based on DATA step logic
  • pass in a data set variable’s value in as the value to a macro parameter

Two common errors when invoking macro statements based on DATA step logic
Because CALL EXECUTE is a DATA step call routine, it can be conditionally executed using DATA step logic. It is important to understand the basic timing of a macro statement. A macro statement executes a DATA step at compile time. This means that macro code executes before any DATA step logic is ever evaluated. Here is an example of logic that will not work. Because of the timing of macro statements, this macro invocation will always execute, regardless of the IF condition. CALL EXECUTE can be used to accommodate for this difference in timing.

If first.hospital_id then do;
  %mymacro(hospital_id)
End;

The above example also illustrates another common mistake when attempting to invoke a macro based on DATA step logic. This example is attempting to pass in the value of the data set variable HOSPITAL_ID as the parameter to the macro. It is important to know that the macro language is simply a text language. When the MYMACRO macro executes, the text ‘hospital_id’ is passed in as the value of the macro parameter, not the value of the data set variable HOSPITAL_ID. Again the desired result can be achieved using CALL EXECUTE.

How to invoke a macro using CALL EXECUTE
So now let’s see the correct way to invoke the macro using CALL EXECUTE. In the following example, we are going to dynamically create a .csv file for each hospital listed in our data set. This will be accomplished by placing a PROC EXPORT within a macro. CALL EXECUTE will allow to invoke the macro for each hospital listed in the HOSPITALS data set.

/* sample data set that contains the hospital ids */ 
data hospitals;                                                
  input hospital_id $ quarter number_of_patients;                               
  datalines;                                      
A100 1 125                                    
A100 2 115                                                                                                                              
A100 3 130                                                                                                                              
A100 4 110                                                                                                                              
A200 1 200                                                                        
A200 2 195 
A200 3 180                                                        
A200 4 190                                       
;                                                

/* this macro uses PROC EXPORT to write data to a .csv file */

%macro mymacro(hosp_id);                                                                                                                

proc export data=hospitals(where=(hospital_id="&hosp_id"))                                                                                  
  outfile="c:\hospitals\&hosp_id..csv"                                                                                                  
  dbms=csv                                                                                                                              
  replace;                                                                                                                              
run;                                                                                                                                    

%mend;                                                                                                                                  

data _null_;                                                                                                                            
  set hospitals;                                                                                                                        
  by hospital_id;

/* this logic allows us to call the MYMACRO macro 
each time the value of HOSPITAL_ID changes */   

  if first.hospital_id then do;
    call execute(cats('%mymacro(',hospital_id,');')); 
  end;                                                                  
run;

This logic will work regardless of the number of hospitals listed and of the number of quarters listed for each hospital. When this code is executed, the following macro invocations are generated.

First time the IF condition is true:
%mymacro(A100)
Second time IF condition is true:
%mymacro(A200)

How the CALL EXECUTE routine processes macros
It is important to know what happens behind the scenes when using CALL EXECUTE. When the argument to CALL EXECUTE is a macro invocation, the macro will execute immediately. This means that all macro statements are executed and all macro variable references are resolved. All non-macro code generated by the macro is stored and does not execute until after the step boundary that terminates the DATA step that contains the CALL EXECUTE is encountered. The timing of when statements are executed can sometimes lead to unresolved macro variables, which in turn can lead to unexpected syntax errors. This issue is discussed in detail in SAS Usage Note 23134. The note discusses the cause of the potential problem as well as the solution.

CALL EXECUTE is an invaluable tool that should be added to your macro programming toolbox. CALL EXECUTE allows you to make your macro invocations data driven, which can make your programming more efficient and adaptable.

tags: CALL EXECUTE, macro, Problem Solvers, SAS Programmers
11月 072013
 

I use relatively fixed patterns in my SAS  programming life. For so called data driven programming (or dynamic programming), I used the following 4 techniques, chronologically:

  • macro array
  • call execute
  • list processing
  • for each loop

For a quick demo, I will start with a simple scenario in which the data set sashelp.zipcode should be spitted to pieces of datasets by states (in real projects, the codes would be more complicated but share the simple atom structure). For example, the dataset for North Carolina:

data zipcode_NC;
    set sashelp.zipcode(where=(statecode="NC"));
run;

There are 50 states plus territories like Puerto Rico in the source data, so you won’t just use a simple string replacement by macro variable. My first SAS dynamic programming technique is using macro array, learned from Chapter 6 of Carpenter’s Complete Guide to the SAS Macro Language:

1. Macro array

proc sort data=sashelp.zipcode (keep=statecode) nodupkey out=statecode;
    by statecode;
run;

data _null_;
    set statecode end=eof;
    i+1;
    II=left(put(i,2.));
    call symputx(‘statecode’||II,statecode);
    if eof then call symputx(‘n’,II);
run;

%macro doit;
%do i=1 %to &n;
    data class_&&statecode&i;
        set sashelp.zipcode(where=(statecode="&&statecode&i"));
    run;
%end;
%mend;
%doit

Macro array is still my favorite and I use it everywhere. It creates multiple macros variables by sequence (macro array) from the control file, then apply a do loop over each macro variable to get job done: straightforward while robust(verbose somehow).

2. Call Execute

Call Execute is a power tool. It will make your codes much concise and efficient. I love it but with little bit of reluctance: regardless the potential timing issue, for me, it is not aesthetically readable in most cases. To make a relatively enjoyable Call Execute, I prefer to enclose the atom part of the program to a macro then use a single Call Execute to resolve it:

%macro break(statecode);
   data class_&statecode;
      set sashelp.zipcode(where=(statecode="&statecode"));
   run;
%mend;

data _null_;
  set statecode;
  call execute(‘%break(‘||trim(statecode)||’)’);
run;

Heavily interacting with data step will just make Call Execute like black magic (this snippet comes from Mike Molter’s 2013 paper, Coding For the Long Haul With Managed Metadata and Process Parameters; sorry Mike, I know it works):

data _null_ ;
length lastds $4 ;
set meta2 end=thatsit ;
if _n_ eq 1 then do;
     call execute (‘proc sql; ‘) ;
     lastds=’ae’;
end;
retain lastds ;
 
call execute (‘create table ‘||data set||’ as select a.*,’||left(keepvars)||’ from
‘||lastds a left join suppae_tran(where=(idvar eq "’||compress(idvar)||’")) b on
a.usubjid=b.usubjid and a.’||compress(idvar)||’=’||left(joincond)||’;’);

 
if thatsit then call execute(‘ quit; ‘) ;
lastds=data set ;
run;

Well there are always trade-offs. Separating Call Execute with data steps (my preference) will make it much more readable, but it is not cool anymore(compared to Mike’s style)! Coolness deserves the efforts and I know it’s part of programmers proud.

3. List Processing

I started to use list in SAS since 2011 and now I have a big collection:

https://github.com/Jiangtang/SAS_ListProcessing

Macro array approach will create series of macro variables, while in list method, a single macro variable will be generated which hold series of values:

proc sql noprint;
    select statecode into:statecode separated by " "
    from statecode
    ;
quit;

%macro doit;
%do i=1 %to %sysfunc(countw(&statecode));
    %let a=%sysfunc(scan(&statecode,&i,’ ‘));
    data class_&a;
        set sashelp.zipcode(where=(statecode="&a"));
    run;
%end;
%mend;
%doit

4. For each

A foreach operation is dream for SAS programmers. Now I have one, for.sas by Jim Anderson:

filename list url "https://raw.github.com/Jiangtang/SAS_ListProcessing/master/_ListProcessing";
%inc list;

data %for(statecode, in=[statecode], do=%nrstr(class_&statecode(where=(statecode="&statecode"))));
    set sashelp.zipcode;
run;

Actually this %for is much more versatile than it appears in this simple demo. It can proceeds sequentially against SAS datasets, value list, number range, along with dataset contents and directory contents. Check it out and you will definitely love it:

http://www.sascommunity.org/wiki/Streamlining_Data-Driven_SAS_With_The_%25FOR_Macro

4月 172010
 


When printing multiple files, or sorting multiple datasets, the traditional method is to write multiple steps as below. Proc print data=libref.ae; var _all_; run; Proc print data=libref.conmed; var _all_; run; Proc print data=libref.demog; var _all_; run; Proc print data=libref.lab; var _all_; run; Proc print data=libref.medhist; var _all_; run; If you are like me who likes to simplify the traditional SAS code here is the tip. CALL EXECUTE comes to rescue here. *Using Disctionary Tables and Call Execute; proc sql; create table dsn as select distinct memname from dictionary.tables where libname="LIBREF" and memtype="DATA"; quit; *Sorts all the datasets using Call Execute; data _null_; set dsn; call execute ("proc sort data=final.||'memname||';by usubjid; run;"); run; *Prints all the datasets using Call Execute; data _null_; set dsn; call execute ("proc print data=final.||'trim(memname)||';var _all_; run;"); run; *Using Proc Contents and Call...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]
 Posted by at 8:45 下午
4月 172010
 


When printing multiple files, or sorting multiple datasets, the traditional method is to write multiple steps as below. Proc print data=libref.ae; var _all_; run; Proc print data=libref.conmed; var _all_; run; Proc print data=libref.demog; var _all_; run; Proc print data=libref.lab; var _all_; run; Proc print data=libref.medhist; var _all_; run; If you are like me who likes to simplify the traditional SAS code here is the tip. CALL EXECUTE comes to rescue here. *Using Disctionary Tables and Call Execute; proc sql; create table dsn as select distinct memname from dictionary.tables where libname="LIBREF" and memtype="DATA"; quit; *Sorts all the datasets using Call Execute; data _null_; set dsn; call execute ("proc sort data=final.||'memname||';by usubjid; run;"); run; *Prints all the datasets using Call Execute; data _null_; set dsn; call execute ("proc print data=final.||'trim(memname)||';var _all_; run;"); run; *Using Proc Contents and Call...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]
 Posted by at 8:45 下午