sas programming

11月 202020
 

If you’re like me and the rest of the conference team, you’ve probably attended more virtual events this year than you ever thought possible. You can see the general evolution of virtual events by watching the early ones from April or May and compare them to the recent ones. We at SAS Global Forum are studying the virtual event world, and we’re learning what works and what needs to be tweaked. We’re using that knowledge to plan the best possible virtual SAS Global Forum 2021.

Everything is virtual these days, so what do we mean by virtual?

Planning a good virtual event takes time, and we’re working through the process now. One thing is certain -- we know the importance of providing quality content and an engaging experience for our attendees. We want to provide attendees with the opportunity as always, but virtually, to continue to learn from other SAS users, hear about new and exciting developments from SAS, and connect and network with experts, peers, partners and SAS. Yes, I said network. We realize it won’t be the same as a live event, but we are hopeful we can provide attendees with an incredible experience where you connect, learn and share with others.

Call for content is open

One of the differences between SAS Global Forum and other conferences is that SAS users are front and center, and the soul of the conference. We can’t have an event without user content. And that’s where you come in! The call for content opened November 17 and lasts through December 21, 2020. Selected presenters will be notified in January 2021. Presentations will be different in 2021; they will be 30 minutes in length, including time for Q&A when able. And since everything is virtual, video is a key component to your content submission. We ask for a 3-minute video along with your title and abstract.

The Student Symposium is back

Calling all postsecondary students -- there’s still time to build a team for the Student Symposium. If you are interested in data science and want to showcase your skills, grab a teammate or two and a faculty advisor and put your thinking caps on. Applications are due by December 21, 2020.

Learn more

I encourage you to visit the SAS Global Forum website for up-to-date information, follow #SASGF on social channels and join the SAS communities group to engage with the conference team and other attendees.

Connect, learn and share during virtual SAS Global Forum 2021 was published on SAS Users.

11月 102020
 

The code and data that drive analytics projects are important assets to the organizations that sponsor them. As such, there is a growing trend to manage these items in the source management systems of record. For most companies these days, that means Git. The specific system might be GitHub Enterprise, GitLab, or Bitbucket -- all platforms that are based on Git.

Many SAS products support direct integration with Git. This includes SAS Studio, SAS Enterprise Guide, and the SAS programming language. (That last one checks a lot of boxes for ways to use Git and SAS together.) While we have good documentation and videos to help you learn about Git and SAS, we often get questions around "best practices" -- what is the best/correct way to organize your SAS projects in Git?

In this article I'll dodge that question, but I'll still try to provide some helpful advice in the process.

Ask the Expert resource: Using SAS® With Git: Bring a DevOps Mindset to Your SAS® Code

Guidelines for managing SAS projects in Git

It’s difficult for us to prescribe exactly how to organize project repositories in source control. Your best approach will depend so much on the type of work, the company organization, and the culture of collaboration. But I can provide some guidance -- mainly things to do and things to avoid -- based on experience.

Do not create one huge repository

DO NOT build one huge repository that contains everything you currently maintain. Your work only grows over time and you'll come to regret/revisit the internal organization of a huge project. Once established, it can be tricky to change the folder structure and organization. If you later try to break a large project into smaller pieces, it can be difficult or impossible to maintain the integrity of source management benefits like file histories and differences.

Design with collaboration in mind

DO NOT organize projects based only on the teams that maintain them. And of course, don't organize projects based on individual team members.

  • Good repo names: risk-adjustment-model, engagement-campaigns
  • Bad repo names: joes-code, claims-dept

All teams reorganize over time, and you don't want to have to reorganize all of your code each time that happens. And code projects change hands, so keep the structure personnel-agnostic if you can. Major refactoring of code can introduce errors, and you don't want to risk that just because you got a new VP or someone changed departments.

Instead, DO organize projects based on function/work that the code accomplishes. Think modular...but don't make projects too granular (or you'll have a million projects). I personally maintain several SAS code projects. The one thing they have in common is that I'm the main contributor -- but I organize them into functional repos that theoretically (oh please oh please) someone else could step in to take over.

The Git view of my YouTube API project in SAS Enterprise Guide

Up with reuse, down with ownership

This might seem a bit communist, but collaboration works best when we don't regard code that we write as "our turf." DO NOT cling to notions of code "ownership." It makes sense for teams/subject-matter experts to have primary responsibility for a project, but systems like Git are designed to help with transparency and collaboration. Be open to another team member suggesting and merging (with review and approval) a change that improves things. GitHub, GitLab, and Bitbucket all support mechanisms for issue tracking and merge requests. These allow changes to be suggested, submitted, revised, and approved in an efficient, transparent way.

DO use source control to enable code reuse. Many teams have foundational "shared code" for standard operations, coded in SAS macros or shared statements. Consider placing these into their own project that other projects and teams can import. You can even use Git functions within SAS to fetch and include this code directly from your Git repository:

/* create a temp folder to hold the shared code */
options dlcreatedir;
%let repoPath = %sysfunc(getoption(WORK))/shared-code;
libname repo "&repoPath.";
libname repo clear;
 
/* Fetch latest code from Git */
data _null_;
 rc = git_clone( 
   "https://gitlab.mycompany.com/sas-projects/shared-code/",
   "&repoPath.");
run;
 
options source2;
/* run the code in this session */
%include "&repoPath./bootstrap-macros.sas";

If you rely on a repository for shared code and components, make sure that tests are in place so changes can be validated and will not break downstream systems. You can even automate tests with continuous integration tools like Jenkins.

DO document how projects relate to each other, dependencies, and prepare guidance for new team members to get started quickly. For most of us, we feel more accountable when we know that our code will be placed in central repositories visible to our peers. It may inspire cleaner code, more complete documentation, and a robust on-boarding process for new team members. Use the Markdown files (README.md and others) in a repository to keep your documentation close to the code.

My SAS code to check Pagespeed Insights, with documentation

Work with Git features (and not against them)

Once your project files are in a Git repository, you might need to change your way of working so that you aren't going against the grain of Git benefits.

DO NOT work on code changes in a shared directory with multiple team members –- you'll step on each other. The advantage of Git is that it's a distributed workflow and each developer can work with their own copy of the repository, and merge/accept changes from others at their own pace.

DO use Git branching to organize and isolate changes until you are ready to merge them with the main branch. It takes a little bit of learning and practice, but when you adopt a branching approach you'll find it much easier to manage -- it beats keeping multiple copies of your code with slightly different file and folder names to mark "works in progress."

DO consider learning and using Git tools such as Git Bash (command line), Git GUI, and a code IDE like VS Code. These don't replace the SAS-provided coding tools with their Git integration, but they can supplement your workflow and make it easier to manage content among several projects.

Learning more

When you're ready to learn more about working with Git and SAS, we have many webinars, videos, and documentation resources:

The post How to organize your SAS projects in Git appeared first on The SAS Dummy.

11月 102020
 

As a SAS consultant I have been an avid user of SAS Enterprise Guide for as long as I can remember. It has been not just my go-to tool, but that of many of the SAS customers I have worked with over the years.

It is easy to use, the interface intuitive, a Swiss Army knife when it comes to data analysis. Whether you’re looking to access SAS data or import good old Excel locally, join data together or perform data analysis, a few clicks and ta-dah, you’re there! Alternatively, if you insist on coding or like me, use a bit of both, the ta-dah point still holds.

SAS Enterprise Guide, or EG as it is commonly known as, is a mature SAS product with many years of R&D, an established user base, a reliable and trusted product. So why move to SAS Studio? Why should I leave the comfort of what works?

For the last nine months I have been working with one of the UK’s largest supermarket answering that exact question as they make that journey from SAS Enterprise Guide to SAS Studio. EG is used widely across several supermarket operations, including:

  • supply chain (to look at wastage and stock availability)
  • marketing analytics (to look at customer behaviour and build successful campaigns)
  • fraud detection (to detect misuse of vouchers).

What is SAS Studio?

Firstly, let's answer the "what is SAS Studio" question. It is the browser-based interface for SAS programmers to run code or use predefined tasks to automatically generate SAS code. Since there is nothing to install on your desktop, you can access it from almost any machine: Windows or Mac. And SAS Studio is brought to you by the same SAS R&D developers who maintain SAS Enterprise Guide.

SAS Studio with Ignite (dark) theme

1. Still does the regular stuff

It allows you to access your data, libraries and existing programs and import a range of data sources including Excel and CSV. You can code or use the tasks to perform analysis. You can build queries to join data, create simple and complex expressions, filter and sort data.

But it does much more than that... So what cool things can you do with SAS Studio?

2. Use the processing power of SAS Viya

SAS Studio (v5.2 onwards) works on SAS Viya. Previously SAS 9 had the compute server aka the workspace server as the processing engine. SAS Viya has CAS, the next generation SAS run time environment which makes use of both memory and disk. It is distributed, fault tolerant, elastic and can work on problems larger than the available RAM. It is all centrally managed, secure, auditable and governed.

3. Cool new functionality

SAS Studio comes with many enhancements and cool new functionality:

  • Custom tasks. You can easily build your own custom tasks (software developer skills not required) so others without SAS coding skills can utilise them. Learn more in this Ask the Expert session.
  • Code snippets. It comes with pre-defined code snippets, commonly used bits of code that you can save and reuse. Additionally, you can create your own which you can share with colleagues. Coders love that these code snippets can be used with keystroke abbreviations.
  • Background submit.  This allows you to run code in the background whilst you continue to work.
  • DATA step debugger. First added into SAS Enterprise Guide, SAS Studio now offers an interactive DATA step debugger as well.
  • Flexible layout for your workspace, You can have multiple tabs open for each program, and open multiple datasets and items.
  • FEDSQL. The query window

    DATA step debugger in SAS Studio

4. Seamlessly access the full suite of SAS Viya capabilities

A key benefit of SAS Studio is the ease of which you can move from writing code to doing some data discovery, visualisation and model building. Previously in the SAS 9 world you may have used EG to access and join your data and then move to SAS Enterprise Miner, a different interface, installed separately to build a model. Those days are long gone.

To illustrate the point, if I wanted to build a campaign to see who would respond to a supermarket voucher, I could access my customer data and join that to my transaction and products data in SAS Studio. I could then move into SAS Visual Analytics to identify the key variables I would need to build an analytical model and even the best model to build. From there I would move to SAS Visual Data Mining and Machine Learning to build the model. I could very easily use the intuitive point-and-click pipeline interface to build several models, incorporating an R or Python to find the best model. This would all be done within one browser-based interface and the data being loaded only once.

This tutorial from Christa Cody illustrates this coding workflow in action.

The Road to SAS Studio

SAS Studio clearly has a huge number of benefits, it does the regular stuff you would expect, but additionally brings a host of cool new functionality and the processing power of SAS Viya, not to mention allowing you to move seamlessly to the next steps of the analytical and decisioning journey including model building, creating visualisations, etc.

Change management + technical enablement = success

Though adoption of modern technology can bring significant benefits to enterprise organisations as this supermarket is seeing, it is not without its challenges. Change is never easy and the transition from EG to Studio will take time. Especially with a mature, well liked and versatile product like EG.

The cultural challenge that new technology provides should not be underestimated and can provide a barrier to adoption. Newer technology requires new approaches, a different way of working across diverse user communities many of whom have well established working practices that may in some cases, resist change. The key is to invest the time with the communities, explain how newer technology can support their activities more efficiently and provide them with broader capability.

Learn more

Visit the Learn and Support center for SAS Studio.

Moving from SAS Enterprise Guide to SAS Studio was published on SAS Users.

11月 022020
 

When there are two equivalent ways to do something, I advocate choosing the one that is simpler and more efficient. Sometimes, I encounter a SAS program that simulates random numbers in a way that is neither simple nor efficient. This article demonstrates two improvements that you can make to your SAS code if you are simulating binary variables or categorical variables.

Simulate a random binary variable

The following DATA step simulates N random binary (0 or 1) values for the X variable. The probability of generating a 1 is p=0.6 and the probability of generating a 0 is 0.4. Can you think of ways that this program can be simplified and improved?

%let N = 8;
%let seed = 54321;
 
data Binary1(drop=p);
call streaminit(&seed);
p = 0.6;
do i = 1 to &N;
   u = rand("Uniform");
   if (u < p) then
      x=1;
   else
      x=0;
   output;
end;
run;
 
proc print data=Binary1 noobs; run;

The goal is to generate a 1 or a 0 for X. To accomplish this, the program generates a random uniform variate, u, which is in the interval (0, 1). If u < p, it assigns the value 1, otherwise it assigns the value 0.

Although the program is mathematically correct, the program can be simplified. It is not necessary for this program to generate and store the u variable. Yes, you can use the DROP statement to prevent the variable from appearing in the output data set, but a simpler way is to use the Bernoulli distribution to generate X directly. The Bernoulli(p) distribution generates a 1 with probability p and generates a 0 with probability 1-p. Thus, the following DATA step is equivalent to the first, but is both simpler and more efficient. Both programs generate the same random binary values.

data Binary2(drop=p);
call streaminit(&seed);
p = 0.6;
do i = 1 to &N;
   x = rand("Bernoulli", p);     /* Bern(p) returns 1  with probability p */
   output;
end;
run;
 
proc print data=Binary2 noobs; run;

Simulate a random categorical variable

The following DATA step simulates N random categorical values for the X variable. If p = {0.1, 0.1, 0.2, 0.1, 0.3, 0.2} is a vector of probabilities, then the probability of generating the value i is p[i]. For example, the probability of generating a 3 is 0.2. Again, the program generates a random uniform variate and uses the cumulative probabilities ({0.1, 0.2, 0.4, 0.5, 0.8, 1}) as cutpoints to determine what value to assign to X, based on the value of u. (This is called the inverse CDF method.)

/* p = {0.1, 0.1, 0.2, 0.1, 0.3, 0.2} */
/* Use the cumulative probability as cutpoints for assigning values to X */
%let c1 = 0.1;
%let c2 = 0.2;
%let c3 = 0.4;
%let c4 = 0.5;
%let c5 = 0.8;
%let c6 = 1;
 
data Categorical1;
call streaminit(&seed);
do i = 1 to &N;
   u = rand("Uniform");
   if (u <=&c1) then
      y=1;
   else if (u <=&c2) then
      y=2;
   else if (u <=&c3) then
      y=3;
   else if (u <=&c4) then
      y=4;
   else if (u <=&c5) then
      y=5;
   else
      y=6;
   output;
end;
run;
 
proc print data=Categorical1 noobs; run;

If you want to generate more than six categories, this indirect method becomes untenable. Suppose you want to generate a categorical variable that has 100 categories. Do you really want to use a super-long IF-THEN/ELSE statement to assign the values of X based on some uniform variate, u? Of course not! Just as you can use the Bernoulli distribution to directly generate a random variable that has two levels, you can use the Table distribution to directly generate a random variable that has k levels, as follows:

data Categorical2;
call streaminit(&seed);
array p[6] _temporary_ (0.1, 0.1, 0.2, 0.1, 0.3, 0.2);
do i = 1 to &N;
   x = rand("Table", of p[*]); /* Table(p) returns i with probability p[i] */
   output;
end;
 
proc print data=Categorical2 noobs; run;

Summary

In summary, this article shows two tips for simulating discrete random variables:

  1. Use the Bernoulli distribution to generate random binary variates.
  2. Use the Table distribution to generate random categorical variates.

These distributions enable you to directly generate categorical values based on supplied probabilities. They are more efficient than the oft-used method of assigning values based on a uniform random variate.

The post Tips to simulate binary and categorical variables appeared first on The DO Loop.

10月 052020
 

Finite-precision computations can be tricky. You might know, mathematically, that a certain result must be non-negative or must be within a certain interval. However, when you actually compute that result on a computer that uses finite-precision, you might observe that the value is slightly negative or slightly outside of the interval. This frequently happens when you are adding numbers that are not exactly representable in base 2. One of the most famous examples is the sum
    0.1 + 0.1 + 0.1 ≠ 0.3 (finite precision),
which is shown to every student in Computer Science 101. Other examples include:

  • If x is in the interval [0,1], then y = 1 - x10 is also in the interval [0,1]. That is true in exact precision, but not necessarily true in finite-precision computations when x is close to 1.
  • Although sin2(t) + cos2(t) = 1 in exact arithmetic for all values of t, the equation might not be true in finite precision.

SAS programs that demonstrate the previous finite-precision computations are shown at the end of this article. Situations like these can cause problems when you want to use the result in a function that has a restricted domain. For example, the SQRT function cannot operate on negative numbers. For probability distributions, the quantile function cannot operate on numbers outside the interval [0,1].

This article discusses how to "trap" results that are invalid because they are outside of a known interval. You can use IF-THEN/ELSE logic to catch an invalid result, but SAS provides more compact syntax. This article discusses using the IFN function in Base SAS and using the "elementwise minimum" (<>) and "elementwise maximum" (><) operators in the SAS/IML language.

Trap and map

I previously wrote about the "trap and cap" technique for handling functions like log(x). The idea is to "trap" invalid input values (x ≤ 0) and "cap" the output when you are trying to visualize the function. For the present article, the technique is "trap and map": you need to detect invalid computations and then map the result back into the interval that you know is mathematically correct. For example, if the argument is supposed to represent a probability in the interval [0,1], you must map negative numbers to 0 and map numbers greater than 1 to 1.

How to use the IFN function in SAS

Clearly, you can "trap and map" by using IF-THEN/ELSE logic. For example, suppose you know that a variable, x, must be in the interval [0,1]. You can use the SAS DATA step to trap invalid values and map them into the interval, as follows:

/* trap invalid values of x and map them into [0,1]. Store result in y */
data Map2;
input x @@;
if x<0 then 
   y = 0;
else if x>1 then 
   y = 1;
else 
   y = x;
datalines;
1.05 1 0.9 0.5 0 -1e-16 -1.1
;

This program traps the invalid values, but it requires six lines of IF-THEN/ELSE logic. A more compact syntax is to use the IFN function. The IFN function has three arguments:

  • The first argument is a logical expression, such as x < 0.
  • The second argument is the value to return when the logical expression is true.
  • The third argument is the value to return when the logical expression is false.

For example, you can use the function call IFN(x<0, 0, x) to trap negative values of x and map those values to 0. Similarly, you can use the function call IFN(x>1, 1, x) to trap values of x that are greater than 1 and map those values to 1. To perform both of the trap-and-map operations on one line, you can nest the function calls so that the third argument to the IFN function is itself a function call, as follows:

data Map2;
input x @@;
y = ifn(x<0, 0, ifn(x>1, 1, x));   /* trap and map: force x into [0,1] */
 
/* or, for clarity, split into two simpler calls */
temp = ifn(x>1, 1, x);             /* force x <= 1 */
z = ifn(x<0, 0, temp);             /* force x >= 0 */
datalines;
1.05 1 0.9 0.5 0 -1e-16 -1.1
;
 
proc print noobs; run;

The output shows that the program performed the trap-and-map operation correctly. All values of y are in the interval [0,1].

If you can't quite see how the logic works in the statement that nests the two IFN calls, you can split the logic into two steps as shown later in the program. The first IFN call traps any variables that are greater than 1 and maps them to 1. The second IFN call traps any variables that are less than 0 and maps them to 0. The z variable has the same values as the y variable.

How to use the element min/max operators in SAS/IML

The SAS/IML language contains operators that perform similar computations. If x is a vector of numbers, then

  • The expression (x <> 0) returns a vector that is the elementwise maximum between the elements of x and 0. In other words, any elements of x that are less than 0 get mapped to 0.
  • The expression (x >< 1) returns a vector that is the elementwise minimum between the elements of x and 1. In other words, any elements of x that are greater than 1 get mapped to 1.
  • You can combine the operators. The expression (x <> 0) >< 1 forces all elements of x into the interval [0,1]. Because the elementwise operators are commutative, you can also write the expression as 0 <> x >< 1.

These examples are shown in the following SAS/IML program:

proc iml;
x = {1.05, 1, 0.9, 0.5, 0, -1e-16, -1.1};
y0 = (x <> 0);                     /* force x >= 0 */
y1 = (x >< 1);                     /* force x <= 1 */
y01 = (x <> 0) >< 1;               /* force x into [0,1] */
print x y0 y1 y01;

Summary

In summary, because of finite-precision computations (or just bad input data), it is sometimes necessary to trap invalid values and map them into an interval. Using IF-THEN/ELSE statements is clear and straightforward, but require multiple lines of code. You can perform the same logical trap-and-map calculations more compactly by using the IFN function in Base SAS or by using the elementwise minimum or maximum operators in SAS/IML.

Appendix: Examples of floating-point computations to study

The following SAS DATA step programs show typical computations for which the floating-point computation can produce results that are different from the expected mathematical (full-precision) computation.

data FinitePrecision;
z = 0.1 + 0.1 + 0.1;
if z^=0.3 then 
   put 'In finite precision: 0.1 + 0.1 + 0.1 ^= 0.3';
run;
 
data DomainError1;
do t = -4 to 4 by 0.1;
   x = cos(t);
   y = sin(t);
   z = x**2 + y**2;  /* always = 1, mathematically */
   s = sqrt(1 - z);  /* s = sqrt(0) */
   output;
end;
run;
 
data DomainError2;
do x = 0 to 1 by 0.01;
   z = 1 - x**10;  /* always in [0,1], mathematically */
   s = sqrt(z);  
   output;
end;
run;
9月 232020
 

Many textbooks and research papers present formulas that involve recurrence relations. Familiar examples include:

  • The factorial function: Set Fact(0)=1 and define Fact(n) = n*Fact(n-1) for n > 0.
  • The Fibonacci numbers: Set Fib(0)=1 and Fib(1)=1 and define Fib(n) = Fib(n-1) + Fib(n-2) for n > 1.
  • The binomial coefficients (combinations of "n choose k"): For a set that has n elements, set Comb(n,0)=1 and Comb(n,n)=1 and define Comb(n,k) = Comb(n-1, k-1) + Comb(n-1, k) for 0 ≤ k ≤ n.
  • Time series models: In many time series models (such as the ARMA model), the response and error terms depend on values at previous time points. This leads to recurrence relations.

Sometimes the indices begin at 0, other times they begin at 1. Working efficiently with sequences and recurrence relations can be a challenge. The SAS DATA step is designed to processes one observation at a time, but a recurrence relation requires looking at past values. If you use the SAS DATA step to work with recurrence relations, you need to use tricks to store and refer to previous values. Another challenge is indexing: most SAS programmers are used to one-based indexing, whereas some formulas use zero-based indexing.

This article uses the Fibonacci numbers to illustrate how to deal with some of these issues in SAS. There are several definitions of the Fibonacci numbers, but we'll look at the definition that uses zero indexing: F0 = 1, F1 = 1, and Fn = Fn-1 + Fn-2 for n > 1. In this article, I will use n ≤ 7 to demonstrate the concepts.

Recurrence relations by using arrays

The SAS DATA step supports arrays, and you can specify the indices of the arrays. Therefore, you can specify that an array index starts with zero. The following DATA step generates "wide" data. There is one observation, and the variables are named F0-F7.

data FibonacciWide;
array F[0:7] F0-F7;  /* index starts at 0 */
F[0] = 1;            /* initialize first values */
F[1] = 1;
do i = 2 to 7;       /* apply recurrence relation */
   F[i] = F[i-1] + F[i-2];
end;
drop i;
run;
 
proc print noobs; run;

You can mentally check that each number is the sum of the two previous numbers. Using an array is convenient and easy to program. Unfortunately, this method is not very useful in practice. In practice, you usually want the data in long form rather than wide form. For example, if you simulate time series data, each observation represents a time point and the values depend on previous time points.

Recurrence relations by using extra variables

In the DATA step, you can use the concept of "lags" to implement a recurrence relation. A lag is any previous value. If we are currently computing the n_th value, the first lag is the (n-1)th value, which is the previous observation. The second lag is the (n-2)th value and so forth. One way to look at previous values is to save them in an extra variable. The following DATA step uses F1 to hold the first lag of F and F2 to hold the second lag of F. Recall that the SUM function will return the sum of the nonmissing values, so sum(F1, F2) is nonmissing if F1 is nonmissing.

data FibonacciLong;
F1 = 1; F2 = .;        /* initialize lags */
i=0; F = F1; output;   /* manually output F(0) */
do i = 1 to 7;         /* now iterate: F(i) = F(i-1) + F(i-2) */
   F = sum(F1, F2);
   output;
   F2 = F1;            /* update lags for the next iteration */
   F1 = F;
end;
run;
 
proc print noobs; 
   var i F F1 F2; 
run;

Recurrence relations by using the LAG function

The DATA step supports a LAGn function. The LAGn function maintains a queue of length n, which initially contains missing values. Every time you call the LAGn function, it pops the top of the queue, returns that value, and adds the current value of its argument to the end of the queue. The LAGn functions can be surprisingly difficult to use in a complicated program because the queue is only updated when the function is called. If you have conditional IF-THEN/ELSE logic, you need to be careful to keep the queues up to date. I confess that I am often frustrated when I try to use the LAG function with recurrence relations. I find it easier to use extra variables.

But since the documentation for the LAGn function includes the Fibonacci numbers among its examples, I include it here. Notice the clever use of calling LAG1(F) and immediately assigning the new value of F, so that the program only needs to compute the first lag and does not use any extra variables:

/* modified from documentation for the LAG function */
data FibonacciLag;
i=0; F=1; output;       /* initialize and output F[0] */
/* lag1(F) is missing when _N_=1, but equals F[i-1] in later iters */
do i = 1 to 7;
   F = sum(F, lag(F));  /* iterate: F(i) = F(i-1) + F(i-2) */
   output;
end;
run;
 
proc print noobs; run;

The values are the same as for the previous program, but notice that this program does not use any extra variables to store the lags.

Recurrence relations in SAS/IML

Statistical programmers use the SAS/IML matrix language because of its power and flexibility, but also because you can vectorize computations. A vectorized computation is one in which an iterative loop is replaced by a matrix-vector computation. Unfortunately, for many time series computations, you cannot completely get rid of the loops because the series is defined in terms of a recurrence relation. You cannot compute the n_th value until you have computed the previous values.

The easiest way to implement a recurrence relation in the SAS/IML language is to use the DATA step program for generating the "wide form" sequence. The PROC IML code is almost identical to the DATA step code:

proc iml;
N = 8;
F = j(1, N, 1);            /* initialize F[1]=F[2]=1 */
do i = 3 to N;             
   F[i] = F[i-1] + F[i-2]; /* overwrite F[i] with the sum of previous terms, i > 2 */
end;
labls = 'F0':'F7';
print F[c=labls];

The program is efficient and straightforward. The output is not shown. The SAS/IML language does not support 0-based indexing, so for recurrence relations that start at 0, I usually define the indexes to match the recurrence relation and add 1 to the subscripts of the IML vectors. For example, if you index from 0 to 7 in the previous program, the body of the loop becomes F[i+1] = F[i]+ F[i-1].

Linear recurrence relations and matrix iteration

The Fibonacci sequence is an example of a linear recurrence relations. For a linear recurrence relation, you can use matrices and vectors to generate values. You can define the Fibonacci matrix to be the 2 x 2 matrix with values {0 1, 1 1}. The Fibonacci matrix transforms a vector {x1, x2} into the vector {x2, x1+x2}. In other words, it moves the second element to the first element and replaces the second element with the sum of the elements. If you iterate this transformation, the iterates generate the Fibonacci numbers, as shown in the following statements:

M = {0 1,                  /* Fibonacci matrix */
     1 1};
v = {1, 1};                /* initial state */
F = j(1, N, 1);            /* initialize F[1]=1 */
do i = 2 to N;             
   v = M*v;                /* generate next Fibonacci number */
   F[i] = v[1];            /* save the number in an array */
end;
print F[c=labls];

You can use this method to generate sequences in any linear recurrence relation. Previous methods work for linear or nonlinear recurrence relationships. The SAS/IML language also supports a LAG function, although it works differently from the DATA step function of the same name.

Summary

This article shows a few ways to work with recurrence relations in SAS. You can use DATA step arrays to apply relations for "wide form" data. You can use extra variables or the LAG function for long form data. You can perform similar computations in PROC IML. If the recurrence relation is linear, you can also use matrix-vector computations to apply each step of the recurrence relation.

The post Working with recurrence relations in SAS appeared first on The DO Loop.

9月 172020
 

Unquote by removing matching quotesBefore we delve into unquoting SAS character variables let’s briefly review existing SAS functionality related to the character strings quoting/unquoting.

%QUOTE and %UNQUOTE macro functions

Don’t be fooled by these macro functions’ names. They have nothing to do with quoting or un-quoting character variables’ values. Moreover, they have nothing to do with quoting or un-quoting even macro variables’ values. According to the %QUOTE Macro Function documentation it masks special characters and mnemonic operators in a resolved value at macro execution.  %UNQUOTE Macro Function unmasks all special characters and mnemonic operators so they are interpreted as macro language elements instead of as text. There are many other SAS “macro quoting functions” (%SUPERQ, %BQUOTE, %NRBQUOTE, all macro functions whose name starts with %Q: %QSCAN, %QSUBSTR, %QSYSFUNC, etc.) that perform some action including masking.

Historically, however, SAS Macro Language uses terms “quote” and “unquote” to denote “mask” and “unmask”. Keep that in mind when reading SAS Macro documentation.

QUOTE function

Most SAS programmers are familiar with the QUOTE function that adds quotation marks around a character value. It can add double quotation marks (by default) or single quotation marks if you specify that in its second argument.

This function goes even further as it doubles any quotation mark that already existed within the value to make sure that an embedded quotation mark is escaped (not treated as an opening or closing quotation mark) during parsing.

DEQUOTE function

There is also a complementary DEQUOTE function that removes matching quotation marks from a character string that begins with a quotation mark. But be warned that it also deletes all characters to the right of the first matching quotation mark. In my view, deleting those characters is overkill because when writing a SAS program, we may not know what is going to be in the data and whether it’s okay to delete its part outside the first matching quotes. That is why you need to be extra careful if you decide to use this function. Here is an example of what I mean. If you run the following code:

data a;
   input x $ 1-50;
   datalines;
'This is what you get. Let's be careful.'
;
 
data _null_;
   set a;
   y = dequote(x);
   put x= / y=;
run;

you will get the following in the SAS log:

y=This is what you get. Let

This is hardly what you really wanted as you have just lost valuable information – part of the y character value got deleted: 's be careful. I would rather not remove the quotation marks at all than remove them at the expense of losing meaningful information.

$QUOTE informat

The $QUOTE informat does exactly what the DEQUOTE() function does, that is removes matching quotation marks from a character string that begins with a quotation mark. You can use it in the example above by replacing

y = dequote(x);

with the INPUT() function

y = input(x, $quote50.);

Or you can use it directly in the INPUT statement when reading raw data from datalines or an external file:

input x $quote50.;

Both, $QUOTE informat and DEQUOTE() function, in addition to removing all characters to the right of the closing quotation mark do the following unconventional, peculiar things:

  • Remove a lone quotation mark (either double or single) when it’s the only character in the string; apparently, the lone quotation mark is matched to itself.
  • Match single quotation mark with double quotation mark as if they are the same.
  • Remove matching quotation marks from a character string that begins with a quotation mark; if your string has one or more leading blanks (that is, a quotation mark is not the first character), nothing gets removed (un-quoted).

If the described behavior matches your use case, you are welcome to use either $QUOTE informat or DEQUOTE() function. Otherwise, please read on.

UNQUOTE function definition

Up to this point such a function did not exist, but we are about to create one to justify the title. Let’s keep it simple and straightforward. Here is what I propose our new unquote() function to do:

  • If first and last non-blank characters of a character string value are matching quotation marks, we will remove them. We will not consider quotation marks matching if one of them is a single quotation mark and another is a double quotation mark.
  • We will remove those matching quotation marks whether they are both single quotation marks OR both double quotation marks.
  • We are not going to remove or change any other quotation marks that may be present within those matching quotation marks that we remove.
  • We will remove leading and trailing blanks outside the matching quotation marks that we delete.
  • However, we will not remove any leading or trailing blanks within the matching quotation marks that we delete. You may additionally apply the STRIP() function if you need to do that.

To summarize these specifications, our new UNQUOTE() function will extract a character substring within matching quotation marks if they are the first and the last non-blank characters in a character string. Otherwise, it returns the character argument unchanged.

UNQUOTE function implementation

Here is how such a function can be implemented using PROC FCMP:

libname funclib 'c:\projects\functions';
 
proc fcmp outlib=funclib.userfuncs.v1; /* outlib=libname.dataset.package */
   function unquote(x $) $32767;
      pos1 = notspace(x); *<- first non-blank character position;
      if pos1=0 then return (x); *<- empty string;
 
      char1 = char(x, pos1); *<- first non-blank character;
      if char1 not in ('"', "'") then return (x); *<- first non-blank character is not " or ' ;
 
      posL = notspace(x, -length(x)); *<- last non-blank character position;
 
      if pos1=posL then return (x); *<- single character string;
 
      charL = char(x, posL); *<- last non-blank character;
      if charL^=char1 then return (x); *<- last non-blank character does not macth first;
 
      /* at this point we should have matching quotation marks */
      return (substrn(x, pos1 + 1, posL - pos1 - 1)); *<- remove first and last quotation character;
   endfunc; 
run;

Here are the highlights of this implementation:

We use multiple RETURN statements: we sequentially check for different special conditions and if one of them is met we return the argument value intact. The RETURN statement does not just return the value, but also stops any further function execution.

At the very end, after making sure that none of the special conditions is met, we strip the argument value from the matching quotation marks along with the leading and trailing blanks outside of them.

NOTE: SAS user-defined functions are stored in a SAS data set specified in the outlib= option of the PROC FCMP. It requires a 3-level name (libref.datsetname.packagename) for the function definition location to allow for several versions of the same-name function to be stored there.

However, when a user-defined function is used in a SAS DATA Step, only a 2-level name can be specified (libref.datasetname). If that data set has several same-name functions stored in different packages the DATA Step uses the latest function definition (found in a package closest to the bottom of the data set).

UNQUOTE function results

Let’s use the following code to test our newly minted user-defined function UNQUOE():

libname funclib 'c:\projects\functions';
options cmplib=funclib.userfuncs;
 
data A;
   infile datalines truncover;
   input @1 S $char100.;
   datalines;
'
"
How about this?
    How about this?
"How about this?"
'How about this?'
"How about this?'
'How about this?"
"   How about this?"
'      How about this?'
'      How "about" this?'
'      How 'about' this?'
   "     How about this?"
   "     How "about" this?"
   "     How 'about' this?"
   '     How about this?'
;
 
data B;
   set A;
   length NEW_S $100;
   label NEW_S = 'unquote(S)';
   NEW_S = unquote(S);
run;

This code produces the following output table:

Example of character string unquoting
As you can see it does exactly what we wanted it to do – removing matching first and last quotation marks as well as stripping out blanks outside the matching quotation marks.

DSD (Delimiter-Sensitive Data) option

This INFILE statement’s option is particularly and extremely useful when using LIST input to read and un-quote comma-delimited raw data. In addition to removing enclosing quotation marks from character values, the DSD option specifies that when data values are enclosed in quotation marks, delimiters within the value are masked, that is treated as character data (not as delimiters). It also sets the default delimiter to a comma and treats two consecutive delimiters as a missing value.

In contrast with the above UNQUOTE() function, the DSD option will not remove enclosing quotation marks if there are same additional quotation marks present inside the character value.  When DSD option does strip enclosing quotation marks it also strips leading and trailing blanks outside and within the removed quotation marks.

Additional Resources

Your thoughts?

Have you found this blog post useful? Please share your use cases, thoughts and feedback in the comments below.

How to unquote SAS character variable values was published on SAS Users.

9月 172020
 

Unquote by removing matching quotesBefore we delve into unquoting SAS character variables let’s briefly review existing SAS functionality related to the character strings quoting/unquoting.

%QUOTE and %UNQUOTE macro functions

Don’t be fooled by these macro functions’ names. They have nothing to do with quoting or un-quoting character variables’ values. Moreover, they have nothing to do with quoting or un-quoting even macro variables’ values. According to the %QUOTE Macro Function documentation it masks special characters and mnemonic operators in a resolved value at macro execution.  %UNQUOTE Macro Function unmasks all special characters and mnemonic operators so they are interpreted as macro language elements instead of as text. There are many other SAS “macro quoting functions” (%SUPERQ, %BQUOTE, %NRBQUOTE, all macro functions whose name starts with %Q: %QSCAN, %QSUBSTR, %QSYSFUNC, etc.) that perform some action including masking.

Historically, however, SAS Macro Language uses terms “quote” and “unquote” to denote “mask” and “unmask”. Keep that in mind when reading SAS Macro documentation.

QUOTE function

Most SAS programmers are familiar with the QUOTE function that adds quotation marks around a character value. It can add double quotation marks (by default) or single quotation marks if you specify that in its second argument.

This function goes even further as it doubles any quotation mark that already existed within the value to make sure that an embedded quotation mark is escaped (not treated as an opening or closing quotation mark) during parsing.

DEQUOTE function

There is also a complementary DEQUOTE function that removes matching quotation marks from a character string that begins with a quotation mark. But be warned that it also deletes all characters to the right of the first matching quotation mark. In my view, deleting those characters is overkill because when writing a SAS program, we may not know what is going to be in the data and whether it’s okay to delete its part outside the first matching quotes. That is why you need to be extra careful if you decide to use this function. Here is an example of what I mean. If you run the following code:

data a;
   input x $ 1-50;
   datalines;
'This is what you get. Let's be careful.'
;
 
data _null_;
   set a;
   y = dequote(x);
   put x= / y=;
run;

you will get the following in the SAS log:

y=This is what you get. Let

This is hardly what you really wanted as you have just lost valuable information – part of the y character value got deleted: 's be careful. I would rather not remove the quotation marks at all than remove them at the expense of losing meaningful information.

$QUOTE informat

The $QUOTE informat does exactly what the DEQUOTE() function does, that is removes matching quotation marks from a character string that begins with a quotation mark. You can use it in the example above by replacing

y = dequote(x);

with the INPUT() function

y = input(x, $quote50.);

Or you can use it directly in the INPUT statement when reading raw data from datalines or an external file:

input x $quote50.;

Both, $QUOTE informat and DEQUOTE() function, in addition to removing all characters to the right of the closing quotation mark do the following unconventional, peculiar things:

  • Remove a lone quotation mark (either double or single) when it’s the only character in the string; apparently, the lone quotation mark is matched to itself.
  • Match single quotation mark with double quotation mark as if they are the same.
  • Remove matching quotation marks from a character string that begins with a quotation mark; if your string has one or more leading blanks (that is, a quotation mark is not the first character), nothing gets removed (un-quoted).

If the described behavior matches your use case, you are welcome to use either $QUOTE informat or DEQUOTE() function. Otherwise, please read on.

UNQUOTE function definition

Up to this point such a function did not exist, but we are about to create one to justify the title. Let’s keep it simple and straightforward. Here is what I propose our new unquote() function to do:

  • If first and last non-blank characters of a character string value are matching quotation marks, we will remove them. We will not consider quotation marks matching if one of them is a single quotation mark and another is a double quotation mark.
  • We will remove those matching quotation marks whether they are both single quotation marks OR both double quotation marks.
  • We are not going to remove or change any other quotation marks that may be present within those matching quotation marks that we remove.
  • We will remove leading and trailing blanks outside the matching quotation marks that we delete.
  • However, we will not remove any leading or trailing blanks within the matching quotation marks that we delete. You may additionally apply the STRIP() function if you need to do that.

To summarize these specifications, our new UNQUOTE() function will extract a character substring within matching quotation marks if they are the first and the last non-blank characters in a character string. Otherwise, it returns the character argument unchanged.

UNQUOTE function implementation

Here is how such a function can be implemented using PROC FCMP:

libname funclib 'c:\projects\functions';
 
proc fcmp outlib=funclib.userfuncs.v1; /* outlib=libname.dataset.package */
   function unquote(x $) $32767;
      pos1 = notspace(x); *<- first non-blank character position;
      if pos1=0 then return (x); *<- empty string;
 
      char1 = char(x, pos1); *<- first non-blank character;
      if char1 not in ('"', "'") then return (x); *<- first non-blank character is not " or ' ;
 
      posL = notspace(x, -length(x)); *<- last non-blank character position;
 
      if pos1=posL then return (x); *<- single character string;
 
      charL = char(x, posL); *<- last non-blank character;
      if charL^=char1 then return (x); *<- last non-blank character does not macth first;
 
      /* at this point we should have matching quotation marks */
      return (substrn(x, pos1 + 1, posL - pos1 - 1)); *<- remove first and last quotation character;
   endfunc; 
run;

Here are the highlights of this implementation:

We use multiple RETURN statements: we sequentially check for different special conditions and if one of them is met we return the argument value intact. The RETURN statement does not just return the value, but also stops any further function execution.

At the very end, after making sure that none of the special conditions is met, we strip the argument value from the matching quotation marks along with the leading and trailing blanks outside of them.

NOTE: SAS user-defined functions are stored in a SAS data set specified in the outlib= option of the PROC FCMP. It requires a 3-level name (libref.datsetname.packagename) for the function definition location to allow for several versions of the same-name function to be stored there.

However, when a user-defined function is used in a SAS DATA Step, only a 2-level name can be specified (libref.datasetname). If that data set has several same-name functions stored in different packages the DATA Step uses the latest function definition (found in a package closest to the bottom of the data set).

UNQUOTE function results

Let’s use the following code to test our newly minted user-defined function UNQUOE():

libname funclib 'c:\projects\functions';
options cmplib=funclib.userfuncs;
 
data A;
   infile datalines truncover;
   input @1 S $char100.;
   datalines;
'
"
How about this?
    How about this?
"How about this?"
'How about this?'
"How about this?'
'How about this?"
"   How about this?"
'      How about this?'
'      How "about" this?'
'      How 'about' this?'
   "     How about this?"
   "     How "about" this?"
   "     How 'about' this?"
   '     How about this?'
;
 
data B;
   set A;
   length NEW_S $100;
   label NEW_S = 'unquote(S)';
   NEW_S = unquote(S);
run;

This code produces the following output table:

Example of character string unquoting
As you can see it does exactly what we wanted it to do – removing matching first and last quotation marks as well as stripping out blanks outside the matching quotation marks.

DSD (Delimiter-Sensitive Data) option

This INFILE statement’s option is particularly and extremely useful when using LIST input to read and un-quote comma-delimited raw data. In addition to removing enclosing quotation marks from character values, the DSD option specifies that when data values are enclosed in quotation marks, delimiters within the value are masked, that is treated as character data (not as delimiters). It also sets the default delimiter to a comma and treats two consecutive delimiters as a missing value.

In contrast with the above UNQUOTE() function, the DSD option will not remove enclosing quotation marks if there are same additional quotation marks present inside the character value.  When DSD option does strip enclosing quotation marks it also strips leading and trailing blanks outside and within the removed quotation marks.

Additional Resources

Your thoughts?

Have you found this blog post useful? Please share your use cases, thoughts and feedback in the comments below.

How to unquote SAS character variable values was published on SAS Users.

9月 022020
 

SAS offering free learning resources in celebration of programmers

For more than 40 years, SAS programmers have crafted software and solutions that transform the world. From statistics to data science, to analytics and artificial intelligence, people writing code have architected a new economy with incredible opportunities. SAS Programmer Week honors those people by offering free learning resources available for everyone, from students to early career professionals to SAS veterans.

Running from Sept. 7-11, SAS Programmer Week leads up to the international Day of the Programmer on Saturday, Sept. 12. Training resources will be available for free through a variety of YouTube and video tutorials, webinars, blogs and documentation.

There will be three different tracks for new, experienced and analytics-focused users, with new content released each day. The week culminates with SAS certification prep content that will have participants ready to pursue a valuable SAS credential.

For instance, Tech Republic named SAS as one of 7 data science certifications to boost your resume and salary. CIO Magazine puts SAS among the top 11 big data and analytics certifications for 2020.

Since SAS programmers are busy and may not have all day to engage with the materials, SAS Programmer Week is flexible. Participants can access the material when they want to learn a specific skill related to the day’s topic or consume the material in snippets when they have time.

Interested participants can visit the SAS Programmer Week website to register today, preview the materials and schedule, and jump-start their career journeys.

 

All hail the SAS programmer! was published on SAS Users.

8月 242020
 

I got a lot of feedback about my recent article about how to find roots of nonlinear functions by using the SOLVE function in PROC FCMP. A colleague asked how the FCMP procedure stores the functions. Specifically, why the OUTLIB= option on the PROC FCMP statement use a three-level syntax: OUTLIB=libref.DataSetName.PackageName. The three levels are a libref, a data set name, and a package name. The documentation is terse about what the third level (the package name) is used for and why it is important. This article describes how the FCMP-defined functions are stored, and how you can use the package name to call different versions of a function.

This article is my attempt to "reverse engineer" how PROC FCMP stores functions based on what I have read and observed. In addition to the FCMP documentation, I recommend reading Secosky (2007) and Eberhardt (2009). Feel free to add your own knowledge in the comments.

How FCMP functions are stored

I started writing about the capabilities of the FCMP procedure in 2012, but the procedure itself goes back to SAS 9.2. Modern versions of SAS store functions in an analytic store (which is read by using PROC ASTORE) or in an item store (which is read by using PROC PLM). But these binary storage formats had not yet been developed back in the pre-9.2 days. So PROC FCMP stores functions in a SAS data set. That means you can use PROC PRINT to investigate how PROC FCMP stores functions.

When you use the OUTLIB= option in PROC FCMP, you specify a three-level name: OUTLIB=libref.DataSetName.PackageName. The first two levels specify the name of a SAS data set. This data set is created if it doesn't exist, or it is modified if it already exists. The third level is used as a text field in a variable named _KEY_, which enables one data set to contain functions that belong to different packages. The package name becomes important if two packages define a function that has the same name.

To demonstrate, let's define some functions and store them in a data set named Work.MyFuncs. The following statements create two functions (A and B) that belong to the 'PROD' (for 'Production') package and one function (A) that belongs to the 'DEV' (for 'Development') package. Notice that both packages have a function named 'A'. The following statements define the functions and use PROC PRINT to display a portion of the Work.MyFuncs data set:

/* Store all functions in the data set WORK.MyFuncs */
/* Define functions in 'PROD' package */
proc fcmp outlib=work.MyFuncs.Prod;
   function A(x);
      return( x );      /* in the 'Prod' pkg, A(x) equals x */
   endsub;
   function B(x);
      return( x > 0 );
   endsub;
quit;
 
/* Define functions in 'DEV' package */
proc fcmp outlib=work.MyFuncs.Dev;
   function A(x);
      return( 2*x );    /* the 'Dev' pkg uses a different definition for A(x) */
   endsub;
quit;
 
proc print data=work.MyFuncs;
   var _Key_ Sequence Type Subtype Name;
run;

The output from PROC PRINT is shown. The data set contains 20 rows. I have put a red rectangle around rows 1–13 and another around rows 14–20. Each rectangle defines a package. The names of the packages are defined by the observations where Subtype='Package', which are highlighted in yellow. The Type, Subtype, and Name columns indicate how the FCMP statements that define the functions are stored in the data set. The _KEY_ column identifies which rows define which functions. There are other columns (not shown) that store the actual content of each function.

This output shows how the third level of the OUTLIB= option is used. The _KEY_ column records the package name and appends each function name ('A' or 'B') to the name of the package. So PROC FCMP knows that there are three stored functions whose full names are PROD.A, PROD.B, and DEV.A.

Calling a function from the DATA step

Since there are two functions called 'A', what happens if I call 'A' from a SAS DATA step? The answer is that the DATA step uses the most recent definition, which in this example is DEV.A. To alert you to the fact that calling 'A' is ambiguous, the SAS log displays a warning. You can also use the _DISPLAYLOC_ flag on the CMPLIB= system option to display the origin of each call to an FCMP function, as follows:

/* Tell the DATA step where to look for unresolved functions.
   The _DISPLAYLOC_ flag shows the full name for each call to an FCMP function */
options cmplib=(work.MyFuncs _DISPLAYLOC_); 
data Want;
   x = 1; y = A(x);   /* y is the result of the latest definition */
run;
 
proc print data=Want noobs; run;
WARNING: Function 'A' was defined in a previous package. 'A' in current
         package DEV will be used as default when the package name is not
         specified.
 
NOTE: Function 'A' loaded from work.MyFuncs.DEV.

The value of the X and Y variables make it clear that the function DEV.A was called (because A(x)=2*x in that definition). The WARNING and NOTE in the SAS log reinforce this fact.

Choosing which package to call

The WARNING in the previous section says that the current (most recent) package "will be used as default when the package name is not specified." This message seems to imply that you can somehow call PROD.A, which is the other stored function that is named 'A'. This is, in fact, true. The PROC FCMP documentation states, "to select a specific subroutine when there is ambiguity, use the package name and a period as the prefix to the subroutine name."

You cannot specify the package name directly in the DATA step, but you can specify the package name in an FCMP function. So, for example, you can define a function called 'ChooseA' that includes a flag that indicates which package to use. The following PROC FCMP statements define a function that will call either PROD.A or DEV.A, depending on the value of a parameter. This wrapper function can then be called in the DATA step:

/* In PROC FCMP, you can "dis-ambiguate" by using a two-level function name */
proc fcmp outlib=work.MyFuncs.Choose;
   function ChooseA(x, choice $);
      if upcase(choice)="DEV" then
        return( Dev.A(x) );
      else
        return( Prod.A(x) );
   endsub;
quit;
 
data WantChoice;
   x = 1;
   y_Dev  = ChooseA(x, "Dev");   /* call Dev.A */
   y_Prod = ChooseA(x, "Prod");  /* call Prod.A */
run;
 
proc print data=WantChoice noobs; run;

From the definitions of DEV.A and PROD.A, you can verify that each function was called correctly. Because the _DISPLAYLOC_ option is still active, the SAS log also indicates that each function was called.

Summary

This article was motivated by a question about how the FCMP procedure stores functions. The answer is that the OUTLIB= option on the PROC FCMP statement requires a libref, a data set name, and a package name. In most circumstances, you do not need to use the package name. The package name becomes important only if two different packages each support a function that has the same name. In that case, you can use the package name to disambiguate the function call.

Personally, I prefer to avoid having two packages that define the same function, but if you cannot avoid it, this trick shows you how to handle it. Eberhardt (2009, p. 15) discusses a related issue, which is how to call functions that are stored in two (or more) different data sets.

The post How does PROC FCMP store functions? appeared first on The DO Loop.