3月 212019

SAS Decision Manager enables you to build and test decisions to use in batch processes, real-time web applications or with SAS ESP.

In this blog, I explain how to use Rulesets in an Event Stream Process project. If you are streaming data using SAS ESP and your data stream involves making decisions, you can build Rulesets in SAS Decision Manager and use them in your event stream project. ESP can invoke the code generated by SAS Decision Manager and execute it in its Micro Analytic Service (MAS) engine.

Receiving code for Rulesets

To use a Ruleset in Decision Manager within an event stream project in ESP, you need to export the DS2 code generated by Decision Manager and point ESP towards the code to execute it. To export code from Decision Manager, we use the SAS Decision Manager Viya REST API to:
• Obtain an access token to SAS Viya
• Receive the ID for the required Ruleset
• Receive the Decision Manager DS2 code via the Ruleset ID

Obtain an access token to SAS Viya

Before using SAS Viya APIs, your SAS administrator must register a client identifier. The SAS Logon OAuth API uses OAuth2 to securely identify your application before it connects to the SAS Viya platform. See Registering clients for information on how clients are registered. Once a client is successfully registered, the SAS administrator provides you with the client identifier and client secret to authenticate an API request.
To obtain an access token call:

If successfully executed, you will get an access token for all further REST calls.

Receive the Ruleset ID

We need the ID for the Ruleset we want to use in ESP. The REST Endpoint requires the ID to receive the DS2 code.
To get the ID, call the Endpoint that lists all available Rulesets:

If successfully executed, you will receive the Ruleset ID in the field “id” in the “items” list.

Receive Ruleset code

With this new ID, we can export the DS2 code for the Ruleset.
To get the code, call the appropriate Ruleset Endpoint:
Set ID to the value of this new Ruleset ID.

If executed successfully, you will receive the DS2 code for the Ruleset.

Preparing the code

Copy the DS2 code from the REST call into a file, save the file with a descriptive name (i.e. the name of the Ruleset) and move it to a location where ESP can access it.

Invoke Decision Manager Code in ESP

Now that we have saved the code into a file and moved it to a location that ESP can access, we can now invoke the code from our ESP project.

We need to register the ruleset code file we saved.
Open the ESP project and go to Micro Analytic Service Modules at the project level.

Add a new Micro Analytic Service Module for the ruleset code file and fill in all required fields.

To invoke the code in the event stream, add a Calculate Window.
In Settings choose Calculation = User-specified.

Under Handlers, select the source and ensure the field values are set correctly.

Set the fields for the output schema of the calculate window. Note that the field names and types must match the names and types used in the Ruleset.
Save the project.

You are now ready to run your project in Test mode to check if it works.


SAS Decision Manager allows you to build decisions in an independent environment to ESP. This gives you the freedom to design and test decisions in a less technical environment without touching the event stream. After testing the decision, you can simply “hook it in” to your event stream.
Other users can work on and update decisions by just applying a new/updated code file. This will allow your event stream to be to be more flexible and easier to maintain. To learn more, please check out these sources.

Video: SAS Decision Manager
Article: Using SAS Decision Manager to enrich the data prep process

Calling SAS Decision Manager Rulesets in ESP was published on SAS Users.

3月 072019

As of December 2018, any customer with a valid SAS Viya order is able to package and deploy their SAS Viya software in Docker containers. SAS has provided a fully documented and supported project (or “recipe”) for easily building these containers. So how can you start? You can simply stop reading this article and go directly to the GitHub repository and follow the instructions there. Otherwise, in this article, Jeff Owens, a solutions architect at SAS, provides a little color commentary around the process in case it is helpful…

First of all, what is the point of these containers?

Well, at its core, remember SAS and it’s massively parallel, in-memory counterpart, Cloud Analytic Services (CAS) is a powerful runtime for data processing and analytics. A runtime simply being an engine responsible for processing and executing a particular type of code (i.e. SAS code). Traditionally, the SAS runtime would live on a centralized server somewhere and users would submit their “jobs” to that SAS runtime (server) in a variety of ways. The SAS server supports a number of different products, tasks, etc. – but for this discussion let’s just focus on the scenario where a job here is a “.sas” file, perhaps developed in an IDE-like Enterprise Guide or SAS Studio, and submitted to the SAS runtime engine via the IDE itself, a bash shell, or maybe even SAS’ enterprise grade scheduler and job management solution – SAS Grid. In these cases, the SAS and CAS servers are on dedicated, always-on physical servers.

The brave new containerized world in which we live provides us a new deployment model: submit the job and create the runtime server at the same time. Plus, only consume the exact resources from the host machine or the Kubernetes cluster the specific job requires. And when the job finishes, release those resources for others to use. Kubernetes and PaaS clusters are quite likely shared environments, and one of the major themes in the rise of the containers is the further abstraction between hardware and software. Some of that may be easier said than done, particularly for customers with very large volumes of jobs to manage, but it is indeed possible today with SAS Viya on Docker/Kubernetes.

Another effective (and more immediate) usage of this containerized version of SAS Viya is simply an adhoc, on-demand, temporary development environment. The container package includes SAS Studio, so one can quickly spin up a full SAS Viya programming sandbox – SAS Studio as well as the SAS & CAS runtimes. Here they can develop and test SAS code, and just as quickly tear the environment down when no longer needed. This is useful for users that: (a) don’t have access to an “always-on” environment for whatever reason, (b) want to try out experimental code that could potentially consume resources from a shared "always-on" sas environment, and/or (c) maybe their Kubernetes cluster has many more resources available than their always-on and they want to try a BIG job.

Yes, it is possible to deploy the entire SAS Viya stack (microservices and all) via Kubernetes but that discussion is for another day. This post focuses strictly on the SAS Viya programming components and running on a single machine Docker host rather than a Kubernetes cluster.

Build the container image

I will begin here with a fresh single machine RHEL 7.5 server running on Openstack. But this machine could have been running on any cloud or VM platform, and I could use any (modern enough) flavor of Linux thanks to how Docker works. My machine here has 8cpu, 16GB RAM, and a 50GB root volume. Less or more is fine. A couple of notes to help understand how to configure an instance:

  • The final docker container image we will end up with will be ~10GB in size and like all docker images will live in /var/lib/docker/images by default.
    • Yes, that is large for a container. Most of this size is just static bins and libs that support the very developed SAS language. Compare to an Anaconda image which is ~3.6GB.
  • As for RAM, remember any tables loaded to CAS are loaded to memory (and will swap to disk as needed). So, your memory choice should be directly dependent on the data sizes you expect to work with.
  • Similar story for cores – CAS code is multithreaded, so more cores = more parallelization.

The first step is to install Docker.

Following along with sas-container-recipes now, the first thing I should do is mirror the repo for my order. Note, this is not a required step – you could build this container directly from SAS repos if you wanted, but we’ll mirror as a best practice. We could simply mirror and serve it over the local filesystem of our build host, but since I promised color I’ll serve it over the web instead. So, these commands run on a separate RHEL server. If you choose to mirror on your build host, make sure you have the disk space (~30GB should be plenty). You will also need your file available on the SAS Customer Support site. Run the following code to execute the setup.

$ wget
$ tar xf mirrormgr-linux.tgz
$ rm -f mirrormgr-linux.tgz
$ mkdir -p /repos/viyactr
$ mirrormgr mirror –deployment-data –path /repos/viyactr –platform x64-redhat-linux-6 –latest
$ yum install httpd -y
$ system start httpd
$ systemctl enable httpd
$ ln -s /repos/viyactr /var/www/html/sas_repo

Next, I go ahead and clone the sas-containers-recipes repo locally and upload my file and I am ready to run the build command. As a bonus, I am also going to use my site’s (SAS’) sssd.conf file so my container will use our corporate Active Directory for authentication. If you do not need or want that integration you can skip the “vi addons/sssd.conf” line and change the “--addons” option to “addons/auth-demo” so your container seeds with a single “sasdemo:sasdemo” user:password instead.

$ # upload to this machine somehow
$ Git clone
$ cd sas-container-recipes/
$ vi addons/sssd.conf # <- paste in your site’s sssd.conf file
$ \
--type single \
--zip ~/ \
--mirror-url \
--addons “addons/auth-sssd”

The build should take about 45 minutes and produce a single container image for you (there might be a few images, but it is just one with a thin layer or two on top). You might want to give this image a new name (docker tag) and push it into your own private registry (docker push). Aside from that, we are ready to run it.
If you are curious, look in the addons directory for the other optional layers you can add to your container. Several tools are available for easily configuring connections to external databases.

Run the container

Here is the run command we can use to launch the container. Note the image name I use here is “sas-viya-programming:xxxxxx” – this is the image that has my sssd layer built on top of it.

$ docker run \
--detach \ 
--rm \ 
--env CASENV_CAS_HOST=$(hostname -f) \ 
--publish 5570:5570 \ 
--publish 8081:80 \ 
--name sas-viya-programming \ 
--hostname sas-viya-programming \ 

Connect to the container

And now, in a web browser, I can go to :8081/SASStudio and I will end up in SAS Studio, where I can sign in with my internal SAS credentials. To stop the container, use the name you gave it: “docker stop sas-viya-programming”. Because we used the “--rm” flag the container will be removed (completely destroyed) when we stop it.

Note we are explicitly mapping in the HTTP port (8081:80) so we easily know how to get to SAS Studio. If you want to start up another container here on the same host, you will need to use a different port or else you’ll get an address already in use error. Also note we might be interested in connecting directly to this CAS server from something other than SAS Studio (localhost). A remote python client for example. We can use the other port we mapped in (5570:5570) to connect to the CAS server.

Persist the data

Running this container with the above command means anything and everything done inside the container (configuration changes, code, data) will not persist if the container stops and a new one started later. Luckily this is a very standard and easy to solve scenario with Docker and Kubernetes. Here are a couple of targets inside the container you might be interested in mounting a volume to:

  • /tmp – this is where CAS_DISK_CACHE is by default, not to mention SASWORK. Those are scratch space used by the runtimes. If you are working with small data and don’t care too much about performance, no need to worry about this. But to optimize your container we would suggest mounting a Docker volume to this location (or, ideally, bind mount a high-performance storage device here). Note that generally Docker prefers us to use Docker volumes in lieu of bind mounts, but that is more for manageability, security, and portability than performance.
  • /data – this directory doesn’t necessarily exist but when you mount a volume into a container the target location will be created. So, you could call this target whatever you want, assuming it doesn’t exist yet.  Bind mounting is tempting here and OK to do but consider the scenario when another user wants to run your container following instructions you provided them – better to use a Docker volume than force them to create the directory on the host.  If you have an NFS location, bind mounting that makes sense
  • /code – same spiel as with /data. Once you are in the container you can save your work and it will persist in the docker volume from run to run of your container.

Here is what an updated docker run command might look like with these volumes included:

$docker run \ 
--detach \ 
-rm \ 
--env CASNV_CAS_VIRTUAL_HOST=$(hostname -f) \ 
--volume mydata:/data \ 
--volume /nfsdata:/nfsdata \ # example syntax for bind mount instead of docker volume mount 
--volume mycode:/code \ 
--volume sastmp:/tmp \ 
--publish 5570:5570 \ 
--publish 8081:80 \ 
--name sas-viya-programming \ 
--hostname sas-viya-programming \ 

Can I run this on my laptop?

Yes. You would just need to install Docker on your laptop (go to for that). You can certainly follow the instructions from the top to build and run locally. You can even push this container image out to an internal registry so other users could skip the build and just run.

So far, we have only talked about the “ad-hoc” or “sandbox” dev type of use case for this container. A later article may cover how to run in batch mode or maybe we will move straight to multi-containers & Kubernetes. In the meantime though, here is how to submit a .sas program as a batch job to this single container we have built.

Give it a try!

Try creating your own image and deploying a container. Feel free to comment on your experience.

More info:

SAS Communities Article- Running SAS Analytics in a Docker container
SAS Global Forum Paper- Docker Toolkit for Data Scientists – How to Start Doing Data Science in Minutes!
SAS Global Forum Tech Talk Video- Deploying and running SAS in Containers

Getting Started with SAS Containers was published on SAS Users.

3月 062019

conditionally terminating a SAS batch flow process in UNIX/LinuxIn automated production (or business operations) environments, we often run SAS job flows in batch mode and on schedule. SAS job flow is a collection of several inter-dependent SAS programs executed as a single process.

In my earlier posts, Running SAS programs in batch under Unix/Linux and Let SAS write batch scripts for you, I described how you can run SAS programs in batch mode by creating UNIX/Linux scripts that in turn incorporate other scripts invocations.

In this scenario you can run multiple SAS programs sequentially or in parallel, all while having a single root script kicked off on schedule. The whole SAS processing flow runs like a chain reaction.

Why and when to stop SAS batch flow process

However, sometimes we need to automatically stop and terminate that chain job flow execution if certain criteria are met (or not met) in a program of that process flow.
Let’s say our first job in a batch flow is a data preparation step (ETL) where we extract data tables from a database and prepare them for further processing. The rest of the batch process is dependent on successful completion of that critical first job. The process is kicked off at 3:00 a.m. daily, however, sometimes we run into a situation when the database connection is unavailable, or the database itself is not finished refreshing, or something else happens resulting in the ETL program completing with ERRORs.

This failure means that our data has not updated properly and there is no reason to continue running the remainder of the job flow process as it might lead to undesired or even disastrous consequences. In this situation we want to automatically terminate the flow execution and send an e-mail notification to the process owners and/or SAS administrators informing them about the mishap.

How to stop SAS batch flow process in UNIX/Linux

Suppose, we run the following script on UNIX/Linux:

#1 extract data from a database
#2 run the rest of processing flow

The script runs the SAS ETL process as follows:

dtstamp=$(date +%Y.%m.%d_%H.%M.%S)
/sas/SASHome/SASFoundation/9.4/sas $pgmname -log $logname

We want to run shell script (which itself runs multiple other scripts) only if program completes successfully, that is if SAS ETL process that is run by completes with no ERRORs or WARNINGs. Otherwise, we want to terminate the script and do not run the rest of the processing flow.

To do this, we re-write our script as:

#1 extract data from a database
echo "Status=$exitcode (0=SUCCESS,1=WARNING,2=ERROR)"
if [ $exitcode -eq 0 ]
      #2 run the rest of processing flow

In this code, we use a special shell script variable ($? for the Bourne and Korn shells, $STATUS for the C shell) to capture the exit status code of the previously executed OS command, /sas/code/etl/


Then the optional echo command just prints the captured value of that status for our information.

Every UNIX/Linux command executed by the shell script or user has an exit status represented by an integer number in the range of 0-255. The exit code of 0 means the command executed successfully without any errors; a non-zero value means the command was a failure.

SAS System plays nicely with the UNIX/Linux Operating System. According to the SAS documentation $? for the Bourne and Korn shells, and $STATUS for the C shell.) A value of 0 indicates successful termination. For additional flexibility, SAS’ Condition Exit Status Code All steps terminated normally 0 SAS issued WARNINGs 1 SAS issued ERRORs 2 User issued ABORT statement 3 User issued ABORT RETURN statement 4 User issued ABORT ABEND statement 5 SAS could not initialize because of a severe error 6 User issued ABORT RETURN - n statement n User issued ABORT ABEND - n statement n

Since our script executes SAS code, the exit status code is passed by the SAS System to and consequently to our shell script.

Then, in the script we check if that exit code equals to 0 and then and only then run the remaining flow by executing the shell script. Otherwise, we skip and exit from the script reaching its end.

Alternatively, the script can be implemented with an explicit exit as follows:

#1 extract data from a database
echo "Status=$exitcode (0=SUCCESS,1=WARNING,2=ERROR)"
if [ $exitcode -ne 0 ]
   then exit
#2 run the rest of processing flow

In this shell script code example, we check the exit return code value, and if it is NOT equal to 0, then we explicitly terminate the shell script using exit command which gets us out of the script immediately without executing the subsequent commands. In this case, our #2 command invoking script never gets executed that effectively stops the batch flow process.

If you also need to automatically send an e-mail notification to the designated people about the failed batch flow process, you can do it in a separate SAS job that runs right before exit command. Then the if-statement will look something like this:

if [ $exitcode -ne 0 ]
      # send an email and exit

That is immediately after the email is sent, the shell script and the whole batch flow process gets terminated by the exit command; no shell script commands beyond that if-statement will be executed.

A word of caution

Be extra careful if you use the special script variable $? directly in a script's logical expression, without assigning it to an interim variable. For example, you could use the following script command sequence:

if [ $? -ne 0 ]
. . .

However, let’s say you insert another script command between them, for example:

echo "Status=$? (0=SUCCESS,1=WARNING,2=ERROR)"
if [ $? -ne 0 ]
. . .

Then the $? variable in the if [ $? -ne 0 ] statement will have the value of the previous echo command, not the /stas/code/etl/ command as you might imply.

Hence, I suggest capturing the $? value in an interim variable (e.g. exitcode=$?) right after the command, exit code of which you are going to inspect, and then reference that interim variable (as $exitcode) in your subsequent script statements. That will save you from trouble of inadvertently referring to a wrong exit code when you insert some additional commands during your script development.

Your thoughts

What do you think about this approach? Did you find this blog post useful? Did you ever need to terminate your batch job flow? How did you go about it? Please share with us.

How to conditionally terminate a SAS batch flow process in UNIX/Linux was published on SAS Users.

7月 252018

I recently joined SAS in a brand new role: I'm a Developer Advocate.  My job is to help SAS customers who want to access the power of SAS from within other applications, or who might want to build their own applications that leverage SAS analytics.  For my first contribution, I decided to write an article about a quick task that would interest developers and that isn't already heavily documented. So was born this novice's experience in using R (and RStudio) with SAS Viya. This writing will chronicle my journey from the planning stages, all the way to running commands from RStudio on the data stored in SAS Viya. This is just the beginning; we will discuss at the end where I should go next.

Why use SAS Viya with R?

From the start, I asked myself, "What's the use case here? Why would anyone want to do this?" After a bit of research discussion with my SAS colleagues, the answer became clear.  R is a popular programming language used by data scientists, developers, and analysts – even within organizations that also use SAS.  However, R has some well-known limitations when working with big data, and our SAS customers are often challenged to combine the work of a diverse set of tools into a well-governed analytics lifecycle. Combining the developers' familiarity of R programming with the power and flexibility of SAS Viya for data storage, analytical processing, and governance, this seemed like a perfect exercise.  For this purpose of this scenario, think of SAS Viya as the platform and the Cloud Analytics Server (CAS) is where all the data is stored and processed.

How I got started with SAS Viya

I did not want to start with the task of deploying my own SAS Viya environment. This is a non-trivial activity, and not something an analyst would tackle, so the major pre-req here is you'll need access to an existing SAS Viya setup.  Fortunately for me, here at SAS we have preconfigured SAS Viya environments available on a private cloud that we can use for demos and testing.  So, SAS Viya is my server-side environment. Beyond that, a client is all I needed. I used a generic Windows machine and got busy loading some software.

What documentation did I use/follow?

I started with the official SAS documentation: SAS Scripting Wrapper for Analytics Transfer (SWAT) for R.

The Process

The first two things I installed were R and RStudio, which I found at these locations:

The installs were uneventful, so I'll won't list all those steps here. Next, I installed a couple of pre-req R packages and attempted to install the SAS Scripting Wrapper for Analytics Transfer (SWAT) package for R. Think of SWAT as what allows R and SAS to work together. In an R command line, I entered the following commands:

> install.packages('httr')
> install.packages('jsonlite')
> install.packages('> 
  linux64.tar.gz', repos=NULL, type='file')

When attempting the last command, I hit an error:

ERROR: dependency 'dplyr' is not available for package 'swat'
* removing 'C:/Program Files/R/R-3.5.1/library/swat'
Warning message:
In install.packages("",  :
installation of package 'C:/Users/sas/AppData/Local/Temp/2/RtmpEXUAuC/downloaded_packages/R-swat-1.2.1-linux64.tar.gz'
  had non-zero exit status

The install failed. Based on the error message, it turns out I had forgotten to install another R package:

> install.packages("dplyr")

(This dependency is documented in the R SWAT documentation, but I missed it. Since this could happen to anyone – right? – I decided to come clean here. Perhaps you'll learn from my misstep.)

After installing the dplyr package in the R session, I reran the swat install and was happy to hit a return code of zero. Success!

For the brevity of this post, I decided to not configure an authentication file and will be required to pass user credentials when making connections. I will configure authinfo in a follow-up post.

Testing my RStudio->SAS Viya connection

From RStudio, I ran the following command to connect to the CAS server:

> library(swat)
> conn <- CAS("", 8777, protocol='http', user='user', password='password')

Now that I succeeded in connecting my R client to the CAS server, I was ready to load data and start making API calls.

How did I decide on a use case?

I'm in the process of moving houses, so I decided to find a data set on property values in the area to do some basic analysis, to see if I was getting a good deal. I did a quick google search and downloaded a .csv from a local government site. At this point, I was all set up, connected, and had data. All I needed now was to run some CAS Actions from RStudio.

CAS actions are commands that you submit through RStudio to tell the CAS server to 'do' something. One or more objects are returned to the client -- for example, a collection of data frames. CAS actions are organized into action sets and are invoked via APIs. You can find

> citydata <-, "C:\\Users\\sas\\Downloads\\property.csv", sep=';')
NOTE: Cloud Analytic Services made the uploaded file available as table PROPERTY in caslib CASUSER(user).

What analysis did I perform?

I purposefully kept my analysis brief, as I just wanted to make sure that I could connect, run a few commands, and get results back.

My RStudio session, including all of the things I tried

Here is a brief series of CAS action commands that I ran from RStudio:

Get the mean value of a variable:

> cas.mean(citydata$TotalSaleValue)
          Column     Mean
1 TotalSaleValue 343806.5

Get the standard deviation of a variable:

          Column      Std
1 TotalSaleValue 185992.9

Get boxplot data for a variable:

> cas.percentile.boxPlot(citydata$TotalSaleValue)
          Column     Q1     Q2     Q3     Mean WhiskerLo WhiskerHi Min     Max      Std    N
1 TotalSaleValue 239000 320000 418000 343806.5         0    685000   0 2318000 185992.9 5301

Get boxplot data for another variable:

> cas.percentile.boxPlot(citydata$TotalBldgSqFt)
         Column   Q1   Q2   Q3     Mean WhiskerLo WhiskerHi Min   Max      Std    N
1 TotalBldgSqFt 2522 2922 3492 3131.446      1072      4943 572 13801 1032.024 5301

Did I succeed?

I think so. Let's say the house I want is 3,000 square feet and costs $258,000. As you can see in the box plot data, I'm getting a good deal. The house size is in the second quartile, while the house cost falls in the first quartile. Yes, this is not the most in depth statistical analysis, but I'll get more into that in a future article.

What's next?

This activity has really sparked my interest to learn more and I will continue to expand my analysis, attempt more complex statistical procedures and create graphs. A follow up blog is already in the works. If this article has piqued your interest in the subject, I'd like to ask you: What would you like to see next? Please comment and I will turn my focus to those topics for a future post.

Using RStudio with SAS Viya was published on SAS Users.