Good news learners! SAS University Edition has gone back to school and learned some new tricks.

With the December 2017 update, SAS University Edition now includes the SASPy package, available in its Jupyter Notebook interface. If you're keeping track, you know that SAS University Edition has long had support for Jupyter Notebook. With that, you can write and run SAS programs in a notebook-style environment. But until now, you could not use that Jupyter Notebook to run Python programs. With the latest update, you can -- and you can use the SASPy library to drive SAS features like a Python coder.

Oh, and there's another new trick that you'll find in this version: you can now use SAS (and Python) to access data from HTTPS websites -- that is, sites that use SSL encryption. Previous releases of SAS University Edition did not include the components that are needed to support these encrypted connections. That's going to make downloading web data much easier, not to mention using REST APIs. I'll show one HTTPS-enabled example in this post.

How to create a Python notebook in SAS University Edition

When you first access SAS University Edition in your web browser, you'll see a colorful "Welcome" window. From here, you can (A) start SAS Studio or (B) start Jupyter Notebook. For this article, I'll assume that you select choice (B). However, if you want to learn to use SAS and all of its capabilities, SAS Studio remains the best method for doing that in SAS University Edition.

When you start the notebook interface, you're brought into the Jupyter Home page. To get started with Python, select New->Python 3 from the menu on the right. You'll get a new empty Untitled notebook. I'm going to assume that you know how to work with the notebook interface and that you want to use those skills in a new way...with SAS. That is why you're reading this, right?

Move data from a pandas data frame to SAS

pandas is the standard for Python programmers who work with data. The pandas module is included in SAS University Edition -- you can use it to read and manipulate data frames (which you can think of like a table). Here's an example of retrieving a data file from GitHub and loading it into a data frame. (Read more about this particular file in this article. Note that GitHub uses HTTPS -- now possible to access in SAS University Edition!)

import saspy
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv')

Here's the result. This is all straight Python stuff; we haven't started using any SAS yet.

Before we can use SAS features with this data, we need to move the data into a SAS data set. SASPy provides a dataframe2sasdata() method (shorter alias: df2sd) that can import your Python pandas data frame into a SAS library and data set. The method returns a SASdata object. This example copies the data into WORK.PROBLY in the SAS session:

sas = saspy.SASsession()
probly = sas.df2sd(df,'PROBLY')

The SASdata object also includes a describe() method that yields a result that's similar to what you get from pandas:

Drive SAS procedures with Python

SASPy includes a collection of built-in objects and methods that provide APIs to the most commonly used SAS procedures. The APIs present a simple "Python-ic" style approach to the work you're trying to accomplish. For example, to create a SAS-based histogram for a variable in a data set, simply use the hist() method.

SASPy offers dozens of simple API methods that represent statistics, machine learning, time series, and more. You can find them documented on the GitHub project page. Note that since SAS University Edition does not include all SAS products, some of these API methods might not work for you. For example, the SASml.forest() method (representing

In SASPy, all methods generate SAS program code behind the scenes. If you like the results you see and want to learn the SAS code that was used, you can flip on the "teach me SAS" mode in SASPy.


Here's what SASPy reveals about the describe() and hist() methods we've already seen:

Interesting code, right? Does it make you want to learn more about SCALE= option on PROC SGPLOT?

If you want to experiment with SAS statements that you've learned, you don't need to leave the current notebook and start over. There's also a built-in %%SAS "magic command" that you can use to try out a few of these SAS statements.

proc means data=sashelp.cars stackodsoutput n nmiss median mean std min p25 p50 p75 max;run;

Python limitations in SAS University Edition

SAS University Edition includes over 300 Python modules to support your work in Jupyter Notebook. To see a complete list, run the help('modules') command from within a Python notebook. This list includes the common Python packages required to work with data, such as pandas and NumPy. However, it does not include any of the popular Python-based machine learning modules, nor any modules to support data visualization. Of course, SASPy has support for most of this within its APIs, so why would you need anything else...right?

Because SAS University Edition is packaged in a virtual machine that you cannot alter, you don't have the option of installing additional Python modules. You also don't have access to the Jupyter terminal, which would allow you to control the system from a shell-like interface. All of this is possible (and encouraged) when you have your own SAS installation with your own instance of SASPy. It's all waiting for you when you've outgrown the learning environment of SAS University Edition and you're ready to apply your SAS skills and tech to your official work!

Learn more

Rick Wicklin showed us how to visualize the ages of US Presidents at the time of their inaugurations. That's a pretty relevant thing to do, as the age of the incoming president can indirectly influence aspects of the president's term, thanks to health and generational factors.

As part of his post, Rick supplied the complete data set for US Presidents and their birthdays. He challenged his readers to create their own interesting visualizations, and that's what I'm going to do here. I'm going to show you the distribution of US Presidents by their astrological signs.

Now, you might think that "your sign" is not as relevant of a factor as Age, and I certainly hope that you're correct about that. But past presidents have sought the advice of astrologers, and zodiac signs can influence the counsel such astrologers might offer. (Famously, Richard Nixon took advice from celebrity psychic Jeane Dixon. First Lady Nancy Reagan also sought her advice, and we know that Mrs. Reagan in turn influenced President Reagan.)

Like any good analyst, I mostly reused existing work to produce my results. First, I used the DATA step that Rick provided to create the data set of presidents and birthdays. Next, I reused my own work to create a SAS format that displays a zodiac sign for each date. And finally, I wrote write a tiny bit of PROC FREQ code to create my table and frequency plot.

data signs;
 /* So this column appears first */
 retain President;
 length sign 8;
 /* SIGN. format created earlier with PROC FORMAT */
 format sign sign.;
 set presidents (keep=President BirthDate InaugurationDate);
 /* convert birthday to our normalized SIGN date */
 sign = mdy(month(birthdate),day(birthdate),2000);
ods graphics on;
proc freq data=signs order=freq;
tables sign / plots=freqplot;

To keep things a bit fresh, I did all of this work in SAS University Edition using the Jupyter Notebook interface. Here's a glimpse of what it looks like:

And here's the distribution you've all been waiting to see. When he takes office, Donald Trump will join George H. W. Bush and JFK in the Gemini column.

I've shared the Jupyter Notebook file as a public gist on GitHub. You can download and import into your own instance if you have SAS and Jupyter Notebook working together. (Having trouble rendering the notebook file? Try looking at it through the nbviewer service. That usually works.)

A few months ago I shared the news about Jupyter notebook support for SAS. If you have SAS for Linux, you can install a free open-source project called sas-kernel and begin running SAS code within your Jupyter notebooks. In my post, I hinted that support for this might be coming in the SAS University Edition. I'm pleased to say that this is one time where my crystal ball actually worked -- Jupyter support has arrived!

(Need to learn more about SAS and Jupyter? Watch this 7-minute video from SAS Global Forum.)

Start coding in the notebook format

If you download or update your instance of SAS University Edition, you'll be able to point your browser to a slightly different URL and begin running SAS programs in Jupyter. Of course, you can continue to use SAS Studio to learn SAS programming skills. Having trouble deciding which to use? You don't have to choose: you can use both!

If you've started SAS University Edition within Oracle Virtual Box, you can find SAS Studio at its familiar address: http://localhost:10080/. And you can find the Jupyter notebook environment at: http://localhost:18888/. (If you're using VMWare, the URLs are slightly different. Check the documentation.)

Why did SAS add support for Jupyter notebooks? The answer is simple: you asked for it. While we believe that SAS Studio provides a better environment for producing and managing SAS code, Jupyter notebooks are widely used by students and data scientists who want to package their code, results, and documentation in the convenient notebook format. Notebook files (*.ipynb format) are even supported on GitHub, easily shareable and viewable by others.

Now, what are the limitations?

jupyter_uemenuWithin SAS University Edition, the Jupyter environment supports only SAS programs. The Jupyter project can support other languages, including Python, Julia, and R (the namesake languages) and dozens of others with published language kernels. However, because of the virtual-machine core of the SAS University Edition, those other languages are not available.

Support for other languages (as well as for the Jupyter console) is available when you use Jupyter in a standalone SAS environment. In fact, the sas_kernel project recently received some exciting updates. You can now host the Jupyter environment on a different machine than your SAS server (although Linux is still the only supported SAS host), and the installation process has been streamlined. See more on the sassoftware GitHub home for the sas_kernel project.

Where can you learn more about Jupyter in SAS University Edition?

Check out the help topics for SAS University Edition, beginning with this one: What is Jupyter Notebook in SAS University Edition?

And if you need help or advice about how to make the best use of SAS University Edition, check out the SAS Analytics U community. There are plenty of experts in the forum who would love to help you learn!

We've just celebrated Earth Day, but I'm here to talk about Jupyter -- and the SAS open source project that opens the door for more learning. With this new project on the github.com/sassoftware page, SAS contributes new support for running SAS from within Jupyter Notebooks -- a popular browser-based environment used by professors and data scientists.

My colleague Amy Peters announced this during a SAS Tech Talk show at SAS Global Forum 2016. If you want to learn more about Jupyter and see the SAS support in action, then you can watch the video here.

Visit the project on GitHub: sas_kernel by sassoftware

Within Jupyter, the sas_kernel provides multiple ways to access SAS programming methods. The most natural method is to create a new SAS notebook, available from the New menu in the Jupyter Home window and from the File menu in an active notebook:

From a SAS notebook, you can enter and run SAS code directly from a cell:

There is even a Notebook extension (./nbextensions/showSASLog) that can show you the SAS log.

The second way that you can run SAS code is by using special Python "magics" supported by the sas_kernel. These magic commands look almost just like SAS macro calls (imagine that!). From within a Python language notebook, you can inject your SAS program code and pull in SAS results. This allows you to move easily between Python and SAS in a single environment. Here's a simple example:

proc means data=sashelp.cars;
ods graphics / height=500 width=800;
proc sgplot data=sashelp.cars;
histogram msrp;

How to get started

Currently, to run SAS with Jupyter you need:

  • SAS 9.4 or later running on Linux
  • Python 3 installed on the same machine (that's basically part of Linux)
  • Admin rights to be able to install/configure the Jupyter Notebook infrastructure and the sas_kernel.

End users of Jupyter Notebook do not need special privileges -- you need those only to install and configure the pieces that make it work. The GitHub project has all of the doc and step-by-step instructions for installation.

What's next for SAS and Jupyter?

This is just the start for SAS in the Jupyter world. Amy says that she has already received lots of interest and feedback, and SAS is working to make the Jupyter Notebook approach available in something like SAS University Edition and SAS OnDemand for Academics. Stay tuned!

