Chris Hemedinger

7月 172017
 

DMS keysSAS power users (and actually, power users of any application) like to customize their environment for maximimum productivity. Long-time SAS users remember the KEYS window in SAS display manager, which allows you to assign SAS commands to "hot keys" in your SAS session. These users will invest many hours to come up with the perfect keyboard mappings to suit the type of work that they do.

When using SAS Enterprise Guide, these power users often lament the lack of a similar KEYS window. But these people needn't suffer with the default keys -- a popular tool named AutoHotkey can fill the gap for this and for any other Windows application. I've recommended it to many SAS users over the years, and I've heard positive feedback from those who have adopted it. AutoHotkey is free, and it's lightweight and portable; even users with "locked-down" systems can usually make use of it.

AutoHotkey provides its own powerful scripting language, which allows you define new behaviors for any key combination that you want to repurpose. When you activate these scripts, AutoHotkey gets first crack at processing your keypress, so you can redirect the built-in key mappings for any Windows application. I'll share two examples of different types of scripts that users have found helpful.

"Unmap" a key that you don't like

In SAS Enterprise Guide, F3 and F8 are both mapped to "Run program". A newer user found the F8 mapping confusing because she had a habit of using that key for something else, and so became quite annoyed when she kept accidentally running her process before she was ready.

The following AutoHotkey script "eats" the F8 keypress. The logic first checks to see if the running process is SAS Enterprise Guide (seguide.exe), and if so, it simply stops processing the action, effectively vetoing the F8 action.

F8::
WinGet, Active_ID, ID, A
WinGet, Active_Process, ProcessName, ahk_id %Active_ID%
if ( Active_Process ="seguide.exe" ) {
  ;eat the keystroke
} 

Map a single key to an action that requires multiple keys or clicks

I recently shared a tip to close all open data sets in SAS Enterprise Guide. It's a feature on the Tools menu that launches a special window, and some readers wished for a single key mapping to get the job done. Using AutoHotkey, you can map a series of clicks/keystrokes to a single key.

The following script will select the menu item, activate the "View Open Data Sets" window, and then select Close All.

F12::
WinGet, Active_ID, ID, A
WinGet, Active_Process, ProcessName, ahk_id %Active_ID%
if ( Active_Process ="seguide.exe" ) 
{
  Sleep, 100
  Send {Alt Down}{Alt Up}{t}
  Sleep, 100  
  Send, {v}
  WinActivate, View Open Data Sets ahk_class WindowsForms10.Window.8.app.0.143a722_r12_ad1
  Send, {Tab}
  Sleep, 100  
  Send, {Space}
  Sleep, 500  
  Send, {Esc}
}

You'll see that one of the script commands activates the "View Open Data Sets" window. The window "class" is referenced, and the class name is hardly intuitive. AutoHotkey includes a "Window spy" utility called "Active Window Info" that can help you to find the exact name of the window you need to activate.

Window Spy

AutoHotkey can direct mouse movements and clicks, but those directives might not be reliable in different Windows configurations. In my scripts, I rely on simulated keyboard commands. This script activates the top-level menu with Alt+"t" (for Tools), then "v" (for the "View Open Data Sets" window), then TAB to the "Close All" button, space bar to press the button, then Escape to close the window. Each action takes some time to take effect, so "Sleep" commands are inserted (with times in milliseconds) to allow the actions to complete.

Every action in SAS Enterprise Guide is accessible by the keyboard (even if several keystrokes are required). If you want to see all of the already-defined keyboard mappings, search the SAS Enterprise Guide help for "keyboard shortcuts."

Key help

Automate more with AutoHotkey

In this article, I've only just scratched the surface of how you can customize keys and automate actions in SAS Enterprise Guide. Some of our users have asked us to build in the ability to customize key actions within the application. While that might be a good enhancement within the boundaries of your SAS applications, a tool like AutoHotkey can help you to automate your common tasks within SAS and across other applications that you use. The scripting language presents a bit of a learning curve, but the online help is excellent. And there is a large community of AutoHotkey users who have published hundreds of useful examples.

Have you used AutoHotkey to automate any SAS tasks? If so, please share your tips here in the comments or within the SAS Enterprise Guide community.

The post Customize your keys in SAS Enterprise Guide with AutoHotkey appeared first on The SAS Dummy.

7月 032017
 

If you spend a lot of time in SAS Enterprise Guide (as I do), you probably get to know its features pretty well. But we don't always take the time to explore as we should, so there might be a few golden nuggets of editor knowledge that have escaped you so far. Here are 10 program editor features that I've found essential while writing, editing, and debugging SAS programs. How many of these do you already know and use?

1. Turn on the line numbers

We programmers like to count lines of code. The SAS log often uses line numbers to reference problems in WARNINGs and ERRORs. So of course, you should have line numbers displayed in the program editor. But they aren't on by default. Go to Program → Editor Options and select "Show line numbers" to turn them on.

line numbers

2. Get the tabs out (or leave them in)

Tabs or spaces? Your choice here can have a significant effect on your earning potential, and perhaps even on your love life. Most code editors have options that support your choice, regardless of which camp you choose. SAS Enterprise Guide offers these:

  • Tab size - width of a tab character, represented in number of spaces. Default is 4, but I like to use 2 as it makes my program lines less wide.
  • Insert spaces for tabs - when you press the TAB key, don't add a TAB character but instead add the specified number of space characters.
  • Replace tabs with spaces on file open - a perfect passive-aggressive option when working with team members who disagree with your TAB world view. This option will change TAB characters to spaces when you open the program file. If you must retain the TAB characters...well, my main advice is do not rely on TAB characters in your code file. But if you must for some crazy reason, don't select this option and sign a pact with your teammates for the same.

tabs vs spaces

3. Define abbreviations for commonly used code

The most common code snippet that I reuse is a LIBNAME statement that points to my project data. To save on typing and mistakes, I've saved this as an editor abbreviation. When I begin to type the alias I've assigned to the snippet, the program editor offers to complete it for me. The custom abbreviation is presented along side all of the other built-in syntax suggestions.

abbrev animation

See more about editor abbreviations in this article.

4. Let the editor format your code

As shown in the vigorous "TABS vs spaces" debate, programmers care deeply about how their code is formatted. Individuals and teams adopt various standards for line breaks and indenting, and these are usually particular to the programming language. In general, SAS doesn't care how your code is laid out -- statements are delimited by semicolons, and that's the only cue that SAS needs. However, your teammates (and your future YOU, rereading your code) might appreciate something a little more readable.

Press Ctrl+I to format your entire program, applying some reasonable readability rules to indent code lines with conditionals and looping logic. Or select just a portion of the program and press Ctrl+I to affect a smaller part of the program. You can adjust some of the formatting rules by visiting Program → Editor Options, the Indenter tab.

autoformat code

5. Zoom out for the big picture

Some SAS programs are long -- hundreds (or thousands!) of lines of code. Sometimes it's helpful to get a birds-eye view of your code to understand its structure and to help you navigate. The Zoom feature is super helpful for this. Simply press Ctrl+- (control-minus) until you get the view you need. Press Ctrl++ (control-plus) to zoom back in, or press Ctrl+0 to get to the 100% view.

This trick works for SAS logs as well, and also data sets and ODS output (including text listing, which uses the program editor in a special mode for viewing SAS output).

zoom out

6. Change the program editor font

Want to waste an afternoon? Search the Internet for "best font for programmers" and experiment with all of the results that you find. I discovered Consolas (built into Microsoft Windows) a decade ago, and I've yet to find anything better. I use it for all of my "fixed font" needs: programming, terminal windows, command consoles, etc. But you can choose your own favorite -- just don't feel that you're stuck with the default "Courier" that seems to be standard issue.

Change your font in Programs → Editor Options, Appearance tab. You'll find lots of elements that you can tweak for typeface, size and color.

7. Select columns of content with block selection

Even though column block selection -- also known as "Alt+Select" -- is a standard feature in most advanced text editors, many programmers don't know about it. It's the perfect trick for selecting just a few columns of your text without including the content that's on the rest of the line. In SAS programming, this can be handy for selecting columns of values from the text listing output and pasting somewhere else, such as into a DATALINES block. It takes a little practice to master the Alt+Select, but once you do you'll find all sorts of uses for it. To get started, simply hold down the Alt key and click-drag to highlight a vertical column of text within the editor.

column selection animation

8. Find (and replace) using regular expressions

Regular expressions are a powerful, if confusing, method for finding and replacing text that matches certain patterns. The Find/Find and Replace window in SAS Enterprise Guide supports "Regular expression search" as a checkbox option.

Here's an example. Suppose I wanted to find all occurrences of 3 numbers after the thousands separator (comma) at the end of each data line -- and I wanted to turn those digits into zeros. (I don't know why--but just stick with me here.) A regex pattern to match this is ",\d\d\d\n" (comma, followed by 3 occurrences of numeric digits, followed by a line ending). Here's an animation of this in action.

regex replace animation

For more, select Help→SAS Enterprise Guide help and search for "regular expressions". The help topics contain several examples of useful patterns.

9. Scroll just part of your document using a split view

Do you find yourself scrolling back and forth in your program view? Trying to remember what was in that DATA step at the top of your program so you can reference the proper variable in another part of your code? Instead of dealing with "scrolling whiplash", you can split the program editor view to keep one part of your code always visible while you work on another code segment that's hundreds of lines away from it.

split view

There are several ways to split your view of SAS code, log output, and listing. Check out the article here for details.

10. Break out to your other favorite editor

Please don't tell anyone, but I have a secret: SAS Enterprise Guide is not my default application associated with .SAS files. When I double-click on a .SAS file in Windows Explorer, I like to use Notepad++ to provide a quick view of what's in that program file. Don't get me wrong: I use SAS Enterprise Guide for all of my serious SAS programming work. With syntax suggestions, color coding, built-in DATA step debugger, and more -- there just isn't a better, more full-featured environment. (No, I'm not trying to troll you, diehard SAS display manager users -- you keep using what makes you happy.) But Notepad++ has a deep set of text editing features, and sometimes I like to use it for hardcore find/replace functions, deeper inspection of special characters in my files, and more.

You can launch your program into your other favorite editor from SAS Enterprise Guide. Simply right-click on the program node in your process flow, select Open → Open <program name> with Windows Default. And make sure your other editor is registered in Windows as the default "Open with" action for SAS programs. Note: this trick works only with SAS programs that you've saved locally on your Windows file system.

Open with default

More than editing -- this is your workbench

The program editor isn't just about "editing programs." It's also the launchpad for several other programmer-centric features, such as debugging your DATA step, comparing your SAS programs, viewing program history and source control, and more. If you use SAS Enterprise Guide, take the time to learn about all of its programming features -- you'll become a more productive programmer as a result.

The post Ten SAS Enterprise Guide program editor tricks appeared first on The SAS Dummy.

6月 292017
 

One of the big benefits of the SAS Viya platform is how approachable it is for programmers of other languages. You don't have to learn SAS in order to become productive quickly. We've seen a lot of interest from people who code in Python, maybe because that language has become known for its application in machine learning. SAS has a new product called SAS Visual Data Mining and Machine Learning. And these days, you can't offer such a product without also offering something special to those Python enthusiasts.

Introducing Python SWAT

And so, SAS has published the Python SWAT project (where "SWAT" stands for the SAS scripting wapper for analytical transfer. The project is a Python code library that SAS released using an open source model. That means that you can download it for free, make changes locally, and even contribute those changes back to the community (as some developers have already done!). You'll find it at github.com/sassoftware/python-swat.

SAS developer Kevin Smith is the main contributor on Python SWAT, and he's a big fan of Python. He's also an expert in SAS and in many programming languages. If you're a SAS user, you probably run Kevin's code every day; he was an original developer on the SAS Output Delivery System (ODS). Now he's a member of the cloud analytics team in SAS R&D. (He's also the author of more than a few conference papers and SAS books.)

Kevin enjoys the dynamic, fluid style that a scripting language like Python affords - versus the more formal "code-compile-build-execute" model of a compiled language. Watch this video (about 14 minutes) in which Kevin talks about what he likes in Python, and shows off how Python SWAT can drive SAS' machine learning capabilities.

New -- but familiar -- syntax for Python coders

The analytics engine behind the SAS Viya platform is called CAS, or SAS Cloud Analytic Services. You'll want to learn that term, because "CAS" is used throughout the SAS documentation and APIs. And while CAS might be new to you, the Python approach to CAS should feel very familiar for users of Python libraries, especially users of pandas, the Python Data Analysis Library.

CAS and SAS' Python SWAT extends these concepts to provide intuitive, high-performance analytics from SAS Viya in your favorite Python environment, whether that's a Jupyter notebook or a simple console. Watch the video to see Kevin's demo and discussion about how to get started. You'll learn:

  • How to connect your Python session to the CAS server
  • How to upload data from your client to the CAS server
  • How SWAT extends the concept of the DataFrame API in pandas to leverage CAS capabilities
  • How to coax CAS to provide descriptive statistics about your data, and then go beyond what's built into the traditional DataFrame methods.

Learn more about SAS Viya and Python

There are plenty of helpful resources to help you learn about using Python with SAS Viya:

And finally, what if you don't have SAS Viya yet, but you're interested in using Python with SAS 9.4? Check out the SASPy project, which allows you to access your traditional SAS features from a Jupyter notebook or Python console. It's another popular open source project from SAS R&D.

The post Using Python to work with SAS Viya and CAS appeared first on The SAS Dummy.

5月 312017
 

SAS programs in Excel, finally!When SAS Add-In for Microsoft Office was first created over a decade ago, SAS programmers were told to check their skills at the door. This new product was for non-programmers only. SAS programmers were invited to contribute to the experience by packaging their code in SAS stored processes, which end users would then run using point-and-click menus. But there was no way to write and run your SAS programs directly in Microsoft Excel or Word or PowerPoint.

This was a comfort to many SAS administrators, who wanted to provide SAS analytics to their end users but didn't want them to have to learn to program. Or perhaps to even allow them to program. But, times have changed. Citizen data scientists have been practicing their coding skills, and now they want to mix it up in Microsoft Office. In response to this demand, SAS R&D has added the SAS programming environment -- the parts that make sense, anyway -- into SAS Add-In for Microsoft Office. You can write programs, run them, and drop the results into any part of your Microsoft Office document.

Here's a short screencast of how it works in Microsoft Excel:

SAS programming in AMO

If you want to learn more and see a narrated demo from the principal developer, watch this interview with Tim Beese from SAS R&D. Tim shows the coding feature along with several other cool enhancements to SAS Add-In for Microsoft Office. As Tim explains, SAS administrators still have the final say when it comes to allowing Excel users to let loose with SAS code -- they can enable the feature by role, and so grant this as a privilege at their discretion.

The first few minutes of this video shows some impressive integration with SAS Visual Analytics and Microsoft Excel. The SAS programming demo begins around the 5-minute mark.

The post Create and run SAS code in SAS Add-In for Microsoft Office appeared first on The SAS Dummy.

5月 312017
 

SAS programs in Excel, finally!When SAS Add-In for Microsoft Office was first created over a decade ago, SAS programmers were told to check their skills at the door. This new product was for non-programmers only. SAS programmers were invited to contribute to the experience by packaging their code in SAS stored processes, which end users would then run using point-and-click menus. But there was no way to write and run your SAS programs directly in Microsoft Excel or Word or PowerPoint.

This was a comfort to many SAS administrators, who wanted to provide SAS analytics to their end users but didn't want them to have to learn to program. Or perhaps to even allow them to program. But, times have changed. Citizen data scientists have been practicing their coding skills, and now they want to mix it up in Microsoft Office. In response to this demand, SAS R&D has added the SAS programming environment -- the parts that make sense, anyway -- into SAS Add-In for Microsoft Office. You can write programs, run them, and drop the results into any part of your Microsoft Office document.

Here's a short screencast of how it works in Microsoft Excel:

SAS programming in AMO

If you want to learn more and see a narrated demo from the principal developer, watch this interview with Tim Beese from SAS R&D. Tim shows the coding feature along with several other cool enhancements to SAS Add-In for Microsoft Office. As Tim explains, SAS administrators still have the final say when it comes to allowing Excel users to let loose with SAS code -- they can enable the feature by role, and so grant this as a privilege at their discretion.

The first few minutes of this video shows some impressive integration with SAS Visual Analytics and Microsoft Excel. The SAS programming demo begins around the 5-minute mark.

The post Create and run SAS code in SAS Add-In for Microsoft Office appeared first on The SAS Dummy.

5月 082017
 

In his recent article Perceptions of probability, Rick Wicklin explores how vague statements about "likeliness" translate into probabilities that we can express numerically. It's a fun, informative post -- I recommend it! You'll "Almost Certainly" enjoy it.

To prepare the article, Rick first had to download the source data from the study he cited. The data was shared as a CSV file on GitHub. Rick also had to rename the variables (column names) from the data table so that they are easier to code within SAS. Traditionally, SAS variable names must adhere to a few common programming rules: they must be alphanumeric, begin with a letter, and contain no spaces or special characters. The complete rules are documented in the this method for reading data from a cloud service like DropBox and GitHub. It's still my favorite technique for reading data from the Internet. You'll find lots of papers and examples that use FILENAME URL for the same job in fewer lines of code, but PROC HTTP is more robust. It runs faster, and it allows you to separate the step of fetching the file from the subsequent steps of processing that file.

You can see the contents of the CSV file at this friendly URL: https://github.com/zonination/perceptions/blob/master/probly.csv. But that's not the URL that I need for PROC HTTP or any programmatic access. To download the file via a script, I need the "Raw" file URL, which I can access via the Raw button on the GitHub page.

GitHub preview

In this case, that's https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv. Here's the PROC HTTP step to download the CSV file into a temporary fileref.

/* Fetch the file from the web site */
filename probly temp;
proc http
 url="https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv"
 method="GET"
 out=probly;
run;

A note for SAS University Edition users: this step won't work for you, as the free software does not support access to secure (HTTPS) sites. You'll have to manually download the file via your browser and then continue with the remaining steps.

Step 2. Import the data into SAS with PROC IMPORT

SAS can process data with nonstandard variable names, including names that contain spaces and special characters. You simply have to use the VALIDVARNAME= system option to put SAS into the right mode (oops, almost wrote "mood" there but it's sort of the same thing).

With 'crime against nature'n.)

For this step, I'll set VALIDVARNAME=ANY to allow PROC IMPORT to retain the original column names from the CSV file. The same trick would work if I was importing from an Excel file, or any other data source that was a little more liberal in its naming rules.

/* Tell SAS to allow "nonstandard" names */
options validvarname=any;
 
/* import to a SAS data set */
proc import
  file=probly
  out=work.probly replace
  dbms=csv;
run;

Step 3. Create RENAME and LABEL statements with PROC SQL

This is one of my favorite SAS tricks. You can use PROC SQL SELECT INTO to create SAS programming statements for you, based on the data you're processing. Using this technique, I can build the parts of the LABEL statement and the RENAME statement dynamically, without knowing the variable names ahead of time.

The LABEL statement is simple. I'm going to build a series of assignments that look like this:

  'original variable name'n = 'original variable name'

I used the SELECT INTO clause to build a label assignment for each variable name. I used the CAT function to assemble the label assignment piece-by-piece, including the special literal syntax, the variable name, the assignment operator, and the label value within quotes. I'm fetching the variable names from SASHELP.VCOLUMN, one of the built-in dictionary tables that SAS provides to surface table and column metadata.

  select cat("'",trim(name),"'n","=","'",trim(name),"'") 
     into :labelStmt separated by ' '  
  from sashelp.vcolumn where memname="PROBLY" and libname="WORK";

Here's part of the value of &labelStmt:

'Almost Certainly'n='Almost Certainly' 
'Highly Likely'n='Highly Likely' 
'Very Good Chance'n='Very Good Chance' 
'Probable'n='Probable' 
'Likely'n='Likely' 
'Probably'n='Probably' 
'We Believe'n='We Believe' 

The RENAME statement is a little trickier, because I have to calculate a new valid variable name. For this specific data source that's easy, because the only SAS "rule" that these column names violate is the ban on space characters. I can create a new name by using the COMPRESS function to remove the spaces. To be a little safer, I used the "kn" modifier on the COMPRESS function to keep only English letters, numbers, and underscores. That should cover all cases except for variable names that are too long (greater than 32 characters) or that begin with a number (or that don't contain any valid characters to begin with).

Some of the column names are one-word names that are already valid. If I include those in the RENAME statement, SAS will generate an error (you cannot "rename" a variable to its current name). I used the

/* Generate new names to comply with SAS rules.                          */
/* Assumes names contain spaces, and can fix with COMPRESS               */
/* Other deviations (like special chars, names that start with a number) */
/* would need different adjustments                                      */
/* NVALID() function can check that a name is a valid V7 name           */
proc sql noprint;
 
  /* retain original names as labels */
  select cat("'",trim(name),"'n","=","'",trim(name),"'") 
     into :labelStmt separated by ' '  
  from sashelp.vcolumn where memname="PROBLY" and libname="WORK";
 
  select cat("'",trim(name),"'n","=",compress(name,,'kn')) 
     into :renameStmt separated by ' '  
  from sashelp.vcolumn where memname="PROBLY" and libname="WORK"
  /* exclude those varnames that are already valid */
  AND not NVALID(trim(name),'V7');
quit;

Step 4. Modify the data set with new names and labels using PROC DATASETS

With the body of the LABEL and RENAME statements built, it's time to plug them into a PROC DATASETS step. PROC DATASETS can change data set attributes such as variable names, labels, and formats without requiring a complete rewrite of the data -- it's a very efficient operation.

I include the LABEL statement first, since it references the original variable names. Then I include the RENAME statement, which changes the variable names to their new V7-compliant values.

Finally, I reset the VALIDVARNAME= option to the normal V7 sanity. (Unless you're running in SAS Enterprise Guide, in which case the option is already set to ANY by default. Check this blog post for a less disruptive method of setting/restoring options.)

proc datasets lib=work nolist ;
  modify probly / memtype=data;
  label &labelStmt.;
  rename &renameStmt.;
  /* optional: report on the var names/labels */
  contents data=probly nodetails;
quit;
 
/* reset back to the old rules */
options validvarname=v7;

Here's the CONTENTS output from the PROC DATASETS step, which shows the final variable attributes. I now have easy-to-code variable names, and they still have their descriptive labels. My data dictionary dreams are coming true!

DATASETS rename output

Download the entire program example from my public Gist: import_renameV7.sas.

The post How to download and convert CSV files for use in SAS appeared first on The SAS Dummy.

5月 082017
 

In his recent article Perceptions of probability, Rick Wicklin explores how vague statements about "likeliness" translate into probabilities that we can express numerically. It's a fun, informative post -- I recommend it! You'll "Almost Certainly" enjoy it.

To prepare the article, Rick first had to download the source data from the study he cited. The data was shared as a CSV file on GitHub. Rick also had to rename the variables (column names) from the data table so that they are easier to code within SAS. Traditionally, SAS variable names must adhere to a few common programming rules: they must be alphanumeric, begin with a letter, and contain no spaces or special characters. The complete rules are documented in the this method for reading data from a cloud service like DropBox and GitHub. It's still my favorite technique for reading data from the Internet. You'll find lots of papers and examples that use FILENAME URL for the same job in fewer lines of code, but PROC HTTP is more robust. It runs faster, and it allows you to separate the step of fetching the file from the subsequent steps of processing that file.

You can see the contents of the CSV file at this friendly URL: https://github.com/zonination/perceptions/blob/master/probly.csv. But that's not the URL that I need for PROC HTTP or any programmatic access. To download the file via a script, I need the "Raw" file URL, which I can access via the Raw button on the GitHub page.

GitHub preview

In this case, that's https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv. Here's the PROC HTTP step to download the CSV file into a temporary fileref.

/* Fetch the file from the web site */
filename probly temp;
proc http
 url="https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv"
 method="GET"
 out=probly;
run;

A note for SAS University Edition users: this step won't work for you, as the free software does not support access to secure (HTTPS) sites. You'll have to manually download the file via your browser and then continue with the remaining steps.

Step 2. Import the data into SAS with PROC IMPORT

SAS can process data with nonstandard variable names, including names that contain spaces and special characters. You simply have to use the VALIDVARNAME= system option to put SAS into the right mode (oops, almost wrote "mood" there but it's sort of the same thing).

With 'crime against nature'n.)

For this step, I'll set VALIDVARNAME=ANY to allow PROC IMPORT to retain the original column names from the CSV file. The same trick would work if I was importing from an Excel file, or any other data source that was a little more liberal in its naming rules.

/* Tell SAS to allow "nonstandard" names */
options validvarname=any;
 
/* import to a SAS data set */
proc import
  file=probly
  out=work.probly replace
  dbms=csv;
run;

Step 3. Create RENAME and LABEL statements with PROC SQL

This is one of my favorite SAS tricks. You can use PROC SQL SELECT INTO to create SAS programming statements for you, based on the data you're processing. Using this technique, I can build the parts of the LABEL statement and the RENAME statement dynamically, without knowing the variable names ahead of time.

The LABEL statement is simple. I'm going to build a series of assignments that look like this:

  'original variable name'n = 'original variable name'

I used the SELECT INTO clause to build a label assignment for each variable name. I used the CAT function to assemble the label assignment piece-by-piece, including the special literal syntax, the variable name, the assignment operator, and the label value within quotes. I'm fetching the variable names from SASHELP.VCOLUMN, one of the built-in dictionary tables that SAS provides to surface table and column metadata.

  select cat("'",trim(name),"'n","=","'",trim(name),"'") 
     into :labelStmt separated by ' '  
  from sashelp.vcolumn where memname="PROBLY" and libname="WORK";

Here's part of the value of &labelStmt:

'Almost Certainly'n='Almost Certainly' 
'Highly Likely'n='Highly Likely' 
'Very Good Chance'n='Very Good Chance' 
'Probable'n='Probable' 
'Likely'n='Likely' 
'Probably'n='Probably' 
'We Believe'n='We Believe' 

The RENAME statement is a little trickier, because I have to calculate a new valid variable name. For this specific data source that's easy, because the only SAS "rule" that these column names violate is the ban on space characters. I can create a new name by using the COMPRESS function to remove the spaces. To be a little safer, I used the "kn" modifier on the COMPRESS function to keep only English letters, numbers, and underscores. That should cover all cases except for variable names that are too long (greater than 32 characters) or that begin with a number (or that don't contain any valid characters to begin with).

Some of the column names are one-word names that are already valid. If I include those in the RENAME statement, SAS will generate an error (you cannot "rename" a variable to its current name). I used the

/* Generate new names to comply with SAS rules.                          */
/* Assumes names contain spaces, and can fix with COMPRESS               */
/* Other deviations (like special chars, names that start with a number) */
/* would need different adjustments                                      */
/* NVALID() function can check that a name is a valid V7 name           */
proc sql noprint;
 
  /* retain original names as labels */
  select cat("'",trim(name),"'n","=","'",trim(name),"'") 
     into :labelStmt separated by ' '  
  from sashelp.vcolumn where memname="PROBLY" and libname="WORK";
 
  select cat("'",trim(name),"'n","=",compress(name,,'kn')) 
     into :renameStmt separated by ' '  
  from sashelp.vcolumn where memname="PROBLY" and libname="WORK"
  /* exclude those varnames that are already valid */
  AND not NVALID(trim(name),'V7');
quit;

Step 4. Modify the data set with new names and labels using PROC DATASETS

With the body of the LABEL and RENAME statements built, it's time to plug them into a PROC DATASETS step. PROC DATASETS can change data set attributes such as variable names, labels, and formats without requiring a complete rewrite of the data -- it's a very efficient operation.

I include the LABEL statement first, since it references the original variable names. Then I include the RENAME statement, which changes the variable names to their new V7-compliant values.

Finally, I reset the VALIDVARNAME= option to the normal V7 sanity. (Unless you're running in SAS Enterprise Guide, in which case the option is already set to ANY by default. Check this blog post for a less disruptive method of setting/restoring options.)

proc datasets lib=work nolist ;
  modify probly / memtype=data;
  label &labelStmt.;
  rename &renameStmt.;
  /* optional: report on the var names/labels */
  contents data=probly nodetails;
quit;
 
/* reset back to the old rules */
options validvarname=v7;

Here's the CONTENTS output from the PROC DATASETS step, which shows the final variable attributes. I now have easy-to-code variable names, and they still have their descriptive labels. My data dictionary dreams are coming true!

DATASETS rename output

Download the entire program example from my public Gist: import_renameV7.sas.

The post How to download and convert CSV files for use in SAS appeared first on The SAS Dummy.

4月 152017
 

Every day before I even wake up, I have little "SAS robots" that do work for me. These are SAS batch jobs that gather data from external services and build data marts, generate reports, and send e-mail. One of those SAS jobs gathers Google Analytics data about our SAS blogs at blogs.sas.com.

With SAS 9.4 Maintenance 4, it's finally easy (well, relatively speaking) to connect to complicated APIs like those supported by Google, and to gather information with an unattended batch process. It's made possible by recent enhancements in PROC HTTP and the new JSON library engine in SAS. The PROC HTTP enhancements make it easier to negotiate multi-step authentication schemes like OAuth2. And of course, the JSON engine makes it easier to parse JSON results into SAS data sets. If you scour the Internet and SAS conference papers, you might find examples that use PROC GROOVY or other tricks to call outside of SAS to drive these OAuth2 APIs and parse JSON. The recent enhancements make such tricks unnecessary, and thus provide a cleaner approach with fewer moving parts.

Joseph Henry, a senior software developer at SAS (and primary developer for PROC HTTP), wrote a useful SAS Global Forum paper about PROC HTTP and OAuth2. Joseph definitely understands this better than I do, so I recommend that you read his paper first. Joseph's example uses Google Drive. Come back to this post when you want to see the instructions I've hacked together -- it's what I've got working for Google Analytics.

How to use the Google APIs

There are four main steps to use the Google APIs. Two of these steps need to be performed just once, and don't need to be automated. The other two steps are necessary each time you access a Google API from a program, and they can be scripted. Here are the steps:

  1. Grant permission and obtain an authorization token for your "app" (a SAS program, in our case). This must be performed in a browser while logged into your Google account.
  2. Obtain an access token and refresh token. You can accomplish this with a SAS program that you run just once. You then save the refresh token (in a safe place!) for subsequent runs.
  3. Convert your saved refresh token into an access token. Do this at the start of every SAS job that needs to get Google Analytics data.
  4. And finally, use the Google Analytics API to get actual data!

There is a lot to learn about how Google APIs work and how to provision your Google account with access to resources like Google Analytics. That's more than I can go into here, so I'm going to assume a few things and skip ahead. I'm going to assume that you already have an API project created, and that you have a "client ID" and "client secret". The Google API documentation is very good on this topic, and there are many Internet resources to help you learn this part of the process.

Also, some Google APIs are free to use, while others cost money. And some are rate-limited -- that is, you can call the APIs only so many times within a given period. Your specific limits might differ depending on the type of accounts you have with Google. If you have a corporate relationship with Google, you might have an admin who has to provision your specific Google account to get access to the APIs. I'm glossing over all of these nuances -- this post is simply about the code.

Step 1. Grant permission and retrieve auth code

Do this step only once per account, per API permission. You perform this step in the browser while logged into your Google account. Enter the following URL, substituting your client-id as indicated. The URL needs to be all on one line, but I've broken it up here for readability.

 https://accounts.google.com/o/oauth2/v2/auth?
         scope=https://www.googleapis.com/auth/analytics.readonly
         &redirect_uri=urn:ietf:wg:oauth:2.0:oob
         &response_type=code
         &client_id=<your-client-id>.apps.googleusercontent.com 

You will be prompted to allow the "app" access to your Google Analytics data (read only). This is similar to allowing another app to access your Facebook or Twitter profile -- you've probably seen similar prompts during your own Internet citizenship. Permissions are an important component of allowing apps to act on your behalf via REST APIs.

Then, you'll be redirected to a web page with an auth code that you should copy and save. We'll use it in the next step. Setting the redirect_uri properly is very important: redirect_uri=urn:ietf:wg:oauth:2.0:oob. Otherwise the API won't generate a code that you can use in a tool-based app like SAS.

Step 2. Exchange the auth code for an access token

Next, run this PROC HTTP step in SAS with the POST method to exchange that auth code from Step 1 for an access token. It will return a JSON response with a valid Bearer access token. That token expires in 3600 seconds (1 hour). It also returns a refresh_token, which you can exchange again for a new access token after the first one expires. The refresh_token never expires (though it can be revoked via the developer console or API). Thus, you usually need to perform this step just once, unless your token is revoked for some reason.

/* file to store your result */
filename token "c:\temp\token.json";
%let code_given =<code-returned-from-step-1> ;
%let oauth2=https://www.googleapis.com/oauth2/v4/token;
%let client_id=<your-client-id>.apps.googleusercontent.com;
%let client_secret=<your-client-secret>;
proc http
/* put this all on one line! */
 url="&oauth2.?client_id=&client_id.%str(&)code=&code_given.
      %str(&)client_secret=&client_secret.%str(&)redirect_uri=urn:ietf:wg:oauth:2.0:oob
      %str(&)grant_type=authorization_code%str(&)response_type=token"
 method="POST"
 out=token
;
run;

Note: the refresh_token and client-id/secret values should be protected! Anyone who has access to these can get to your Google API data as if they were you. Consider storing them in a file that only you have read access to, and programmatically pull them in when running your SAS program under your host account.

Step 3. Exchange the refresh_token for a valid access token

Typically, you'll include this step just once at the beginning of your SAS job. This takes your saved refresh_token value and asks Google to grant an access token for use in the rest of your program. The Google APIs run very fast -- you should not need to renew the token again within the same job.

/* STEP 3. Do this every time you want to use the GA API */
/* Turn in a refresh-token for a valid access-token      */
/* Should be good for 60 minutes                         */
/* So typically run once at beginning of the job.        */
%let oauth2=https://www.googleapis.com/oauth2/v4/token;
%let client_id=<your-client-id>.apps.googleusercontent.com;
%let client_secret=<your-client-secret>;
%let refresh_token=<refresh-token-from-step-2>;
 
filename rtoken temp;
proc http
 method="POST"
 /* Again, put this all on one line */
 /* broken here for readability */
 url="&oauth2.?client_id=&client_id.
     %str(&)client_secret=&client_secret.
     %str(&)grant_type=refresh_token%str(&)refresh_token=&refresh_token."
 out=rtoken;
run;
 
/* Read the access token out of the refresh response  */
/* Relies on the JSON libname engine (9.4m4 or later) */
libname rtok json fileref=rtoken;
data _null_;
 set rtok.root;
 call symputx('access_token',access_token);
run;

Step 4. Use the Google Analytics API to gather data

Finally, we're to the point where we can retrieve data from this service! And this is where those new features in PROC HTTP come into play. In SAS 9.4m3, Joseph added support for an inline HEADERS statement, which is perfect for injecting OAuth2 information like "Bearer" token into the HTTP header.

proc http
  url="<REST-api-url-call>"
  method="GET" out=ga_resp;
  /* Headers statement makes this easy */
  headers 
    "Authorization"="Bearer &access_token."
    "client-id:"="&client_id.";
 run;

I've noticed that a lot of REST API docs use cUrl (command-line URL) as in their examples. In my experience, PROC HTTP can do just about anything that cUrl can do, and these new options make it even easier to make cUrl-based examples work in SAS.

The Google Analytics API is complex on its own, so I suggest that you spend some time in the Google API Explorer to see what types of requests yield what sort of results. Google Analytics allows you to fetch metrics across different dimensions and dates, so the results can come back summarized across the dimensions that you request. Since we're bringing this data into SAS and I probably want to further summarize in my reports, I try to get as much "unsummarized" data as possible, and I avoid aggregations such as "Average views" or "Average time on page" -- as I don't want to risk calculating new summaries based on these.

I've included my program for retrieving the page views, total time on page, entrances and exits at the daily level. This is granular enough for me to create meaningful reports all of our blog authors. The program is a SAS macro that issues one API call per day in the range specified.

/* Metrics and dimensions are defined in the Google Analytics doc */
/* Experiment in the developer console for the right mix          */
/* Your scenario might be different and would require a different */
/* type of query                                                  */
/* The GA API will "number" the return elements as                */
/* element1, element2, element3 and so on                         */
/* In my example, path and title will be 1 and 2 */
%let dimensions=  %sysfunc(urlencode(%str(ga:pagePath,ga:pageTitle)));
/* then pageviews, uniquepageviews, timeonpage will be 3, 4, 5, etc. */
%let metrics=     %sysfunc(urlencode(%str(ga:pageviews,ga:uniquePageviews,ga:timeOnPage,ga:entrances,ga:exits)));
/* this ID is the "View ID" for the GA data you want to access   */
%let id=          %sysfunc(urlencode(%str(ga:<your-view-ID>)));
 
%macro getGAdata;
%do workdate = &enddate %to &startdate %by -1;
	%let urldate=%sysfunc(putn(&workdate.,yymmdd10.));
	filename ga_resp temp;
	proc http
         /* again, put this url= value ALL on a single line! */
	 url="https://www.googleapis.com/analytics/v3/data/ga
            ?ids=&id.%str(&)start-date=&urldate.
            %str(&)end-date=&urldate.
            %str(&)metrics=&metrics.
            %str(&)dimensions=&dimensions.%str(&)max-results=20000"
	 method="GET" out=ga_resp;
	 headers 
	   "Authorization"="Bearer &access_token."
	   "client-id:"="&client_id.";
	run;
 
        /* Love this JSON libname engine! */
	libname garesp json fileref=ga_resp;
 
	data ga.ga_daily%sysfunc(compress(&urldate.,'-')) (drop=element:);
		set garesp.rows;
		drop ordinal_root ordinal_rows;
		length date 8 url $ 300 title $ 250 
	          views 8 unique_views 8 time_on_page 8 entrances 8 exits 8
	          ;
		format date yymmdd10.;
		date=&workdate.;
		/* Corerce the elements into data variables */
		/* Basic on expected sequence               */
		url = element1;
		title = element2;
		views = input(element3, 5.);
		unique_views = input(element4, 6.);
		time_on_page=input(element5, 7.2);
		entrances = input(element6, 6.);
		exits = input(element7, 6.);
	run;
%end;
%mend;
 
%getGAdata;
 
/* Assemble the daily files into one data set */
data alldays_gadata;
  set ga.ga_daily:;
run;

When it's all finally working, SAS pulls in data that looks like this:
Google Analytics data via API

Re-entry and disclaimers

This has been a pretty deep post about Google APIs and how to make them work from SAS programs. If you have experience with the Google APIs already, then I'm optimistic that this article is enough for you to get things working in SAS. If you've never used OAuth2 or any of the Google APIs, then this post might not be quite enough. Don't forget to read Joseph's paper about OAuth2 and PROC HTTP. You might also like his paper about using RESTful APIs.

Please leave a comment if you have questions or if you think that I left something out. I'll do my best to correct and clarify.

The post Using SAS to access Google Analytics APIs appeared first on The SAS Dummy.

4月 092017
 

Thanks to a new open source project from SAS, Python coders can now bring the power of SAS into their Python scripts. The project is SASPy, and it's available on the SAS Software GitHub. It works with SAS 9.4 and higher, and requires Python 3.x.

I spoke with Jared Dean about the SASPy project. Jared is a Principal Data Scientist at SAS and one of the lead developers on SASPy and a related project called Pipefitter. Here's a video of our conversation, which includes an interactive demo. Jared is obviously pretty excited about the whole thing.

Use SAS like a Python coder

SASPy brings a "Python-ic" sensibility to this approach for using SAS. That means that all of your access to SAS data and methods are surfaced using objects and syntax that are familiar to Python users. This includes the ability to exchange data via pandas, the ubiquitous Python data analysis framework. And even the native SAS objects are accessed in a very "pandas-like" way.

import saspy
import pandas as pd
sas = saspy.SASsession(cfgname='winlocal')
cars = sas.sasdata("CARS","SASHELP")
cars.describe()

The output is what you expect from pandas...but with statistics that SAS users are accustomed to. PROC MEANS anyone?

In[3]: cars.describe()
Out[3]: 
       Variable Label    N  NMiss   Median          Mean        StdDev  
0         MSRP     .   428      0  27635.0  32774.855140  19431.716674   
1      Invoice     .   428      0  25294.5  30014.700935  17642.117750   
2   EngineSize     .   428      0      3.0      3.196729      1.108595   
3    Cylinders     .   426      2      6.0      5.807512      1.558443   
4   Horsepower     .   428      0    210.0    215.885514     71.836032   
5     MPG_City     .   428      0     19.0     20.060748      5.238218   
6  MPG_Highway     .   428      0     26.0     26.843458      5.741201   
7       Weight     .   428      0   3474.5   3577.953271    758.983215   
8    Wheelbase     .   428      0    107.0    108.154206      8.311813   
9       Length     .   428      0    187.0    186.362150     14.357991   

       Min       P25      P50      P75       Max  
0  10280.0  20329.50  27635.0  39215.0  192465.0  
1   9875.0  18851.00  25294.5  35732.5  173560.0  
2      1.3      2.35      3.0      3.9       8.3  
3      3.0      4.00      6.0      6.0      12.0  
4     73.0    165.00    210.0    255.0     500.0  
5     10.0     17.00     19.0     21.5      60.0  
6     12.0     24.00     26.0     29.0      66.0  
7   1850.0   3103.00   3474.5   3978.5    7190.0  
8     89.0    103.00    107.0    112.0     144.0  
9    143.0    178.00    187.0    194.0     238.0  

SASPy also provides high-level Python objects for the most popular and powerful SAS procedures. These are organized by SAS product, such as SAS/STAT, SAS/ETS and so on. To explore, issue a dir() command on your SAS session object. In this example, I've created a sasstat object and I used dot<TAB> to list the available SAS analyses:

SAS/STAT object in SASPy

The SAS Pipefitter project extends the SASPy project by providing access to advanced analytics and machine learning algorithms. In our video interview, Jared presents a cool example of a decision tree applied to the passenger survival factors on the Titanic. It's powered by PROC HPSPLIT behind the scenes, but Python users don't need to know all of that "inside baseball."

Installing SASPy and getting started

Like most things Python, installing the SASPy package is simple. You can use the pip installation manager to fetch the latest version:

pip install saspy

However, since you need to connect to a SAS session to get to the SAS goodness, you will need some additional files to broker that connection. Most notably, you need a few Java jar files that SAS provides. You can find these in the SAS Deployment Manager folder for your SAS installation:
../deploywiz/sas.svc.connection.jar
..deploywiz/log4j.jar
../deploywiz/sas.security.sspi.jar
../deploywiz/sas.core.jar

The jar files are compatible between Windows and Unix, so if you find them in a Unix SAS install you can still copy them to your Python Windows client. You'll need to modify the sascgf.py file (installed with the SASPy package) to point to where you've stashed these. If using local SAS on Windows, you also need to make sure that the sspiauth.dll is in your Windows system PATH. The easiest method to add SASHOMESASFoundation9.4coresasexe to your system PATH variable.

All of this is documented in the "Installation and Configuration" section of the project documentation. The connectivity options support an impressively diverse set of SAS configs: Windows, Unix, SAS Grid Computing, and even SAS on the mainframe!

Download, comment, contribute

SASPy is an open source project, and all of the Python code is available for your inspection and improvement. The developers at SAS welcome you to give it a try and enter issues when you see something that needs to be improved. And if you're a hotshot Python coder, feel free to fork the project and issue a pull request with your suggested changes!

The post Introducing SASPy: Use Python code to access SAS appeared first on The SAS Dummy.

3月 282017
 

Did you know that PROC SQL captures the record count for a result set in a special automatic macro variable? When you create a subset of data to include in a report, it's a nice touch to add a record count and other summaries as an eye-catcher to the report title. I often see the following pattern in SAS programs, which adds an extra step to get a record count:

proc sql noprint;
 
 create table result 
  as select * from sashelp.cars
  where origin='Asia';
 
 /* count the records in the result */
 select count(model) into :resultcount
  from result;
quit;
 
title "Summary of Cars from Asia: &resultcount. records";
proc means data=result;
run;

This creates a report with an informative title like this:

Here's the tip. Instead of including a SELECT INTO step that's going to make another pass through the data, you can rely on the &SQLOBS automatic macro variable. This variable holds the record "result set" count from the most recent SELECT clause.

proc sql noprint;
 create table result 
  as select * from sashelp.cars
  where origin='Asia';
 
 %let resultcount=&sqlobs;
quit;
 
title "Summary of Cars from Asia: &resultcount. records";
proc means data=result;
run;

Because SAS replaces the value with each subsequent SELECT clause, it's important to assign it to another macro variable immediately if you intend to use it later. Here's the result:

Not only is this more efficient, but SAS automatically trims the whitespace from the SQLOBS variable so that it looks better in a TITLE statement. If you're using SELECT INTO to populate macro variables for other reasons, you can use the TRIMMED keyword to achieve the same effect.

proc sql noprint;
 
 create table result 
  as select * from sashelp.cars
  where origin='Asia';
 
 %let resultcount=&sqlobs;
 
 select avg(mpg_highway) into: AvgMpg TRIMMED
  from result;
 
quit;
 
title "Summary of Cars from Asia: &resultcount. records, &AvgMpg. MPG Average";
proc means data=result;
run;

See also

The post How many records are in that PROC SQL result? appeared first on The SAS Dummy.