During my morning commute I heard an interesting news story about the merits and risks of the \$100 bill. Apparently there are a lot of them in circulation, but no one knows exactly where they are. They are seldom used for legitimate business transactions because when a transaction reaches into the hundreds of dollars, the business parties tend to resort to a digital transaction (or maybe a good-old-fashioned personal check).

One stat from the story stuck with me: 80 percent of the currency value that is in circulation right now is in \$100 bills. When I hear a number like that my first instinct is to find the data and verify it. It was easy to find on the US Federal Reserve: Currency in Circulation web site.

I could have copied and pasted the table for use in SAS, but the Fed already makes an ASCII version of the table available for nerds like me. With a bit of SAS code I was able to download the data, transpose it for analysis, calculate the currency values for with the counts of the various notes, and create a graph that verifies what I heard on the news. Indeed, much of our nation's cash wealth is floating around on a bunch of Benjamins.

The percentage of wealth represented in the \$100 bill has grown over the past 20 years, perhaps at the expense of the far-more-ATM-friendly \$20 bill. In 1996, the ratio of hundreds-to-twenties was 60/20. Today it's 79/12.

If we look at the percent breakdown by bill counts (instead of value), you can see the shifts a bit differently.

I created those two plots by calculating the percentages with PROC FREQ, and then using PROC SGPLOT to graph Year, using the percentages as the response value and stacking the data by Denomination. I used GROUPORDER=Data to keep the data colors and legend consistent across the different graphs. The raw values are interesting to examine as well, as it more clearly shows the trends for the past 20 years. Here are the two representations (cash value and bill counts) with their actual values and not just the percentages.

An interesting aside: the \$2 bill remains a novelty. I had the opportunity to spend a few of these recently, and the person receiving them had to be convinced that they were real money. It made me want to secure more \$2 bills (a.k.a. "some Jeffersons") for future transactions -- that's the sort of trouble-making consumer that I like to be.

In case you're interested, here's the SAS program that I used -- try it yourself! (If using SAS University Edition, you'll have to download the text data first instead of relying on the FILENAME URL statement. Notes in the program.)

tags: SGPLOT

The post Visualized: US Currency in circulation, past and present appeared first on The SAS Dummy.

Colors are the subject of many romantic poems and songs, but there isn't much romance to be found in their hexadecimal values. With apologies to Van Morrison:

...Skipping and a jumping
In the misty morning fog with
Our hearts a thumpin' and you
My cx662F14 eyed girl

When it comes to specifying colors within a SAS program, you can always rely on the simple color names: red, blue, yellow, and so on. (You know, the colors you might remember from your first box of Crayola crayons.) You can even predict a few more exotic names such as "lightgreen" and "darkyellow" and even "olivedrab". Are you familiar with HTML color name standards? Most of those names work as well. But for true color precision, you might want to use the hexadecimal values or at least the super-descriptive SAS color names.

In SAS Enterprise Guide, when you type a piece of SAS syntax that expects a color value, you'll find that the program editor pops up a helpful "color picker," displaying a long list of acceptable color names and their hex values. You can scroll through the list or use "type ahead" to find the color you want, then click or press Enter to accept it.

There's a keyboard shortcut that will invoke the color picker at any time: Ctrl+Shift+C. Use that when working on a SAS macro program or at any place the SAS program editor might not otherwise predict. By default, the editor will drop in the color name. You can change that behavior by visiting Program->Editor Options, Autocomplete tab. Select between the SAS color name or the more-obscure hex value. (Guaranteed to make your program more difficult to read, and thus helpful for job security.)

Are you using SAS Studio? You can also color your world with just a few keystrokes. This screenshot is from SAS Studio 3.6:

### More colorful resources

The post Tip for coding your color values in SAS Enterprise Guide appeared first on The SAS Dummy.

At SAS, we've published more repositories on GitHub as a way to share our open source projects and examples. These "repos" (that's Git lingo) are created and maintained by experts in R&D, professional services (consulting), and SAS training. Some recent examples include:

With dozens of repositories under the sassoftware account, it becomes a challenge to keep track of them all. So, I've built a process that uses SAS and the GitHub APIs to create reports for my colleagues.

## Using the GitHub API

GitHub APIs are robust and well-documented. Like most APIs these days, you access them using HTTP and REST. Most of the API output is returned as JSON. With PROC HTTP and the JSON libname engine (new in SAS 9.4 Maint 4), using these APIs from SAS is a cinch.

The two API calls that we'll use for this basic report are:

## Fetching the GitHub account metadata

The following SAS program calls the first API to gather some account metadata. Then, it stores a selection of those values in macro variables for later use.

```/* Establish temp file for HTTP response */ filename resp temp;   /* Get Org metadata, including repo count */ proc http url="https://api.github.com/orgs/sassoftware" method="GET" out=resp ; run;   /* Read response as JSON data, extract select fields */ /* It's in the ROOT data set, found via experiment */ libname ss json fileref=resp;   data meta; set ss.root; call symputx('repocount',public_repos); call symputx('acctname',name); call symputx('accturl',html_url); run;   /* log results */ %put &=repocount; %put &=acctname; %put &=accturl;```

Here is the output of this program (as of today):

```REPOCOUNT=66
ACCTNAME=SAS Software
ACCTURL=https://github.com/sassoftware
```

The important piece of this output is the count of repositories. We'll need that number in order to complete the next step.

## Fetching the repositories and stats

It turns out that the /repos API call returns the details for 30 repositories at a time. For accounts with more than 30 repos, we need to call the API multiple times with a &page= index value to iterate through each batch. I've wrapped this process in a short macro function that repeats the calls as many times as needed to gather all of the data. This snippet calculates the upper bound of my loop index:

```/* Number of repos / 30, rounded up to next integer */ %let pages=%sysfunc(ceil(%sysevalf(&repocount / 30)));```

Given the 66 repositories on the SAS Software account right now, that results in 3 API calls.

Each API call creates verbose JSON output with dozens of fields, only a few if which we care about for this report. To simplify things, I've created a JSON map that defines just the fields that I want to capture. I came up with this map by first allowing the JSON libname engine to "autocreate" a map file with the full response. I edited that file and whittled the result to just 12 fields. (Read my previous blog post about the JSON engine to learn more about JSON maps.)

The multiple API calls create multiple data sets, which I must then concatenate into a single output data set for reporting. Then to clean up, I used PROC DATASETS to delete the intermediate data sets.

First, here's the output data:

Here's the code segment, which is rather long because I included the JSON map inline.

```/* This trimmed JSON map defines just the fields we want */ /* Created by using AUTOMAP=CREATE on JSON libname */ /* then editing the generated map file to reduce to */ /* minimum number of fields of interest */ filename repomap temp; data _null_; infile datalines; file repomap; input; put _infile_; datalines; { "DATASETS": [ { "DSNAME": "root", "TABLEPATH": "/root", "VARIABLES": [ { "NAME": "id", "TYPE": "NUMERIC", "PATH": "/root/id" }, { "NAME": "name", "TYPE": "CHARACTER", "PATH": "/root/name", "CURRENT_LENGTH": 50, "LENGTH": 50 }, { "NAME": "html_url", "TYPE": "CHARACTER", "PATH": "/root/html_url", "CURRENT_LENGTH": 100, "LENGTH": 100 }, { "NAME": "language", "TYPE": "CHARACTER", "PATH": "/root/language", "CURRENT_LENGTH": 20, "LENGTH": 20 }, { "NAME": "description", "TYPE": "CHARACTER", "PATH": "/root/description", "CURRENT_LENGTH": 300, "LENGTH": 500 }, { "NAME": "created_at", "TYPE": "NUMERIC", "INFORMAT": [ "IS8601DT", 19, 0 ], "FORMAT": ["DATETIME", 20], "PATH": "/root/created_at", "CURRENT_LENGTH": 20 }, { "NAME": "updated_at", "TYPE": "NUMERIC", "INFORMAT": [ "IS8601DT", 19, 0 ], "FORMAT": ["DATETIME", 20], "PATH": "/root/updated_at", "CURRENT_LENGTH": 20 }, { "NAME": "pushed_at", "TYPE": "NUMERIC", "INFORMAT": [ "IS8601DT", 19, 0 ], "FORMAT": ["DATETIME", 20], "PATH": "/root/pushed_at", "CURRENT_LENGTH": 20 }, { "NAME": "size", "TYPE": "NUMERIC", "PATH": "/root/size" }, { "NAME": "stars", "TYPE": "NUMERIC", "PATH": "/root/stargazers_count" }, { "NAME": "forks", "TYPE": "NUMERIC", "PATH": "/root/forks" }, { "NAME": "open_issues", "TYPE": "NUMERIC", "PATH": "/root/open_issues" } ] } ] } ; run;   /* GETREPOS: iterate through each "page" of repositories */ /* and collect the GitHub data */ /* Output: <account>_REPOS, a data set with all basic data */ /* about an account's public repositories */ %macro getrepos; %do i = 1 %to &pages; proc http url="https://api.github.com/orgs/sassoftware/repos?page=&i." method="GET" out=resp ; run;   /* Use JSON engine with defined map to capture data */ libname repos json map=repomap fileref=resp; data _repos&i.; set repos.root; run; %end;   /* Concatenate all pages of data */ data sassoftware_allrepos; set _repos:; run;   /* delete intermediate repository data */ proc datasets nolist nodetails; delete _repos:; quit; %mend;   /* Run the macro */ %getrepos;```

## Creating a simple report

Finally, I want to create simple report listing of all of the repositories and their top-level stats. I'm using PROC SQL without a CREATE TABLE statement, which will create a simple ODS listing report for me. I use this approach instead of PROC PRINT because I transformed a couple of the columns in the same step. For example, I created a new variable with a fully formed HTML link, which ODS HTML will render as an active link in the browser. Here's a snapshot of the output, followed by the code.

```/* Best with ODS HTML output */ title "github.com/sassoftware (&acctname.): Repositories and stats"; title2 "ALL &repocount. repos, Data pulled with GitHub API as of &SYSDATE."; title3 height=1 link="&accturl." "See &acctname. on GitHub"; proc sql; select catt('<a href="',t1.html_url,'">',t1.name,"</a>") as Repository, case when length(t1.description)>50 then cat(substr(t1.description,1,49),'...') else t1.description end as Description, t1.language as Language, t1.created_at format=dtdate9. as Created, t1.pushed_at format=dtdate9. as Last_Update, t1.stars as Stars, t1.forks as Forks, t1.open_issues as Open_Issues from sassoftware_allrepos t1 order by t1.pushed_at desc; quit;```

## Get the entire example

Not wanting to get too meta on you here, but I've placed the entire program on my own GitHub account. The program I've shared has a few modifications that make it easier to adapt for any organization or user on GitHub. As you play with this, keep in mind that the GitHub API is "rate limited" -- they allow only so many API calls from a single IP address in a certain period of time. That's to ensure that the APIs perform well for all users. You can use authenticated API calls to increase the rate-limit threshold for yourself, and I do that for my own production reporting process. But...that's a blog post for a different day.

tags: github, JSON, open source, PROC HTTP

The post Reporting on GitHub accounts with SAS appeared first on The SAS Dummy.

### TL; DR

Free training from SAS: "SAS Programming for R Users." The schedule of Live Web offerings is here. If you prefer self-study, the complete course materials are on the SAS Software GitHub space and you can practice with the free SAS University Edition software.

## The details: how R programmers can learn SAS for free

As much as I would love for SAS customers to use SAS to the exclusion of everything else, that rarely happens. Every time I visit a SAS customer, I hear about the other non-SAS tools that they use alongside SAS and their integration points. The most popular of these include desktop tools such as Microsoft Excel, or enterprise databases from other vendors. But increasingly, I hear from users who dabble in open source tools such as Python and R, or who work with other teams that use those tools.

Programmers tend to favor the programming languages that they know. When you learn a new programming language, your experience is colored by inevitable comparisons with the languages you've already mastered. If you work with R coders who want to learn SAS, you should consider that they probably won't learn SAS the same way that you did.

### A SAS programming course for experienced programmers

The traditional way to learn SAS begins with the DATA step, where you learn how to read files, how to write files, about the program data vector, and basically how the DATA step "thinks". Then you move on to the various procedures for descriptive stats, reporting, and maybe even some graphing. While this approach can make you productive with simple tasks quickly, to an R coder this might feel too much like "starting over." That's why R programmers (or even MATLAB or Stata users) need an approach that leverages what they already know to hit the ground running.

That's the thinking behind the new SAS Programming for R Users course. This course does not start with the basics about statistics or the importance of data prep -- the assumption is that you already know that. Instead, you'll get hands-on experience with SAS/IML -- a statistical matrix language that will certainly feel familiar to R users. You'll eventually get to the DATA step and other procedures, of course -- and these will open new worlds for you -- but you'll learn to be productive quickly using the skills you already have. (You can read more about the genesis of the course from its creator and main instructor, Jordan Bakerman.)

The course centers around classic and real statistical problems, from Bayesian logistic regression to the Monty Hall problem. If you don't know your statistics, you might feel that you're swimming in waters over your head. But if you're comfortable with the concepts, you should feel right at home. (If you're just beginning with statistics, SAS offers this different free e-learning course.)

The classic game show proof - click for code

"SAS Programming for R Users" also shows you how to use SAS and R together, submitting R code from within your SAS program. That's made possible by a special connection between SAS/IML and R -- something that SAS has supported for years.

This is a free instructor-led course that's offered in Live Web format. "Live Web" means that you connect from your desk at home or work, tune into the lecture and demos, and then practice your skills on a hosted classroom environment. And this course is free -- costing you only your time (5 half-day sessions). Check out the SAS Training site to see when the next offering might meet your schedule.

### Find the course materials on GitHub, right now

What if you can't find a Live Web offering that meets your schedule? In the spirit of openness, the SAS Training team has published the complete course materials on GitHub. You'll find the course notes (over 600 pages), data sets, and over 80 SAS programs to support the course exercises. You can use the free SAS University Edition to try the course exercises yourself and practice with the software. (The only part that you can't practice is the "submit to R" lessons, because the SAS University Edition doesn't support the connection to R.)

The post Learning SAS programming for R users appeared first on The SAS Dummy.

JSON is the new XML. The number of SAS users who need to access JSON data has skyrocketed, thanks mainly to the proliferation of REST-based APIs and web services. Because JSON is structured data in text format, we've been able to offer simple parsing techniques that use DATA step and most recently PROC DS2. But finally*, with SAS 9.4 Maintenance 4, we have a built-in LIBNAME engine for JSON.

### Simple JSON example: Who is in space right now?

Speaking of skyrocketing, I discovered a cool web service that reports who is in space right now (at least on the International Space Station). It's actually a perfect example of a REST API, because it does just that one thing and it's easily integrated into any process, including SAS. It returns a simple stream of data that can be easily mapped into a tabular structure. Here's my example code and results, which I produced with SAS 9.4 Maintenance 4.

```filename resp temp;   /* Neat service from Open Notify project */ proc http url="http://api.open-notify.org/astros.json" method= "GET" out=resp; run;   /* Assign a JSON library to the HTTP response */ libname space JSON fileref=resp;   /* Print result, dropping automatic ordinal metadata */ title "Who is in space right now? (as of &sysdate)"; proc print data=space.people (drop=ordinal:); run;```

But what if your JSON data isn't so simple? JSON can represent information in nested structures that can be many layers deep. These cases require some additional mapping to transform the JSON representation to a rectangular data table that we can use for reporting and analytics.

### JSON map example: Most recent topics from SAS Support Communities

In a previous post I shared a PROC DS2 program that uses the DS2 JSON package to call and parse our SAS Support Communities API. The parsing process is robust, but it requires quite a bit of fore knowledge about the structure and fields within the JSON payload. It also requires many lines of code to extract each field that I want.

Here's a revised pass that uses the JSON engine:

```/* split URL for readability */ %let url1=http://communities.sas.com/kntur85557/restapi/vc/categories/id/bi/topics/recent; %let url2=?restapi.response_format=json%str(&)restapi.response_style=-types,-null,view; %let url3=%str(&)page_size=100; %let fullurl=&url1.&url2.&url3;   filename topics temp;   proc http url= "&fullurl." method="GET" out=topics; run;   /* Let the JSON engine do its thing */ libname posts JSON fileref=topics; title "Automap of JSON data";   /* examine resulting tables/structure */ proc datasets lib=posts; quit; proc print data=posts.alldata(obs=20); run;```

Thanks to the many layers of data in the JSON response, here are the tables that SAS creates automatically.

There are 12 tables that contain various components of the message data that I want, plus the ALLDATA member that contains everything in one linear table. ALLDATA is good for examining structure, but not for analysis. You can see that it's basically name-value pairs with no data types/formats assigned.

I could use DATA steps or PROC SQL to merge the various tables into a single denormalized table for my reporting purposes, but there is a better way: define and apply a JSON map for the libname engine to use.

To get started, I need to rerun my JSON libname assignment with the AUTOMAP option. This creates an external file with the JSON-formatted mapping that SAS generates automatically. In my example here, the file lands in the WORK directory with the name "top.map".

```filename jmap "%sysfunc(GETOPTION(WORK))/top.map";   proc http url= "&fullurl." method="GET" out=topics; run;   libname posts JSON fileref=topics map=jmap automap=create;```

This generated map is quite long -- over 400 lines of JSON metadata. Here's a snippet of the file that describes a few fields in just one of the generated tables.

```"DSNAME": "messages_message",
"TABLEPATH": "/root/response/messages/message",
"VARIABLES": [
{
"NAME": "ordinal_messages",
"TYPE": "ORDINAL",
"PATH": "/root/response/messages"
},
{
"NAME": "ordinal_message",
"TYPE": "ORDINAL",
"PATH": "/root/response/messages/message"
},
{
"NAME": "href",
"TYPE": "CHARACTER",
"PATH": "/root/response/messages/message/href",
"CURRENT_LENGTH": 19
},
{
"NAME": "view_href",
"TYPE": "CHARACTER",
"PATH": "/root/response/messages/message/view_href",
"CURRENT_LENGTH": 134
},
```

By using this map as a starting point, I can create a new map file -- one that is simpler, much smaller, and defines just the fields that I want. I can reference each field by its "path" in the JSON nested structure, and I can also specify the types and formats that I want in the final data.

In my new map, I eliminated many of the tables and fields and ended up with a file that was just about 60 lines long. I also applied sensible variable names, and I even specified SAS formats and informats to transform some columns during the import process. For example, instead of reading the message "datetime" field as a character string, I coerced the value into a numeric variable with a DATETIME format:

```{
"NAME": "datetime",
"TYPE": "NUMERIC",
"INFORMAT": [ "IS8601DT", 19, 0 ],
"FORMAT": ["DATETIME", 20],
"PATH": "/root/response/messages/message/post_time/_",
"CURRENT_LENGTH": 8
},
```

I called my new map file 'minimap.map' and then re-issued the libname without the AUTOMAP option:

```filename minmap 'c:tempminmap.map';   proc http url= "&fullurl." method="GET" out=topics; run;   libname posts json fileref=topics map=minmap; proc datasets lib=posts; quit;   data messages; set posts.messages; run;```

Here's a snapshot of the single data set as a result.

I think you'll agree that this result is much more usable than what my first pass produced. And the amount of code is much smaller and easier to maintain than any previous SAS-based process for reading JSON.

Here's the complete program in public GitHub gist, including my custom JSON map.

* By the way, tags: JSON, REST API, SAS programming

The post Reading data with the SAS JSON libname engine appeared first on The SAS Dummy.

In my earlier post about WHERE and IF statements, I announced that the DATA step debugger has finally arrived in SAS Enterprise Guide. (I admit that I might have buried the lead in that post.) Let's use this post to talk about the new debugger and how it works.

First, let's address some important limitations. This tool is for debugging DATA step code. It can't be used to debug PROC SQL or PROC IML or SAS macro programs. Next, it can't be used to debug DATA steps that read data from CARDS or DATALINES. That's an unfortunate limitation, but it's a side effect of the way the DATA step "debug" mode works with client applications like SAS Enterprise Guide. (Workaround: load your data in a separate step, then debug your more complex DATA step logic in a subsequent step.)

### Ye olde DATA step debugger

1986 called; they want their debugger back.

If you've been around SAS programs for a while then you might remember the full-screen DATA step debugger in the SAS windowing environment. Introduced as production in SAS 6.09E (E="enhanced!"), it was basic but it did the job, relying on command-line processing to direct the debugger actions. It had only two windows: one for the source, and one for the "log", meaning the debugger console log. You could set breakpoints, variable watch conditions, examine variables and calculate values -- all with commands that you typed. (Even though I'm writing this in the past tense and it seems like I'm eulogizing, this debugger still lives on in Base SAS!)

### The new DATA step debugger

The new debugging environment, introduced in SAS Enterprise Guide 7.13, has all of the features of its ancestor. And it's much more usable, with toolbars and windows that allow you to control its behavior. But keyboard junkies, don't worry -- that command line is still there too!

To activate the debugger, click the new "bug" toolbar icon in the program editor window. Once activated, you can click the bug in the left "gutter" of the program editor to begin a debug session. (You can also press F5 to debug the active DATA step.)

Examine the screenshot below. You see the source window on top and the console window at the bottom, plus a convenient "watch" window that shows much of the content in the program data vector (PDV). That's all of the variables defined in the DATA step, plus automatic variables like _N_ and _ERROR_.

As you step through the DATA step, the line pointer in the source window advances to show the next line that will execute. You can use keyboard shortcuts (F10), the toolbar, or typed a typed command ("step") to execute that line and advance. With every step, the watch window is updated with the latest values of the variables in your step. When a variable changes value, it's colored red. If you want to the DATA step to break processing when a certain variable changes value, check the Watch box for that variable.

### Diving deeper with advanced debugging

Here's another example of debugging a different DATA step program. This program uses a BY statement and FIRST.variable logic, and you can see the additional automatic variables (FIRST.Make and LAST.Make) that the debugger is tracking. I also used END=eof on the SET statement; that adds the eof "flag" variable into the mix during run time.

In the Debug Console window you can see that I've issued some pretty fancy commands. The DATA step debugger allows you to set breakpoints that trigger on specific conditions. For example, "b 8 when (running_price > 10000)" will break on Line 8 when the value of running_price exceeds 10,000. "b 8 after 5" will break on Line 8 after 5 passes through the DATA step. You can set and clear line-specific breakpoints by clicking in the "gutter" (that left-hand margin next to the line numbers).

The "list _all_" command reveals the details about your open data sets and files. Here's what I see during the run of my program.

Other commands let you SET variable values, EXAMINE variables, CALCulate expressions, GO and JUMP to specific lines, and more. The SAS documentation contains a complete reference for DATA step debugger commands, and most of those work exactly as documented, even within SAS Enterprise Guide. Here's the list:

This old-but-still relevant SAS Global Forum paper (written by a SAS user) also covers some useful debugging concepts in SAS which you can apply in this new environment.

### A personal note: eating my words

I've presented "SAS Enterprise Guide for SAS programmers" as a topic in one form or another for the past 15 years. Every so often the topic of the DATA step debugger comes up, and I've said "don't look for it anytime soon." Knowing how the full-screen debugger is closely tied to the SAS windowing environment, I didn't hold out hope for a client application like SAS Enterprise Guide to get it working. Kudos to the R&D team! They creatively found a solution with the "/ldebug" option, an even more obscure debugging approach that works in SAS batch mode. I think this feature will be tremendous productivity boost for experienced SAS programmers, and a useful learning and teaching tool for those just getting started with the DATA step.

The post Using the DATA step debugger in SAS Enterprise Guide appeared first on The SAS Dummy.

In the DATA step, the WHERE statement and the IF statement (a.k.a. the "subsetting IF") have similar functions. In many scenarios, they produce identical results. But new SAS programmers are taught early on that these two statements work very differently, and in important ways. To understand the differences, it helps to step through the program line-by-line to see how SAS "thinks." Fortunately, the new DATA step debugger in SAS Enterprise Guide 7.13 makes this really easy to do.

### Difference between WHERE statement and IF statement

Here are the basics: the WHERE statement is applied when the DATA step is compiled. Incoming data (from a SET or MERGE statement) is filtered immediately to just those records that match the WHERE condition, so only those records are ever loaded into the program data vector (PDV). This results in fewer iterations through DATA step code, but provides no opportunity for "dynamic" decisions about which records to examine.

In contrast, the IF statement is evaluated at run time, and operates on the variables in the PDV. When the IF condition is met, the current observation is kept for eventual output. Unlike the WHERE statement, the IF statement can examine values of new variables that are defined within the step.

Consider these two DATA steps. They produce identical output of 10 records, but the first one processes only those 10 records whereas the second step processes all 19 records from the input.

```data results1; set sashelp.class; /* WHERE applied at compile time */ /* Processes ONLY matching obs */ where sex='M'; run;   data results2; set sashelp.class; /* IF evaluated at run time */ /* Processes EVERY obs */ if sex='M'; run;```

### Using the DATA step debugger to understand the DATA step

The new DATA step debugger in SAS Enterprise Guide makes it very easy to illustrate how WHERE is processed differently from IF. I loaded each of the above programs into my session, then clicked the new "bug" toolbar icon to activate the debugger. Once activated, you can click the bug in the left "gutter" of the program editor to begin a debug session. (You can also press F5 to debug the active DATA step.)

Watch this first animation of a debugger session and see what you notice about the WHERE statement.

Watching this little movie, I see a few things that reveal some insights.

• The statement pointer never lands on Line 5 (the WHERE statement). That's because the WHERE statement isn't processed at run time.
• Even though the CLASS data contains 19 records, the value of the _N_ automatic variable reaches only 11, indicating that only 10 records were processed.
• The variable watch window uses red to indicate when a variable changes between iterations. The Sex variable never changes from 'M', and thus stays colored black through the entire session.

Let's compare that to the IF statement. Study this animation and see what stands out to you.

Here's what I see:

• The statement pointer begins at Line 2, then 5, and moves to Line 6 (the RUN statement) only when the record has made it past the IF condition and into the output. For each observation where Sex='F', the DATA step stops processing the record and the RUN statement is skipped.
• In this program, _N_ reaches 20 -- that's because all 19 records in SASHELP.CLASS are processed and the step exits at the end-of-file condition.

### Learning more about subsetting IF, IF-THEN, WHERE, and debugging

There are several good articles about how the IF statement works, on its own and in combination with IF-THEN-ELSE constructs. Here's a recent article by SAS trainer Charu Shankar. And here's another reference that's included in a piece about the Top 10 SAS coding efficiencies.

The new DATA step debugger in SAS Enterprise Guide opens a new world of understanding for beginner and veteran SAS programmers. It has all of the functions of the "classic" debugger available in the Base SAS windowing environment, but with a much friendlier user interface, keyboard shortcuts, and useful watch windows. In a future post, I'll cover the debugging functions in more detail.

The post Debugging the difference between WHERE and IF in SAS appeared first on The SAS Dummy.

Rick Wicklin showed us how to visualize the ages of US Presidents at the time of their inaugurations. That's a pretty relevant thing to do, as the age of the incoming president can indirectly influence aspects of the president's term, thanks to health and generational factors.

As part of his post, Rick supplied the complete data set for US Presidents and their birthdays. He challenged his readers to create their own interesting visualizations, and that's what I'm going to do here. I'm going to show you the distribution of US Presidents by their astrological signs.

Now, you might think that "your sign" is not as relevant of a factor as Age, and I certainly hope that you're correct about that. But past presidents have sought the advice of astrologers, and zodiac signs can influence the counsel such astrologers might offer. (Famously, Richard Nixon took advice from celebrity psychic Jeane Dixon. First Lady Nancy Reagan also sought her advice, and we know that Mrs. Reagan in turn influenced President Reagan.)

Like any good analyst, I mostly reused existing work to produce my results. First, I used the DATA step that Rick provided to create the data set of presidents and birthdays. Next, I reused my own work to create a SAS format that displays a zodiac sign for each date. And finally, I wrote write a tiny bit of PROC FREQ code to create my table and frequency plot.

```data signs; /* So this column appears first */ retain President; length sign 8; /* SIGN. format created earlier with PROC FORMAT */ format sign sign.; set presidents (keep=President BirthDate InaugurationDate); /* convert birthday to our normalized SIGN date */ sign = mdy(month(birthdate),day(birthdate),2000); run;   ods graphics on; proc freq data=signs order=freq; tables sign / plots=freqplot; run;```

To keep things a bit fresh, I did all of this work in SAS University Edition using the Jupyter Notebook interface. Here's a glimpse of what it looks like:

And here's the distribution you've all been waiting to see. When he takes office, Donald Trump will join George H. W. Bush and JFK in the Gemini column.

I've shared the Jupyter Notebook file as a public gist on GitHub. You can download and import into your own instance if you have SAS and Jupyter Notebook working together. (Having trouble rendering the notebook file? Try looking at it through the nbviewer service. That usually works.)

The post Zodiac signs of US Presidents appeared first on The SAS Dummy.

Because "Copy files" is a custom task, you have to download the task package (from this blog) and follow a few steps to install the task into your SAS Enterprise Guide environment. When installed, the task can be found in the Tools → Add-In menu.

### Copy Files task moves to the Tasks → Data menu

SAS Enterprise Guide 7.13 is set to release within the next couple of weeks (near the end of November 2016), and it contains several exciting new features that I'll describe in this blog. Many of you will see it immediately when SAS Enterprise Guide prompts you to update. Stay tuned!

The post The Copy Files task is going legit (and moving) appeared first on The SAS Dummy.

SAS Community member @tc (a.k.a. Ted Conway) has found a new toy: ODS Graphics. Using PROC SGPLOT and GTL (Graph Template Language), along with some creative data prep steps, Ted has created several fun examples that show off what you can do with a bit of creativity, some math knowledge, and open data.

And bonus -- since most of his examples work with SAS University Edition, it's easy for you to try them yourself. Here are some of my favorites.

### Learn to draw a Jack-O-Lantern

Using the GIF output device and free data from Math-Aids.com, Ted shows how to use GTL (PROC TEMPLATE and PROC SGRENDER) to animate this Halloween icon.

### The United Polygons of America

Usually map charts with SAS require specialized procedures and map data, but here's a technique that can plot a stylized version of the USA and convey some interesting data. (You might have seen this one featured in a SAS Tech Report newsletter. Do you subscribe?)

### A look at Katie Ledecky's dominance

Using a vector plot, Ted shows how this championship swimmer dominated her event during the summer games in Rio. This example contains a lot of text information too; and that's a cool trick in PROC SGPLOT with the AXISTABLE statement. Click on the image for a closer look.

### Demonstrating the Bublé Sort

This example is nerdy on so many levels. It's a take on the Computer Science 101 concept of "bubble sort," an algorithm for placing a collection of items in a desired order. In this case, the items consist of Christmas songs recorded by Michael Bublé, that dreamy crooner from Canada.

Ted posts these examples (and more) in the SAS/GRAPH and ODS Graphics section of SAS Support Communities. That's a great place to learn SAS graphing techniques, from simple to advanced, and to see what other practitioners are doing. Experts like Ted hang out there, and the SAS visualization developers often post answers to the tricky questions.

#### More from @tc

In addition to his community posts, Ted is an award-winning contributor to SAS Global Forum with some very popular presentations. Here are a few of his papers.

The post Binge on this series: Fun with ODS Graphics appeared first on The SAS Dummy.