Chris Hemedinger

1月 312023

SAS has released SAS 9.4 Maintenance 8, a major update to SAS 9.4.

Security is the primary focus of the Maintenance 8 update. This release contains updates for many of the third-party technologies that are used by the platform, including the Java runtime environment (JRE) and many of the third-party JAR files. This release also adds support for major releases of supported operating systems, while limiting support for operating systems that are no longer supported by their respective suppliers. My colleague Margaret Crevar has summarized these changes in this SAS Communities post.

As with all SAS maintenance releases, this release "rolls up" the hotfixes and enhancements delivered since the last major update (SAS 9.4 Maintenance 7). Most SAS platform products and solutions have also been updated to remain compatible with this release and take advantage of enhancements. However, there are some products and solutions that will not be available immediately, or that will not deliver support for SAS 9.4 Maintenance 8.

Because SAS 9.4 Maintenance 8 is a major software release for SAS 9.4, it is covered by the SAS Support policy for the "Standard Support" timeframe according to its general availability date: Jan 31, 2023. (Current policy offers Standard Support for 5 years from the GA date.)

While this maintenance release doesn't contain new features, it does demonstrate the commitment of SAS to support users of the SAS 9.4 platform for many years to come. (See "Your platform, your way" in "Your analytics, your way" from Shadi Shahin.) New data and analytics capabilities are delivered in the SAS Viya platform, which offers monthly cadence releases via its continuous delivery model.

For an overview of all product changes and updates in SAS 9.4 Maintenance 8, see the What's New topic in the SAS documentation.

SAS 9.4 Maintenance 8 is available was published on SAS Users.

3月 172022

I created this project as a fun exercise to emulate the popular game Wordle in the SAS language. I was inspired by this story about a GitHub user who implemented Wordle in a Bash shell. (See the Bash script here. Read the comments -- it's an amazing stream of improved versions, takes in different programming languages, and "code golf" to reduce the lines-of-code.)

While developing my SAS solution, I created a Twitter poll to ask how other SAS programmers might approach it.

twitter poll
For me it was always going to be arrays, since that's what I know best. I'd love to be able to dash out a hash object approach or even use SAS/IML, but it would take me too long to wrap my brain around these. The PRX* regex function choice is bit of a feint -- regular expressions seem like a natural fit (pattern matching!), but Wordle play has nuances that makes regex less elegant than you might guess. Prove me wrong!

Two SAS users from Japan, apparently inspired by my poll, each implemented their own games! I've added links to their work at the end of this article.

How to code Wordle in SAS

You can see my complete SAS code for the game here: sascommunities/wordle-sas.

The interesting aspects of my version of the game include:

  • Uses the "official" word lists from New York Times as curated by cfreshman. I found these while examining the Bash script version. I used PROC HTTP to pull this list dynamically.
  • Also verifies guesses as "valid" using the list of allowed guesses, again curated by cfreshman. You know you can't submit just any 5 characters as a guess, right? That's an important part of the game.
  • Uses DATA step array to verify guesses against solution word.
  • Use DATA step object method to create a gridded output of the game board. Credit to my SAS friends in Japan for this idea!

I've added a screenshot of example game play below. This was captured in SAS Studio running in SAS Viya.

Example game with output

Example game play for my Wordle in SAS

How to play Wordle in SAS

To play:

  1. Submit the program in your SAS session. This program should work in PC SAS, SAS OnDemand for Academics, SAS Enterprise Guide, and SAS Viya.

    The program will fetch word lists from GitHub and populate into data sets. It will also define two macros you will use to play the game.

  2. Start a game by running:

    This will select a random word from the word list as the "puzzle" word and store it in a SAS macro variable (don't peek!)
  3. Optionally seed a game with a known word by using and optional 5-character word parameter:

    This will seed the puzzle word ("crane" in this example). It's useful for testing. See a battery of test "sessions" in
  4. Submit a first guess by running:

    This will check the guess against the puzzle word, and it will output a report with the familiar "status" - letters that appear in the word (yellow) and that are in the correct position (green). It will also report if the guess is not a valid guess word, and it won't count that against you as one of your 6 permitted guesses.

Use the %guess() macro (one at a time) to submit additional guesses. The program keeps track of your guesses and when you solve it, it shares the familiar congratulatory message that marks the end of a Wordle session. Ready to play again? Use the %startGame macro to reset.

Start a fresh game using Git functions

If you don't want to look at or copy/paste the game code, you can use Git functions in SAS to bring the program into your SAS session and play. (These Git functions require at least SAS 9.4 Maint 6 or SAS Viya.)

options dlcreatedir;
%let repopath=%sysfunc(getoption(WORK))/wordle-sas;
libname repo "&repopath.";
data _null_;
    rc = gitfn_clone( 
    put 'Git repo cloned ' rc=; 
%include "&repopath./";
/* start a game and submit first guess */


I know that my program could be more efficient...perhaps at the cost of readability. Also it's possible that I have some lingering bugs, although I did quite a bit of testing and bug-fixing along the way. Pro tip: The DATA step debugger (available in SAS Enterprise Guide and in SAS Viya version of SAS Studio) was a very useful tool for me!

Your feedback/improvements are welcome. Feel free to comment here or on the GitHub project.

Wordle games in SAS by other people

SAS users in Japan quickly implemented their own versions of Wordle-like games. Check them out:

The post Programming the Wordle game in SAS appeared first on The SAS Dummy.

7月 072021

When I was a computer science student in the 1980s, our digital alphabet was simple and small. We could express ourselves with the letters A..Z (and lowercase a..z) and numbers (0..9) and a handful of punctuation and symbols. Thanks to the ASCII standard, we could represent any of these characters in a single byte (actually just 7 bits). This allowed for a generous 128 different characters, and we had character slots to spare. (Of course for non-English and especially non-latin characters we had to resort to different code pages...but that was before the Internet forced us to work together. Before Unicode, we lived in a digital Tower of Babel.)

Even with the limited character set, pictorial communication was possible with ASCII through the fun medium of "ASCII art." ASCII art is basically the stone-age version of emojis. For example, consider the shrug emoji: 🤷

Its ASCII-art ancestor is this: ¯\_(ツ)_/¯ While ASCII art currently enjoys a retro renaissance, the emoji has become indispensable in our daily communications.

Emojis before Unicode

Given the ubiquity of emojis in every communication channel, it's sometimes difficult to remember that just a few years ago emoji characters were devised and implemented in vendor-specific offerings. As the sole Android phone user in my house, I remember a time when my iPhone-happy family could express themselves in emojis that I couldn't read in the family group chat. Apple would release new emojis for their users, and then Android (Google) would leap frog with another set of their own fun symbols. But if you weren't trading messages with users of the same technology, then chunks of your text would be lost in translation.

Enter Unicode. A standard system for encoding characters that allows for multiple bytes of storage, Unicode has seemingly endless runway for adding new characters. More importantly, there is a standards body that sets revisions for Unicode characters periodically so everyone can use the same huge alphabet. In 2015, emoji characters were added into Unicode and have been revised steadily with universal agreement.

This standardization has helped to propel emojis as a main component of communication in every channel. Text messages, Twitter threads, Venmo payments, Facebook messages, Slack messages, GitHub comments -- everything accepts emojis. (Emojis are so ingrained and expected that if you send a Venmo payment without using an emoji and just use plain text, it could be interpreted as a slight or at the least as a miscue.)

For more background about emojis, read How Emjois Work (source: How Stuff Works).

Unicode is essential for emojis. In SAS, the use of Unicode is possible by way of UTF-8 encoding. If you work in a modern SAS environment with a diverse set of data, you should already be using ENCODING=UTF8 as your SAS session encoding. If you use SAS OnDemand for Academics (the free environment for any learner), this is already set for you. And SAS Viya offers only UTF-8 -- which makes sense, because it's the best for most data and it's how most apps work these days.

Emojis as data and processing in SAS

Emojis are everywhere, and their presence can enrich (and complicate) the way that we analyze text data. For example, emojis are often useful cues for sentiment (smiley face! laughing-with-tears face! grimace face! poop!). It's not unusual for a text message to be ALL emojis with no "traditional" words.

The website maintains the complete compendium of emojis as defined in the latest standards. They also provide the emoji definitions as data files, which we can easily read into SAS. This program reads all of the data as published and adds features for just the "basic" emojis:

/* MUST be running with ENCODING=UTF8 */
filename raw temp;
proc http
ods escapechar='~';
data emojis (drop=line);
length line $ 1000 codepoint_range $ 45 val_start 8 val_end 8 type $ 30 comments $ 65 saschar $ 20 htmlchar $ 25;
infile raw ;
line = _infile_;
if substr(line,1,1)^='#' and line ^= ' ' then do;
 /* read the raw codepoint value - could be single, a range, or a combo of several */
 codepoint_range = scan(line,1,';');
 /* read the type field */
 type = compress(scan(line,2,';'));
 /* text description of this emoji */
 comments = scan(line,3,'#;');
 /* for those emojis that have a range of values */
 val_start = input(scan(codepoint_range,1,'. '), hex.);
 if find(codepoint_range,'..') > 0 then do;
  val_end = input(scan(codepoint_range,2,'.'), hex.);
 else val_end=val_start;
 if type = "Basic_Emoji" then do;
  saschar = cat('~{Unicode ',scan(codepoint_range,1,' .'),'}');
  htmlchar = cats('<span>&#x',scan(codepoint_range,1,' .'),';</span>');
proc print data=emojis; run;

(As usual, all of the SAS code in this article is available on GitHub.)

The "features" I added include the Unicode representation for an emoji character in SAS, which could then be used in any SAS report in ODS or any graphics produced in the SG procedures. I also added the HTML-encoded representation of the emoji, which uses the form &#xNNNN; where NNNN is the Unicode value for the character. Here's the raw data view:

When you PROC PRINT to an HTML destination, here's the view in the results browser:

In search of structured emoji data

The site can serve up the emoji definitions and codes, but this data isn't exactly ready for use within applications. One could work through the list of emojis (thousands of them!) and tag these with descriptive words and meanings. That could take a long time and to be honest, I'm not sure I could accurately interpret many of the emojis myself. So I began the hunt for data files that had this work already completed.

I found the GitHub/gemoji project, a Ruby-language code repository that contains a structured JSON file that describes a recent collection of emojis. From all of the files in the project, I need only one JSON file. Here's a SAS program that downloads the file with PROC HTTP and reads the data with the JSON libname engine:

filename rawj temp;
 proc http
libname emoji json fileref=rawj;

Upon reading these data, I quickly realized the JSON text contains the actual Unicode character for the emoji, and not the decimal or hex value that we might need for using it later in SAS.

I wanted to convert the emoji character to its numeric code. That's when I discovered the UNICODEC function, which can "decode" the Unicode sequence into its numeric values. (Note that some characters use more than one value in a sequence).

Here's my complete program, which includes some reworking of the tags and aliases attributes so I can have one record per emoji:

filename rawj temp;
 proc http
libname emoji json fileref=rawj;
/* reformat the tags and aliases data for inclusion in a single data set */
data tags;
 length ordinal_root 8 tags $ 60;
 set emoji.tags;
 tags = catx(', ',of tags:);
 keep ordinal_root tags;
data aliases;
 length ordinal_root 8 aliases $ 60;
 set emoji.aliases;
 aliases = catx(', ',of aliases:);
 keep ordinal_root aliases;
/* Join together in one record per emoji */
proc sql;
 create table full_emoji as 
 select  t1.emoji as emoji_char, 
    unicodec(t1.emoji,'esc') as emoji_code, 
    t1.description, t1.category, t1.unicode_version, 
     when t1.skin_tones = 1 then  t1.skin_tones
	 else 0
	end as has_skin_tones,
    t2.tags, t3.aliases
  from emoji.root t1
  left join tags t2 on (t1.ordinal_root = t2.ordinal_root)
  left join aliases t3 on (t1.ordinal_root = t3.ordinal_root)
proc print data=full_emoji; run;

Here's a snippet of the report that includes some of the more interesting sequences:

The diversity and inclusion aspect of emoji glyphs is ever-expanding. For example, consider the emoji for "family":

  • The basic family emoji code is \u0001F46A (👪)
  • But since families come in all shapes and sizes, you can find a family that better represents you. For example, how about "family: man, man, girl, girl"? The code is \u0001F468\u200D\u0001F468\u200D\u0001F467\u200D\u0001F467, which includes the codes for each component "member" all smooshed together with a "zero-width joiner" (ZWJ) code in between (👨‍👨‍👧‍👧)
  • All of the above, but with a dark-skin-tone modifier (\u0001F3FF) for 2 of the family members: \u0001F468\u0001F3FF\u200D\u0001F468\u200D\u0001F467\u200D\u0001F467\u0001F3FF (👨🏿‍👨‍👧‍👧🏿)

Conclusion: Emojis reflect society, and society adapts to emojis

As you might have noticed from that last sequence I shared, a single concept can call for many different emojis. As our society becomes more inclusive around gender, skin color, and differently capable people, emojis are keeping up. Everyone can express the concept in the way that is most meaningful for them. This is just one way that the language of emojis enriches our communication, and in turn our experience feeds back into the process and grows the emoji collection even more.

As emoji-rich data is used for reporting and for training of AI models, it's important for our understanding of emoji context and meaning to keep up with the times. Already we know that emoji use differs among different age generations and across other demographic groups. The use and application of emojis -- separate from the definition of emoji codes -- is yet another dimension to the data.

Our task as data scientists is to bring all of this intelligence and context into the process when we parse, interpret and build training data sets. The mechanics of parsing and producing emoji-rich data is just the start.

If you're encountering emojis in your data and considering them in your reporting and analytics, please let me know how! I'd love to hear from you in the comments.

The post How to work with emojis in SAS appeared first on The SAS Dummy.

1月 182021

Recommended soundtrack for this blog post: Netflix Trip by AJR.

This week's news confirms what I already knew: The Office was the most-streamed television show of 2020. According to reports that I've seen, the show was streamed for 57 billion minutes during this extraordinary year. I'm guessing that's in part because we've all been shut in and working from home; we crave our missing office interactions. We lived vicariously (and perhaps dysfunctionally) through watching Dunder Mifflin staff. But another major factor was the looming deadline of the departure of The Office from Netflix as of January 1, 2021. It was a well-publicized event, so Netflix viewers had to get their binge on while they could.

People in my house are fans of the show, and they account for nearly 6,000 of those 57 billion streaming minutes. I can be this precise (nerd alert!) because I'm in the habit of analyzing our Netflix activity by using SAS. In fact, I can tell you that since late 2017, we've streamed 576 episodes of The Office. We streamed 297 episodes in 2020. (Since the show has only 201 episodes we clearly we have a few repeats in there.)

I built a heatmap that shows the frequency and intensity of our streaming of this popular show. In this graph each row is a month, each square is a day. White squares are Office-free. A square with any red indicates at least one virtual visit with the Scranton crew; the darker the shade, the more episodes streamed during that day. You can see that Sept 15, 2020 was a particular big binge with 17 episodes. (Each episode is about 20-21 minutes, so it's definitely achievable.)

netflix trip through The Office

Heatmap of our household streaming of The Office

How to build the heatmap

To build this heatmap, I started with my Netflix viewing history (downloaded from my Netflix account as CSV files). I filtered to just "The Office (U.S.)" titles, and then merged with a complete "calendar" of dates between late 2017 and the start of 2021. Summarized and merged, the data looks something like this:

With all of the data summarized in this way such that there is only one observation per X and Y value, I can use the HEATMAPPARM statement in PROC SGPLOT to visualize it. (If I needed the procedure to summarize/bin the data for me, I would use the HEATMAP statement. Thanks to Rick Wicklin for this tip!)

proc sgplot data=ofc_viewing;
 title height=2.5 "The Office - a Netflix Journey";
 title2 height=2 "&episodes. episodes streamed on &days. days, over 3 years";
 label Episodes="Episodes per day";
 format monyear monyy7.;
 heatmapparm x=day y=monyear 
   colorresponse=episodes / x2axis
   colormodel=(white  CXfcae91 CXfb6a4a CXde2d26 CXa50f15) ;
 yaxis  minor reverse display=(nolabel) 
 x2axis values=(1 to 31 by 1) 
   display=(nolabel)  ;

You can see the full code -- with all of the data prep -- on my GitHub repository here. You may even run the code in your own SAS environment -- it will fetch my Netflix viewing data from another GitHub location where I've stashed it.

Distribution of Seasons (not "seasonal distribution")

If you examine the heatmap I produced, you can almost see our Office enthusiasm in three different bursts. These relate directly to our 3 children and the moments they discovered the show. First was early 2018 (middle child), then late 2019 (youngest child), then late 2020 (oldest child, now 22 years old, striving to catch up).

The Office ran for 9 seasons, and our kids have their favorite seasons and episodes -- hence the repeated viewings. I used PROC FREQ to show the distribution of episode views across the seasons:

Season 1 is remarkably low for two reasons. First and most importantly, it contains the fewest episodes. Second, many viewers agree that Season 1 is the "cringiest" content, and can be uncomfortable to watch. (This Reddit user leaned into the cringe with his data visualization of "that's what she said" jokes.)

From the data (and from listening to my kids), I know that Season 2 is a favorite. Of the 60 episodes we streamed at least 4 times, 19 of them were in Season 2.

More than streaming, it's an Office lifestyle

Office fandom goes beyond just watching the show. Our kids continue to embrace The Office in other mediums as well. We have t-shirts depicting the memes for "FALSE." and "Schrute Farms." We listen to The Office Ladies podcast, hosted by two stars of the show. In 2018 our daughter's Odyssey of the Mind team created a parody skit based on The Office (a weather-based office named Thunder Mifflin) -- and advanced to world finals.

Rarely does a day go by without some reference to an iconic phrase or life lesson that we gleaned from The Office. We're grateful for the shared experience, and we'll miss our friends from the Dunder Mifflin Paper Company.

The post Visualizing our Netflix Trip through <em>The Office</em> appeared first on The SAS Dummy.

11月 102020

The code and data that drive analytics projects are important assets to the organizations that sponsor them. As such, there is a growing trend to manage these items in the source management systems of record. For most companies these days, that means Git. The specific system might be GitHub Enterprise, GitLab, or Bitbucket -- all platforms that are based on Git.

Many SAS products support direct integration with Git. This includes SAS Studio, SAS Enterprise Guide, and the SAS programming language. (That last one checks a lot of boxes for ways to use Git and SAS together.) While we have good documentation and videos to help you learn about Git and SAS, we often get questions around "best practices" -- what is the best/correct way to organize your SAS projects in Git?

In this article I'll dodge that question, but I'll still try to provide some helpful advice in the process.

Ask the Expert resource: Using SAS® With Git: Bring a DevOps Mindset to Your SAS® Code

Guidelines for managing SAS projects in Git

It’s difficult for us to prescribe exactly how to organize project repositories in source control. Your best approach will depend so much on the type of work, the company organization, and the culture of collaboration. But I can provide some guidance -- mainly things to do and things to avoid -- based on experience.

Do not create one huge repository

DO NOT build one huge repository that contains everything you currently maintain. Your work only grows over time and you'll come to regret/revisit the internal organization of a huge project. Once established, it can be tricky to change the folder structure and organization. If you later try to break a large project into smaller pieces, it can be difficult or impossible to maintain the integrity of source management benefits like file histories and differences.

Design with collaboration in mind

DO NOT organize projects based only on the teams that maintain them. And of course, don't organize projects based on individual team members.

  • Good repo names: risk-adjustment-model, engagement-campaigns
  • Bad repo names: joes-code, claims-dept

All teams reorganize over time, and you don't want to have to reorganize all of your code each time that happens. And code projects change hands, so keep the structure personnel-agnostic if you can. Major refactoring of code can introduce errors, and you don't want to risk that just because you got a new VP or someone changed departments.

Instead, DO organize projects based on function/work that the code accomplishes. Think modular...but don't make projects too granular (or you'll have a million projects). I personally maintain several SAS code projects. The one thing they have in common is that I'm the main contributor -- but I organize them into functional repos that theoretically (oh please oh please) someone else could step in to take over.

The Git view of my YouTube API project in SAS Enterprise Guide

Up with reuse, down with ownership

This might seem a bit communist, but collaboration works best when we don't regard code that we write as "our turf." DO NOT cling to notions of code "ownership." It makes sense for teams/subject-matter experts to have primary responsibility for a project, but systems like Git are designed to help with transparency and collaboration. Be open to another team member suggesting and merging (with review and approval) a change that improves things. GitHub, GitLab, and Bitbucket all support mechanisms for issue tracking and merge requests. These allow changes to be suggested, submitted, revised, and approved in an efficient, transparent way.

DO use source control to enable code reuse. Many teams have foundational "shared code" for standard operations, coded in SAS macros or shared statements. Consider placing these into their own project that other projects and teams can import. You can even use Git functions within SAS to fetch and include this code directly from your Git repository:

/* create a temp folder to hold the shared code */
options dlcreatedir;
%let repoPath = %sysfunc(getoption(WORK))/shared-code;
libname repo "&repoPath.";
libname repo clear;
/* Fetch latest code from Git */
data _null_;
 rc = git_clone( 
options source2;
/* run the code in this session */
%include "&repoPath./";

If you rely on a repository for shared code and components, make sure that tests are in place so changes can be validated and will not break downstream systems. You can even automate tests with continuous integration tools like Jenkins.

DO document how projects relate to each other, dependencies, and prepare guidance for new team members to get started quickly. For most of us, we feel more accountable when we know that our code will be placed in central repositories visible to our peers. It may inspire cleaner code, more complete documentation, and a robust on-boarding process for new team members. Use the Markdown files ( and others) in a repository to keep your documentation close to the code.

My SAS code to check Pagespeed Insights, with documentation

Work with Git features (and not against them)

Once your project files are in a Git repository, you might need to change your way of working so that you aren't going against the grain of Git benefits.

DO NOT work on code changes in a shared directory with multiple team members –- you'll step on each other. The advantage of Git is that it's a distributed workflow and each developer can work with their own copy of the repository, and merge/accept changes from others at their own pace.

DO use Git branching to organize and isolate changes until you are ready to merge them with the main branch. It takes a little bit of learning and practice, but when you adopt a branching approach you'll find it much easier to manage -- it beats keeping multiple copies of your code with slightly different file and folder names to mark "works in progress."

DO consider learning and using Git tools such as Git Bash (command line), Git GUI, and a code IDE like VS Code. These don't replace the SAS-provided coding tools with their Git integration, but they can supplement your workflow and make it easier to manage content among several projects.

Learning more

When you're ready to learn more about working with Git and SAS, we have many webinars, videos, and documentation resources:

The post How to organize your SAS projects in Git appeared first on The SAS Dummy.

9月 062019

A few years ago I shared a method to publish content from SAS to a Slack channel. Since that time, our teams at SAS have gone "all in" on collaboration with Microsoft Office 365, including Microsoft Teams. Microsoft Teams is the Office suite's answer to Slack, and it's not a coincidence that it works in nearly the same way.

The lazy method: send e-mail to the channel

Before I cover the "deluxe" method for sending content to a Microsoft Teams channel, I want to make sure you know that there is a simple method that involves no coding, and no need for APIs. The message experience isn't as nice, but it does the job. You can simply "send e-mail" to the channel. If you're automating output from SAS, it's a simple, well-documented process to send e-mail from a SAS program. (Here's an example from me, using FILENAME EMAIL.)

When you send e-mail to a Microsoft Teams channel, the message notice includes the message subject line, sender, and the first bit of the message content. To see the entire message, you must click on the "View original e-mail" link in the notice. This "downloads" the message to your device so that you can open it with a local tool (such as your e-mail reader, Microsoft Outlook). My team uses this method to receive certain alerts from our platform. Here's an example:

To get the unique e-mail address for a channel, right-click on the channel name and select Get email address. Any message that you send to that e-mail address will be distributed to the team.

Getting started with a Microsoft Teams webhook

In order to provide a richer, more integrated experience with Microsoft Teams, you can publish content using a webhook. A webhook is a REST API endpoint that allows you to post messages and notifications with more control over the appearance and interactive options within the messages. In SAS, you can publish to a webhook by using PROC HTTP.

To get started, you need to add and configure a webhook for your Microsoft Teams channel:

  1. Right-click on the channel name and select Connectors.
  2. Microsoft Teams offers built-in connectors for many different applications. To find the connector for Incoming Webhook, use the search field to narrow the list. Then click Add to add the connector to the channel.
  3. You must grant certain permissions to the connector to interact with your channel. In this case, you need to allow the webhook to send messages and notifications. Review the permissions and click Install.
  4. On the Configuration page, assign a name to this connector and optionally customize the image. The image will be the avatar that's used when the connector posts content to the channel. When you've completed these changes, select Create.
  5. The connector generates a unique (and very long) URL that serves as the REST API endpoint. You can copy the URL from this field -- you will need it later in your SAS program. You can always come back to these configuration settings to change the connector avatar or re-copy the URL.

    At this point, it's a good idea to test that you can publish a basic message from SAS. The "payload" for a Teams message is a JSON-formatted structure, and you can find examples in the Microsoft Teams reference doc. Here's a SAS program that publishes the simplest message. Add your webhook URL and run the code to verify the connector is working for your channel.

    filename resp temp;
    options noquotelenmax;
    proc http
      /* Substitute your webhook URL here */
          "$schema": "",
          "type": "AdaptiveCard",
          "version": "1.0",
          "summary": "Test message from SAS",
          "text": "This message was sent by **SAS**!"

    If successful, this step will post a simple message to your Teams channel:

    Design a message card for Microsoft Teams

    Now that we have the basic plumbing working, it's time to add some bells and whistles. Microsoft Teams calls these notifications "message cards", which are messages that can include interactive features such as images, data, action buttons, and more.

    Designing a simple message

    Microsoft Teams supports a large palette of building blocks (expressed in JSON) to create different card experiences. You can experiment with these cards in the MessageCard Playground that Microsoft hosts. The tool provides templates for several card varieties, and you can edit the JSON definitions to tweak and design your own.

    For one of my use cases, I designed a simple card to show the status of our recommendation engine on SAS Support Communities. (Read this article for more information about how we built and monitor the recommendation engine.) The engine runs as a service and is accessed with its own API. I wanted a periodic "health check" to post to our internal team that would alert us to any problems. Here's the JSON that I used in the MessageCard Playground to design it.

    Much of the JSON is boilerplate for the message. I drew the green blocks to indicate the areas that need to be dynamic -- that is, replaced with values from the real-time API call. Here's what the card looks like when rendered in the Microsoft Teams channel.

    Since my API call to the recommendation engine service creates a data set, I can run that data through PROC JSON to create the JSON segment I need:

    /* reading the results from my API call to the engine */
    libname results json fileref=resp;
    /* Prep a simple name-value data set with the results */
    data segment (keep=name value);
     set results.root;
     name="Score data updated (UTC)";
     value= astore_creation;
     name="Topics scored";
     name="Number of users";
     value= left(num_users);
     name="Process time";
     value= process_time;
    /* use PROC JSON to create the segment */
    filename segment temp;
    proc json out=segment nosastags pretty;
     export segment;

    I shared a version of the complete program on GitHub. It should run as is -- but you would need to supply your own webhook endpoint for a channel that you can publish to.

    Design a message with actions

    I also use Microsoft Teams to share updates about the SAS Software GitHub organization. In a previous article I discussed how I use GitHub APIs to gather data from the GitHub service. Each day, my program summarizes the recent activity from and publishes a message card to the team. Here's an example of a daily update:

    This card is fancier than my first example. I added action buttons that can direct the team members to the internal reports for more details and to the GitHub site itself. I used the Microsoft Teams documentation and the MessageCard Playground to design the experience:

    Messaging apps as part of a DevOps strategy

    Like many organizations, we (SAS) invest a considerable amount of time and energy into gathering metrics and building reports about our operations. However, reports are useful only when the intended audience is tuned in and refers to them regularly. With a small additional step, you can use SAS to bring your most interesting data forward to your team -- automatically.

    Whether you use Microsoft Teams or Slack, automated alerting and updates are a great opportunity to keep your teams informed. Each of these tools offers fit-for-purpose connectors that can tie in with information from other popular operational systems (Salesforce, GitHub, Yammer, JIRA, and many more). For cases where a built-in connector is not available, the webhook approach allows you to easily create your own.

The post How to publish to a Microsoft Teams channel using SAS appeared first on The SAS Dummy.

8月 192019

I'm old enough to remember when USA Today began publication in the early 1980s. As a teenager who was not particularly interested in current events, I remember scanning each edition for the USA Today Snapshots, a mini infographic feature that presented some statistic in a fun and interesting way. Back then, I felt that these stats made me a little bit smarter for the day. I had no reason to question the numbers I saw, nor did I have the tools, skill or data access to check their work.

Today I still enjoy the USA Today Snapshots feature, but for a different reason. An interesting infographic will spark curiosity. And provided that I have time and interest, I can use the tools of data science (SAS, in my case) and public data to pursue more answers.

In the August 7, 2019 issue, USA Today published this graphic about marijuana use in Colorado. Before reading on, I encourage you to study the graphic for a moment and see what questions arise for you.

Source: USA Today Snapshot from Aug 7 2019

I have some notes

For me, as I studied this graphic, several questions came to mind immediately.

  • Why did they publish this graphic? USA Today Snapshots are usually offered without explanation or context -- that's sort of their thing. So why did the editors choose to share these survey results about marijuana use in Colorado? As readers, we must supply our own context. Most of us know that Colorado recently legalized marijuana for recreational use. The graphic seems to answer the question, "Has marijuana use among certain age groups increased since the law changed?" And a much greater leap: "Has marijuana use increased because of the legal change?"
  • Just Colorado? We see trend lines here for Colorado, but there are other states that have legalized marijuana. How does this compare to Maine or Alaska or California? And what about those states where it's not yet legal, like North Carolina?
  • People '26 and older' are also '18 and older' The reported age categories overlap. '18 and older' includes '18 to 25' and '26 and older'. I believe that the editors added this combined category by aggregating the other two. Why did they do that?
  • Isn't '26 and older' a wide category? '12 to 17' is a 6-year span, and '18 to 25' is an 8-year span. But '26 and older' covers what? 60-plus years?
  • "Coloradoans?" Is that really how people from Colorado refer to themselves? Turns out that's a matter of style preference.

The vagaries of survey results

To its credit, the infographic cites the source for the original data: the National Survey on Drug Use and Health (NSDUH). The organization that conducts the annual survey is the Substance Abuse and Mental Health Services Administration (SAMHSA), which is under the US Department of Health and Human Services. From the survey description: The data provides estimates of substance use and mental illness at the national, state, and sub-state levels. NSDUH data also help to identify the extent of substance use and mental illness among different sub-groups, estimate trends over time, and determine the need for treatment services.

This provides some insight into the purpose of the survey: to help policy makers plan for mental health and substance abuse services. "How many more people are using marijuana for fun?" -- the question I've inferred from the infographic choices -- is perhaps tangential to that charter.

Due to privacy concerns, SAMHSA does not provide the raw survey responses for us analyze. The survey collects details about the respondent's drug use and mental health treatment, as well as demographic information about gender, age, income level, education level, and place of residence. For a deep dive into the questions and survey flow, you can review the 2019 questionnaire here. SAMHSA uses the survey responses to extrapolate to the overall population, producing weighted counts for each response across recoded categories, and imputing counts and percentages for each aspect of substance use (which drugs, how often, how recent).

SAMHSA provides these survey data in two channels: the Public-use Data Analysis System and the Restricted-use Data Analysis System. The "Public-use" data provides annualized statistics about the substance use, mental health, and demographics responses across the entire country. If you want data that includes locale information (such as the US state of residence), then you have to settle for the "Restricted-use" system -- which does not provide annual data, but instead provides data summarized across multi-year study periods. In short, if you want more detail about one aspect of the survey responses, you must sacrifice detail across other facets of the data.

My version of the infographic

I spent hours reviewing the available survey reports and data, and here's what I learned: I am an amateur when it comes to understanding health survey reports. However, I believe that I successfully reverse-engineered the USA Today Snapshot data source so that I could produce my own version of the chart. I used the "Restricted-use" version of the survey reports, which allowed access to imputed data values across two-year study periods. My version shows the same data points, but with these formatting changes:

  • I set the Y axis range as 0% to 100%, which provides a less-exaggerated slope of the trend lines.
  • I did not compute the "18 and over" data point.
  • I added reference lines (dashed blue) to indicate the end of each two-year study period for which I have data points.

Here's one additional data point that's not in the survey or in the USA Today graphic. Colorado legalized marijuana for recreational use in 2012. In my chart, you can see that marijuana use was on the rise (especially among 18-25 years old) well before that, especially since 2009. Medical use was already permitted then (see Robert Allison's chart of the timeline), and we can presume that Coloradoans (!) were warming up to the idea of recreational use before the law was passed. But the health survey measures only reported use, and does not measure the user's purpose (recreational, medical, or otherwise) or attitudes toward the substance.

Limitations of the survey data and this chart

Like the USA Today version, my graph has some limitations.

  • My chart shows the same three broad age categories as the original. These are recoded age values from the study data. For some of the studies it was possible to get more granular age categories (5 or 6 bins instead of 3), but I could not get this for all years. Again, when you push for more detail on one aspect, the "Restricted-use" version of the data pushes back.
  • The "used in the past 12 months" indicators is computed. The survey report doesn't offer this as a binary value. Instead it offers "Used in the past 30 days" and "Used more than 30 days ago but less than 12 months." So, I added those columns together, and I assume that the USA Today editors did the same.
  • I'm not showing the confidence intervals for the imputed survey responses. Since this is survey data, the data values are not absolute but instead are estimates accompanied by a percent-confidence that the true values fall in this certain range. The editors probably decided that this is too complex to convey in your standard USA Today Snapshot -- and it might blunt the potential drama of the graphic. Here's what it would look like for the "Used marijuana in past 30 days" response, with the colored band indicating the 95% confidence interval.

Beyond Colorado: what about other states?

Having done the work to fetch the survey data for Colorado, it was simple to gather and plot the same data for other states. Here's the same graph with data from North Carolina (where marijuana use is illegal) and Maine and California.

While I was limited to the two-year study reports for data at the state level, I was able to get the corresponding data points for every year for the country as a whole:

I noticed that the reported use among those 12-17 years old declined slightly across most states, as well as across the entire country. I don't know what the logistics are for administering such a comprehensive survey to young people, but this made me wonder if something about the survey process had changed over time.

The survey data also provides results for other drugs, like alcohol, tobacco, cocaine, and more. Alcohol has been legal for much longer and is certainly widely used. Here are the results for Alcohol use (imputed 12 months recency) in Colorado. Again I see a decline in the self-reported use among those 12-17 years old. Are fewer young people using alcohol? If true, we don't usually hear about that. Or has something changed in the survey methods with regard to minors?

SAS programs to access NSDUH survey data

On my GitHub repo, you can find my SAS programs to fetch and chart the NSDUH data. The website offers a point-and-click method to select your dimensions: a row, column, and control variable (like a BY group).

I used the interactive report tool to navigate to the data I wanted. After some experimentation, I settled on the "Imputed Marijuana Use Recency" (IRMJRC) value for the report column -- I think that's what USA Today used. Also, I found other public reports that referenced it for similar purposes. The report tool generates a crosstab report and an optional chart, but it also then offers a download option for the CSV version of the data.

I was able to capture that download directive as a URL, and then used PROC HTTP to download the data for each study period. This made it possible to write SAS code to automate the process -- much less tedious than clicking through reports for each study year.

%macro fetchStudy(state=,year=);
  filename study "&workloc./&state._&year..csv";
  proc http
   url="" ||
       "row=CATAG2%str(&)column=IRMJRC%str(&)control=STNAME%str(&)weight=DASWT_1" ||
%let state=COLORADO;
/* Download data for each 2-year study period */
%fetchStudy(state=&state., year=2016-2017);
%fetchStudy(state=&state., year=2015-2016);
%fetchStudy(state=&state., year=2014-2015);
%fetchStudy(state=&state., year=2012-2013);
%fetchStudy(state=&state., year=2010-2011);
%fetchStudy(state=&state., year=2008-2009);
%fetchStudy(state=&state., year=2006-2007);

Each data file represents one two-year study period. To combine these into a single SAS data set, I use the INFILE-with-a-wildcard technique that I've shared here.

 INFILE "&workloc./&state._*.csv"
    /* and so on */

The complete programs are in GitHub -- one version for the state-level two-year study data, and one version for the annual data for the entire country. These programs should work as-is within SAS Enterprise Guide or SAS Studio, including in SAS University Edition. Grab the code and change the STATE macro variable to find the results for your favorite US state.

Conclusion: maintain healthy skepticism

News articles and editorial pieces often use simplified statistics to convey a message or support an argument. There is just something about including numbers that lends credibility to reporting and arguments. Citing statistics is a time-honored and effective method to inform the public and persuade an audience. Responsible journalists will always cite their data sources, so that those with time and interest can fact-check and find additional context beyond what the media might share.

I enjoy features like the USA Today Snapshot, even when they send me down a rabbit hole as this one did. As I tell my children often (and they are weary of hearing it), statistics in the media should not be accepted at face value. But if they make you curious about a topic so that you want to learn more, then I think the editors should be proud of a job well done. It's on the rest of us to follow through to find the deeper answers.

The post A skeptic's guide to statistics in the media appeared first on The SAS Dummy.

7月 252019

Recommendations on SAS Support Communities

If you visit the SAS Support Communities and sign in with your SAS Profile, you'll experience a little bit of SAS AI with every topic that you view.

While it appears to be a simple web site widget, the "Recommended by SAS" sidebar is made possible by an application of the full Analytics Life Cycle. This includes data collection and prep, model building and test, API wrappers with a gateway for monitoring, model deployment in containers with orchestration in Kubernetes, and model assessment using feedback from click actions on the recommendations. We built this by using a combination of SAS analytics and open source tools -- see the SAS Global Forum paper by my colleague, Jared Dean, for the full list of ingredients.

Jared and I have been working for over a year to bring this recommendation engine to life. We discussed it at SAS Global Forum 2018, and finally near the end of 2018 it went into production on The engine scores user visits for new recommendations thousands of times per day. The engine is updated each day with new data and a new scoring model.

Now that the recommendation engine is available, Jared and I met again in front of the camera. This time we discussed how the engine is working and the efforts required to get into production. Like many analytics projects, the hardest part of the journey was that "last mile," but we (and the entire company, actually) were very motivated to bring you a live example of SAS analytics in action. You can watch the full video at (where else?) The video is 17 minutes long -- longer than most "explainer"-type videos. But there was a lot to unpack here, and I think you'll agree there is much to learn from the experience. Not ready to binge on our video? I'll use the rest of this article to cover some highlights.

Good recommendations begin with clean data

The approach of our recommendation engine is based upon your viewing behavior, especially as compared to the behavior of others in the community. With this approach, we don't need to capture much information about you personally, nor do we need information about the content you're reading. Rather, we just need the unique IDs (numbers) for each topic that is viewed, and the ID (again, a number) for the logged-in user who viewed it. One benefit of this approach is that we don't have to worry about surfacing any personal information in the recommendation API that we'll ultimately build. That makes the conversation with our IT and Legal colleagues much easier.

Our communities platform captures details about every action -- including page views -- that happens on the site. We use SAS and the community platform APIs to fetch this data every day so that we can build reports about community activity and health. We now save off a special subset of this data to feed our recommendation engine. Here's an example of the transactions we're using. It's millions of records, covering nearly 100,000 topics and nearly 150,000 active users.

Sample data records for the model

Building user item recommendations with PROC FACTMAC

Starting with these records, Jared uses SAS DATA step to prep the data for further analysis and a pass through the algorithm he selected: factorization machines. As Jared explains in the video, this algorithm shines when the data are represented in sparse matrices. That's what we have here. We have thousands of topics and thousands of community members, and we have a record for each "view" action of a topic by a member. Most members have not viewed most of the topics, and most of the topics have not been viewed by most members. With today's data, that results in a 13 billion cell matrix, but with only 3.3 million view events. Traditional linear algebra methods don't scale to this type of application.

Jared uses PROC FACTMAC (part of SAS Visual Data Mining and Machine Learning) to create an analytics store (ASTORE) for fast scoring. Using the autotuning feature, the FACTMAC selects the best combination of values for factors and iterations. And Jared caps the run time to 3600 seconds (1 hour) -- because we do need this to run in a predictable time window for updating each day.

proc factmac data=mycas.weighted_factmac  outmodel=mycas.factors_out;
   autotune maxtime=3600 objective=MSE 
       TUNINGPARAMETERS=(nfactors(init=20) maxiter(init=200) learnstep(init=0.001) ) ;
   input user_uid conversation_uid /level=nominal;
   target rating /level=interval;
   savestate rstore=mycas.sascomm_rstore;

Using containers to build and containers to score

To update the model with new data each day and then deploy the scoring model as an ASTORE, Jared uses multiple SAS Viya environments. These SAS Viya environments need to "live" only for a short time -- for building the model and then for scoring data. We use Docker containers to spin these up as needed within the cloud environment hosted by SAS IT.

Jared makes the distinction between the "building container," which hosts the full stack of SAS Viya and everything that's needed to prep data and run FACTMAC, and the "scoring container", which contains just the ASTORE and enough code infrastructure (include the SAS Micro Analytics Service, or MAS) to score recommendations. This scoring container is lightweight and is actually run on multiple nodes so that our engine scales to lots of requests. And the fact that it does just the one thing -- score topics for user recommendations -- makes it an easier case for SAS IT to host as a service.

DevOps flow for the recommendation engine

Monitoring API performance and alerting

To access the scoring service, Jared built a simple API using a Python Flask app. The API accepts just one input: the user ID (a number). It returns a list of recommendations and scores. Here's my Postman snippet for testing the engine.

To provision this API as a hosted service that can be called from our community web site, we use an API gateway tool called Apigee. Apigee allows us to control access with API keys, and also monitors the performance of the API. Here's a sample performance report for the past 7 days.

In addition to this dashboard for reporting, we have integrated proactive alerts into Microsoft Teams, the tool we use for collaboration on this project. I scheduled a SAS program that tests the recommendations API daily, and the program then posts to a Teams channel (using the Teams API) with the results. I want to share the specific steps for this Microsoft Teams integration -- that's a topic for another article. But I'll tell you this: the process is very similar to the technique I shared about publishing to a Slack channel with SAS.

Are visitors selecting recommended content?

To make it easier to track recommendation clicks, we added special parameters to the recommended topics URLs to capture the clicks as Google Analytics "events." Here's what that data looks like within the Google Analytics web reporting tool:

You might know that I use SAS with the Google Analytics API to collect web metrics. I've added a new use case for that trick, so now I collect data about the "SAS Recommended Click" events. Each click event contains the unique ID of the recommendation score that the engine generated. Here's what that raw data looks like when I collect it with SAS:

With the data in SAS, we can use that to monitor the health/success of the model in SAS Model Manager, and eventually to improve the algorithm.

Challenges and rewards

This project has been exciting from Day 1. When Jared and I saw the potential for using our own SAS Viya products to improve visitor experience on our communities, we committed ourselves to see it through. Like many analytics applications, this project required buy-in and cooperation from other stakeholders, especially SAS IT. Our friends in IT helped with the API gateway and it's their cloud infrastructure that hosts and orchestrates the containers for the production models. Putting models into production is often referred to as "the last mile" of an analytics project, and it can represent a difficult stretch. It helps when you have the proper tools to manage the scale and the risks.

We've all learned a lot in the process. We learned how to ask for services from IT and to present our case, with both benefits and risks. And we learned to mitigate those risks by applying security measures to our API, and by limiting the execution scope and data of the API container (which lives outside of our firewall).

Thanks to extensive preparation and planning, the engine has been running almost flawlessly for 8 months. You can experience it yourself by visiting SAS Support Communities and logging in with your SAS Profile. The recommendations that you see will be personal to you (whether they are good recommendations...that's another question). We have plans to expand the engine's use to anonymous visitors as well, which will significantly increase the traffic to our little API. Stay tuned!

The post Building a recommendation engine with SAS appeared first on The SAS Dummy.

4月 202019

Do you have a favorite television show? Or a favorite movie franchise that you follow? If you call yourself a "fan," just how much of a fan are you? Are you merely a spectator, or do you take your fanaticism to the next level by creating something new?

When it comes to fandom for franchises like Game of Thrones, the Marvel movies, or Stranger Things, there's a new kind of nerd in town. And this nerd brings data science skills. You've heard of the "second screen" experience for watching television, right? That's where fans watch a show (or sporting event or awards ceremony), but also keep up with Twitter or Facebook so they can commune with other fans of the show on social media. These fan-data-scientists bring a third screen: their favorite data workbench IDE.

I was recently lured into into a rabbit hole of Game of Thrones data by a tweet. The Twitter user was reacting to a data visualization of character screen time during the show. The visualization was built in a different tool, but the person was wondering whether it could be done in SAS. I knew the answer was long as we could get the data. That turned out to be the easiest part.

WARNING: While this blog post does not reveal any plot points from the show, the data does contain spoilers! No spoilers in what I'm showing here, but if you run my code examples there might be data points that you cannot "unsee." I was personally conflicted about this, since I'm a fan of the show but I'm not yet keeping up with the latest episodes. I had to avert my eyes for the most recent data.

Data is Coming

A GitHub user named Jeffrey Lancaster has shared a repository for all aspects of data around Game of Thrones. He also has similar repos for Stranger Things and Marvel universe. Inside that repo there's a JSON file with episode-level data for all episodes and seasons of the show. With a few lines of code, I was able to read the data directly from the repo into SAS:

filename eps temp;
/* Big thanks to this GoT data nerd for assembling this data */
proc http
/* slurp this in with the JSON engine */
libname episode JSON fileref=eps;

Note that I've shared all of my code for my steps in my own GitHub repo (just trying to pay it forward). Everything should work in Base SAS, including in SAS University Edition.

The JSON library reads the data into a series of related tables that show all of the important things that can happen to characters within a scene. Game of Thrones fans know that death, sex, and marriage (in that order) make up the inflection points in the show.

Building the character-scene data

With a little bit of data prep using SQL, I was able to show the details of the on-screen time per character, per scene. These are the basis of the visualization I was trying to create.

/* Build details of scenes and characters who appear in them */
   CREATE TABLE WORK.character_scenes AS 
   SELECT t1.seasonNum, 
          t2.ordinal_scenes as scene_id, 
          input(t2.sceneStart,time.) as time_start format=time., 
          input(t2.sceneEnd,time.) as time_end format=time., 
          (calculated time_end) - (calculated time_start) as duration format=time.,

      WHERE (t1.ordinal_episodes = t2.ordinal_episodes AND 
             t2.ordinal_scenes = t3.ordinal_scenes);

With a few more data prep steps (see my code on GitHub), I was able to summarize the screen time for scene locations:

You can see that The Crownlands dominate as a location. In the show that's a big region and a sort of headquarters for The Seven Kingdoms, and the show data actually includes "sub-locations" that can help us to break that down. Here's the makeup of that 18+ hours of time in The Crownlands:

Screen time for characters

My goal is to show how much screen time each of the major characters receives, and how that changes over time. I began by creating a series of charts using PROC SGPLOT. These were created using a single SGPLOT step using a BY group, segmented by show episode. They appear in a grid because I used ODS LAYOUT GRIDDED to arrange them.

Here's the code segment that creates these dozens of charts. Again, see my GitHub for the intermediate data prep work.

/* Create a gridded presentation of Episode graphs CUMULATIVE timings */
ods graphics / width=500 height=300 imagefmt=svg noborder;
ods layout gridded columns=3 advance=bygroup;
proc sgplot data=all_times noautolegend ;
  hbar name / response=cumulative 
    colorresponse=total_screen_time dataskin=crisp
    datalabel=name datalabelpos=right datalabelattrs=(size=10pt)
    seglabel seglabelattrs=(weight=bold size=10pt color=white) ;
  by epLabel notsorted;
  format cumulative time.;
  label epLabel="Ep";
  where rank<=10;
  xaxis display=(nolabel)  grid ;
  yaxis display=none grid ;
ods layout end;
ods html5 close;

Creating an animated timeline

The example shared on Twitter showed an animation of screen time, per character, over the complete series of episodes. So instead of a huge grid with many plots, need to produce a single file with layers for each episode. In SAS we can produce an animated GIF or animated SVG (scalable vector graphics) file. The SVG is a much smaller file format, but you need a browser or a special viewer to "play" it. Still, that's the path I followed:

/* Create a single animated SVG file for all episodes */
options printerpath=svg animate=start animduration=1 
  svgfadein=.25 svgfadeout=.25 svgfademode=overlap
  nodate nonumber; 
/* change this file path to something that works for you */
ODS PRINTER file="c:\temp\got_cumulative.svg" style=daisy;
/* For SAS University Edition
ODS PRINTER file="/folders/myfolders/got_cumulative.svg" style=daisy;
proc sgplot data=all_times noautolegend ;
  hbar name / response=cumulative 
    colorresponse=total_screen_time dataskin=crisp
    datalabel=name datalabelpos=right datalabelattrs=(size=10pt)
    seglabel seglabelattrs=(weight=bold size=10pt color=white) ;
  by epLabel notsorted;
  format cumulative time.;
  label epLabel="Ep";
  where rank<=10;
  xaxis label="Cumulative screen time (HH:MM:SS)" grid ;
  yaxis display=none grid ;
options animation=stop;
ods printer close;

Here's the result (hosted on my GitHub repo -- but as a GIF for compatibility.)

I code and I know things

Like the Game of Thrones characters, my visualization is imperfect in many ways. As I was just reviewing it I discovered a few data prep missteps that I should correct. I used some features of PROC SGPLOT that I've learned only a little about, and so others might suggest improvements. And my next mission should be to bring this data in SAS Visual Analytics, where the real "data viz maesters" who work with me can work their magic. I'm just hoping that I can stay ahead of the spoilers.

The post Deeper enjoyment of your favorite shows -- through data appeared first on The SAS Dummy.