Chris Hemedinger

7月 122016

I've been working on a SAS program that can add content to the SAS Support Communities (more on that in a future post). Despite my 20+ years of SAS experience, there are a lot of SAS programming tricks that I don't know. Or that I use so infrequently that I always need to remind myself how to accomplish them.

Here's one. I needed to read the contents of an external text file into a SAS macro variable, so that I could then use that value (a very long text string) as part of an API call. In searching for a technique that would work for me, I came across a similar question on SAS Support Communities -- one that had been solved by our resident SASJedi, Mark Jordan. Perfect!

Here's the solution that worked for me:

FILENAME msghtml "path-to-text-file" ;
data _null_;
   length text $32767;
   retain text '';
   infile msghtml flowover dlmstr='//' end=last;
   if last then call symput('MSGBODY',text);

The RETAIN statement allows me to build up the "text" variable as the DATA step processes multiple lines. The END=last on the INFILE statement sets a flag when we hit end-of-file, so I know that we're done and I can CALL SYMPUT the macro value. The FLOWOVER option tells the INPUT statement to keep reading even if no input values are found in the current record. (FLOWOVER is the default behavior, so the option probably isn't needed here.) DLMSTR allows you to specify a multichar delimiter string that's different than the default delimiter (a space character). We're using the CATS function to concatenate a trimmed version of the input buffer (_INFILE_) to the RETAINed "text" variable.

For my project I needed to URL-encode the text value for use in an HTTP-based REST API. So for me, the last line is really:

if last then call symput('MSGBODY',urlencode(trim(text)));

The SAS Support Communities has been a big help to me during this project -- a project that is designed to improve the communities even more. It's a virtuous cycle! I hope that this helps some of you out there, too.

tags: macro programming, SAS Communities, SAS programming

The post How to read the contents of a file into a SAS macro variable appeared first on The SAS Dummy.

6月 282016

Would you like to see the latest features of SAS Enterprise Guide in action? Of course you would! That's why it's well worth the 12 minutes of your time to watch this video from SAS Global Forum 2016.

In the video, Casey Smith (SAS' R&D manager of the SAS Enterprise Guide team) shows off the favorite new features, including:

Casey also talks about his unique perspective as a second-generation SAS user. His Mom is a long-time SAS user; Casey was raised with SAS in the house! It's only appropriate that Casey went on to join SAS as an employee. He frequently presents for user groups and you can often find Casey (as CaseyS_SAS) on the SAS Enterprise Guide discussion board in SAS Support Communities.

tags: SAS Enterprise Guide, SAS global forum, SAS GloFo

The post Video: Demonstrating the new features in SAS Enterprise Guide 7.1 appeared first on The SAS Dummy.

6月 242016

One thing that we have a lot of at SAS: installations of SAS software that we can run. I have SAS for Windows on my laptop, and I have access to many centralized instances of SAS that run on Linux and Windows servers. (I also have access to mainframe SAS, though it's been a while since I've used it. When I log in, I picture a Rube-Goldberg style mechanism that pokes an intern to mount a tape so my profile can be reloaded.)

I often develop programs using my local instance of SAS and SAS Enterprise Guide, but deploy them for use on a central server. I might run them as batch jobs or interactively with SAS Enterprise Guide or SAS Studio or even in SAS/IntrNet.

Our IT department wants SAS employees to have seamless access to their files whether on Windows or on Unix-style file systems, and so they make it easy to access the same network path from Windows (using UNC notation, or "\serverpath" syntax) and Unix (using "/node/usr/path" syntax). As I develop my SAS programs, I want the programs to work the same whether run from Windows or Unix, and I don't want to have to change LIBNAME paths each time. Fortunately, SAS programs are usually portable across different operating systems, and while SAS data sets might have different encodings across systems, SAS can always read a data set that was created by a different version.

I have a simple technique that references the proper path for the operating system that I'm using. I build a SAS macro variable by using the IFC function and the &SYSSCP automatic variable to check whether I'm running on Windows, then assign the path accordingly.

/* Use the IFC function as a shorthand for if-then, returning a character string */
%let tgtpath = %sysfunc(
  ifc(&SYSSCP. = WIN,
libname tgt "&tgtpath.";

When I run this on SAS for Linux, I see this in the log:

NOTE: Libref TGT was successfully assigned as follows: 
      Engine:        V9 
      Physical Name: /r/node/vol/vol01/mydept/project

And on Windows:

NOTE: Libref TGT was successfully assigned as follows: 
      Engine:        V9 
      Physical Name: \sasprodrootdeptmydeptproject
tags: SAS programming, SAS tips

The post Assign a SAS library to a different path depending on your OS appeared first on The SAS Dummy.

6月 032016

SAS Technical Support has earned a wonderful reputation for being friendly, knowledgeable, and thorough. Every customer that I talk to is delighted by the experience. That's why what I'm about to say might be heresy, but here it goes. If you have question about how to accomplish a task using SAS software, you can probably find your answer faster on SAS Support Communities.

With over 50,000 topics already cataloged, the chances are high that your question has already been asked and answered on the communities. And if you need to post a new question, the peer network of SAS communities members represent thousands of experts who can respond to your question immediately.

Search Find solutions.

Search Find solutions.

Self-service is self-satisfying

No matter how pleasant the experience, calling customer support is a last resort for many people. Studies show that millennials are much happier when they can solve a question themselves by using a self-serve option (like an online community!). I'm not a millennial (missed that cutoff by a decade or two) -- but I feel exactly the same way.

There are additional benefits to using SAS Support Communities. For starters, you'll come to learn who the experts are in your field. By following them and reading their work, you can learn more about questions that you don't even have yet. In addition, many SAS employees read and reply to topics in the communities. For example, you might read an answer about SAS Enterprise Miner from a developer who actually works on the product. That direct line of communication is relatively rare in the software industry, especially for a company like SAS that has so many products and customers worldwide.

SAS Support Communities DO work for SAS users

Because we're SAS and we measure everything, we have ways of measuring user success on our communities. First, we can look at the data around the topics viewed. Solved topics make up 30% of all page views on site. Many users who visit the communities site look at just one topic per visit -- the one that solves their immediate issue. That tells me that they found what they needed right away, and then moved on with their lives.

The second way we measure success: we ask. In the past few months, nearly 4000 people completed our "Tell us what you think" questionnaire. 72% of survey respondents say they found what they were looking for on SAS Support Communities. That's a solid benchmark that we strive to improve -- but our industry experts tell us that our success rate compares very favorably to other communities sites.

Communities respond fast -- no SLA needed

Communities reply time, past 90 days

Communities reply time, past 90 days

If you've used SAS Technical Support, you might be aware of their policies around response times. Support tracks can have different levels of severity, and the more severe tracks have a quicker SLA (service-level agreement, or promised response time.) In practice, most customers experience much faster service than that the SLA policies promise, but that's not guaranteed. While SAS Support Communities don't offer a service-level agreement, they are "open" 24 hours a day, every day, around the world. Our data show that community members respond quickly, often within minutes of your question. 92% of well-phrased questions receive a reply within a day. For questions that eventually show as solved, the reply that solves the question arrives in 8 hours -- 72% of the time. Can you see how tapping a community of thousands of experts can expedite your path to learning and to a solution?

How to ask a good question and receive a fast reply

Experts who respond on the communities have tremendous experience and intuition, but they aren't mind readers. You have to form good questions if you want to receive a helpful response. Here are some tips for success:

  • Use a precise subject line. Try to include your goal, error message, SAS procedure name, function -- whatever keywords will help an expert to "pick up" your question as something he/she could answer. (Pro tip: "Urgent help needed!" or "SAS question" are not effective subject lines.)
  • Share example data. Many questions can be answered properly only when the responders can see the "shape" and characteristics of your data. Don't share anything proprietary, of course.
  • Show what you have tried. Community members love to nudge you towards the proper solution, but it helps if you share what you've already tried and if you hit any walls...explain. If you have special constraints (must use a older version of SAS, product set, etc), share that too.
  • Search on communities site first, before posting a new question. The act of entering a new question helps with this because you'll see the subject line "autocomplete" with suggested matching topics, even before you post. That's another reason that the first tip (precise subject line) is so important.
  • Post into most appropriate board. There are boards for most SAS products, and these are monitored regularly by experts who specialize. Posting on the correct board helps your topic to be seen by the best experts.
  • When you receive a helpful reply, come back and mark it as an Accepted Solution, or at least click Like for the replies that are helpful. This action will help you and others to find the answer in the future.

When to call SAS Technical Support

Your peers in the community cannot solve every problem that you encounter. If you're experiencing slow performance that you can't explain, or having installation troubles, or seeing "crashes" -- you probably need to open a track with SAS Technical Support. The support consultants are experts in diagnosing and getting to the bottom of such issues. You'll most likely need to share details and logs that you would not typically share in a public forum. However, the sweet spot of the communities is the "how do I" question -- a syntax, best practice, or simple usage query that you encounter as you learn to use the software. And SAS users never stop learning -- even those of us who have decades of SAS experience.


tags: SAS Communities, SAS Technical Support

The post How SAS Support Communities can expedite your tech support experience appeared first on The SAS Dummy.

5月 192016

Yesterday a frustrated SAS user complained on Twitter. He's working with a database that stores an ID field as a big long number (perhaps using the database BIGINT type), and SAS can't display a number greater than 15 digits. Well, it's actually 16 digits, depending on the value:

%put Biggest Exact Int = %sysfunc(constant(EXACTINT,8));
>> Biggest Exact Int = 9007199254740992

It's a controversial design decision to use an integer to represent an ID value in a database. You might save a few bytes of storage, but it limits your ability to write programs (not just SAS programs) that have to store and manipulate that value. And if you don't need to do math operations with the ID, your data consumers would rather see a nice character value there.

Fortunately, when working with databases, you can tell SAS to read numeric values as character values into your SAS data sets. In addition to solving the precision problem I've just described, this can also help when you need to join database fields with other source systems that store their key fields differently. It's usually much easier to convert the field "on the way in" rather than try to mangle it after you've already read in the records. Use the DBSASTYPE= data set option to tell SAS how to read database fields. Here's a sample SAS program that shows how I access a table using ODBC, one step without and one step with the DBSASTYPE= option.

libname wpblogs odbc datasrc="wpblogs";
options obs=10;
data users_IDint (keep=ID display_name);
  set wpblogs.wp_users;
data users_IDchar (keep=ID display_name);
  set wpblogs.wp_users 

Here are the resulting tables; you can see the simple difference. One has ID as a number, and one has it as a character. Magic!

The DBSASTYPE= option is supported for virtually all SAS/ACCESS database engines, including the ubiquitous SAS/ACCESS to ODBC.

Oh, and you might be wondering how things turned out for our frustrated user on Twitter. Our SAS Cares social media team heard his plea and responded -- as they always do. And our user not only found the information useful, he took it a step further by replying back with an additional syntax tip.

tags: DBSASTYPE option, PROC SQL, sas/access

The post Tell SAS to read a database field as CHAR instead of INT or BIGINT appeared first on The SAS Dummy.

5月 182016

As a parent of children who love books, I can tell you that there is something humorous about taking a first name, adding a "Mc" and then a rhyming surname to make up a brand new character name. My daughters always loved to read the adventures of Harry Mclary from Donaldson's Dairy, and we loved to read it aloud to them. It was just fun.

The Boaty McBoatface phenomenon has taken this to the next level by adding "face" as a suffix, which often has a funny punctuating effect ("silly face", "Chu chi face", "doody face," etc. Hilarious!).

I thought that I was done writing blogs about Boaty McBoatface, but I've been hearing from so many people about this topic that I need at least this one more to finish it off.

Name our ship: final results

Spoiler: NERC is not going to christen the new vessel "Boaty McBoatface." Instead the name comes from the 4th-highest vote-getter, "David Attenbourough." The famous explorer earned over 11,000 votes, or 2.78% of all votes cast. However, as a crowd-pleasing nod to the plebians, NERC will name one of the ship's remotely operated submarines "Boaty McBoatface." Hooray! I grabbed the final voting results from the NameOurShip website and re-ran my analysis. Here's the final top 10 standings.


Many imitators, but original stays on top

The original entry of "Boaty McBoatface" inspired many copycats who submitted names with a similar formula. None of them seemed to have the wide appeal of Boaty, probably because they weren't first and original, but here they are with their vote counts.

I found these in the data with a simple SQL LIKE operator, finding those names that had the pattern "-y Mcy".

proc sql;
   create table work.TheMCs as 
   select t1.title, 
            (sum(t1.likes)) format=comma20. as totalVotes
      from work.votes t1
      where t1.title like '%y Mc%'
      group by t1.title
      order by totalVotes desc;

Boaty Mac: start of a popular movement

Silly names are not limited to research vessels. The world has embraced the Boaty McBoatface pattern with much enthusiasm. A colleague sent me news about Parsey McParseface, an open-source project from Google. Grumpy McNoisybutt was proposed as a name for a rattlesnake. Even my own daughter has created Rocky McRockface, a major character in her rock cycle project.

I won't say that this is my final Boaty post. Who knows? In a couple of years I might be reporting on "Boaty McBoatface"-inspired baby names. I'm confident that at least one poor child will bear the name; that's the sort of world we live in. Fortunately, children usually find a way to have revenge on their parents (which is why I have nothing but praise for Rocky McRockface).

tags: Boaty McBoatface, PROC SQL

The post Copy McCopyface and the new naming revolution appeared first on The SAS Dummy.

5月 152016

What's the most common data reporting mechanism? Is it web-based reporting? PDFs? How about spreadsheets? Maybe, but in my experience many reports are delivered using a less-scalable and transient mechanism: e-mail.

I'm a data steward at SAS. Specifically, I look after the operational data around our blogging program and our online communities. Even though we have many self-serve reports already set up for these programs, I'm often approached with specific one-off questions. When I get asked the same question multiple times, I'll create a report that runs automatically and stays current for future requests. But for most jobs I find myself running queries and pasting the result into e-mail. Just because the request is ad-hoc doesn't mean it has to take a lot of time or appear as "quick and dirty." Here's my process for creating solid e-mail responses that please my stakeholders.

Step 1: Select data values, then Copy with Headers

This is one of my favorite new features in SAS Enterprise Guide 7.1. I use this technique several times per week. It's a big time saver because it grabs the selected data values and their column names. It places them onto the Windows clipboard as tab-delimited data.

If you don't have version 7.1 then you can approximate this technique by selecting Send To->Microsoft Excel. This launches Microsoft Excel, opens a new sheet and populates it with all of the data in the data grid, including the headers. You can then copy your selection from the sheet and continue.

Step 2. Paste selection into new e-mail message

When you paste tab-delimited data into Microsoft Outlook, you get a raggedy-looking set of lines. But don't worry -- Step 3 will take care of that. Other e-mail programs might actually create a table for you automatically.


Step 3. Apply "Convert text to table" action

In Microsoft Outlook, this option on the Message menu, under the Table pulldown. It converts your selected text into a true table, offering you options to confirm the number of columns and rows. Shortcut: simply select Insert Table with your data selected, and Outlook creates the table.


Optional Step 4: Beautify

Microsoft Outlook offers a number of canned crowd-pleasing table layouts that will format your headings, rows and columns so they look fancy yet readable. Pick your favorite and apply.

You can see me stumble through all of the steps in this animation:

The result is an attractive table that answers a question. It looks like something that might have taken you hours to prepare. There is one danger though: because you accomplished the task so quickly, your constituents might feel emboldened to ask more difficult questions, and with greater frequency.

The post Ad-hoc reporting with SAS: Tips for the e-mail jockey appeared first on The SAS Dummy.

4月 252016

We've just celebrated Earth Day, but I'm here to talk about Jupyter -- and the SAS open source project that opens the door for more learning. With this new project on the page, SAS contributes new support for running SAS from within Jupyter Notebooks -- a popular browser-based environment used by professors and data scientists.

My colleague Amy Peters announced this during a SAS Tech Talk show at SAS Global Forum 2016. If you want to learn more about Jupyter and see the SAS support in action, then you can watch the video here.

Visit the project on GitHub: sas_kernel by sassoftware

Within Jupyter, the sas_kernel provides multiple ways to access SAS programming methods. The most natural method is to create a new SAS notebook, available from the New menu in the Jupyter Home window and from the File menu in an active notebook:

From a SAS notebook, you can enter and run SAS code directly from a cell:

There is even a Notebook extension (./nbextensions/showSASLog) that can show you the SAS log.

The second way that you can run SAS code is by using special Python "magics" supported by the sas_kernel. These magic commands look almost just like SAS macro calls (imagine that!). From within a Python language notebook, you can inject your SAS program code and pull in SAS results. This allows you to move easily between Python and SAS in a single environment. Here's a simple example:

proc means;
ods graphics / height=500 width=800;
proc sgplot;
histogram msrp;

How to get started

Currently, to run SAS with Jupyter you need:

  • SAS 9.4 or later running on Linux
  • Python 3 installed on the same machine (that's basically part of Linux)
  • Admin rights to be able to install/configure the Jupyter Notebook infrastructure and the sas_kernel.

End users of Jupyter Notebook do not need special privileges -- you need those only to install and configure the pieces that make it work. The GitHub project has all of the doc and step-by-step instructions for installation.

What's next for SAS and Jupyter?

This is just the start for SAS in the Jupyter world. Amy says that she has already received lots of interest and feedback, and SAS is working to make the Jupyter Notebook approach available in something like SAS University Edition and SAS OnDemand for Academics. Stay tuned!

tags: Jupyter, open source, Python, SAS global forum

The post How to run SAS programs in Jupyter Notebook appeared first on The SAS Dummy.

4月 072016

I know what you're thinking: two "Boaty McBoatface" articles within two weeks? And we're past April Fool's Day?

But since I posted my original analysis about the "Name our ship" phenomenon that's happening in the UK right now, a new contender has appeared: Poppy-Mai.

The cause of Poppy-Mai, a critically ill infant who has captured the imagination of many British citizens (and indeed, of the world), has made a very large dent in the lead that Boaty McBoatface holds.

Yes, "Boaty" still has a-better-than 4:1 lead. But that's a lot closer than the 10:1 lead (over "Henry Worsley") from just over a week ago. Check out the box plot now: you can actually make out a few more dots. Voting is open for another 10 days -- and as we have seen, a lot can happen in that time.

As I take this second look at the submissions (now almost 6300) and voting data (almost 350,000 votes cast), I've found a few more entries that made me chuckle. Some of them struck me by their word play, and others cater to my nerdy sensibilities. Here they are (capitalization retained):

While I'm on this topic, I want to give a shout-out to regex101, the online regular expression tester. I was able to develop and test my regular expressions before dropping them into a PRXPARSE function call. I found that I had to adjust my regular expression to cast a wider net for valid titles from the names submissions data. Previously, I wasn't capturing all of the punctuation. While that's probably because I didn't expect punctuation to be part of a ship's name, that assumption doesn't stop people from suggesting and voting on such names. My new regex match:

  title_regex = prxparse("/'title':s?""([a-zA-Z0-9'.-_#s$%&()@!]+)/");

I could probably optimize by specifying an exception pattern instead of an inclusion pattern...but this isn't the sort of project where I worry about that.

Will I write about Boaty McBoatface again? What will my next Boaty article reveal? Stay tuned!

tags: Boaty McBoatface, regular expressions, SAS programming, SGPLOT

The post Boaty McBoatface is on the run appeared first on The SAS Dummy.

3月 262016

In a voting contest, is it possible for a huge population to get behind a ridiculous candidate with such force that no other contestant can possibly catch up? The answer is: Yes.

Just ask the folks at NERC, the environmental research organization in the UK. They are commissioning a new vessel for polar research, and they decided to crowdsource the naming process. Anyone in the world is welcome to visit their NameOurShip web site and suggest a name or vote on an existing name submission.

As of today, the leading name is "RRS Boaty McBoatface." ("RRS" is standard prefix for a Royal Research Ship.) This wonderfully creative name is winning the race by more than just a little bit: it has 10 times the number of votes as the next highest vote getter, "RRS Henry Worsley".

I wondered whether the raw data for this poll might be available, and I was pleased to find it embedded in the web page that shows the current entries. The raw data is in JSON format, embedded in the source of the HTML page. I saved the web page source to my local machine, copied out just the JSON line with the submissions data, then used SAS to parse the results. Here's my code:

filename records "c:projectsvotedata.txt";

data votes (keep=title likes);
 length likes 8;
 format likes comma20.;
 label likes="Votes";
 length len 8;
 infile records;
  if _n_ = 1 then
      retain likes_regex title_regex;
      likes_regex = prxparse("/'likes':s?([0-9]*)/");
      title_regex = prxparse("/'title':s?""([a-zA-Z0-9's]+)/");

 position = prxmatch(likes_regex,_infile_);
  if (position ^= 0) then
      call prxposn(likes_regex, 1, start, len);
      likes = substr(_infile_,start,len);
 start=0; len=0;

 position = prxmatch(title_regex,_infile_);
  if (position ^= 0) then
      call prxposn(title_regex, 1, start, len);
      title = substr(_infile_,start,len);

With the data in SAS, I used PROC FREQ to show the current tally:

title "Vote tally for NERC's Name Our Ship campaign";
proc freq data=votes order=freq;
table title;
weight likes;

The numbers are compelling: good ol' Boaty Mac has over 42% of the nearly 200,000 votes. The arguably more-respectable "Henry Worsley" entry is tracking at just 4%. I'm not an expert on polling and sample sizes, but even I can tell that Boaty McBoatface is going to be tough to beat.

To drive the point home a bit more, let's look at a box plot of the votes distribution.

title "Distribution of votes for ALL submissions";
proc sgplot data=votes;
hbox likes;
xaxis valueattrs=(size=12pt);

In this output, we have a clear outlier:
If we exclude Boaty, then it shows a slightly closer race among the other runners up (which include some good serious entries, plus some whimsical entries, such as "Boatimus Prime"):

title "Distribution of votes for ALL submissions except Boaty McBoatface";
proc sgplot data=votes(where=(title^="Boaty McBoatface"));
hbox likes;
xaxis valueattrs=(size=12pt);

See the difference between the automatic axis values between the two graphs? The tick marks show 80,000 vs. 8,000 as the top values.

Digging further, I wondered whether there were some recurring themes in the entries. I decided to calculate word frequencies using a technique I found on our SAS Support Communities (thanks to Cynthia Zender for sharing):

/* Tally the words across all submissions */
data wdcount(keep=word);
    set votes;
    i = 1;
    origword = scan(title,i);
    word = compress(lowcase(origword),'?');
    wordord = i;
    do until (origword = ' ');
        /* exclude the most common words */
        if word not in ('a','the','of','and') then output;
        i + 1;
        wordord = i;
        origword = scan(title,i);
        word = compress(lowcase(origword),'?');
proc sql;
   create table work.wordcounts as 
   select t1.word, 
          /* count_of_word */
            (count(t1.word)) as word_count
      from work.wdcount t1
      group by t1.word
      order by word_count desc;
title "Frequently occurring words in boat name submissions";
proc print data=wordcounts(obs=25);

The top words evoke the northern, cold nature of the boat's mission. Here are the top 25 words and their counts:

  1    polar         352 
  2    ice           193 
  3    explorer      110 
  4    arctic         86 
  5    red            69 
  6    sir            55 
  7    john           54 
  8    lady           46 
  9    sea            42 
 10    ocean          42 
 11    scott          41 
 12    bear           39 
 13    aurora         38 
 14    artic          37 
 15    queen          37 
 16    captain        36 
 17    james          36 
 18    endeavour      35 
 19    william        35 
 20    star           34 
 21    spirit         34 
 22    new            26 
 23    antarctic      26 
 24    boat           25 
 25    cold           25 

I don't know when voting closes, so maybe whimsy will yet be outvoted by a more serious entry. Or maybe NERC will exercise their right to "take this under advisement" and set a certain standard for the finalist names. Whatever the outcome, I'm sure we haven't heard the last of Boaty...

tags: regular expressions, SAS programming, SGPLOT

The post And it's Boaty McBoatface by an order of magnitude appeared first on The SAS Dummy.