DS2

12月 292016
 

This SAS Jedi is very excited about the SAS 9.4 M4 release, which brought many wonderful gifts just in time for Christmas. So in the interest of extending the Christmas spirit, I'm going to blog about some of my favorites! I've long loved the SAS DO statement variant which allows […]

The post SAS Jedi Christmas - SAS 9.4 M4 DS2 Do Loop Upgrade appeared first on SAS Learning Post.

11月 222016
 

Maybe programming isn’t quite as dangerous as a lightsaber battle, but if you think using SAS to turn data into action feels a little bit like magic, you should know that nobody is better at harnessing “the Force” of DS2 than SAS Jedi Mark Jordan. Mark has a resume that […]

The post SAS Jedi and DS2 Guru: Mark Jordan appeared first on SAS Learning Post.

6月 072016
 

A reader posed a question in the comments to an earlier Jedi SAS Trick, asking how to write the results of a DS2 DATA _NULL_ program to a text file. It's an interesting question, as DS2 currently has no text file handling statements or capabilities. Take, for example, this traditional […]

The post Jedi SAS Tricks: Writing to Text Files from DS2 appeared first on SAS Learning Post.

4月 012016
 

The journey continues as we hear from the instructors for each of the courses being offered on Thursday and Friday, April 21 and 22 after SAS Global Forum. Next up is Mark Jordan who developed and will teach the Introduction to DS2 and Hadoop course. Why should people get excited […]

The post The road to SAS Global Forum: A training Q&A with the Jedi SAS appeared first on SAS Learning Post.

9月 292015
 

Thanks to the proliferation of cloud services and REST-based APIs, SAS users have been making use of PROC HTTP calls (to query these web services) and some creative DATA step or PROC GROOVY code to process the JSON results. Such methods get the job done (JSON is simply text, after all), but they aren't as robust as an official JSON parser. JSON is simple: it's a series of name-value pairs that represent an object in JavaScript. But these pairs can be nested within one another, so in order to parse the result you need to know about the object structure. A parser helps with the process, but you still need to know the semantics of any JSON response.

SAS 9.4 introduced PROC JSON, which allows you to create JSON output from a data set. But it wasn't until SAS 9.4 Maintenance 3 that we have a built-in method to parse JSON content. This method was added as a DS2 package: the JSON package.

I created an example of the method working -- using an API that powers our SAS Support Communities! The example queries communities.sas.com for the most recent posts to the SAS Programming category. Here's a small excerpt of the JSON response.

 "post_time": "2015-09-28T16:29:05+00:00",
  "views": {
  "count": 1
  },
  "subject": "Re: How to code for the consecutive values",
  "author": {
  "href": "/users/id/13884",
  "login": "ballardw"

Notice that some items, such as post_time, are simple one-level values. But other items, such as views or author, require a deeper dive to retrieve the value of interest ("count" for views, and "login" for author). The DS2 JSON parser can help you to navigate to those values without you needing to know how many braces or colons or commas are in your way.

Here is an example of the result: a series plot from PROC SGPLOT and a one-way frequency analysis from PROC FREQ. The program also produces a detailed listing of the messages, the topic content, and the datetime stamp.

series

boardfreq
This is my first real DS2 program, so I'm open to feedback. I already know of a couple of improvements I should make, but I want to share it now as I think it's good enough to help others who are looking to do something similar.

The program requires SAS 9.4 Maintenance 3. It also works fine in the most recent version of SAS University Edition (using SAS Studio 3.4). All of the code runs using just Base SAS procedures.

/* DS2 program that uses a REST-based API */
/* Uses http package for API calls       */
/* and the JSON package (new in 9.4m3)   */
/* to parse the result.                  */
proc ds2; 
  data messages (overwrite=yes);
    /* Global package references */
    dcl package json j();
 
    /* Keeping these variables for output */
    dcl double post_date having format datetime20.;
    dcl int views;
    dcl nvarchar(128) subject author board;
 
    /* these are temp variables */
    dcl varchar(65534) character set utf8 response;
    dcl int rc;
    drop response rc;
 
    method parseMessages();
      dcl int tokenType parseFlags;
      dcl nvarchar(128) token;
      rc=0;
      * iterate over all message entries;
      do while (rc=0);
        j.getNextToken( rc, token, tokenType, parseFlags);
 
        * subject line;
        if (token eq 'subject') then
          do;
            j.getNextToken( rc, token, tokenType, parseFlags);
            subject=token;
          end;
 
        * board URL, nested in an href label;
        if (token eq 'board') then
          do;
            do while (token ne 'href');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            j.getNextToken( rc, token, tokenType, parseFlags );
            board=token;
          end;
 
        * number of views (int), nested in a count label ;
        if (token eq 'views') then
          do;
            do while (token ne 'count');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            j.getNextToken( rc, token, tokenType, parseFlags );
            views=inputn(token,'5.');
          end;
 
        * date-time of message (input/convert to SAS date) ;
        * format from API: 2015-09-28T10:16:01+00:00 ;
        if (token eq 'post_time') then
          do;
            j.getNextToken( rc, token, tokenType, parseFlags );
            post_date=inputn(token,'anydtdtm26.');
          end;
 
        * user name of author, nested in a login label;
        if (token eq 'author') then
          do; 
            do while (token ne 'login');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            * get the author login (username) value;
            j.getNextToken( rc, token, tokenType, parseFlags );
            author=token;
            output;
          end;
      end;
      return;
    end;
 
    method init();
      dcl package http webQuery();
      dcl int rc tokenType parseFlags;
      dcl nvarchar(128) token;
      dcl integer i rc;
 
      /* create a GET call to the API                                         */
      /* 'sas_programming' covers all SAS programming topics from communities */
      webQuery.createGetMethod(
         'http://communities.sas.com/kntur85557/' || 
         'restapi/vc/categories/id/sas_programming/posts/recent' ||
         '?restapi.response_format=json' ||
         '&restapi.response_style=-types,-null&page_size=100');
      /* execute the GET */
      webQuery.executeMethod();
      /* retrieve the response body as a string */
      webQuery.getResponseBodyAsString(response, rc);
      rc = j.createParser( response );
      do while (rc = 0);
        j.getNextToken( rc, token, tokenType, parseFlags);
        if (token = 'message') then
          parseMessages();
      end;
    end;
 
  method term();
    rc = j.destroyParser();
  end;
 
  enddata;
run;
quit;
 
/* Add some basic reporting */
proc freq data=messages noprint;
    format post_date datetime11.;
    table post_date / out=message_times;
run;
 
ods graphics / width=2000 height=600;
title '100 recent message contributions in SAS Programming';
title2 'Time in GMT';
proc sgplot data=message_times;
    series x=post_date y=count;
    xaxis minor label='Messages';
    yaxis label='Time created' grid;
run;
 
title 'Board frequency for recent 100 messages';
proc freq data=messages order=freq;
    table board;
run;
 
title 'Detailed listing of messages';
proc print data=messages;
run;
 
title;

I also shared this program on the SAS Support Communities as a discussion topic. If you want to contribute to the effort, please leave me a reply with your suggestions and improvements!

tags: DS2, JSON, REST API, SAS 9.4

The post Using SAS DS2 to parse JSON appeared first on The SAS Dummy.

9月 052015
 

(DS2 would be the king!) Years ago I made up a piece of SAS code to demonstrate the basic idea of Map-Reduce. Now this idea can be best implemented by this piece of workable program with PROC DS2 (tested in SAS 9.4 TS1M2, Win7):

PROC DS2;

/* create some data –*/
data input_data / overwrite = yes;
dcl double d;
method init();
   dcl int i;
   do i = 1 to 10000000;
      /*– create some money values –*/
      d = round( (ranuni(123) * 10 ), .01 );
      output;
   end;
end;
enddata;
run;

/*– count the rows in multiple threads –*/
thread map / overwrite = yes;
dcl double c s;
keep c s;
method run();
   set input_data;
   /*– the more compuation here, the more benefit –*/
   c + 1;
   s + d;
end;
method term();
   output;
   put s= c=;
end;
endthread;
run;

/*– blend the results into one total –*/
data reduce / overwrite = yes;
dcl thread map m;
dcl double totc tots;
keep totc tots;
method run();
   set from m threads=4;
   totc + c;
   tots + s;
end;
method term();
   output;
end;
enddata;
run;
quit;

proc print data=reduce; run;

Thanks Robert Ray of SAS Institute to kindly allow me to post his code.

9月 052015
 

The code:

data a;
    input i a $ b $;
    datalines;
    1 a1A b1
    1 a1A b1
    2 a2 b2
    ;
run;

data b;
    input i a $ c $;
    datalines;
    1 a1C c1
    2 .   c2
    3 .  c3
    ;
run;

data mrge;
    merge a b;
    by i;
run;

proc ds2;
    data ds2;
       method run();
          merge a b;
        by i;
       end;
     enddata;
    run;
quit;

The outputs:

SAS_DS2_merge

The comments:

1. One of the weird behaviors of data step MERGE is that the value “c1” was carried over to row 2 of merged out dataset, Work.Mrge. In output dataset Work.Ds2 (generated by DS2), the row 2 of variable c is missing, which is kind of safe operation as we expected.

2. In both output datasets, value ‘a1C’ overwrote ‘a1A’ in row 1.

3. This DS2 MERGE is available in SAS 9.4 (TS1M3).

7月 212015
 

I'm gearing up to teach the next "DS2 Programming Essentials with Hadoop" class, and thinking about Warp Speed DATA Steps with DS2 where I first demonstrated parallel processing using threads in base SAS. But how about DATA step processing at maximum warp? For that, we'll need a massively parallel processing […]

The post Jedi SAS Tricks - Maximum Warp with Hadoop appeared first on The SAS Training Post.

1月 182015
 
While perusing the SAS9.4 DS2 documentation, I ran across the section on the HTTP package. This intrigued me because, as DS2 has no text file handling statements I assumed all hope of leveraging Internet-based APIs was lost. But even a Jedi is wrong now and then! And what better API […]