1月 242018
 

This article and accompanying technical white paper are written to help SAS 9 users process existing SAS 9 code multi-threaded in SAS Viya 3.3.
Read the full paper, Getting Your SAS 9 Code to Run Multi-Threaded in SAS Viya 3.3.

The Future is Multi-threaded Processing Using SAS® Viya®

When I first began researching how to run SAS 9 code as multi-threaded in SAS Viya, I decided to compile all the relevant information and detail the internal processing rules that guide how the SAS Viya Cloud Analytics Services (CAS) server handles code and data. What I found was that there are a simple set of guidelines, which if followed, help ensure that most of your existing SAS 9 code will run multi-threaded. I have stashed a lot of great information into a single whitepaper that is available today called “Getting Your SAS 9 Code to Run Multi-threaded in SAS Viya”.

Starting with the basic distinctions between single and parallel processing is key to understanding why and how some of the parallel processing changes have been implemented. You see, SAS Viya technology constructs code so that everything runs in pooled memory using multiple processors. Redefining SAS for this parallel processing paradigm has led to huge gains in decreasing program run-times, as well as concomitant increases in accuracy for a variety of machine learning techniques. Using SAS Viya products helps revolutionize how we think about undertaking large-scale work because now we can complete so many more tasks in a fraction of the time it took before.

The new SAS Viya products bring a ton of value compared to other choices you might have in the analytics marketplace. Unfortunately most open source libraries and packages, especially those developed for use in Python and R, are limited to single-threading. SAS Viya offers a way forward by coding in these languages using an alternative set of SAS objects that can run as parallel, multi-threaded, distributed processes. The real difference is in the shared memory architecture, which is not the same as parallel, distributed processing that you hear claimed from most Hadoop and cloud vendors. Even though parallel, distributed processing is faster than single-threading, it proverbially hits a performance wall that is far below what pooled and persisted data provides when using multi-threaded techniques and code.

For these reasons, I believe that SAS Viya is the future of data/decision science, with shared memory running against hundreds if not thousands of processors, and returning results back almost instantaneously. And it’s not for just a handful of statistical techniques. I’m talking about running every task in the analytics lifecycle as a multi-threaded process, meaning end-to-end processing through data staging, model development and deployment, all potentially accomplished through a point-and-click interface if you choose. Give SAS Viya a try and you will see what I am talking about. Oh, and don’t forget to read my technical white paper that provides a checklist of all the things that you may need to consider when transitioning your SAS 9 code to run multi-threaded in SAS Viya!

Questions or Comments?

If you have any questions about the details found in this paper, feel free to leave them in the comments field below.

Getting your SAS 9 code to run multi-threaded in SAS Viya 3.3 was published on SAS Users.

1月 242018
 

A popular way to use lists in the SAS/IML language is to pack together several related matrices into a single data structure that can be passed to a function. Imagine that you have written an algorithm that requires a dozen different parameters. Historically, you would have to pass those parameters to the function by explicitly listing each argument. If you use lists (introduced in SAS/IML 14.2, which is SAS 9.4M4), you can pack all parameters into one list and pass that list into the function module where the individual items can be extracted as needed.

Example: An algorithm that requires multiple parameters

To illustrate this list-passing technique, consider the algorithm for Newton's method, which is an iterative method for finding the roots of a nonlinear systems of equations. An implementation of Newton's method is included the SAS/IML documentation. Newton's method requires several arguments:

  • The name of a module that evaluates the function whose roots are sought.
  • The name of a module that evaluates the Jacobian matrix (the derivative of the function).
  • An initial guess for the solution.
  • The maximum number of iterations before the algorithm decides that the method is not converging.
  • A convergence criterion that determines the accuracy of the solution.

The example in the documentation (which was written in the 1980s) uses global variables and hard-codes the names of functions and the convergence criteria. The example is meant to demonstrate the basic ideas, but is not reusable. You can make the example more flexible and robust by passing parameters to the module as shown in the next section.

Create a list of parameters

Suppose that you define the modules F and DF to evaluate a multivariate function and the matrix of partial derivatives (the Jacobian matrix), respectively. The following modules evaluate the same mathematical function and derivatives that are used in the SAS/IML documentation:

proc iml;
/* Newton's method to solve for roots of the system of equations
  F(x1,x2) = [ x1+x2-x1*x2+2 ] = [ 0 ]
             [ x1*exp(-x2)-1 ]   [ 0 ]
*/
/* 1. define a function F that evaluates a multivariate function */
start F(x);
   x1 = x[1];   x2 = x[2];                 /* extract the values */
   f = (x1+x2-x1*x2+2) // (x1*exp(-x2)-1); /* evaluate the function */
   return ( f );
finish F;
 
/* 2. define a function DF that evaluates the Jacobian of F */
start DF(x);
   x1 = x[1];   x2 = x[2];                 /* extract the values */
   J = {. ., . .};
   J[1,1] = 1-x2;       J[1,2] = 1-x1;
   J[2,1] = exp(-x2);   J[2,2] =  -x1*exp(-x2);
   return ( J );
finish DF;

As shown in a previous article, you can create a named list that contains the parameters for Newton's algorithm. In SAS/IML 14.3, you can create the list by specifying a sequence of name-value pairs:

/* 3. list of NAME  = VALUE pairs  (syntax uses SAS/IML 14.3) */
args = [#"Func"     = "F",         /* name of module that evaluates the function */
        #"Deriv"    = "DF",        /* name of module that evaluates the deivative */ 
        #"x0"       = {0.1, -2.0}, /* initial guess for solution */
        #"MaxIters" = 100,         /* maximum iterations */
        #"ConvCrit" = 1e-6 ];      /* convergence criterion */

In SAS/IML 14.3, you can use the list item operator ($) to specify an item in a list. You can also get the name of the function and use CALL EXECUTE to evaluate the function at the initial guess, as follows:

x = args$"x0";                           /* initial guess */ 
EvalFunc = "y=" + args$"Func" + "(x);";  /* string for "y=F(x);" */
call execute(EvalFunc);                  /* evaluates function at x */
print EvalFunc, y;

This is a useful trick for evaluating a function by name. The next section uses this trick inside a module that implements Newton's method.

Pass a list of parameters to and from a SAS/IML module

You can now implement Newton's method by defining a SAS/IML module that takes one argument, which is a list. Inside the module, the parameters are extracted from the list and used to evaluate the functions F and DF and to control the iteration and convergence of Newton's method, as follows:

/* 4. Newton's method for solving the roots for F(x)=0. The argument to the module is a list. */
start Newton( L );
   x = L$"x0";                              /* initial guess */
   EvalFunc = "y=" + L$"Func" + "(x);";     /* string for "y=F(x);" */
   EvalDeriv = "D=" + L$"Deriv" + "(x);";   /* string for "D=DF(x);" */
 
   call execute(EvalFunc);          /* evaluate F at starting values */
   converged = (max(abs(y)) < L$"ConvCrit");              /* 0 or 1 */
   do iter = 1 to L$"Maxiters"      /* iterate until max iterations */
             while( ^converged );   /*      or until convergence */
      call execute(EvalDeriv);      /* evaluate derivatives D=DF(x) */
      delta = -solve(D,y);          /* solve for correction vector */
      x = x+delta;                  /* update x with new approximation */
      call execute(EvalFunc);       /* evaluate the function y=F(x)*/
      converged = (max(abs(y)) < L$"ConvCrit");
   end;
   /* you can return x or  a list that summarizes the results */
   result = [#"x"          = x,           /* approximate root (if converged) */
             #"iterations" = iter,        /* total number of iterations */
             #"status"      = converged]; /* 1 if root found; 0 otherwise */
   return ( result );
finish Newton;
 
/* 5. call Newton's method and pass in a list of parameters */
soln = Newton( args );              
xSoln = soln$"x";                   /* get root */
if soln$"status" then               /* check convergence statust */
   print "Newton's method converged in " (soln$"iterations") " iterations", xSoln;
else 
   print "Newton's method did not converge in " (soln$"iterations") " iterations";

The Newton module returns a named list that has three items. The item named "status" is a binary value that indicates whether the method converged. If the method converged, the item named "x" contains the approximate root. The item named "iterations" contains the number of iterations of the algorithm.

In summary, you can use lists to create a structure that contains many parameters. You can pass the list to a SAS/IML module, which can extract and use the parameters. This enables you to simplify the calling syntax for user-defined modules that require many parameters.

For additional examples of using lists to pass heterogeneous data to SAS/IML modules, see "More Than Matrices: SAS/IML Software Supports New Data Structures" (Wicklin, 2017, pp. 11–12) and the SAS/IML documentation for lists.

The post Use lists to pass parameters to SAS/IML functions appeared first on The DO Loop.

1月 242018
 

Using SAS with REST APIs is fun and rewarding, but it's also complicated. When you're dealing with web services, credentials, data parsing and security, there are a lot of things that can go wrong. It's useful to have a simple program that verifies that the "basic plumbing" is working before you try to push a lot of complex coding through it.

I'm gratified that many of my readers are able to adapt my API examples and achieve similar results. When I receive a message from a frustrated user who can't get things to work, the cause is almost always one of the following:

  • Not using a recent enough version of SAS. PROC HTTP was revised and improved in SAS 9.4 Maint 3. The JSON library engine was added in SAS 9.4 Maint 4 (released in 2016). It's time to upgrade!
  • Cannot access the web service through a corporate firewall. Use the SSL certificates with the SSLCALISTLOC= option. If you're using SAS University Edition, you'll need the update from December 2017 or later -- that's where SSL was added. The editions prior to that did not include the SSL support.

The following SAS program is a simple plumbing test. It uses a free HTTP test service (httpbin.org) to verify your Internet connectivity from SAS and your ability to use SSL. The endpoint returns a JSON-formatted collection of timestamps in various formats, which the program parses using the JSON library engine. I have successfully run this program from my local SAS on Windows, from SAS University Edition (Dec 2017 release), and from SAS OnDemand for Academics (using SAS Studio).

If you can run this program successfully from your SAS session, then you're ready to attempt the more complex REST API calls. If you encounter any errors while running this simple test, then you will need to resolve these before moving on to the really useful APIs. (Like maybe checking on who is in space right now...)

/* PROC HTTP and JSON libname test          */
/* Requires SAS 9.4m4 or later to run       */
/* SAS University Edition Dec 2017 or later */
 
/* utility macro to echo the JSON out */
%macro echoFile(fn=);
  data _null_;
   infile &fn end=_eof;
   input;
   put _infile_;
  run;
%mend;
 
filename resp "%sysfunc(getoption(WORK))/now.json";
proc http
 url="https://now.httpbin.org/"
 method="GET"
 out=resp;
run;
 
/* Supported with SAS 9.4 Maint 5 */
/*
 %put HTTP Status code = &SYS_PROCHTTP_STATUS_CODE. : &SYS_PROCHTTP_STATUS_PHRASE.; 
*/
 
%echoFile(fn=resp);
 
/* Tell SAS to parse the JSON response */
libname time JSON fileref=resp;
 
title "JSON library structure";
proc datasets lib=time;
quit;
 
/* interpret the various datetime vals and convert to SAS */
data values (keep=from:);
 length from_epoch 8
        from_iso8601 8
        from_rfc2822 8
        from_rfc3339 8;
 
 /* Apply the DT format to a range of vars */
 format from: datetime20.;
 set time.now;
 
 /* epoch = # seconds since 01Jan1970        */
 /* SAS datetime is based on 01Jan1960, so   */
 /* add the offset here for a SAS value      */
 from_epoch = epoch + '01Jan1970:0:0:0'dt;
 
 /* straight conversion to ISO8061           */
 from_iso8601 = input(iso8601,is8601dt.);
 
 /* trim the leading "day of week"      */
 from_rfc2822 = input(substr(rfc2822,5),anydtdtm21.);
 
 /* allow SAS to figure it out */
 from_rfc3339 = input(rfc3339,anydtdtm.);
 
 run;
 
title "Raw values from the JSON response";
proc print data=time.now (drop=ord:);
run;
 
title "Formatted values as SAS datetime";
proc print data=values;
run;
 
libname time clear;
filename resp clear;

The post How to test PROC HTTP and the JSON library engine appeared first on The SAS Dummy.

1月 242018
 

Well! I did it! I conducted a workshop for internal SAS employees wishing to take SAS certification. The workshop was to help them understand the following: SAS Certification overview Exam preparation Website Content Exam pass score Sample questions Study material As the day for the exam approaches, on 5 February, [...]

The post Demystifying SAS certification – Week 1 appeared first on SAS Learning Post.

1月 242018
 

Are you in the early stages of your Internet of Things (IoT) strategy? According to new IoT research done by the Internet of Things Institute on behalf of SAS, you are not alone. There's been a lot of buzz in the market for several years around IoT and how it's [...]

Road map to IoT success, from telematics to connected vehicles was published on SAS Voices by Suzanne Clayton

1月 232018
 

Is it time to add SAS to the list of "romance" languages?*.

It's no secret that there are enthusiastic SAS programmers who love the SAS language. So it only makes sense that sometimes, these programmers will "nerd out" and express their adoration for fellow humans by using the code that they love. For evidence, check out these contributions from some nerdy would-be Cupids:

Next-level Nerd love: popping the question with SAS

Recently a SAS community member took this to the next level and crafted a marriage proposal using SAS. With help from the SAS community, this troubadour wrote a SAS program that popped the question in a completely new way. Our hero wanted to present a SAS program that was cryptic enough so that a casual glance would not reveal its purpose. But it had to be easy to open and run within SAS University Edition. His girlfriend is a biostatistics student; she's not an expert in SAS, but she does use it for her coursework and research. With over 70 messages exchanged among nearly 20 people, the community came through.

When Brandon's beloved ran his program in SAS University Edition, she was presented with an animated graphic that metaphorically got down on one knee and asked for her hand in marriage. And she said "Yes!" (Or maybe in SAS terms her answer was "RESPONSE=1".)

Simulated version of the proposal created with SAS

A marriage proposal that was helped along by the SAS Support Communities? That's definitely a remarkable event, but it's not completely surprising to us at SAS. SAS users regularly impress us by combining their SAS skills with their creativity. And, with their willingness to help one another.

The SAS Valentine Challenge for YOU

With Valentine's Day approaching, we thought that it would be fun to prepare some Valentine's greetings using SAS. We've started a topic on the SAS Support Communities and we invite you to contribute your programs and ideas. You can use any SAS code or tool that you want: ODS Graphics, SAS/GRAPH, SAS Visual Analytics, even text-based output. Our only request is that it's something that another SAS user could pick up and use. In other words, show your work!

It's not exactly a contest so there are no rules for "winning." Still, we hope that you'll try to one-up one another with your creative contributions. The best entries will get bragging rights and some social media love from SAS and others. And who knows? Maybe we'll have some other surprises along the way. We'll come back and update this post with some of our favorite replies.

Check out the community topic and add your idea. We can't wait to see what you come up with!

* Yes, I know that "romance" languages are named such because they stem from the language spoken by the Romans -- not because they are inherently romantic in the lovey-dovey sense. Still, it's difficult to resist the play on words.

A Valentine challenge: express your <3 with SAS was published on SAS Users.

1月 222018
 

Amazon recently announced the 20 finalists in their search for a location to house their new 2nd headquarters (nicknamed 'hq2'). They showed a map on their web page, but I wanted to create my own map, with a few improvements ... Here's their official map. It's not terrible, but it [...]

The post Here are the Amazon HQ2 finalists! appeared first on SAS Learning Post.

1月 222018
 

SAS/IML 14.3 (SAS 9.4M5) introduced a new syntax for creating lists and for assigning and extracting item in a list. Lists (introduced in SAS/IML 14.2) are data structures that are convenient for holding heterogeneous data. A single list can hold character matrices, numeric matrices, scalar values, and other lists, as discussed in a previous article about how to use lists in SAS/IML.

The list creation operator

You can use square brackets to create a list. The elements of the list are separated by commas. For example, the following syntax creates a list that contains three elements. The list represents a hypothetical patient in a cholesterol-lowering study.

proc iml;
/* 1. New list creation syntax (L = [val1, val2,...]) */
/*                 Cholesterol at
       Name  Age   0mo  3mo  6mo  12mo */
P1 = ["Bob",  36, {182, 170, 170, 162}]; /* Patient 1 */

The name of the list is P1. The first element of the list is a string, the second is a scalar number, and the third is a numeric vector. In this case, the elements are specified by using literal values, but you can also use previously defined variables. The following statement is equivalent but defines the list by using existing variables:

Name = "Bob"; Age = 36; Chol = {182, 170, 170, 162};
P1 = [Name, Age, Chol];   /* define list from copies of existing variables */

As mentioned earlier, lists can contain other lists. For example, if you have multiple patients in a study, you can create a list of each patient's data, then create a list that contains all the patients' data, as follows:

P2 = ["Fred", 52, {175, 165, 155}     ]; /* Patient 2 */
P3 = ["Tom",  45, {160, 145,   ., 139}]; /* Patient 3 */
Patients = [P1, P2, P3];                 /* a list of patients */

Assign and extract list items

You can use the list item operator ($) to specify an item in a list. For each patient, the age is stored as the second item in a list. If you want the age of Bob (the first patient), you can use either of the following statements:

/* 2. New list item syntax ($) */
BobAge = P1$2;         /* get 2nd item from P1 list */
BobAge = Patients$1$2; /* get 1st item of Patients, then 2nd item from that list */

The first statement is straightforward: P1$2 means "get the second item from the P1 list." The second statement is parsed left to right. The syntax Patient$1 means "get the first item from the Patients list," which is equivalent to P1. Thus Patient$1$2 gets Bob's age.

The preceding example uses literal values to specify the position of an item in a list, but you can also use a variable. This makes it possible to extract items in a loop. For example, the following statements loop over all patients, extract the name and cholesterol values of the patient, and compute each patient's average cholesterol value during the study:

N = ListLen(Patients);      /* number of patients */
Name = j(N,1,"     ");      /* allocate vector for names */
AvgChol = j(N,1,.);         /* allocate vector for results */
do i = 1 to N;
   Name[i] = Patients$i$1;  /* get name of i_th patient */
   Chol = Patients$i$3;     /* get sequence of cholesterol values */
   AvgChol[i] = mean(Chol); /* average value */
end;
print AvgChol[rowname=Name];

You can use the list item operator on the left side of an assignment operator to change an item in a list. For example, suppose you discover that Bob's age was typed wrong: Bob is really 63, not 36. You can update Bob's data as follows:

P1$2 = 63;         /* update 2nd item in P1 list */
Patients$1 = P1;   /* update entire patient list */

Alternatively, you could use the syntax Patients$1$2 = 63 to update the data in place.

Extract sublists

To subset a list, use square brackets ([]) and specify the indices of the items. For example, the following statement extracts the second and third patients into a new list:

/* 3. Extract sublist SL = L[ {i1, i2,...} ] */
SubList = Patients[ {2 3} ]; /* Fred and Tom */

The sublist operator ([]) ALWAYS returns a list, even if you specify only one item! Thus Patients[1] is a LIST that contains one item. If you want the item itself, use Patients$1.

Named lists

In the previous example, the items in the list P1 are a patient's name, age, and cholesterol readings. If you want to extract Bob's age, you can write P1$2, but someone unfamiliar with the order of the list items would have no idea what that item represents. Thus it is helpful to define a named list, which is also called an associative array. When you specify a named list, you specify the items by using name-value pairs, as follows:

P = [#"Name" = "Bob",                        /* P$1 or P$"Name" */
     #"Age"  = 63,                           /* P$2 or P$"Age" */
     #"Cholesterol" = {182, 170, 170, 162}]; /* P$3 or P$"Cholesterol" */

You can use the names to refer to the items in a list. For example, the following statements extract the patient's name and cholesterol readings by using the list item operator:

Name = P$"Name";
Chol = P$"Cholesterol";       /* get Bob's measurements */
print Chol[Label=Name];

You can also use names in the sublist operator to extract a sublist:

L = P[ {"Age" "Cholesterol"} ];  /* sublist that contains two items */

Summary

In summary, SAS/IML 14.3 contains new syntax for creating a list and for extracting items and sublists. This syntax makes it easier to use lists and to read and write SAS/IML programs that use lists.

The post Create lists by using a natural syntax in SAS/IML appeared first on The DO Loop.

1月 202018
 

SAS/GRAPH® Annotate FacilityThe

data myanno;                                                                                                                            
  length function color $8;                                                                                                                                                                                                                             
  function='move';                                                                                                                      
    x=0;  y=0;                                                                                                                          
    output;                                                                                                                             
  function='draw';                                                                                                                      
   x=100;  y=100;                                                                                                                       
   color='red';                                                                                                                         
   output;                                                                                                                              
run;                                                                                                                                    
 
proc gplot data=sashelp.cars;                                                                                                           
  plot mpg_highway*cylinders / vaxis=axis1 haxis=axis2 annotate=myanno;                                                                                         
  symbol1 interpol=none value=dot color=blue; 
  axis1 label=(angle=90);                                                                                                               
  axis2 offset=(2,2)pct;                                                                                                                                                                                                          
run;                                                                                                                                    
quit;

 

The following annotate errors are written to the SAS log when I run this code:

NOTE: ERROR DETECTED IN ANNOTATE= DATASET WORK.MYANNO.
NOTE: PROBLEM IN OBSERVATION     2 -
      A CALCULATED COORDINATE LIES OUTSIDE THE VISIBLE AREA           X
      A CALCULATED COORDINATE LIES OUTSIDE THE VISIBLE AREA           Y

 

Here is the resulting graph:

The annotated line is drawn outside the axis area. But why? I defined my X and Y coordinates for the MOVE and DRAW functions correctly, did I not?

The coordinates are defined correctly, but what I did not define is the coordinate system for the annotation. The XSYS and

data myanno;                                                                                                                            
  length function color $8;                                                                                                             
  retain xsys ysys '1';                                                                                                                 
  function='move';                                                                                                                      
    x=0;  y=0;                                                                                                                          
    output;                                                                                                                             
  function='draw';                                                                                                                      
   x=100;  y=100;                                                                                                                       
   color='red';                                                                                                                         
   output;                                                                                                                              
run;                                                                                                                                    
 
proc gplot data=sashelp.cars;                                                                                                           
  plot mpg_highway*cylinders / vaxis=axis1 haxis=axis2 annotate=myanno;                                                                                         
  symbol1 interpol=none value=dot color=blue;
  axis1 label=(angle=90);                                                                                                               
  axis2 offset=(2,2)pct;                                                                                                                                                                                                           
run;                                                                                                                                    
quit;

 

Here is the graph containing the correct line:

Creating Multiple Graphs with an Annotate Data Set, BY-and-BY

The need to generate multiple graphs from one procedure using a BY statement is very common. However, using an Annotate data set with a BY statement can be a little tricky. Here are the general rules for using an Annotate data set with a SAS/GRAPH procedure that creates multiple graphs with a BY statement:

  1. Make sure that the Annotate data set and the input data set for the procedure include the same BY variables. The BY variables must also be the same data type in both data sets.
  2. Both the Annotate data set and the input data set must be sorted by the BY variables.
  3. Include the ANNOTATE= (or ANNO=) option in the action statement of the SAS/GRAPH procedure.

The goal of the following program is to create two graphs using a BY statement in which the annotation is specific to each graph. The Annotate data set draws the maximum MPG_Highway value at the maximum point for each X value.

/* Compute the maximum MPG_Highway values */                                                                                            
proc sort data=sashelp.cars(where=(origin in('USA' 'Europe'))) out=cars;                                                                
  by origin cylinders;                                                                                                                  
run;                                                                                                                                    
 
proc means data=cars noprint;                                                                                                           
  by origin cylinders;                                                                                                                  
  var mpg_highway;                                                                                                                      
  output out=meansout max=max;                                                                                                          
run;                                                                                                                                    
 
data myanno;                                                                                                                            
  length function color text $8;                                                                                                        
  retain xsys ysys '2' color 'black' position '2' size 1.5;                                                                             
  set meansout;                                                                                                                         
 
  function='label';                                                                                                                     
    x=cylinders;  y=max;                                                                                                                
    text=strip(max);                                                                                                                    
    output;                                                                                                                             
run;                                                                                                                                    
 
proc gplot data=cars annotate=myanno;                                                                                                   
  by origin;                                                                                                                            
  plot mpg_highway*cylinders / vaxis=axis1 haxis=axis2;                                                                                 
  symbol1 interpol=none value=dot color=blue;                                                                                           
  axis1 label=(angle=90);                                                                                                               
  axis2 offset=(2,2)pct;                                                                                                                
run;                                                                                                                                    
quit;

 

Here are the resulting graphs:

There are two issues here. First, there should be only one maximum value displayed for each X value. There are duplicate values of the annotated text on each graph. Second, the following messages are written to the SAS log:

NOTE: ERROR DETECTED IN ANNOTATE= DATASET WORK.MYANNO.
NOTE: PROBLEM IN OBSERVATION     1 -
      DATA SYSTEM REQUESTED, BUT VALUE IS NOT ON GRAPH    'Y'
NOTE: PROBLEM IN OBSERVATION     5 -
      DATA SYSTEM REQUESTED, BUT VALUE IS NOT ON GRAPH    'X'
NOTE: The above message was for the following BY group:
      Origin=USA

 

These notes tell me that either the X or the Y coordinate in two of the observations in the Annotate data set do not exist on one of the graphs. This issue occurs because the Annotate coordinates for each of the BY values are different for each graph. The axis ranges are different on the two graphs. So, when all of the annotation, instead of the annotation for only each BY value, is drawn on each graph, some of the Annotate coordinates cannot be found on the graph.

Both of these issues occur because the ANNOTATE=Myanno option is in the PROC GPLOT statement instead of in the action (PLOT) statement. Moving the ANNOTATE=Myanno option to the PLOT statement generates the expected output:

proc gplot data=cars;                                                                                                                   
  by origin;                                                                                                                            
  plot mpg_highway*cylinders / vaxis=axis1 haxis=axis2 annotate=myanno;                                                                 
  symbol1 interpol=none value=dot color=blue;                                                                                           
  axis1 label=(angle=90);                                                                                                               
  axis2 offset=(2,2)pct;                                                                                                                
run;                                                                                                                                    
quit;

 

Off the Grid

Another common issue with using an Annotate data set is when a coordinate in the Annotate data set lies outside the range of an axis on the graph. For example, I will chart the mean MPG_Highway values with the GCHART procedure and draw a symbol at the maximum value for each country of origin using an Annotate data set:

proc sort data=sashelp.cars out=cars;                                                                                                   
  by origin;                                                                                                                            
run;                                                                                                                                    
 
/* Compute the mean and the max */                                                                                                      
proc means data=cars noprint;                                                                                                           
  by origin;                                                                                                                            
  var mpg_highway;                                                                                                                      
  output out=meansout mean=mean max=max;                                                                                                
run;                                                                                                                                    
 
data myanno;                                                                                                                            
  length function color $8 text $14;                                                                                                    
  retain xsys ysys '2' color 'red' position '2' size 2;                                                                                 
  set meansout;                                                                                                                         
 
  function='symbol';                                                                                                                    
    midpoint=origin;  y=max;                                                                                                            
    text='diamondfilled';                                                                                                               
    output;                                                                                                                             
run;                                                                                                                                    
 
proc gchart data=meansout;                                                                                                              
  vbar origin / sumvar=mean annotate=myanno raxis=axis1;                                                                                
  axis1 label=(angle=90);                                                                                                               
run;                                                                                                                                    
quit;

 

When I run this program, the following graph is produced, excluding the annotated symbols:

The following annotate error messages are written to the SAS log:

NOTE: ERROR DETECTED IN ANNOTATE= DATASET WORK.MYANNO.
NOTE: PROBLEM IN OBSERVATION     1 -
      DATA SYSTEM REQUESTED, BUT VALUE IS NOT ON GRAPH    'RESPONSE'
NOTE: PROBLEM IN OBSERVATION     2 -
      DATA SYSTEM REQUESTED, BUT VALUE IS NOT ON GRAPH    'RESPONSE'
NOTE: PROBLEM IN OBSERVATION     3 -
      DATA SYSTEM REQUESTED, BUT VALUE IS NOT ON GRAPH    'RESPONSE'

 

These messages tell me that multiple response values (Y coordinates) in the Annotate data set lie outside the range of the Y axis. The procedure does not automatically extend the Y-axis range to accommodate the annotation, so I need to do this by including the ORDER= option in the AXIS1 statement:

proc gchart data=meansout;                                                                                                              
  vbar origin / sumvar=mean annotate=myanno raxis=axis1;                                                                                
  axis1 label=(angle=90) order=(0 to 70 by 10);                                                                                         
run;                                                                                                                                    
quit;

 

The correct graph is now generated:

Annotation is a useful tool that enables you to draw features on a graph that the graphics procedure might not have the capability to draw. Using an Annotate data set is easier once you understand what the SAS log messages are telling you and can take steps to avoid common issues. Don’t be afraid to dive in!

Happy drawing!

Common annotate pitfalls and how to avoid them was published on SAS Users.