SAS Problem Solvers

11月 182016
 

ProblemSolversWith fall comes cooler weather and, of course, football. Lots of football. Often times there will be two NFL games on that my husband wants to watch at the same time. Instead of flipping back and forth between two television stations, he can watch both games simultaneously, thanks to the picture-in-picture feature that we have on our television. This same concept works for SAS® ODS Graphics.

Have you ever been viewing two graphs across pages, flipping back and forth between the two and wishing you could see them together? Now you can. The Graph Template Language (GTL) and PROC SGRENDER enable you to produce a graph inside of a graph, similar to the picture-in-picture feature on your television.

The Game Plan

In this example, we are going to create a graph in the upper right corner of the axis area of a larger graph. When we define the GTL, we always start with the same GTL wrapper, as is shown below. In the wrapper below, INSET is the name of the GTL definition:

proc template;
define statgraph inset;
begingraph;
 
/* insert the code that produces the graphics output */
 
endgraph;
end;
run;

For demonstration purposes, we are going to use the SAS data set Sashelp.Heart and we are going to plot the variable CHOLESTEROL. The ENTRYTITLE statement defines the title for the graph. This statement is valid within the BEGINGRAPH block or after the last ENDLAYOUT statement. The plotting statements are contained within a LAYOUT block. In our example, we have enclosed the HISTOGRAM and DENSITYPLOT plotting statements inside a LAYOUT OVERLAY block. A standard axis is displayed with the BINAXIS=FALSE option in the HISTOGRAM statement. In the PROC SGRENDER statement, we point to the template definition, INSET, using the TEMPLATE option.

proc template;
define statgraph inset;
begingraph;
entrytitle 'Framingham Heart Study';
   layout overlay;
     histogram cholesterol /  binaxis=false datatransparency=0.5;
     densityplot cholesterol /  datatransparency=0.5;
   endlayout;
endgraph;
end;
run;
 
proc sgrender data=sashelp.heart template=inset;
run;

The results are shown in Figure 1.

SAS ODS Graphics

Figure 1

Special Play

Once we have produced the graph in Figure 1, we can see that we have room to display a second graph in it, in the upper right corner of the axis area. We can insert the graph inside the axis by placing the plotting statements inside of a LAYOUT OVERLAY block within a LAYOUT GRIDDED block.  Here are the details.

In the following LAYOUT GRIDDED statement, which is located after the DENSITYPLOT statement, we define the size of the graph using the options WIDTH=300px and HEIGHT=200px. And the options HALIGN=RIGHT and VALIGN=TOP place the graph in the top right corner.

layout gridded / width=300px height=200px halign=right valign=top;

The inset graph contains two regression lines, one for Cholesterol by Diastolic, which is a dashed blue line. The solid green regression line represents Cholesterol by Systolic. The NAME option is added to each of the REGRESSIONPLOT statements in order to produce a legend. The DISCRETELEGEND specifies that the legend be drawn for both SYSTOLIC and DIASTOLIC with the NAME option. The NAME values are case sensitive. I specify the Y axis label with the LABEL option within the YAXISOPTS option in the LAYOUT OVERLAY statement. I also specify THRESHOLDMAX=1 within the LINEAROPTS option within both YAXISOPTS and XAXISOPTS to ensure that the last tick mark value includes the highest value in the data.

I added a red fill pattern of L3 to the larger graph to make it stand out more.  In addition to specifying the fill pattern with the PATTERN option within the FILLPATTERNATTRS option, you must also specify FILLPATTERN within the DISPLAY option. Valid values for the PATTERN option within the FILLPATTERNATTRS option are L1-L5, R1-R5, and X1-X5.

Note:  The FILLPATTERN option and DISPLAY option FILLPATTERN are available beginning in SAS® 9.4 TS1M1. If you are running an older version of SAS, you need to remove this syntax from the program shown below.

proc template;
define statgraph inset;
begingraph;
entrytitle 'Framingham Heart Study';
  layout overlay;
     histogram cholesterol /  binaxis=false datatransparency=0.5 
                   display=(fillpattern outline fill)       fillattrs=(color=lightred) 
                   fillpatternattrs=(pattern=l3 color=red);
     densityplot cholesterol /  datatransparency=0.5 lineattrs=(color=darkred);
  layout gridded / width=300px height=200px halign=right valign=top;
  layout overlay / yaxisopts=(label='Blood Pressure' linearopts=(thresholdmax=1))
                   xaxisopts=(linearopts=(thresholdmax=1));
     regressionplot x=cholesterol y=diastolic / lineattrs=(color=blue pattern=2) name='Diastolic';
     regressionplot x=cholesterol y=systolic / lineattrs=(color=green) name='Systolic';
     discretelegend 'Diastolic' 'Systolic' / across=2; 
  endlayout;
  endlayout;
  endlayout;
endgraph;
end;
run;
 
proc sgrender data=sashelp.heart template=inset; 
run;

The results are displayed in Figure 2.

sas-ods-graphics02

Figure 2

Now you have the tools you need to display a graph within a graph. For other tips on creating graphs with ODS Graphics and the SG procedures, check out Sanjay Matange’s blog series, Graphically Speaking, http://blogs.sas.com/content/graphicallyspeaking/.pre

tags: Graph Template Language (GTL), SAS ODS, SAS Problem Solvers

Picture-in-Picture - It’s Not Just for Television Anymore was published on SAS Users.

6月 242016
 

ProblemSolversXML has become one of the major standards for moving data across the Internet. Some of XML’s strengths are the abilities to better describe data and to be more extensible than any of its predecessors such as CSV. Due to the increased popularity of XML for moving data, I provide a few tips in this article that will help when you need to read XML files into SAS software.

Reading XML Files

You can read XML files into SAS using either the XML engine or the XMLV2 engine. The XML engine was the first engine that SAS created to read XML files. The XMLV2 engine includes new functionality and enhancements and is aliased to the XML92 engine in SAS® 9.2.

It is easy to read XML files using the XMLV2 engine when the XML file that you read uses the GENERIC markup type and conforms to a very rectangular definition. Here is an example of the XML file layout that is required to be read natively using the XMLV2 engine:

If the file is not in this format, the following informative error message is generated in the SAS log:

reading XML files into SAS® software

If the file is not in this format, the following informative error message is generated in the SAS log:

Reading XML files into SAS Software02

The XMLMap file referenced in the error message is a SAS specific file that instructs the engine about how to create rows, columns, and other information from the XML markup. You can use either of the following methods to generate the XMLMap file:

  • the AUTOMAP= option within the SAS session beginning with SAS® 9.3 TS1M2
  • the SAS XML Mapper, which is a stand-alone Java application

Generating Dynamic Maps within the SAS Session

Using the XMLV2 engine, you can create a dynamic XMLMap file and use it to generate SAS data sets by specifying the AUTOMAP= and the XMLMAP= options in the LIBNAME statement. The AUTOMAP= option has two possible values: REPLACE and REUSE. REPLACE updates an existing XMLMap file, whereas REUSE uses an existing XMLMap file. Both values create a new XMLMap file if one does not currently exist. When the file is automapped, one or more data sets is created. This method creates a representation that is as relational as possible based on the XML markup. In addition, it generates surrogate keys that enable you to combine the data using the SQL procedure or the DATA step. The following example demonstrates this method.

Here is the XML data that is read:

Reading XML files into SAS Software03

Here is the SAS session code:

filename datafile 'c:teststudents2.xml';
filename mapfile "c:teststudents2.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;

proc copy in=datafile out=work;
run;

Here is the output:

Reading XML files into SAS Software04

For more information about using the XMLV2 engine, refer to the “SAS® 9.4 XMLV2 LIBNAME Engine Tip Sheet.”

Using SAS XML Mapper

If you answer “yes” to the following questions, you should use SAS XML Mapper to generate the XMLMap file rather than generating this file dynamically within the SAS session:

  • Do you need to generate an XMLMap from an XML schema file?
  • Do you need to generate a custom XMLMap file or map only a portion of the file?
  • Do you need to view the data to be imported before reading the XML file?
  • Do you need to add specific metadata such as data types, informats/formats, column widths, or names?
  • Do you want to generate sample code for the import of the XML file?

The following image shows SAS XML Mapper once you click the icon that runs the automapping feature, which enables you to quickly generate tables from the XML markup:

Reading XML files into SAS Software05

Using SAS XML Mapper, you can also generate a custom XMLMap file.

For more information, refer to the following video tutorials about creating XMLMap files:

I hope this article helps you easily read XML files into SAS regardless of the file structure. This information should also help you determine the appropriate method to use when generating an XMLMap file.

tags: SAS Problem Solvers, SAS Programmers

Tips for reading XML files into SAS® software was published on SAS Users.

12月 182015
 

Will indexing my SAS data sets help? This is one of the most frequent questions I hear in SAS Technical Support.  The response is always the same: “Maybe.  Tell me about your data, and what you are doing with it.”  Here is a primer on effective indexing.

Indexing can improve performance in some situations, and in other situations, indexing harms performance.  Several factors need to be considered:

  • Is your data set large?
  • Is the amount of data that you are extracting a small percentage of the total number of observations?
  • Is the data refreshed frequently or seldom?
  • Is the data frequently subset by the same variable or by more variables?
  • Is the data sorted by those same variables?

The answers to these questions will guide you to your answer.

Size of the data set and extract

The size of the data set, the number of observations, and the percentage of the observations that will be extracted are the first questions to answer. The words large and small from the first two questions in the list above are relative terms. As a general rule, if you are extracting more than 35% of the data in a query, then an index does not provide much benefit, if any. Back to the point about large or small, if you get a 50% improvement in performance of a job that originally took 40 minutes, that’s impressive. If the job originally took 40 seconds, a 50% improvement does not seem compelling.

Data refresh, subset, and sort

Next you should consider how often the data set is refreshed or rebuilt: daily, weekly, or monthly? Creating an index is not without cost. It takes space and time to build the index. Index files can often take as much or more space than the data set itself. The amount of time required to build the index depends on the data itself. It can take from minutes to hours if the number of observations is measured by the tens or hundreds of millions. So if the data is refreshed each evening and you are going to query it once each day, it might not be worth the time and space to build index.

Likewise, if you are going to query the data multiple times using different variables each time, indexing the data set has very limited benefit, if any.

An example problem and solution

Let’s look at an example: a 30-million observation data set that’s refreshed each weekend.  You need to extract data based on the state of residence of the individuals in the database.  You create an index on the state, and query each state separately for additional processing.  You see some performance benefit from having the index, but it does not seem to make a lot of difference, so you call SAS Technical Support to ask whether your SAS® software is broken.

As the conversation progresses, I ask whether the data set is sorted by the state variable and find out that it is not.  This is important, and here’s why.  When the index is built, SAS scans the data set and looks at the index variable(s) on each observation.  If a new value is found, that value is added to the index table in ascending order along with the Record ID (RID) where it was found.  If a duplicate value is found, only the RID is added.  So the index table has ascending values of the index variable(s), and the RIDs are in ascending order.

If the data is originally organized by a date/time stamp or by an ID Number, then the entries for each state are scattered throughout the data set, which makes it take a very long time to build the index. Even when using the index to retrieve the data, SAS has to scan through most, if not all, of the data set.

The fix is simple: sort the data set first by state, and then create the index.  Yes, it takes time to sort the data. However, the index will build in less time, and the speed of the query will amaze you.  Without first sorting the data, you can start the query, have time to fill your coffee cup, have a brief conversation, and still get back to your computer before the extract is done.  By sorting first and then creating the index, you might not have time to get out of your chair before the query is done.

One final point to note...

A subsetting IF statement does not use an index. Only a WHERE statement or a WHERE clause uses an index.

tags: Problem Solvers, SAS Problem Solvers, SAS Programmers

Will indexing my SAS data sets help? was published on SAS Users.

12月 192014
 

SAS Technical Support Problem SolversIf you haven’t tried them for your web applications and other graphics needs, you’ll want to read further!

Scalable Vector Graphics (SVG) output is vector graphics output you can display with most (if not all) modern web browsers. Because SVG graphic output is scalable, you can zoom in on the graphics output without losing resolution. Unlike bit-mapped images such as PNG or GIF output, they can be resized or transformed without compromising the clarity, eliminating the need to produce multiple versions of the same image! There are other advantages for using Scalable Vector Graphics like their ability to zoom in to view details, their smaller output file size and their usefulness for producing graphics for a range of display sizes and types.

Which SAS products offer SVG graphics?

The SVG family of device drivers has shipped as part of the SAS/GRAPH product since the SAS 9.2 release. Note that you can only use these SVG device drivers with traditional SAS/GRAPH procedures such as PROC GPLOT and PROC GCHART.

Starting with SAS 9.3 version of the Base product, you can also create SVG output with the SAS SG procedures such as SGPLOT and SGPANEL as well as with graphics output created with ODS Graphics. In SAS 9.4, you can also use Scalable Vector Graphics to produce animations.

Typically, when you create SVG graphics, you will want to create the output in one of these ways:

  • a standalone SVG file with a file extension of .svg
  • an HTML output file using the ODS HTML statement
  • an HTML5 output file in SAS 9.4

The output method you choose depends on your application. If you’re creating standalone SVG files, you can use that SVG file in some other document and make reference to it in another HTML page. For example, a common application for this would be creating logos in SVG that can be sized to any space. If you are using SAS 9.4, the HTML5 method is the best when creating an HTML document because the SVG can be embedded directly and there are no additional files to be moved.

In this blog post I’ll show you how to produce each one of these output types using the Base Product or SAS/GRAPH. I’ve also included a list of sample SAS/GRAPH animations that you can try.

Creating Scalable Vector Graphics with Base SAS

In SAS 9.3 and SAS 9.4, you can specify Scalable Vector Graphics output by specifying the OUTPUTFMT=SVG option on the ODS Graphics statement before the procedure step, such as:

ods graphics on / outputfmt=svg; 

The examples in this sections use the sashelp.cars data set shipped with the SAS 9.3 and SAS 9.4 Base product to produce a bubble plot.

svg_sgplot

Stand-alone SVG file. The following sample code uses PROC SGPLOT to write a standalone SVG file with the name sastest.svg to the C:temp directory when running on the Windows operating system:

    ods _all_ close; 
    ods listing gpath='c:temp';

    ods graphics / reset=all outputfmt=svg imagename='sastest'; 
 
    title1 'Plot of MPG City versus Horsepower';  
    proc sgplot data=sashelp.cars; 
      bubble x=horsepower y=mpg_city size=cylinders;
    run;

HTML file. This code uses the same PROC SGPLOT code to write a SVG file along with a corresponding HTML file to C:temp when running on the Windows operating system:

    ods _all_ close; 
    ods html path='c:temp' (url=none) file='svg.html'; 

    ods graphics / outputfmt=svg; 

    title1 'Plot of MPG City versus Horsepower';  
    proc sgplot data=sashelp.cars; 
      bubble x=horsepower y=mpg_city size=cylinders;
    run;

    ods html close; 
    ods listing; 

HTML 5 file. With SAS 9.4 only, you can use PROC SGPLOT with the ODS HTML5 statement to embed the SVG output in an HTML file. Note that with the code below, the SVG output is embedded inside the HTML output via the use of the svg_mode='inline' option on the ODS HTML5 statement.

    ods _all_ close; 
    ods html5 path='c:temp' (url=none) file='svg.html'
                options(svg_mode='inline');

    ods graphics / outputfmt=svg; 

    title1 'Plot of MPG City versus Horsepower';  
    proc sgplot data=sashelp.cars; 
      bubble x=horsepower y=mpg_city size=cylinders;
    run;

    ods html5 close; 
    ods listing; 

Creating Scalable Vector Graphics with SAS/GRAPH

The examples in this sections use PROC GPLOT and the sashelp.class data set to produce a linear plot of weight versus height.

svg_gplot

Stand-alone SVG file. Here is sample SAS code that uses PROC GPLOT to write a standalone SVG file with the name sastest.svg to the Temp directory on your C: drive when running on the Windows operating system:

    ods _all_ close; 
    ods listing;

    filename grafout 'c:tempsastest.svg'; 

    goptions reset=all device=svg gsfname=grafout;  

    symbol1 i=none v=dot c=black h=1.5;
    axis1 minor=none;  
    title1 'Plot of Weight versus Height';
    proc gplot data=sashelp.class;
      plot weight*height / haxis=axis1 vaxis=axis1;
    run;
    quit;  

HTML file. Here’s how to write the same output to a SVG file along with a corresponding HTML file:

    ods _all_ close; 
    ods html path='c:temp' (url=none) file='svg.html';  

    goptions reset=all device=svg;  

    symbol1 i=none v=dot c=black h=1.5;
    axis1 minor=none;  
    title1 'Plot of Weight versus Height';
    proc gplot data=sashelp.class;
      plot weight*height / haxis=axis1 vaxis=axis1;
    run;
    quit;  

    ods html close; 
    ods listing;

HTML 5 file. With SAS 9.4 only, the following sample code uses PROC GPLOT together with the ODS HTML5 statement to embed the SVG output in the resulting HTML file. Note that with the code below, the SVG output is embedded inside the HTML output via the use of the svg_mode='inline' option on the ODS HTML5 statement.

    ods _all_ close; 
    ods html5 path='c:temp' (url=none) file='svg.html'
               options(svg_mode='inline');   

    goptions reset=all device=svg;  

    symbol1 i=none v=dot c=black h=1.5;
    axis1 minor=none;  
    title1 'Plot of Weight versus Height';
    proc gplot data=sashelp.class;
      plot weight*height / haxis=axis1 vaxis=axis1;
    run;
    quit;  

    ods html5 close; 
    ods listing;  

Using Scalable Vector Graphics for animation in SAS/GRAPH

Beginning with SAS 9.4, you can create animated graphs for the web using the SVG device driver together with new options available on the OPTIONS statement. Here are links to sample programs on support.sas.com that demonstrate how to create animated graphs for the web using the SAS 9.4 SVG device driver:

tags: base sas, HTML5, ods, SAS Problem Solvers, sas/graph, Scalable Vector Graphics