Problem Solvers

2月 152019
 
Beginning with SAS® 9.4, you can embed graphics output within HTML output using the ODS HTML5 destination. This technique works with SAS/GRAPH® procedures (such as GPLOT and GCHART), SG procedures (such as SGPLOT and SGRENDER), and when you create graphics output with ODS Graphics enabled. Most (if not all) existing web browsers support graphics output embedded in HTML5 output.

Note: The default graphics output format for the ODS HTML5 destination is Scalable Vector Graphics (SVG). SVG documents display clearly at any size in any viewer or browser that supports SVG. So, SVG files are ideal for display on a computer monitor, PDA, or cell phone; or printed documents. Because it's a vector graphic, a single SVG document can be transformed to any screen resolution without compromising the clarity of the document. Here's an example:

The same SVG graph, scaled at 90% and then at 200%. But 100% crisp!

SAS/GRAPH procedures

When you use the ODS HTML5 destination with a SAS/GRAPH procedure, specify a value of SVG, PNG, or JPEG for the DEVICE option in the GOPTIONS statement. The following sample PROC GPLOT code embeds SVG graphics inside the resulting HTML output:

goptions device=svg;
ods _all_ close;  
ods html5 path="c:\temp" file="svg_graph.html"; 
symbol1 i=none v=squarefilled; 
proc gplot data=sashelp.cars; 
  plot mpg_city * horsepower;   
  where make="Honda"; 
run;
quit;  
ods html5 close; 
ods preferences;

Note that the ODS PREFERENCES statement above resets the ODS environment back to its default settings when you use the SAS windowing environment.

When you use the PNG or JPEG device driver with the ODS HTML5 destination, add the BITMAP_MODE="INLINE" option to the ODS HTML5 statement. Here is an example:

goptions device=png;
ods _all_ close; 
ods html5 path="c:\temp" file="png_graph.html"     options(bitmap_mode="inline");
symbol1 i=none v=squarefilled; 
proc gplot data=sashelp.cars; 
  plot mpg_city * horsepower;   
  where make="Honda"; 
run;
quit;  
ods html5 close; 
ods preferences;

ODS Graphics and SG procedures

When you use SG procedures and ODS Graphics, specify a value of SVG, PNG, or JPEG for the OUTPUTFMT option in the ODS GRAPHICS statement. The following sample code uses PROC SGPLOT to embed SVG graphics inside the HTML output with the ODS HTML5 destination:

ods _all_ close; 
ods html5 path="c:\temp" file="svg_graph.html"; 
ods graphics on / reset=all outputfmt=svg;
proc sgplot data=sashelp.cars; 
  scatter y=mpg_city x=horsepower / markerattrs=(size=9PT symbol=squarefilled);   
  where make="Honda"; 
run;
ods html5 close; 
ods preferences;  

The following sample code uses PROC SGPLOT to embed PNG graphics inside the HTML output with the ODS HTML5 destination:

ods _all_ close; 
ods html5 path="c:\temp" file="png_graph.html" options(bitmap_mode="inline");   
      ods graphics on / reset=all outputfmt=png;
proc sgplot data=sashelp.cars; 
  scatter y=mpg_city x=horsepower / markerattrs=(size=9PT symbol=squarefilled);   
  where make="Honda"; 
run;
      ods html5 close; 
      ods preferences; 

The technique above also works when you use the ODS GRAPHICS ON statement with other procedures that produce graphics output (such as the LIFETEST procedure).

Note that the ODS HTML5 destination supports the SAS Graphics Accelerator. The SAS Graphics Accelerator enables users with visual impairments or blindness to create, explore, and share data visualizations. It supports alternative presentations of data visualizations that include enhanced visual rendering, text descriptions, tabular data, and interactive sonification. Sonification uses non-speech audio to convey important information about the graph.

You can use the ODS HTML5 destination in most situations where you need to embed all of your output into a single HTML output location. For example, when you email HTML output as an attachment or when you create graphics output via a SAS stored process. If you currently use the ODS HTML destination, you might want to experiment with the ODS HTML5 destination to see whether it meets your needs even if you cannot completely switch to it yet.

Embed scalable graphics using the ODS HTML5 destination was published on SAS Users.

1月 182019
 
Interested in learning about what's new in a software release? What if you want to know whether anything has changed in a SAS product? Or whether there are steps that you need to take before you upgrade to a new software release?
 
The online SAS product documentation for SAS® 9.4 and SAS® Viya® contains new sections that provide these answers. The following sections are usually listed on the left-hand side of the table of contents for the online Help document:
 
“What’s New in Base SAS: Details”
“What’s New in SAS 9.4 and SAS Viya”
“SAS Guide to Software Updates and Product Changes”
Note: To make the product-change information easier to find, this section was retitled for SAS® 9.4M6 and SAS® Viya® 3.4. For documentation about previous releases of SAS 9.4, this section is called “SAS Guide to Software Updates.” The information about product changes is included in a subsection called “Product Details and Requirements.” Although the title is different in newer documentation, the content remains the same.

What's available in each section?

• “What’s New” contains information about new features. For example, in SAS 9.4M6, “What’s New” discusses a new ODS destination for Word and a new procedure called PROC SGPIE.
 
• The “SAS Guide to Software Updates and Product Changes” includes the following subsections:
      o A section on software updates, which is for customers who are upgrading to a new release. (FYI: A software update is any modification that SAS provides for existing software. An upgrade is a new release of SAS. A maintenance release is a collection of updates that are applied to the currently installed software.)
      o Another subsection discusses product details and requirements. In it, you will find information about values or settings that have changed from one release to the next. For example, for SAS 9.4M6, the default style for HTML5 output changed from HTMLBlue to HTMLEncore. Another example is for SAS® 9.4M0, when the YEARCUTOFF= system option changed from 1920 to 1926.

Other links to these resources

In the “What’s New” documentation, there is a link in each section to the corresponding product topic in “SAS Guide to Software Updates and Product Changes.”

For example, when you scroll through this SAS Studio page, you see references both to various “What’s New” pages for different versions of SAS Studio and to “SAS Guide to Software Updates and Product Changes.”

In “What's New in Base SAS: Details,” you can search by the software and maintenance release to find new features. Beginning with SAS 9.4, new features for maintenance releases are introduced using the SAS 9.4Mx notation. For example, in the search box on the page, you can enter 9.4M6.

Final thoughts

With these new online Help sections, you can find information quickly about new features of the current SAS release, as well as what has changed from the previous release. As always, we welcome your feedback and suggestions for improving the documentation.

Special thanks

Elizabeth Downes and Marie Dexter in SAS Documentation Development were very willing to make the requested wording changes in the documentation. They also contributed to the content of this article. Thanks to both for their time and effort!

Navigating SAS documentation to find out about new, modified, and updated features was published on SAS Users.

12月 142018
 
Several years ago, I wrote a paper about the top-ten questions about the DATA step that SAS Technical Support receives from customers. Those topics are still popular among people who contact us for help. In this blog, I’m sharing some additional questions that we’re asked on a regular basis. Those questions cover SAS dates, arrays, and how to reference local PC files from SAS® Enterprise Guide® and SAS® Studio when those applications connect to a SAS® server in UNIX operating environments.

About SAS® dates

Let’s begin with dates. We regularly hear customers say something similar to this: "I have a date, but I’m not sure how to use it or whether it’s even a SAS date yet." No worries--we can figure it out! A SAS date is a numeric variable whose value represents the number of days between January 1, 1960 and a specific date. For example, assume that you have a variable named X that has a value of 12398, but you’re not sure what that value represents. Is it a SAS date? Or does it represent January 23, 1998?
 
To determine what the value represents, you first need to run the CONTENTS procedure on the data set and determine whether the variable in question is character or numeric.
 
For this example, here is the partial output from the PROC CONTENTS step:

Alphabetic List of Variables and Attributes
#    Variable    Type    Len    Format

1    x           Num       8
2    y           Char      3
3    z           Num       8    Z5.

If X is a numeric variable, is a format shown in the FORMAT column for that variable? In this case, the answer is no. However, if the variable is numeric and there is no assigned format, this might be a SAS date that needs to be formatted to make sense of the value. If you run a simple DATA step to add any date format to that SAS date value, you will see that 12398 represents the date December 11, 1993.
 
data a;
mydate=12398;
format mydate worddate.;
run;  

If you print the results of this program with the PRINT procedure, the output for data set A is as shown below:
 
Obs         mydate

 1     December 11, 1993

Is this a valid date in the context of this data sample? If you’re unsure, look at the other date values to see whether most of them are similarly structured. Most of the time, if a variable is stored as a SAS date, the variable is already assigned a date format, which is shown in the PROC CONTENTS output. If the value 12398 is a numeric variable such that the digits represent the month, day, and year of a given date (for example, January 23, 1998), you can convert it to a SAS date by running the following DATA step:
 
data a;
x=12398;
y=input(put(x,5.),mmddyy6.);
format y date9.;
run;

The PROC PRINT output from this step shows that the variable Y has a formatted value of 23JAN1998.
 
Obs      x              y

 1     12398    23JAN1998

The format that you assign to the variable can be any SAS format or custom-date format.
 
If the original variable is a character variable, you can convert it to a SAS date by using the INPUT function and the MMDDYY6. informat.
 
data a;
x='12398';
y=input(x,mmddyy6.);
format y date9.;
run;

Using arrays in SAS

Many customers aren’t quite sure that they understand how to use arrays. Arrays are a common construct in many programming languages. Arrays can seem less complex when you remember that they are a temporary grouping of variables. When you perform the same operation on multiple variables, you have less to program if you can refer to a group of variables by a single name. You simply execute a DO loop that processes each variable in turn, and the task is complete!

We often see arrays used for "reshaping data" or transposing a data set from wide-to-long (or long-to-wide). For example, assume that you want to reshape a data set, comprised of three variables and four observations, into a data set that contains twelve variables. Using an array approach makes the programming much easier, as shown below:

In this example:

    1. The variables X, Y, and Z are loaded into an array named VARS, which means that they can be referred to as VARS(1) – VARS(3) or by the variable names X, Y, and Z.
    2. A multidimensional array named ALL is created with twelve variables. The first number in parentheses represents rows, and the second represents columns.
    3. A DO loop processes each variable in the VARS array.
    4. The ALL array is populated one observation at a time by the value of I and the value of J as the DO loop increments.

Because the ALL array is populated by each observation as it is read from data set One, the END= option in the SET statement creates the variable LAST as a flag. This variable indicates when the last observation is read, and the IF statement tests variable LAST. If the variable has a value of 1 (which evaluates to "true"), the statement prints the contents of the program data vector to the output data set. Here's the starting data set and the reshaped result:

Managing PC files in client/server environments

When I began working in Technical Support many years ago, the only interface to Base SAS® software was the Display Manager System, which has separate Program Editor, Log, and Output windows. Now, you can run SAS in various ways, and many of our customers use SAS Enterprise Guide and SAS Studio as their interfaces. One of the most frequently asked questions from customers is about how to access local PC files from these applications that access SAS through a UNIX server.

SAS Enterprise Guide offers built-in tasks to upload and download data sets and other files. You can find these tasks on the Tasks->Data menu.

Two of the tasks, Upload Data Files to Server and Download Data Files to PC, allow you to copy SAS data sets directly between your local PC and your SAS libraries. The third task, Copy Files, allow you to copy any file (or group of files) between your local PC and the file system of the SAS session. See this article to learn how to apply a common pattern with this task: export and download any file from SAS Enterprise Guide. (Note: The Copy Files task was added in SAS Enterprise Guide 7.13. For earlier releases, you can follow the steps in this article.)

If you’re using the SAS Studio interface, you can upload and download files between the server and your PC.

Upload File and Download File buttons in SAS Studio

 
To download a file from the SAS server to your computer:

    1. Select the file that you want to download from the folder tree.
    2. Click the download button and save the file according to the information in your browser dialog box.

To upload one or more files from your local computer:

    1. Select the folder to which you want to upload the files and click the upload button.
    2. In the Upload Files window, click Choose Files to browse for the files that you want to upload.
    3. Select one or more files from your computer and click Open. The selected files are displayed as well as their size. An error message is displayed when you try to upload files where the total size exceeds 10 MB.
    4. Click Upload to complete the upload process.

Always go back to the basics

The three topics that are discussed here don't represent new features or challenges. However, these topics generate many calls to Technical Support. It's a reminder that even as SAS continues to add new features and technology, we still need to know how to tackle the basic building blocks of our SAS programs.

FAQs about SAS dates, arrays and managing local PC files was published on SAS Users.

11月 282018
 
One of the great things about programming with SAS® software is that there are many ways to accomplish the same task. And, since SAS often adds new features that can make a task easier, it's important to stay informed.

This blog shows a few samples of graphs and explains how you can use new functionality to make the old graphs look new again. Over the past several releases, SAS has added more options and procedures for ODS Graphics. While your tried-and-true SAS/GRAPH programs still work, ODS Graphics can create modern-looking graphs with less code, while providing more output options. And, ODS Graphics is part of Base SAS, which means that all of these techniques work in SAS University Edition.

Note: All the graphs in this blog are created using the fifth maintenance release of SAS® 9.4M5 (TS1M5). Not all options are available in prior releases of SAS.

Adding special symbols on a graph

The following graph is created with the DATA Step Graphics Interface (DSGI), which draws the horizontal bars and airplanes as well as places the text.

However, the DSGI is not supported in releases after SAS® 9.3. In SAS 9.4 and later, you can create a similar graph using the SYMBOLCHAR statement in the SGPLOT procedure. Using this statement in PROC SGPLOT references the hexadecimal value for the airplane symbol, as shown below:

To create this graph with PROC SGPLOT, submit the following code:

data planes;
   input month $ number;
   xval2=number + 2000;
   low=0;
   format number comma8.;
   cards;
Jan 13399
Feb 13284
Mar 14725
Apr 15370
May 16252
Jun 15684
Jul 15313
Aug 16005
;
title1 height=14pt 'Number of Flights at Raleigh Durham International Airport';
title2 height=14pt 'By Month for 2018';
footnote1 height=12pt 'Source: Federal Aviation Administration TFMSC Report (Airport)';
 
 
 
proc sgplot data=planes noautolegend noborder;
hbarbasic month / response=number fillattrs=(color=graydd) nooutline
barwidth=0.5 baselineattrs=(thickness=0px);
symbolchar name=airplane char='2708'x / hoffset=0.3 voffset=0.05;
scatter x=number y=month /markerattrs=(symbol=airplane size=60px
color=black);
scatter x=xval2 y=month / markerchar=number markercharattrs=(size=14pt);
xaxis offsetmin=0 display=none;
yaxis display=(noline noticks nolabel) valueattrs=(size=14pt)
offsetmin=0.025 offsetmax=0.025;
run;

For information about PROC SGPLOT, see SGPLOT Procedure in SAS® 9.4 ODS Graphics: Procedures Guide, Sixth Edition.

For more information about the SYMBOLCHAR statement, see the section "SYMBOLCHAR Statement" in the "SGPLOT Procedure" chapter of SAS® 9.4 ODS Graphics: Procedures Guide, Sixth Edition.

Assigning colors to data values

The next example graphs show the results for a fictitious ice-cream flavor survey. Because not all the ice cream flavors are present in each survey group, macro code is used to conditionally define the PATTERN statements based on the values in the data.

You can achieve the same more easily by using attribute maps in PROC SGPLOT to associate the attributes, such as color, with data values so that the same color is always associated with the same data value. The following graph, which is similar to the one above, is created using this method:

To create this graph, submit the following code:

/* Create the input data set ICECREAM */
data icecream;
   input @1 Flavor $10. @12 Rank 1. @14 GRP $1.;
   datalines;
Strawberry 2 B
Chocolate  1 B
Vanilla    3 B
Strawberry 2 A
Vanilla    1 A
;
run;
 
proc sort;
by grp;
run;
 
data attrmap;
id='barcolors';
length value fillcolor linecolor $10;
input value $ fillcolor $;
linecolor=fillcolor;
datalines;
Strawberry pink
Chocolate CX7B3F00
Vanilla beige
;
run;
options nobyline;
title "Ice Cream Survey for Group #byval(grp)";
 
proc sgplot data=icecream dattrmap=attrmap noautolegend;
by grp;
vbar flavor / response=rank group=flavor attrid=barcolors dataskin=pressed;
run;

I changed the colors for the bars in the PROC SGPLOT code so that the bar colors look more like the ice cream that they represent. I also added the DATASKIN= option for the bars to enhance the visual appeal of the bars in the graph.

For more information about attribute maps, see the section Using Attribute Maps to Control Visual Attributes in the SAS® 9.4 ODS Graphics: Procedures Guide, Sixth Edition.

Combining BY-group graphs into a single page

The following graph shows two plots that are created by using PROC GPLOT with a BY statement. The graphs are then paneled side-by-side with the GREPLAY procedure.

You can use the SGPANEL procedure to create the same plots in side-by-side panels. The benefit to this method is that you need only one procedure both to create the plots and to panel them, as shown below:

To create these paneled plots, submit the following code:

proc sgpanel data=sashelp.class;
panelby sex / novarname rows=1 columns=2;
scatter x=age y=height;
run;

Placing symbols and labels on a map

The next graph uses the Annotate facility with the SAS/GRAPH GMAP and GPROJECT procedures to place a symbol and city name at the location of select cities in North Carolina.

Beginning with the fifth maintenance release of SAS 9.4M5 (TS1M5) in 64-bit Windows and 64-bit Linux operating environments, you can use the SGMAP procedure to create such maps. Using this method, you can create maps that show much more detail.

You can use PROC SGMAP with the OPENSTREETMAP, SCATTER, and TEXT statements to create a similar graph, as shown below:

To create this map, submit the following code:

data cities;
input y x city $20.;
cards;
35.6125 -77.36667 Greenville 
36.21667 -81.67472 Boone
35.913064 -79.056112 Chapel Hill
;
run;
 
data dummy;
input y2 x2;
datalines;
33.857977 -84.321869
36.548759 -75.460423
;
 
data cities;
set cities dummy;
run;
title1 h=10pt 'Place points on a map at city locations';
 
proc sgmap plotdata=cities;
openstreetmap;
scatter x=x y=y / markerattrs=(color=red size=10px symbol=circlefilled);
scatter x=x2 y=y2 / markerattrs=(size=0px);
text x=x y=y text=city / textattrs=(size=10pt) position=right;
run;

Because the OPENSTREETMAP statement is used in PROC SGMAP, more detail (for example, cities and roads) is included in the map.

The DUMMY data set adds coordinates to the points that are plotted to modify the display area of the map.

For more information about controlling the display area of the map, see the article How to Control Map Display Area with PROC SGMAP.

For more information about PROC SGMAP, see the SGMAP Procedure chapter in SAS/GRAPH® and Base SAS® 9.4: Mapping Reference.

See also

Many of these features have been covered in more depth within other blog articles. Visit these articles to learn more!
Examples of adding special symbols in your charts using the SYMBOLCHAR statement
Using the new SGMAP procedure to create maps in Base SAS
Adding data-driven features to your charts with ATTRS options
Controlling your graph appearance with DATASKIN and FILLTYPE options

Making great graphs even better with ODS Graphics was published on SAS Users.

11月 012018
 

This blog post was also written by SAS' Bari Lawhorn.

We have had several requests from customers who want to use SAS® software to automate the download of data from a website when there is no application programming interface (API) to do it. As an example, the Yahoo Finance website recently changed their service to decommission their API, and this generated an interesting challenge for one of our customers. This SAS programmer wanted to download historical stock price data "unattended," without having to click through a web page. While working on this case, we discovered that the Yahoo Finance website requires a cookie-crumb combination to download. To help you automate downloads from websites that do not have an API, this blog post takes you through how we used the DEBUG feature of PROC HTTP to achieve partial automation, and eventually full automation with this case.

Partial automation

To access the historical data for Apple stock (symbol: AAPL) on the Yahoo Finance website, we use this URL: https://finance.yahoo.com/quote/AAPL/history?p=AAPL

We click Historical Data --> Download Data and get a CSV file with historical stock price data for Apple. We could save this CSV file and read it into SAS. But, we want a process that does not require us to click in the browser.

Because we know the HTTP procedure, we right-click Download Data and then select Copy link address as shown from a screen shot using the Google Chrome browser below:

Note: The context menu that contains Copy link address looks different in each browser.

Using this link address, we expect to get a direct download of the data into a CSV file (note that your crumb= will differ from ours):

filename out "c:\temp\aapl.csv";
 
proc http
 url='https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1535399845&period2=1538078245&interval=1d&events=history&crumb=hKubrf50i1P'
 method="get" out=out;
run;

However, the above code results in the following log message:

NOTE: PROCEDURE HTTP used (Total process time):
real time           0.25 seconds
cpu time            0.14 seconds
 
NOTE: 401 Unauthorized

When we see this note, we know that the investigation needs to go further.

filename out "c:\temp\aapl.csv";
proc http
 url='https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1535399845&period2=1538078245&interval=1d&events=history&crumb=hKubrf50i1P'
 method="get" out=out;
 debug level=3;
run;

When we run the code, here's what we see in the log (snipped for convenience):

> GET
/v7/finance/download/AAPL?period1=1535399845&period2=1538078245&interval=1d&events=history&crumb=h
Kubrf50i1P HTTP/1.1
 
> User-Agent: SAS/9
> Host: query1.finance.yahoo.com
> Accept: */*
> Connection: Keep-Alive
> Cookie: B=fpd0km1dqnqg3&b=3&s=ug
> 
< HTTP/1.1 401 Unauthorized
< WWW-Authenticate: crumb
< Content-Type: application/json;charset=utf-8
 
…more output…
 
< Strict-Transport-Security: max-age=15552000
 
…more output…
 
{    "finance": {        "error": {            "code": "Unauthorized",
"description": "Invalid cookie"        }    }}
NOTE: PROCEDURE HTTP used (Total process time):
      real time           0.27 seconds
      cpu time            0.15 seconds
 
NOTE: 401 Unauthorized

The log snippet reveals that we did not provide the Yahoo Finance website with a valid cookie. It is important to note that the response header for the URL shows crumb for the authentication method (the line that shows WWW-Authenticate: crumb. A little web research helps us determine that the Yahoo site wants a cookie-crumb combination, so we need to also provide the cookie. But, why did we not need this step when we were using the browser? We used a tool called Fiddler to examine the HTTP traffic and discovered that the cookie was cached when we first clicked in the browser on the Yahoo Finance website:

Luckily, starting in SAS® 9.4M3 (TS1M3), PROC HTTP will set cookies and save them across HTTP steps if the response contains a "set-cookie:  <some cookie>" header when it successfully connects to a URL. So, we try this download in two steps. The first step does two things:

  • PROC HTTP sets the cookie for the Yahoo Finance website.
  • Adds the DEBUG statement so that we can obtain the crumb value from the log.
filename out "c:\temp\Output.txt";
 
filename hdrout "c:\temp\Response.txt";
 
proc http
 out=out
 headerout=hdrout
 url="https://finance.yahoo.com/quote/AAPL/history?p=AAPL"
 method="get";
 debug level=3;
run;

Here's our log snippet showing the set-cookie header and the crumb we copy and use in our next PROC HTTP step:

…more output…
< set-cookie: B=2ehn8rhdsf5r2&b=3&s=fe; expires=Wed, 17-Oct-2019 20:11:14 GMT; path=/;
domain=.yahoo.com
 
…more output…
 
Initialized"},"account-switch-uh-0-AccountSwitch":{"status":"initialized"}}},"CrumbStore":{"crumb":
"4fKG9lnt5jw"},"UserStore":{"guid":"","login":"","alias":"","firstName":"","comscoreC14":-1,"isSig

The second step uses the cached cookie from Yahoo Finance (indicated in the "CrumbStore" value), and in combination with the full link that includes the appropriate crumb value, downloads the CSV file into our c:\temp directory.

filename out "c:\temp\aapl.csv";
 
proc http
 url='https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1535399845&period2=1538078245&interval=1d&events=history&crumb=4fKG9lnt5jw'
 method="get"
 out=out;
run;

With the cookie value in place, our download attempt succeeds!

Here is our log snippet:

31
32   proc http
33       out=data
34       headerout=hdrout2
35       url='https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1534602937&peri
35 ! od2=1537281337&interval=1d&events=history&crumb=4fKG9lnt5jw'
36       method="get";
37   run;
 
NOTE: PROCEDURE HTTP used (Total process time):
      real time           0.37 seconds
      cpu time            0.17 seconds
 
NOTE: 200 OK

Full automation

This partial automation requires us to visit the website and right-click on the download link to get the URL. There’s nothing streamlined about that, and SAS programmers want full automation!

So, how can we fully automate the process? In this section, we'll share a "recipe" for how to get the crumb value -- a value that changes with each transaction. To get the current crumb, we use the first PROC HTTP statement to "screen scrape" the URL and to cache the cookie value that comes back in the response. In this example, we store the first response in the Output.txt file, which contains all the content from the page:

filename out "c:\temp\Output.txt";
filename hdrout "c:\temp\Response.txt";
 
proc http 
    out=out 
    headerout=hdrout
    url="https://finance.yahoo.com/quote/AAPL/history?p=AAPL" 
    method="get";
run;

It is a little overwhelming to examine the web page in its entirety. And the HTML page contains some very long lines, some of them over 200,000 characters long! However, we can still use the SAS DATA step to parse the file and retrieve the text or information that might change on a regular basis, such as the crumb value.

In this DATA step we read chunks of the text data and scan the buffer for the "CrumbStore" keyword. Once found, we're able to apply what we know about the text pattern to extract the crumb value.

data crumb (keep=crumb);
  infile out  recfm=n lrecl=32767;
  /* the @@ directive says DON'T advance pointer to next line */
  input txt: $32767. @@;
  pos = find(txt,"CrumbStore");
  if (pos>0) then
    do;
      crumb = dequote(scan(substr(txt,pos),3,':{}'));
      /* cookie value can have unicode characters, so must URLENCODE */
      call symputx('getCrumb',urlencode(trim(crumb)));
      output;
    end;
run;
 
%put &=getCrumb.;

Example result:

 102        %put &=getCrumb.;
 GETCRUMB=PWDb1Ve5.WD

We feel so good about finding the crumb, we're going to treat ourselves to a whole cookie. Anybody care for a glass of milk?

Complete Code for Full Automation
The following code brings it all together. We also added a PROC IMPORT step and a bonus highlow plot to visualize the results. We've adjusted the file paths so that the code works just as well on SAS for Windows or Unix/Linux systems.

/* use WORK location to store our temp files */
filename out "%sysfunc(getoption(WORK))/output.txt";
filename hdrout "%sysfunc(getoption(WORK))/response1.txt";
 
/* This PROC step caches the cookie for the website finance.yahoo.com */
/* and captures the web page for parsing later                        */
proc http 
  out=out
  headerout=hdrout
  url="https://finance.yahoo.com/quote/AAPL/history?p=AAPL" 
  method="get";
run;
 
/* Read the response and capture the cookie value from     */
/* the CrumbStore field.                                   */
/* The file has very long lines, longer than SAS can       */
/* store in a single variable.  So we read in <32k chunks. */
data crumb (keep=crumb);
  infile out  recfm=n lrecl=32767;
  /* the @@ directive says DON'T advance pointer to next line */
  input txt: $32767. @@;
  pos = find(txt,"CrumbStore");
  if (pos>0) then
    do;
      crumb = dequote(scan(substr(txt,pos),3,':{}'));
      /* cookie value can have unicode characters, so must URLENCODE */
      call symputx('getCrumb',urlencode(trim(crumb)));
      output;
    end;
run;
 
%put &=getCrumb.;
 
filename data "%sysfunc(getoption(WORK))/data.csv";
filename hdrout2 "%sysfunc(getoption(WORK))/response2.txt";
 
proc http 
    out=data 
    headerout=hdrout2
    url="https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1535835578%str(&)period2=1538427578%str(&)interval=1d%str(&)events=history%str(&)crumb=&getCrumb."
    method="get";
run;
 
proc import
 file=data
 out=history
 dbms=csv
 replace;
run;
 
proc sgplot data=history;
  highlow x=date high=high low=low / open=open close=close;
  xaxis display=(nolabel) minor;
  yaxis display=(nolabel);
run;


Disclaimer: As we've seen, Yahoo Finance could change their website at any time, so the URLs in this blog post might not be accurate at a later date. Note that, as of the time of this writing, the above code runs error-free with Base SAS 9.4M5. And it also works in SAS University Edition and SAS OnDemand for Academics!

How to automate a data download with PROC HTTP was published on SAS Users.

9月 212018
 

SAS Technical Support occasionally receives requests from users who want to insert blank rows into their TABULATE procedure tables. PROC TABULATE syntax does not have any specific options that insert blank rows into the table results.

One way to accomplish this task is explained in SAS Sample 45972, "Add a blank row in PROC TABULATE output in ODS destinations." This sample shows you how to add a blank row between class variables in a PROC TABULATE table.  The sample code creates a new data set variable that you can use in a CLASS statement in the PROC TABULATE step.

In addition to the method that is shown this sample, there are two other methods that you can use to insert a blank row in PROC TABULATE table results. The following sections explain those methods:

Method 1: Adding a blank row between row dimension variables

This section demonstrates how to add a blank row between row dimension variables.  This method expands on the approach that is used in Sample 45972.

In the following example, additional data-set variables are added to the data set in a DATA step. The BLANKROW variable is used as a class variable, and the variables DUMMY1 and DUMMY2 are used as analysis variables in the PROC TABULATE step. This sample code also uses the AGE and SEX variables from the SASHELP.CLASS data set as class variables in PROC TABULATE.

/***  Method 1  ***/
data class;
set sashelp.class;
blankrow='  ';
 
dummy1=1;
dummy2=.;
run;
 
proc tabulate data=class missing;
class age sex blankrow;
var dummy1 dummy2;
table age*dummy1=' '
blankrow=' '*dummy2=' '
 
sex*dummy1=' '
blankrow=' '*dummy2=' '
 
all*dummy1=' ',
 
sum*F=8. pctsum / misstext=' ' row=float;
 
keylabel sum='# of Students' 
pctsum='% of Total'
all='Total';
 
title 'Add a Blank Row by Using New Data Set Variables';
run;

This method produces the result that is shown below.  You can use ODS HTML, ODS PDF, ODS RTF, or ODS EXCEL statements to display the table.

 

 

 

 

 

 

 

 

 

 

Method 2: Adding blank rows with user-defined formats and the PRELOADFMT option in the CLASS statement

The second method creates user-defined formats and uses the PRELOADFMT option in a CLASS statement in PROC TABULATE.  The VALUE statements that use the NOTSORTED option in the PROC FORMAT step establish the desired order of the results in the TABULATE results. Using the formats and specifying the ORDER=DATA option in the CLASS statement and the PRINTMISS option in the TABLE statement keeps the order requested in the PROC FORMAT VALUE statements and display the blank rows.

/***  Method 2 ***/
proc format;
value $sexf (notsorted)
'F'='F'
'M'='M'
' '=' ';
 
value agef (notsorted)
11='11'
12='12'
13='13'
14='14'
15='15'
16='16'
.='  ';
 
value $sex2f (notsorted default=8)
'F'='F'
'M'='M'
' '='Missing'
'_'=' ';
 
value age2f (notsorted)
11='11'
12='12'
13='13'
14='14'
15='15'
16='16'
.=' .'
99=' ';
 
value mymiss
0=' '
other=[8.2];
 
run;
 
proc tabulate data=sashelp.class missing;
class sex age / preloadfmt order=data;
 
table sex age all, N pctn*F=mymiss.
/ printmiss misstext=' ' style=[cellwidth=2in];
 
format sex $sexf. age agef.;
/* If there are no missing values for the class  */
/* variables, use the formats $SEXF and AGEF.*/
/* With missing values for the class variables,  */ 
/* use the formats $SEX2F and AGE2F.             */
 
keylabel N='# of Students'
PctN='% of Total'
all='Total';
 
title 'Add a Blank Row by Using the PRELOADFMT Option';
run;

This method produces the result that is shown below. You can use ODS HTML, ODS PDF, ODS RTF, or ODS EXCEL statements to display the table.

 

 

 

 

 

 

 

 

 

 

 

If you have any questions about these methods, contact SAS Technical Support. From this link, you can search the Technical Support Knowledge Base, visit SAS Communities to ask for assistance, and contact SAS Technical Support directly.

Adding blank rows in TABULATE procedure results was published on SAS Users.

8月 242018
 

I know a lot of you have been programming in SAS for a long time, which is awesome! However, when you do something for a long time, sometimes you get set in your ways and you miss out on new ways of doing things.

Although the COUNT and CAT functions have been around for a while now, I see a lot of customer code that is counting and concatenating text strings the "old-fashioned" way. In this article, I would like to introduce you to the COUNT, COUNTW, CATS and CATX functions. These functions make certain tasks much simpler, like counting words in a string and concatenating text together.

Counting words or text occurrences

First let's take a look at the COUNT and COUNTW functions.
The

Data a;
  Contributors='The Big Company INC, The Little Company, ACME Incorporated,    Big Data Co, Donut Inc.';
  Num=count(contributors,'inc','i');  /* the 'i' modifier means to ignore case*/
  Put num=;
Run;

When we examine the SAS log, we can see that NUM has a value of 3.

Num=3
NOTE: The data set WORK.A has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds

The

/* DON'T USE - use COUNTW instead */
data a(drop=done i);
  x='a#b#c#d#e';
  do until(done);
    i+1;
    y=scan(x,i,'#');
    if y='' then done=1;
    else output;
  end;
  run;

I realize this code isn't terrible, but I try to avoid DO UNTIL/WHILE loops if I can. There is always the possibility of going into an infinite loop.
The COUNTW function eliminates the need for a DO UNTIL/WHILE loop.

Here is an example of logic that I use all the time. In this example, I have a macro variable that contains a list of values that I want to loop through. I can use the COUNTW function to easily loop through each file listed in the resolved value of &FILE_NAMES. The code then uses the file name on the DATA statement and the INFILE statement.

%let file_names=01JAN2018.csv 01FEB2018.csv 01MAR2018.csv 01APR2018.csv;
%macro test(files);
 
%do i=1 %to %sysfunc(countw(&file_names,%str( )));
  %let file=%scan(&file_names,&i,%str( ));
  data _%scan(&file,1,.);
    infile "c:\my files\&file";
    input region $ manager $ sales;
  run;
%end;
%mend;
%test(&file_names)

The log is too large to list here, but you can see one of the generated DATA steps in the MPRINT output of this snapshot of the log.

MPRINT(TEST):   data _01JAN2018;
MPRINT(TEST):   infile "c:\my files\01JAN2018.csv";
MPRINT(TEST):   input region $ manager $ sales;
MPRINT(TEST):   run;

This data step will be generated for each file listed.

Counting strings within another text string should be easy to do. The COUNT functions definitely make this a reality!

Concatenating strings in SAS

Now that we know how to COUNT text in SAS, let me show you how to CAT in SAS with the CATS and CATX functions.

Back in the old days, I had hair(!) and we concatenated text strings using double pipes syntax.

  X=var1||var2||var3;

This syntax is not too bad, but what if VARn has trailing blanks? Prior to SAS Version 9 you had to remove the trailing blanks from each value. Also, if the text was right justified, you had to left justify the text. This complicates the syntax:

X=trim(left(var1))||trim(left(var2))||trim(left(var3));

You can now accomplish the same thing using CATS. The
data a;
  length var1 var2 var3 $12;
  var1='abc';
  var2='123';
  var3='xyz';
  x=cats(var1,var2,var3);
  put x=;
run;

VAR1-VAR3 have a length of 12, which means each value contains trailing blanks. By using the CATS function, all trailing blanks are removed and the text is concatenated without any spaces between the text. Here is the result of the above PUT statement.

x=abc123xyz

Another common need when concatenating text together is to create a delimited string. This can now be done using the CATX function. The

data a;
  length var1 var2 var3 $12;
  var1='abc';
  var2='123';
  var3='xyz';
  x=catx(',',var1,var2,var3);
  put x=;
run;

This syntax creates a comma separated list with all leading and trailing blanks removed. Here is the result of the PUT statement.

x=abc,123,xyz

Just how the COUNT functions making counting text in SAS easier, the CAT functions make concatenating strings so much easier. For more explanation and examples of these CAT* functions, see this paper by Louise Hadden, Purrfectly Fabulous Feline Functions (because they are CAT functions, get it?).

Before I let you go, let me point out that in addition to the COUNT, COUNTW, CATS and CATX functions, there are also the COUNTC, CAT, CATQ and CATT functions that provide even more functionality. These functions are not used as often, so I haven't discussed them here. Please How to COUNT CATs in SAS was published on SAS Users.

7月 192018
 

Suppose that you want to know the value of a character variable that has the highest frequency count or even the top three highest values. To determine that value, you need to create an output data set and sort the data by the descending Count or _FREQ_ variable. Then you need to print the top n observations using the OBS= option, based on the number of values that you want to see. You can do this easily using any of a variety of procedures that calculate a frequency count (for example, the FREQ Procedure or the MEANS Procedure).

This blog provides two detailed examples: one calculates the top n values for a single variable and one calculates the top n values for all character variables in a data set.

Print the top n observations of a single variable

The following example prints the three values of the Make variable in the Sashelp.Cars data set that have the highest frequency count. By default, PROC FREQ prints a variable called Count in the output data set. The output data set is sorted by this variable in descending order, and the number of observations that you want to keep is printed by using the OBS= data set option.

proc freq data=sashelp.cars noprint;
tables make / out=counts(drop=percent);
run;
 
proc sort data=counts;
by descending count;
run;
 
proc print data=counts(obs=3);
run;

Print the top n observations of all character variables in a data set

Suppose that you want to know the top three values for all the character variables in a data set. The process shown in the previous section is not efficient when you have many variables. Suppose you also want to store this information in a data set. You can use macro logic to handle both tasks. The following code uses PROC FREQ to create an output data set for each variable. Further manipulation is done in a DATA step so that all the data sets can be combined. A detailed explanation follows the example code:

%macro top_frequency(lib=,dsn=);
 
/* count character variables in the data set */
proc sql noprint;
select name into :charlist separated by ' '
from dictionary.columns
where libname=%upcase("&lib") and memname=%upcase("&dsn")and type='char';
quit;
 
%put &charlist;
%let cnt=%sysfunc(countw(&charlist,%str( )));
%put &cnt;
 
%do i=1 %to &cnt;
 
/* Loop through each character variable in */
/* FREQ and create a separate output  */           
/* data set.                               */
proc freq data=&lib..&dsn noprint;
tables %scan(&charlist,&i) / missing out=out&i(drop=percent 
 rename=(%scan(&charlist,&i)=value));
run;
 
data out&i;
length varname value $100;
set out&i;
varname="%scan(&charlist,&i)";
run;
 
proc sort data=out&i;
by varname descending count;
run;
 
%end;
 
data combine;
set %do i=1 %to &cnt;
out&i(obs=3) /* Keeps top 3 for each variable. */
%end;;
run;
 
proc print data=combine;
run;
 
%mend top_frequency;
 
options mprint mlogic symbolgen;
%top_frequency(lib=SASHELP,dsn=CARS);

I begin my macro definition with two keyword parameters that enable me to substitute the desired library and data set name in my macro invocation:

%macro top_frequency(lib=,dsn=);

The SQL procedure step selects all the character variables in the data set and stores them in a space-delimited macro variable called &CHARLIST. Another macro variable called &CNT counts how many words (or, variable names) are in this list.

proc sql noprint;
select name into :charlist separated by ' '
from dictionary.columns
where libname=%upcase("&lib") and memname=%upcase("&dsn") and type='char';
quit;
 
%put &charlist;
%let cnt=%sysfunc(countw(&charlist,%str( )));
%put &cnt;

The %DO loop iterates through each variable in the list and generates output data from PROC FREQ by using the OUT= option. The output data set contains two variables: the variable from the TABLES request with the unique values of that variable and the Count variable with the frequency counts. The variable name is renamed to Value so that all the data sets can be combined in a later step. In a subsequent DATA step, a new variable, called Varname, is created that contains the variable name as a character string. Finally, the data set is sorted by the descending frequency count.

%do i=1 %to &cnt;
 
/* Loop through each character variable in PROC FREQ */ 
/* and create a separate output data set.            */
proc freq data=&lib..&dsn noprint;
tables %scan(&charlist,&i) / missing
out=out&i(drop=percent 
 rename=(%scan(&charlist,&i)=value));
run;
 
data out&i;
length varname value $100;
set out&i;
varname="%scan(&charlist,&i)";
run;
 
proc sort data=out&i;
by varname descending count;
run;
 
%end;

The final DATA step combines all the data sets into one using another macro %DO loop in the SET statement. The %END statement requires two semicolons: one ends the SET statement and one ends the %END statement. Three observations of each data set are printed by using the OBS= option.

data combine;
set %do i=1 %to &cnt;
 out&i(obs=3) /* Keeps top 3 for each variable. */
%end;;
run;

Knowing your data is essential in any programming application. The ability to quickly view the top values of any or all variables in a data set can be useful for identifying top sales, targeting specific demographic segments, trying to understand the prevalence of certain illnesses or diseases, and so on. As explained in this blog, a variety of Base SAS procedures along with the SAS macro facility make it easy to accomplish such tasks.

Learn more

These resources show different ways to create "top N" reports in SAS:

Keeping the top frequency count (n) for each character variable in a SAS data set was published on SAS Users.

6月 232018
 

Once upon a Time

TranscodingOnce upon a time, Oliver S. Füßling merely occupied a line in a SAS® program. But one day, he lost his last name, and a quest began to help our hero find the rest of his name.

Our Story Begins

The SAS Training Center wanted to re-create course data for the "Introduction to Programming 1" class. The updated class uses SAS® Studio, a new programming environment that incorporates a UTF-8 SAS session encoding. However, the course data sets contained national language characters, which are not available on an English keyboard. As a result, depending on how those programs were submitted in the new environment, they experienced the following transcoding problems:

  • character substitution
  • data truncation
  • invalid-data errors

Like the Training Center, you might encounter similar transcoding issues if you have programs that:

  • contain national language characters
  • are created in the WLatin-1 SAS session encoding
  • you move to a UTF-8 SAS session encoding.

This story explains how you can move such programs successfully to a UTF-8 environment and avoid substitution characters, data truncation, and invalid-data errors.

The programs in the "Introduction to Programming 1" class were originally submitted via an earlier English edition of the SAS® Foundation. However, the sample program in this story is created in SAS® 9.4 (English).

When the program is opened in the Enhanced Editor window, this is how a shortened version of the program looks:

Note: If you would like a copy of this program for your own testing, see the Epilogue heading at the end of this post.

In SAS 9.4 (English) for the Windows environment, the default session encoding is WLatin-1. You can see the encoding in the log by running either of the following sets of statements:

  • PROC OPTIONS OPTION=ENCODING;
    RUN;
  • %PUT ENCODING= %SYSFUNC(getOption(ENCODING));

When programs are saved from the Enhanced Editor window, the encoding for the program file defaults to Default - Western (Windows), as shown below.

When the program file that is shown earlier, which contains the name Oliver S. Füßling, is uploaded and included into the SAS Studio code editor, Oliver's last name displays replacement characters rather than the expected national language characters.

Note: This display shows SAS Studio open in a Google Chrome browser. In this browser, you see two characters (diamonds with white question marks) that are substituted for the national language characters.  If you use SAS Studio in Microsoft Internet Explorer, the display shows only one diamond, and it truncates the remainder of the name.

To begin resolving the display problem, you need to look at the code-editor status bar (bottom of the window).

Notice that there is a text-encoding setting that informs SAS Studio of the encoding of the external file. That setting is shown to the right on the status bar. In the display above, that encoding is UTF-8.

Be aware that this text-encoding setting differs from the UTF-8 SAS session encoding that is displayed by the SAS ENCODING system option, which is generated in the log when you run the OPTIONS procedure. In SAS Studio, the default text encoding is UTF-8, regardless of the session encoding. Because the pgm.sas program was saved originally from the Enhanced Editor in the default Western-Windows encoding, it is not in the encoding that the SAS Studio code editor expects.

To fix the display issue, you can use either of the following options:

Option 1

1.  Right-click the program file and select Open with text encoding.

2.  In the Select Text Encoding dialog box, select the windows-1252 encoding value from the Navigation Pane menu.

The Windows code page 1252 represents the character set that is used by Western European languages, including English, in Microsoft Windows operating environments. The WLatin-1 encoding is the SAS equivalent for the 1252 Windows code page.[1]

3.  Click OK to save your selection before you exit the dialog box.

Option 2

From the General tab in the Preferences dialog box, select a value for the default text encoding.

When you set the value in this way, the change is not reflected immediately in the existing code- editor window. You must close the program and re-open it for the setting to take effect. Any programs that you open later will retain the same setting unless you change the setting or override it by selecting another value in the Select Text Encoding dialog box.

Oliver's last name is displayed correctly in the code editor when you use the windows-1252 setting to open the file, as shown below:

However, Oliver's last name is truncated on the HTML Results tab when you submit the program.

The Plot Thickens

Although the problem is fixed in the code editor when you submit the program, Oliver's last name is truncated as Füßli in the output. However, you do not receive any notes or warnings in the log about that truncation. So, why does the truncation happen?  The ü (U-umlaut) and the ß (German Eszett) are stored as single-byte characters (SBCS) in WLatin-1, but those characters require two bytes in UTF-8. As a result, there is not adequate space to print the remaining characters in the name.

When you submit programs that contain national language characters from a single-byte encoding to a UTF-8 environment, you must be prepared to modify the program to use wider informats when you create your variables. Otherwise, character truncation can occur.

You can correct this problem easily by enlarging the column to accommodate the extra bytes that are used to store the characters in UTF-8.

Here is the modified INPUT statement that successfully reads the data in a UTF-8 SAS session. The character informat for the Lname variable is increased from $7. to $9.

input StudID $12. Age Fname $6. Mi :$2. Lname $9.;

After you increase the informat, Oliver's last name is correct when you view it on the HTML Results tab.

A Subplot Appears

What if the program file is included and executed by using the %INCLUDE statement rather than by submitting it from the code editor?

In this situation, the program stops processing with the following errors:

NOTE: The data set WORK.TEST has 1 observations and 5 variables.
NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       cpu time            0.00 seconds
 
ERROR: Invalid characters were present in the data.
ERROR: An error occurred while processing text data.
NOTE: The SAS System stopped processing this step because of errors.
 
NOTE: There were 1 observations read from the data set WORK.TEST.

In this case, the HTML Results tab does not display a last name at all.

To eliminate this error, you need to use the ENCODING= option in the %INCLUDE statement, as shown below.

%include "your-directory/pgm.sas" /encoding="windows-1252";

By including the ENCODING="windows-1252" option in the %INCLUDE statement, the program now executes successfully, as shown by the notes in the log:

 NOTE: The data set WORK.TEST has 1 observations and 5 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.01 seconds
       cpu time            0.01 seconds
 
 
 NOTE: There were 1 observations read from the data set WORK.TEST.

Happily Ever After (or, The End)!

The moral of this story is that there are many ways to avoid transcoding problems when you have national language characters in SAS programs that you save from a SAS®9 (English) session and move to a UTF-8 environment. Hopefully, you can use the tips that are provided to avoid such issues. However, if you still have problems, you can call on another hero, SAS Technical Support, for help!

Epilogue

The following program is the one used throughout this story. You can copy and paste it for your own use.

data test;
   input StudID $12. Age Fname $6. Mi :$2. Lname $7.;
   datalines;
120400310496 15 Oliver S. Füβling
;
 
proc print;
run;

Additional Resources

6月 232018
 

Once upon a Time

TranscodingOnce upon a time, Oliver S. Füßling merely occupied a line in a SAS® program. But one day, he lost his last name, and a quest began to help our hero find the rest of his name.

Our Story Begins

The SAS Training Center wanted to re-create course data for the "Introduction to Programming 1" class. The updated class uses SAS® Studio, a new programming environment that incorporates a UTF-8 SAS session encoding. However, the course data sets contained national language characters, which are not available on an English keyboard. As a result, depending on how those programs were submitted in the new environment, they experienced the following transcoding problems:

  • character substitution
  • data truncation
  • invalid-data errors

Like the Training Center, you might encounter similar transcoding issues if you have programs that:

  • contain national language characters
  • are created in the WLatin-1 SAS session encoding
  • you move to a UTF-8 SAS session encoding.

This story explains how you can move such programs successfully to a UTF-8 environment and avoid substitution characters, data truncation, and invalid-data errors.

The programs in the "Introduction to Programming 1" class were originally submitted via an earlier English edition of the SAS® Foundation. However, the sample program in this story is created in SAS® 9.4 (English).

When the program is opened in the Enhanced Editor window, this is how a shortened version of the program looks:

Note: If you would like a copy of this program for your own testing, see the Epilogue heading at the end of this post.

In SAS 9.4 (English) for the Windows environment, the default session encoding is WLatin-1. You can see the encoding in the log by running either of the following sets of statements:

  • PROC OPTIONS OPTION=ENCODING;
    RUN;
  • %PUT ENCODING= %SYSFUNC(getOption(ENCODING));

When programs are saved from the Enhanced Editor window, the encoding for the program file defaults to Default - Western (Windows), as shown below.

When the program file that is shown earlier, which contains the name Oliver S. Füßling, is uploaded and included into the SAS Studio code editor, Oliver's last name displays replacement characters rather than the expected national language characters.

Note: This display shows SAS Studio open in a Google Chrome browser. In this browser, you see two characters (diamonds with white question marks) that are substituted for the national language characters.  If you use SAS Studio in Microsoft Internet Explorer, the display shows only one diamond, and it truncates the remainder of the name.

To begin resolving the display problem, you need to look at the code-editor status bar (bottom of the window).

Notice that there is a text-encoding setting that informs SAS Studio of the encoding of the external file. That setting is shown to the right on the status bar. In the display above, that encoding is UTF-8.

Be aware that this text-encoding setting differs from the UTF-8 SAS session encoding that is displayed by the SAS ENCODING system option, which is generated in the log when you run the OPTIONS procedure. In SAS Studio, the default text encoding is UTF-8, regardless of the session encoding. Because the pgm.sas program was saved originally from the Enhanced Editor in the default Western-Windows encoding, it is not in the encoding that the SAS Studio code editor expects.

To fix the display issue, you can use either of the following options:

Option 1

1.  Right-click the program file and select Open with text encoding.

2.  In the Select Text Encoding dialog box, select the windows-1252 encoding value from the Navigation Pane menu.

The Windows code page 1252 represents the character set that is used by Western European languages, including English, in Microsoft Windows operating environments. The WLatin-1 encoding is the SAS equivalent for the 1252 Windows code page.[1]

3.  Click OK to save your selection before you exit the dialog box.

Option 2

From the General tab in the Preferences dialog box, select a value for the default text encoding.

When you set the value in this way, the change is not reflected immediately in the existing code- editor window. You must close the program and re-open it for the setting to take effect. Any programs that you open later will retain the same setting unless you change the setting or override it by selecting another value in the Select Text Encoding dialog box.

Oliver's last name is displayed correctly in the code editor when you use the windows-1252 setting to open the file, as shown below:

However, Oliver's last name is truncated on the HTML Results tab when you submit the program.

The Plot Thickens

Although the problem is fixed in the code editor when you submit the program, Oliver's last name is truncated as Füßli in the output. However, you do not receive any notes or warnings in the log about that truncation. So, why does the truncation happen?  The ü (U-umlaut) and the ß (German Eszett) are stored as single-byte characters (SBCS) in WLatin-1, but those characters require two bytes in UTF-8. As a result, there is not adequate space to print the remaining characters in the name.

When you submit programs that contain national language characters from a single-byte encoding to a UTF-8 environment, you must be prepared to modify the program to use wider informats when you create your variables. Otherwise, character truncation can occur.

You can correct this problem easily by enlarging the column to accommodate the extra bytes that are used to store the characters in UTF-8.

Here is the modified INPUT statement that successfully reads the data in a UTF-8 SAS session. The character informat for the Lname variable is increased from $7. to $9.

input StudID $12. Age Fname $6. Mi :$2. Lname $9.;

After you increase the informat, Oliver's last name is correct when you view it on the HTML Results tab.

A Subplot Appears

What if the program file is included and executed by using the %INCLUDE statement rather than by submitting it from the code editor?

In this situation, the program stops processing with the following errors:

NOTE: The data set WORK.TEST has 1 observations and 5 variables.
NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       cpu time            0.00 seconds
 
ERROR: Invalid characters were present in the data.
ERROR: An error occurred while processing text data.
NOTE: The SAS System stopped processing this step because of errors.
 
NOTE: There were 1 observations read from the data set WORK.TEST.

In this case, the HTML Results tab does not display a last name at all.

To eliminate this error, you need to use the ENCODING= option in the %INCLUDE statement, as shown below.

%include "your-directory/pgm.sas" /encoding="windows-1252";

By including the ENCODING="windows-1252" option in the %INCLUDE statement, the program now executes successfully, as shown by the notes in the log:

 NOTE: The data set WORK.TEST has 1 observations and 5 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.01 seconds
       cpu time            0.01 seconds
 
 
 NOTE: There were 1 observations read from the data set WORK.TEST.

Happily Ever After (or, The End)!

The moral of this story is that there are many ways to avoid transcoding problems when you have national language characters in SAS programs that you save from a SAS®9 (English) session and move to a UTF-8 environment. Hopefully, you can use the tips that are provided to avoid such issues. However, if you still have problems, you can call on another hero, SAS Technical Support, for help!

Epilogue

The following program is the one used throughout this story. You can copy and paste it for your own use.

data test;
   input StudID $12. Age Fname $6. Mi :$2. Lname $7.;
   datalines;
120400310496 15 Oliver S. Füβling
;
 
proc print;
run;

Additional Resources