SAS programmers

6月 162017
 

Using parameters within the macro facilityHave you ever written a macro and wondered if there was an easy way to pass values to the macro? You can by using macro parameters. Macro parameters enable you to pass values into the macro at macro invocation, and set default values for macro variables within the macro definition. In this blog post, I also discuss how you can pass in a varying number of parameter values.

There are two types of macro parameters: positional and keyword.

Positional Parameters

You can use positional parameters to assign values based on their position in the macro definition and at invocation. The order that you use to specify the values must match the order in which they are listed in the %MACRO statement. When specifying multiple positional parameters, use a comma to separate the parameters. If you do not pass a value to the macro when it is invoked, a null value is assigned to the macro variable specified in the %MACRO statement.

Here is an example:

%macro test(var1,var2,var3);                                                                                                            
 %put &=var1;                                                                                                                           
 %put &=var2;                                                                                                                           
 %put &=var3;                                                                                                                           
%mend test;                                                                                                                             
 
/** Each value corresponds to the position of each variable in the definition. **/ 
/** Here, I am passing numeric values.                                         **/                                                            
%test(1,2,3)                                                                                                                            
/** The first position matches with var1 and is given a null value.            **/                                                             
%test(,2,3)                                                                                                                             
/** I pass no values, so var1-var3 are created with null values.               **/                                                             
%test()                                                                                                                                 
/** The first value contains a comma, so I use %STR to mask the comma.         **/                                                             
/** Otherwise, I would receive an error similar to this: ERROR: More           **/
/** positional parameters found than defined.                                  **/                                                             
%test(%str(1,1.1),2,3)                                                                                                                  
/** Each value corresponds to the position of each variable in the definition. **/ 
/** Here, I am passing character values.                                       **/                                                            
%test(a,b,c) 
/** I gave the first (var1) and second (var2) positions a value of             **/
/** b and c, so var3 is left with a null value.                                **/                                                             
%test(b,c)

 

Here are the log results:

173  /** Each value corresponds to the position of each variable in the definition. **/
174  /** Here, I am passing numeric values.                                         **/
175  %test(1,2,3)
VAR1=1
VAR2=2
VAR3=3
176  /** The first position matches with var1 and is given a null value.            **/                                                             
177  %test(,2,3)
VAR1=
VAR2=2
VAR3=3
 
178  /** I pass no values, so var1-var3 are created with null values.               **/
179  %test()
VAR1=
VAR2=
VAR3=
180  /** The first value contains a comma, so I use %STR to mask the comma.         **/                                                             
181  /** Otherwise, I would receive an error similar to this: ERROR: More           **/
182  /** positional parameters found than defined.                                  **/                                                             
183  %test(%str(1,1.1),2,3)
VAR1=1,1.1
VAR2=2
VAR3=3
184  /** Each value corresponds to the position of each variable in the definition. **/
185  /** Here, I am passing character values.                                       **/
186  %test(a,b,c)
VAR1=a
VAR2=b
VAR3=c
187  /** I gave the first (var1) and second (var2) positions a value of             **/
188  /** b and c, so var3 is left with a null value.                               **/
189  %test(b,c)
VAR1=b
VAR2=c
VAR3=

 

Keyword Parameters

The benefit of using keyword parameters is the ability to give the macro variables a default value within the macro definition. When you assign values using keyword parameters, you must include an equal sign after the macro variable name.

Here is an example:

%macro test(color=blue,id=123);                                                                                                         
 %put &=color;                                                                                                                          
 %put &=id;                                                                                                                             
%mend test;                                                                                                                             
 
/** Values passed to the macro overwrite default values from the definition. **/                                                                 
%test(color=red,id=456)                                                                                                                 
/** Passing in no values allows the default values to take precedence.      **/                                                                 
%test()                                                                                                                                 
/** You are not required to pass in a value for each keyword parameter.    **/                                                                 
%test(color=green)                                                                                                                      
/** The order of variables does not matter.                               **/                                                                                                 
%test(id=789,color=yellow)

 

Here are the log results:

270  /** Values passed to the macro overwrite default values from the definition. **/
271  %test(color=red,id=456)
COLOR=red
ID=456
272  /** Passing in no values allows the default values to take precedence.     **/
273  %test()
COLOR=blue
ID=123
274  /** You are not required to pass in a value for each keyword parameter.   **/
275  %test(color=green)
COLOR=green
ID=123
276  /** The order of variables does not matter.                              **/
277  %test(id=789,color=yellow)
COLOR=yellow
ID=789

 

If the macro definition combines positional and keyword parameters, positional parameters must come first. If you do not follow this order, this error is generated:

ERROR: All positional parameters must precede keyword parameters.

 

Here is an example:

%macro test(val,color=blue,id=123);                                                                                                     
 %put &=color;                                                                                                                          
 %put &=id;                                                                                                                             
 %put &=val;                                                                                                                            
%mend test;                                                                                                                             
 
/** The positional parameter is listed first. **/                                                                 
%test(1,color=red,id=456)
 
Here are the log results:
 
318  /** The positional parameter is listed first. **/                                                                 319  %test(1,color=red,id=456)
COLOR=red
ID=456
VAL=1

 

PARMBUFF

The PARMBUFF option creates a macro variable called &SYSPBUFF that contains the entire list of parameter values, including the parentheses. This enables you to pass in a varying number of parameter values. In the following example, you can pass any number of parameter values to the macro. This following example illustrates how to parse each word in the parameter list:

%macro makes/parmbuff; 
  /** The COUNTW function counts the number of words within &SYSPBUFF.            **/                                                                                                                 
   %let cnt=%sysfunc(countw(&syspbuff)); 
  /** The %DO loop increments based on the number of words returned to the macro. **/
  /** variable &CNT.                                                              **/                                
   %do i= 1 %to &cnt;  
  /** The %SCAN function extracts each word from &SYSPBUFF.                      **/                                                                                                                  
     %let make=%scan(&syspbuff,&i);                                                                                                     
     %put &make;                                                                                                                        
   %end;                                                                                                                                
%mend makes;                                                                                                                            
 
%makes(toyota,ford,chevy)

 

Here are the log results:

19  %macro makes/parmbuff;
20    /** The COUNTW function counts the number of words within &SYSPBUFF.            **/
21     %let cnt=%sysfunc(countw(&syspbuff));
22    /** The %DO loop increments based on the number of words returned to the macro  **/
23    /** variable &CNT.                                                              **/
24     %do i= 1 %to &cnt;
25    /** The %SCAN function extracts each word from &SYSPBUFF.                       **/
26       %let make=%scan(&syspbuff,&i);
27       %put &make;
28     %end;
29  %mend makes;
30
31  %makes(toyota,ford,chevy)
toyota
ford
chevy

 

When you specify the PARMBUFF option and the macro definition includes both positional and keyword parameters, the parameters still receive values when you invoke the macro. In this scenario, the entire invocation list of values is assigned to &SYSPBUFF. Here is an example:

%macro test(b,a=300)/parmbuff;                                                                                                      
 %put &=syspbuff;                                                                                                                        
 %put _local_;                                                                                                                          
%mend;                                                                                                                                  
 
%test(200,a=100)

 

Here are the log results:

SYSPBUFF=(200,a=100)
TEST A 100
TEST B 200

 

Notice that &SYSPBUFF includes the entire parameter list (including the parentheses), but each individual parameter still receives its own value.

If you need to know all the parameter values that are passed to the macro, specify the PARMBUFF option in the macro definition to get access to &SYSPBUFF, which contains all the parameter values. For more information about PARMBUFF, see %MACRO Statement in SAS® 9.4 Macro Language: Reference, Fifth Edition.

I hope this blog post has helped you understand how to pass values to a macro. If you have SAS macro questions that you would like me to cover in future blog posts, please comment below.

Using parameters within the macro facility was published on SAS Users.

6月 072017
 

Combine and conquer with SASThe ancient political maxim “divide and conquer” (Lat. “dīvide et īmpera”) has been used for millennia in politics, sociology, and psychology, mainly to manipulate people. In the last two centuries, however, this principle has also been adopted by computer science. It is widely used in computer programming, primarily in computer algorithms such as binary search, quick sort, Fast Fourier Transform (FFT), parallel processing algorithms – to name a few.

A typical Divide and Conquer algorithm consists of the following steps:

  1. Divide – break the given problem into smaller pieces (sub-problems) of the same type;
  2. Conquer – solve these smaller pieces (sub-problems);
  3. Combine – aggregate the results of the solved sub-problems to deduce the result of the initial problem.

However, in this blog post we’ll reverse the “divide and conquer” principle and go backwards from step three to step two, to combine and then conquer.

In our day-to-day programming activities, we may not need to divide a problem in order to solve it. In many cases, we deal with small pieces of a puzzle (language elements) provided by the SAS programming language, and we solve some larger problems by combining those elements.

Let’s consider some examples.

Combining SAS Formats

Suppose you have a variable x in a SAS table on which you need to report with the following requirement. Depending on the variable value, you will need to apply different SAS formats:

Different SAS formats based on variable values

In other words, you need to display percentages that are greater than 10% as whole numbers, percentages between 1% and 10% as numbers with one decimal point, and percentages that are less than 1% as numbers with two decimal points.

You may get away with proc format’s picture statement with a proper multiplier – but good luck with that. Fortunately, there is a much better way of solving this problem. Did you know that you can use already existing formats as labels in proc format’s value statement? In other words, we can combine several SAS formats while defining another SAS format.

The above problem is easily solved by defining a new format as a combination of different PERCENTw.d formats as follows:

proc format;
   value pctmod
      -0.01 <-< 0.01 = [percent10.2] 
      -0.1 - -0.01, 0.01 - 0.1 = [percent10.1]
      low -< -0.1, 0.1 <- high = [percent10.0]
      ;
run;

Here is what this format does:

Unformatted and formatted values

You can combine any other existing formats as needed, including mixing and matching SAS-supplied formats and user-defined formats. You can even use existing formats within the PICTURE statement of the PROC FORMAT. For example, the following format will format values between 0 and 1 as cents (e.g. ¢75) while values greater than 1 as dollars (e.g. $9,340):

proc format;
   picture dolcent
	  0 <-< 1 = '99' (prefix='¢' mult=100)
	  1 - high = [dollar32.0]
	  ;
run;

Here is what this format does:

Formatted values vs. unformatted values

When using an existing format as a label, make sure to enclose the format name in square brackets, e.g. [date9.], and do not enclose the name of the existing format in single or double quotes.

Using an existing format in the format definition can be construed as format nesting. A nested level of one means that you are creating the format A with the format B as a formatted value. However, if you also create format B with the format C as a formatted value, then you have nested level of two. Avoid nesting formats with a level higher than one, as the resource requirements may increase dramatically with each additional level.

Combining SAS Libraries

Suppose you have defined several SAS data libraries, such as

libname apples 'c:\projects\garden\apples';
libname oranges 'c:\projects\garden\oranges';
libname tomatoes 'c:\projects\garden\tomatoes';

You can then logically combine (concatenate) them into a single library:

libname combolib (apples oranges tomatoes);

or

libname combolib ('c:\projects\garden\apples' 'c:\projects\garden\oranges' 'c:\projects\garden\tomatoes');

Yes, you can combine apples with oranges, and even with tomatoes, and that will enable you to access the SAS data sets in several libraries with one libref.

As you can see, you can combine two or more libraries by their librefs, physical names, or even a combination of librefs and physical names, such as:

libname combolib (apples oranges 'c:\projects\garden\tomatoes');

SAS will even allow using commas between the concatenated libraries:

libname combolib (apples, oranges, tomatoes);
libname combolib (apples, oranges, 'c:\projects\garden\tomatoes');

It is important to know that while SAS has no problem combining multiple libraries that contain same-named files, when you reference such a file, the concatenated libraries are searched in order and the first occurrence of the specified file is used.
This proves to be extremely useful when you have multiple versions of the same tables that you promote between different environments during a system development cycle. For example, you might have a SAS data table named SALES in either some or all of the development, testing, and production environments. You can define librefs for each environment as well as a libref for the combined environment:

libname devllib 'c:\projects\proj_x\devl';
libname testlib 'c:\projects\proj_x\test';
libname prodlib 'c:\projects\proj_x\prod';
 libname lastlib (prodlib testlib devllib);

Then when you reference your table as lastlib.SALES SAS will return the SALES table from production environment if it exists there; if not then from testing environment if it exists there; if not then from development environment it exists there. That logical hierarchy means that SAS will return the latest and greatest version of that SALES table. If you want that table from a specific environment, then you reference it using a specific libref: prodlib.SALES, testlib.SALES, or devllib.SALES.

Combining SAS Format Libraries

You can combine SAS format libraries to achieve similar version prioritization as in the above case of SAS data libraries using FMTSEARCH= System Option:

options fmtsearch=(prodlib testlib devllib);

This implies that your SAS formats are stored in the catalogs named FORMATS in the above specified libraries: prodlib.FORMATS, testlib.FORMATS, devllib.FORMATS.

One must remember that if not specified, WORK and LIBRARY libraries by default take precedence in the order in which SAS formats are searched, making the above options statement effectively look like:

options fmtsearch=(work library prodlib testlib devllib);

If WORK and LIBRARY libraries are specified in the fmtsearch= then the default is not applied and the search order is as specified.

Your turn

An unlimited space below is reserved for your contribution to the fascinating topic of combine and conquer with SAS. This is your chance to contribute to the list of coding techniques that you use to combine and conquer solving your SAS coding problems.

 

Combine and conquer with SAS was published on SAS Users.

6月 052017
 

Refining your data for effective reports is even easier in the 8.1 release of SAS Visual Analytics. In this blog post, I’ll take a look at the data pane, how it displays data from your active data source, and a few tasks that you might want to perform, such as viewing measure details, changing data item properties, and creating geographic data items, hierarchies, and custom categories.  In a future blog, I’ll look at creating filters, new calculated data items, and calculated aggregations.

In the SAS Visual Analytics interface, you can display the Data pane by clicking the Data icon on the left menu.

Measure details

You can view data item properties and make these changes:

  • The name of a data item.
  • The Classification of Measure or Category data item.
  • The format of a Measure or Date/Time/Datetime data item.
  • The aggregation of a Measure data item.
  • Note:  User-defined formats for Category data items are honored, but cannot be changed.  The user-defined format must be available to the CAS server where the data is loaded.

A Measure classification can be changed to Category or Geography. A Category classification can be changed to Geography.  Date and Tim classifications cannot be changed.

You can modify the aggregation for a Measure data item:

You can modify measure formats:

  • You can expand to display available Duration and Currency formats.
  • Over thirty different Currency formats are available for local or international currency.  (Example: $56,790 or USD56,789)

Geography data items identity a geographic region or location, and are typically used to visualize data on a geographic map.

  • You can create geography data items by using predefined geography classifications such as countries or states.
  • You can create a custom geographic classification by providing latitude and longitude coordinates in your data source.
  • For predefined classifications, the values of your category data items must match the lookup values documented here.

Designating a Geography data item:

Designating a custom Geography data item:

A hierarchy enables you to add drill-down functionality to a report.

  • A hierarchy can display information from general to specific.
  • You can create category, date/datetime, or geography hierarchies.
  • Not all report objects support all types of hierarchies.

  • A custom category is a category data item which associates a set of alphanumeric values with intervals, ranges, or specific values of a data item.
  • A custom category can be based on a category or a measure data item.
  • A custom category can provide functionality in a report similar to that of a user-defined format—without having to have a format previously associated with the data item.

Example:
1-20 = ‘First group’
21-30 = ‘Second group’
31-40 = ‘Third group’

Depending on the report object, some roles require only one value; other roles may require multiple values.

You can right-click on a data item in the Roles pane to remove the data item from a role.

Duplicating a data item can enable you to display data in two different ways.

Duplicating the Date data item with 939 different values can enable you to create a report that analyzes production on each day of the week: Monday, Tuesday, Wednesday,…

Duplicating a measure data item can enable you to use the column as a category data item in reports as well.

Note: You can only delete data items that you have created, such duplicated data items or calculated data items.  Deleting a data item means that it no longer appears in the data pane.

This covers many of the basic tasks that you can complete in the new data pane. In my next blog, I’ll take a look at the visual and text editors for creating filter expressions and calculated items and aggregations in the data pane.

The new Data Pane in SAS Visual Analytics - It's painless! was published on SAS Users.

6月 022017
 

There are several exciting new features in SAS Visual Analytics 8.1 that I know will excite you, including that you can now configure cascading prompts for the Report Prompt and Page Prompt areas! Prior to SAS Visual Analytics 8.1 there was a way to use parameters to configure cascading prompts for report and section prompts, but now all we have to do is define a Filter from the Actions Pane. It’s a lot less work and there are even a couple of different ways it can be done.

Method 1

Steps:
1.      Add Control Objects to Report or Page Prompt area.
2.      Assign Roles to Control Objects.
3.      Define a Filter from the Actions Pane.
4.      Test your cascading prompts and enhance controls.

Method 1, Step 1
Add the desired Control Objects to either the Report or Page Prompt areas.

Method 1, Step 2
Assign Roles to your Control Objects.

Method 1, Step 3
Create the cascading prompt by defining a Filter from the Action Pane. This means that for the selected data value in the source object, it will use that value to filter the target, i.e. return the rows where source prompt = selected value.

In this example, we are using our SASHELP Cars data set. Our source is the car’s Origin and the target is the car’s Make. If you pick Europe for Origin it will return the corresponding car Makes such as Audi, BMW, Saab, Volkswagen, etc.

Click on the source Control object, Origin, and from the Actions Pane, select Add filter.

Next, select the target object to filter and click OK.

You can check to be sure your action is defined by clicking on the Objects and looking at the Actions Pane.

Method 1, Step 4
Enhance your controls using the Options Pane and then test out your configured cascading prompts!

In my example, I made the Origin button bar required and changed the Selection background color and text color. I also added a bar chart to my report.

Method 2

Steps:
1.      Define a prompt hierarchy.
2.      Drag the prompt hierarchy to the Report or Page Prompt area.
3.      Test your cascading prompts.
4.      (Optional) Enhance controls.

Method 2, Step 1
Create a hierarchy for the data items for which you wish to create cascading prompts.

Method 2, Step 2
Drag your prompt hierarchy to either the Report or Page Prompt area.

Method 2, Step 3
Test your automatically configured cascading prompts! That’s right – this is another new feature. Well, it’s two combined, first is the Auto Controls. Second, if you drag a hierarchy data item to either the Report or Page Prompt area, SAS Visual Analytics automatically defines the Filter Actions for you.

Method 2, Step 4 (Optional)
Go back and enhance your Control Objects using the Options Pane as you like.

There you have it – cascading prompts anywhere in SAS Visual Analytics 8.1 reports: in Report Prompts, Page Prompts and of course still anywhere in the report canvas; and it’s as easy as defining Filter Actions. You may be wondering, is it possible to configure cascading prompts for different data sources? Not to worry – it is and I’ll show you in in my next blog.

Cascading Prompts as Report and Page Prompts in SAS Visual Analytics was published on SAS Users.

5月 312017
 

You probably know about the w.d and BESTw. formats. They have been around throughout my entire 38-year history of using SAS. Do you also know about the Dw.p and BESTDw.p formats? You might find that they are the best D formats around and are even better than BEST! The Dw.p. and BESTDw.p formats work well when the range of values is sufficiently large that different values need to be formatted by using different w.d formats. Specifying BESTDw.p (where p = w - 1) is often better than specifying BESTw. for columns of numbers, since the decimal does not shift when the last digit is 0.

     d10.1      d10.3      d10.6      d10.9    best10.  bestd10.9       10.5
 
2.06115E-9 2.06115E-9 2.06115E-9 2.06115E-9 2.06115E-9 2.06115E-9    0.00000
 5.6028E-9  5.6028E-9  5.6028E-9  5.6028E-9  5.6028E-9  5.6028E-9    0.00000
  1.523E-8   1.523E-8   1.523E-8   1.523E-8   1.523E-8   1.523E-8    0.00000
 0.0000008 8.31529E-7 8.31529E-7 8.31529E-7 8.31529E-7 8.31529E-7    0.00000
 0.0000167  0.0000167 0.00001670 0.00001670  0.0000167 0.00001670    0.00002
 0.0000454  0.0000454 0.00004540 0.00004540  0.0000454 0.00004540    0.00005
    0.0001  0.0001234 0.00012341 0.00012341 0.00012341 0.00012341    0.00012
    0.0498    0.04979  0.0497871 0.04978707 0.04978707 0.04978707    0.04979
    1.0000    1.00000   1.000000 1.00000000          1          1    1.00000
    2.7183    2.71828   2.718282 2.71828183 2.71828183 2.71828183    2.71828
    7.3891    7.38906   7.389056 7.38905610  7.3890561 7.38905610    7.38906
   54.5982   54.59815  54.598150 54.5981500   54.59815 54.5981500   54.59815
  403.4288  403.42879   403.4288 403.428793 403.428793 403.428793  403.42879
 8103.0839       8103  8103.0839 8103.08393 8103.08393 8103.08393 8103.08393
     22026      22026   22026.47 22026.4658 22026.4658 22026.4658 22026.4658
  24154953   24154953   24154953 24154952.8 24154952.8 24154952.8 24154952.8
  65659969   65659969   65659969 65659969.1 65659969.1 65659969.1 65659969.1

Here is a review of some of the basic formats for displaying numbers.

w.d displays a numeric value in a field w positions wide, with d positions to the right of the decimal place, and with w - d - 1 positions to the left of the decimal place. Use it when you know the range of value (such as correlation coefficients) or when you know the typical range (such as t statistics).

Example:

x = 100 / 3;
y = -9.9999;
z = -12.3333333;
put x 7.3 / y 7.4 / z 7.4;
 33.333
-9.9999
-12.333

Notice that the decimal automatically shifts for extreme values.

BESTw. displays numeric values, in a field w positions wide, using the best format for each particular value. Decimals might not align for many of the values. This can make results difficult to read.

Example:

x1 = 100 / 3;
x2 = x1 ** 2;
x3 = 1 / x1;
x4 = x1 / 10;
x5 = 12;
format x1-x5 BEST8.;
put x1 / x2 / x3 / x4 / x5;
33.33333
1111.111
0.03
3.333333
12

Dw.p displays numeric values, in a field w positions wide, possibly with a great range of values, lining up decimals for values of similar magnitude. The precision parameter is p. The Dw.p format selects a w.d (or possibly Ew.) format for each value, with the number of decimal points d chosen so that values of similar magnitude are displayed with decimal points that line up within a column of values. For a given field width w, this internal selection of w.d formats is made subject to the precision constraint specified by p. That is, every value will be displayed with at least p significant digits shown (counting digits both to the left and the right of the decimal point).

Example:

x1 = 100 / 3;
x2 = x1 ** 2;
x3 = 1 / x1;
x4 = x1 / 10;
x5 = 12;
format x1-x5 D8.;
put x1 / x2 / x3 / x4 / x5;
33.3333
1111.1
0.0300
3.3333
12.0000

BESTDw.p displays numeric values, in a field w positions wide, lining up decimals for values of similar magnitude, and displays integers without decimals. As the name implies, this format combines the BESTw. format for integers and the Dw.p format for nonintegers.

Example:

x1 = 100 / 3;
x2 = x1 ** 2;
x3 = 1 / x1;
x4 = x1 / 10;
x5 = 12;
format x1-x5 BestD8.;
put x1 / x2 / x3 / x4 / x5;
33.3333
1111.1
0.0300
3.3333
12

Example:

x1 = 100 / 3;
x2 = x1 ** 2;
x3 = 1 / x1;
x4 = x1 / 10;
x5 = 12;
format x1-x5 BestD8.7;
put x1 / x2 / x3 / x4 / x5;
33.33333
1111.111
0.030000
3.333333
12

Specifying BESTDw.p (where p = w - 1) is better than specifying BESTw. for columns, since decimals do not shift when the last digit is 0.

Example:

proc iml;
   x     = {-2 -1.2 -.52 0 .5 1.3 1.5 2.4 3 3.5 4 4.5 5 6 7 8};
   knots = {-7 -4 -2 0 5 8 12 17};
   b     = bspline(x,4,knots);
   print b[format=bestd7.6 label='BestD7.6'] '  '
         b[format=best7. label='Best7.'];
quit;
 
BestD7.6                                   Best7.
 
0.16534 0.59550 0.23099 0.00816       0    0.16534  0.5955 0.23099 0.00816       0
0.16534 0.59550 0.23099 0.00816       0    0.16534  0.5955 0.23099 0.00816       0
0.16534 0.59550 0.23099 0.00816       0    0.16534  0.5955 0.23099 0.00816       0
0.16534 0.59550 0.23099 0.00816       0    0.16534  0.5955 0.23099 0.00816       0
0.10848 0.55622 0.31543 0.01986 7.66E-6    0.10848 0.55622 0.31543 0.01986 7.66E-6
0.04958 0.45179 0.44110 0.05718 0.00035    0.04958 0.45179  0.4411 0.05718 0.00035
0.03970 0.42132 0.46769 0.07067 0.00062     0.0397 0.42132 0.46769 0.07067 0.00062
0.01209 0.28103 0.55021 0.15260 0.00407    0.01209 0.28103 0.55021  0.1526 0.00407
0.00423 0.19585 0.56541 0.22458 0.00993    0.00423 0.19585 0.56541 0.22458 0.00993
0.00134 0.13569 0.55242 0.29216 0.01839    0.00134 0.13569 0.55242 0.29216 0.01839
0.00026 0.08757 0.51761 0.36319 0.03137    0.00026 0.08757 0.51761 0.36319 0.03137
0.00002 0.05202 0.46436 0.43334 0.05025    0.00002 0.05202 0.46436 0.43334 0.05025
      0 0.02813 0.39792 0.49737 0.07659          0 0.02813 0.39792 0.49737 0.07659
      0 0.02813 0.39792 0.49737 0.07659          0 0.02813 0.39792 0.49737 0.07659
      0 0.02813 0.39792 0.49737 0.07659          0 0.02813 0.39792 0.49737 0.07659
      0 0.02813 0.39792 0.49737 0.07659          0 0.02813 0.39792 0.49737 0.07659

Dw.p Format Details. Dw.p does not imply w.d. The p specification is a precision specification. It is not a number-of-decimal-places (d) specification. First, note that the effective values for p range from 1 to 9. Dw. (no p), Dw.0, and Dw.3 all mean the same thing: p=3. Here is a column of numbers formatted 6 different ways, illustrating various choices of p:

     d10.1      d10.2      d10.3      d10.5      d10.7      d10.9
 
2.06115E-9 2.06115E-9 2.06115E-9 2.06115E-9 2.06115E-9 2.06115E-9
 0.0000061  0.0000061 6.14421E-6 6.14421E-6 6.14421E-6 6.14421E-6
 0.0000167  0.0000167  0.0000167 0.00001670 0.00001670 0.00001670
    7.3891     7.3891    7.38906   7.389056  7.3890561 7.38905610
   20.0855    20.0855   20.08554  20.085537  20.085537 20.0855369
   54.5982    54.5982   54.59815  54.598150  54.598150 54.5981500
  148.4132   148.4132  148.41316    148.413  148.41316 148.413159
  403.4288   403.4288  403.42879    403.429  403.42879 403.428793
 1096.6332  1096.6332       1097   1096.633  1096.6332 1096.63316
 8103.0839  8103.0839       8103   8103.084  8103.0839 8103.08393
     22026      22026      22026  22026.466  22026.466 22026.4658
     59874      59874      59874  59874.142  59874.142 59874.1417
    162755     162755     162755     162755  162754.79 162754.791
    442413     442413     442413     442413  442413.39 442413.392
   1202604    1202604    1202604    1202604  1202604.3 1202604.28
   8886111    8886111    8886111    8886111  8886110.5 8886110.52
  24154953   24154953   24154953   24154953   24154953 24154952.8
  65659969   65659969   65659969   65659969   65659969 65659969.1
 178482301  178482301  178482301  178482301  178482301  178482301
 485165195  485165195  485165195  485165195  485165195  485165195

Some of the results are as follows:

  • Display at least 1 significant digit. When p = 1 (the minimum) , there are relatively few groups of values and decimals that align within those groups. Hence decimals appear in fewer of the w positions, and as a result, some values might be displayed with relatively less precision. In this case, there is one group of values that is displayed with the Ew. format and three additional groups (decimals in positions 3, 6, and not displayed).
  • Display at least 2 significant digits. When p = 2, there tend to be more ranges of values where decimals align. Hence, decimals might appear in more of the w positions, and values tend to be displayed with more precision. In this case, there is one group of values that is displayed with the Ew. format and three additional groups (decimals in positions 3, 6, and not displayed).
  • Display at least 3 significant digits. When p = 3 (the default), there will tend to be still more ranges of values where decimals align. Hence, decimals might appear in still more of the w positions, and values tend to be displayed with still more precision. In this case, there is one group of values that is displayed with the Ew. format and three additional groups (decimals in positions 3, 5, and not displayed). Notice that some values are displayed with more precision than with smaller values of p.
  • Display at least 9 significant digits. When p = 9 (the maximum), there will tend to be still more ranges of values where decimals align. Hence, decimals can appear in all but the first and last of the w positions, and values will be displayed with maximum precision. Decimals will not align for many values. In this case, there is one group of values that is displayed with the Ew. format and eight additional groups (decimals in positions 2, 8 and not displayed). Notice that most values are displayed with more precision than with smaller values of p.

As p increases, more different w.d formats are likely to get chosen, values tend to be displayed with more precision, and decimals are less likely to align. The choice of p and w are related. If you want to ensure more worst-case precision, then a wider field width is appropriate. Also, one advantage of Dw.d is that it allows more moderate width columns. If all you have is w.d, and you want to ensure reasonable precision for values of different scales, they you're pretty much forced to use very wide columns with a large number of decimal places and hope for the best.

Submit the following steps to see more values that are formatted using a variety of D and other formats. The table of formatted numbers displayed above is a subset of the results displayed by these steps.

data x(drop=i j l f);
   array d[13] d0-d12;
   length l f $ 200;
   do j = 0 to 9;
      l = catx(' ', l, cats('d', j), '=', cats('"d10.',j,'"'));
      f = catx(' ', f, cats('d', j), cats('d10.',j));
   end;
   l = catx(' ', 'label',  l, 'd10 = "best10." d11 = "bestd10.9" d12 = "10.5";');
   f = catx(' ', 'format', f, 'd10 best10. d11 bestd10.9 d12 10.5;');
   call symputx('lf', catx(' ', l, f));
   do i = -20 to 20;
      d0 = exp(i);
      do j = 2 to 13; d[j] = d0; end;
      output;
   end;
run;
 
options ls=142;
proc print noobs label; &amp;lf run;

Letting ODS pick the format. In an ODS table template, you can use the CHOOSE_FORMAT= option to choose a format. Use it with the FORMAT_WIDTH=w option to choose a good format for the specified width (w).

CHOOSE_FORMAT=MAX_ABS
picks a good format for a single number based on the maximum absolute value. This is the simplest case.

CHOOSE_FORMAT=MAX
picks a good format for a single number based on the maximum. Values are all expected to be nonnegative so no space is reserved for a minus sign.

CHOOSE_FORMAT=MIN_MAX
picks a good format for a single number based on the minimum and the maximum. Values can be mixed, positive and negative. This option provides room for a minus sign only where it is actually needed. So if all values happen to be positive, no room is reserved for a minus sign. If values range from say -5 to 10, no extra column is needed for the minus sign. If you also specify FORMAT_NDEC=ndec in PROC TEMPLATE, ndec provides a ceiling on the maximum number of decimal places.

CHOOSE_FORMAT=COMPROMISE
picks a good w.d format for a column of numbers if they are not too heterogeneous.

History. The w.d and BESTw. formats have been around for decades. Many SAS formats were written by Rick Langston, although the original mainframe w.d and BESTw. formats precede Rick. The functionality that underlies the Dw.p format has also been around for decades, but it has not always been available as a format. Mark Little wrote it, and for many years it was only available for procedures to use internally. When SAS started making output using ODS, then that functionality was surfaced as the Dw.p format. The BESTDw.p format came much later.

Conclusions. When you explicitly specify a format, you typically specify w.d when the values have a known range (such as correlation coefficients) or Dw.p when values are more diverse. You might instead use BESTDw.p when you expect a mix of integers and nonintegers. BESTw. is not the best format for columns; it is best when used for scalars. Ew. is useful for large or small numbers or when extra precision is desired. Dw.p. and BESTDw.p work well when the range of values is large. Experiment with the precision parameter p to get the best display of your results. In particular, setting p to w - 1 and specifying BESTDw.p provides a good alternative to the BESTw. format.

The BEST D formats that you have ever seen! was published on SAS Users.

5月 262017
 

Elizabeth is courageous. Scoliosis since birth, corrective spinal surgery replaced her spine with steel, tripping on stairs permanently broke her right ankle. Then she decided to come take yoga with me. To help ease back pain & reduce hip stress, I offered options like bent legs not cross. In class [...]

The post Take that SAS option appeared first on SAS Learning Post.

5月 192017
 

Technical Support regularly receives incoming calls from customers who have encountered the following transcoding warning:

WARNING: Some character data was lost during transcoding in the data set xxx.xxx. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding

People are not always exactly sure what this warning means nor what to do about it. No worries! This blog provides background information about why this warning occurs and offers resources that can help you understand and resolve any encoding or truncation issues that are the source of the warning above.

In most Western character sets, each character occupies only one byte of computer memory or storage. These are known as single-byte character sets (SBCS). Character sets for languages such as Japanese, Korean, and Simplified and Traditional Chinese require mixed-width encoding. The character sets for these languages are double-byte character sets (DBCS). (The term DBCS can be misleading because not all of the characters in a double-byte character set are two bytes. Thus, the term multibyte character set [MBCS] is sometimes used instead of DBCS.)

Unicode is a multibyte character set that was created to support all languages. It includes all characters from most modern written languages and historic scripts, even hieroglyphs and cuneiform. UTF-8 is one of the more common encoding forms of the Unicode standard, and this encoding is recommended as the SAS session encoding for multilingual environments if your data sources contain characters from many regions. Each character can be up to four bytes in UTF-8. UTF-8 is the SAS session encoding for SAS® Viya™. Clients such as SAS University Edition, SAS Studio, and SAS Visual Analytics typically execute statements in a server environment that run the UTF-8 encoding.

Now, let's look at how SAS handles data from different encodings along with two causes of the warning shown earlier.

Starting in SAS®9, SAS data sets store an encoding indicator in the descriptor portion. If the encoding value of the file differs from the encoding of the currently executing SAS session, the Cross Environment Data Access (CEDA) engine is invoked when SAS reads the data set. CEDA transcodes the data, by default. Transcoding is the process of converting the contents of SAS files from one encoding to another, and this process is necessary in order to read data from around the world.

When transcoding occurs, you receive an informational note about the use of CEDA. In addition, you might see the warning (shown above) in your SAS log.

The transcoding warning is issued when you have values for character variables that have either of the following characteristics:

  • The number of bytes for a character in one encoding do not match the number of bytes that are used for the same character in another encoding (as shown in the image below). If the length of the column is not wide enough to accommodate the additional bytes, truncation of the character data occurs.

  • The characters exist in one encoding, but they do not exist in another encoding. As a result, the transcoding is not successful.

If you use a procedure (for example, the PRINT or REPORT procedures), the procedure runs to completion. The variable is truncated if the additional bytes cause the value to exceed the length of the column. Alternatively, if a character from the data set does not also exist in the target encoding or if it occupies a different code point (after transcoding), the column might be blank or another character might be substituted for the original character. The procedure generates the transcoding warning.

If you have an output data set open, SAS transcodes the character data and stops writing observations to the file at the point it encounters the problematic characters. This behavior can result in either fewer observations than expected or a data set with zero observations. When this happens, SAS generates an error message instead of a warning. This error message contains the same content as the warning message.

ERROR: Some character data was lost during transcoding in the data set xxx.xxx. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding

New Video Available to Help You Resolve the Transcoding Error or Warning

Before you attempt to resolve the transcoding error or warning, it is helpful to understand a bit about encodings and how to handle multilingual SAS data. To help you in this endeavor, SAS Technical Support worked with Education and Training to create a new, short (12 minutes) video to help you quickly troubleshoot and resolve the most common reasons for the transcoding error or warning.

Click the image below to access the video. (Note: You will be prompted to create a SAS profile if you do not have one already.)

Click image to play video

 

If you want to practice the techniques in this video, you can download the data sets that are used in the video here.

This video is based on solutions that have helped customers who called SAS Technical Support for assistance. The video offers these solutions in a visual format that is easy to follow. In addition to showing you how to launch SAS in different encodings to resolve the error, the video also introduces the Character Variable Padding (CVP) engine (Read-only) for processing SAS data files that contain multilingual data.

This video is one of a number of resources available to help you. The next section lists several tips that might help prevent or resolve transcoding warnings or errors.

Helpful Tips from the Field

The following tips have been tested and used in the field, and you might find them helpful when you work with multilingual data. The first two tips are usage tips; the others are programming tips.

  • Invoke SAS in more than one encoding. For this first tip, it is important to know that if you try to change the value of the ENCODING system option during a SAS session, SAS ignores the option. When this situation occurs, the following warning is generated:

WARNING 30-12: SAS option ENCODING is valid only at startup of the SAS System. The SAS option is ignored.

If you work with data that is stored in more than one encoding, you can invoke SAS in more than one session encoding and manage the sessions easily by setting the title bar to display the session encoding in the main SAS window.

Note: This solution only works in the Microsoft Windows operating environment. Display of UTF-8 encoded data in the SAS windowing environment is not fully supported.  SAS Enterprise Guide and SAS Studio have full support for display of UTF-8 characters.

To customize the title bar of the main SAS window.

1.  In a Windows environment, select Start All Programs SAS Additional Languages.

2.  Under Additional Languages, select the shortcut for the session encoding that you want. For example, the following image shows that SAS 9.4 (Unicode Support) is selected

3.  Right-click the shortcut and select Properties. This opens the properties dialog box for that shortcut.

4. On the Shortcut tab, place your cursor at the end of the current string in the Target text box. If it is not already listed in the box, enter the location of the configuration file that you want to use by using the –CONFIG option followed by the –AWSTITLE option.

Then include in quotation marks the text that you want to see in the main SAS window, as shown in the following example:

The entire line should look similar to the following, only it will appear as one long string:

"C:\Program Files\SASHome\SASFoundation\9.4\sas.exe" -CONFIG "C:\Program Files\SASHome\SASFoundation\9.4\nls\
u8\sasv9.cfg" -awstitle "SAS UTF8"

The following display shows the –AWSTITLE option added to the Target field of the SAS 9.4 (Unicode Support) Properties dialog box:


5
. Click OK.

Now, when you open SAS, you can immediately recognize the session encoding based on the value in the title bar.

You can pin this shortcut to your task bar by dragging it from the Start menu to the task bar. Then you should see your custom title when you hover your cursor over the shortcut.

  • Prevent an unexpected transcoding warning or error: If you read or convert a SAS data set from LATIN1 or WLATIN1 encoding to UTF-8 that contains Microsoft Word smart quotes or dashes in character variables, it can cause truncation errors. Those are not ASCII characters, and they require more than one byte when converted to UTF-8. For example, performing the following steps causes an unexpected transcoding error or warning.
    1. You use SAS in the WLATIN1 encoding.
    2. You copy a quoted string from Microsoft Word and paste that string into the DATALINES section of a SAS program in the SAS windowing environment.
    3. You create a new variable that uses the quoted string in its value.
    4. You execute the SAS program to create a permanent SAS data set.
    5. You invoke a UTF-8 SAS session to read the WLATIN1 data set in a procedure or DATA step.

You can prevent this issue by clearing the "Straight quotes" with "smart quotes" option in Word, as shown below. This option is available by selecting File Options Proofing AutoCorrect Option. On the AutoFormat tab, the option is listed under the Replace category.

  • Set the CVP engine explicitly (for UTF-8 sessions). The transcoding warning is common if you use a UTF-8 session encoding and the input data source is in the WLATIN1 encoding. The first 128 code points of the UTF-8 code page are identical to the 7-bit ASCII encoding. If your character data only contains characters within those 128 code points, SAS does not generate a warning or error when it transcodes the data. For code points above 128, the character variables must be expanded to two or more bytes. Many characters that are used in Western European languages fall into this category. You can address this issue and avoid the warning or error by adding the CVP engine to the LIBNAME statement for your input data source.

The following example LIBNAME statement illustrates how to specify the CVP engine explicitly:

libname mySAS cvp 'SAS-library-path';
data utf8dat;
set mySAS.wlatindat;
run;

In this example, the code is submitted in a UTF-8 SAS session. The CVP engine is Read-only, and it is used to read the input data set from the WLATIN1 encoding.

  • Set the CVP engine implicitly. The CVP engine provides an easy way to convert your files and avoid truncation problems by using a default value of 1.5 times the original length of the character variable. However, that value might not be sufficient for the length of the characters in your data. The expansion length should be based on the characters that are contained in your data.

For example, if your data consists of double-byte character (DBCS) data (Chinese, Japanese, and so on), most characters are three bytes and emoji characters are four bytes. Therefore, the default 1.5 value for the CVP engine might not be enough for the number of additional bytes. A value of 2.5 is a reasonable expansion for DBCS characters, but you might need more.

In such cases, you might want to try setting the CVP engine implicitly. When you set the CVP engine implicitly, you can use either the CVPBYTES= or CVPMULTIPLIER= options. The following example uses the CVPBYTES= option to accommodate the four bytes that are required for the winking emoticon in UTF-8:

libname enc 'c:\public\encdata' cvpbytes=4;

 

  • Convert text in a character variable to another encoding. Sometimes, the CVP engine does not correct the transcoding error or warning. In these cases, Technical Support usually finds that the destination encoding does not support all of the characters found in the data. For example, if you have a UTF-8 data set that contains French, Greek, Hindi, and other characters and you are running a WLATIN1 session, SAS will successfully transcode the French and other Western European characters to WLATIN1. However, the Greek and Hindi characters cannot be transcoded because there is no representation for them in the WLATIN1 encoding. As a result, an error is generated, and the data is lost.

If you have this kind of multilingual data, you can use the KPROPDATA() function to transcode a character string from and to any encoding and convert characters that are not represented in the target encoding to a character of your choice (a question mark or a space). Note that you must specify BINARY or ANY as the input encoding to prevent CEDA from transcoding your data. (See the next tip for information about suppressing transcoding.) A macro is available in the technical paper Multilingual Computing with SAS 9.4 (on page 7) that you can run to accomplish these tasks.

  • Suppress transcoding. There are certain cases in which you might want to bypass transcoding errors. For example, suppose that you know that your data contains only ASCII characters. You are providing data sets to users and you do not know whether the users will run SAS in WLATIN1 or UTF-8 session encodings. Because the characters are ASCII, SAS does not need to transcode them. You do not want CEDA notes nor the extra overhead that CEDA requires. In such a case, you can suppress transcoding without risking data loss. In the LIBNAME statement for the output data set, specify either the OUTENCODING=ASCIIANY option or the ENCODING=ASCIIANY data set option.

Example 1:

libname final 'SAS-library-path' outencoding=asciiany; 
proc copy in=sashelp out=final noclone;
   select dsname;
run;

Example 2:

libname final 'SAS-library-path'
data final.class(encoding=asciiany);
   set dsname;
run;

The value of ASCIIANY means no transcoding occurs on ASCII machines. When the data source is open on an EBCDIC machine, SAS assumes that the data is ASCII and transcodes it into the EBCDIC session encoding. Note: The CONTENTS procedure shows the encoding value: us-ascii ASCII (ANSI).

The encoding options shown above have two other values:

  • ANY: With this value, SAS does not transcode at all, which is, effectively, binary mode.
  • EBCDICANY: With this value, transcoding only occurs on ASCII machines. SAS assumes the data is EBCDIC and transcodes the character data into the ASCII session encoding.

Additional Self-Help Resources

This section lists several documents that provide more detail about encoding concepts and suggests ways to handle transcoding problems.

  • Encoding: Helping SAS speak your language. This blog by Bari Lawhorn briefly explains what an encoding is, how to determine the default encoding, what to do if your encoding differs from that of other people with whom you share SAS data, and where to find more in-depth information about encodings.
  • Multilingual Computing with SAS® 9.4: This paper, which describes new and enhanced features for handling multiple languages in SAS 9.4, covers topics such as installing and configuring the SAS System, changing the locale of a deployment, working with multibyte data, multilingual support in SAS® Visual Analytics, and more
  • SAS® and UTF-8: Ultimately the Finest. Your Data and Applications Will Thank You!: This SAS Global Forum 2017 paper by Elizabeth Bales and Wei Zheng is a perfect go-to resource when you are migrating data files and formats from another encoding to UTF-8.
  • SAS® Encoding: Understanding the Details: This SAS Press book, by Manfred Kiefer, explains the basics about character encoding that are required for creating, manipulating, and rendering any type of character. This book also provide examples for troubleshooting a variety of encoding problems.
  • SAS® 9.4 National Language Support (NLS): Reference Guide, Fifth Edition. This user guide provides detailed information about encodings, transcoding, and other NLS topics (including dictionaries of NLS autocall macros, data set options, system options, formats, informats, functions, and procedures)
  • SAS Note 15597, "How to convert SAS data set encoding." This note explains how you can convert the encoding for a SAS data set.
  • SAS Note 52716, "The error "Some character data was lost during transcoding in the data set" occurs when the data set encoding does not match the SAS® session encoding."
  • SAS Sample 55054, "PROC SQL can be used to identify data representation and encoding for all data sets in a library:" This sample enables you to query a library of data sets that are in mixed encodings so you can plan to use them in compatible session encodings or you can convert the data sets to another encoding. Using the SQL procedure in this sample, you can create output similar to the following that shows you the data representation and encoding for all data sets in your library.

 

 

Demystifying and resolving common transcoding problems was published on SAS Users.

4月 282017
 

ETL automationWhen developing SAS applications, you can feed database tables into your application by using the libname access engine either by directly referring a database table, or via SAS or database views that themselves refer to one or more of the database tables.

However, such on-the-fly data access may not be efficient enough, especially for interactive SAS applications when system response time is critical. In case of distributed systems where a database might reside on one server, and the SAS Application server – on a different physical server, user wait time caused by delays in data access across networks and databases might become intolerable. In such cases, it makes perfect sense to extract database tables and load them into SAS tables in advance, preferably placing them on (or close to) the SAS Application server. That will ensure higher operational efficiency (responsiveness) of the interactive SAS application.

SAS Data Integration Studio provides a powerful visual design tool for building, implementing, and managing data integration processes across different data sources, applications, and platforms.

However, in case of multiple tables, loading them one by one using even such a powerful tool might become a bit tedious. In an automation paradigm, being “visual” is not necessarily a good thing; when we automate it implies that we want to get things done without visualizing or even seeing them.

Here is a SAS coding approach, which I recently used for a customer project, to automate the repetitive process.

SAS code example of loading multiple Oracle tables into SAS tables

There were several Oracle tables that needed to be extracted and loaded into SAS data tables on a different physical server. The Oracle tables contained multi-year data, but we were only interested in the latest year’s data. That is those tables that have datetime stamp EVENT_DT column needed to be subset to the latest rolling 365 days.

Step 1. Creating a driver table

In order to make this process data-driven, let’s first create a driver table containing a list of the table names that needed to be extracted and loaded:

/* --------------------------------------------------- */
/* create table list to extract &amp; load,                */
/* dt_flag indicates whether EVENT_DT variable present */
/* --------------------------------------------------- */
 
libname parmdl '/sas/data/parmdata';
 
data parmdl.tablelist;
        length tname $8 dt_flag $1;
        input tname dt_flag;
        datalines;
ADDRESS N
ACCOUNT Y
BENEFIT Y
EXCLUSN N
FINANCE Y
HOUSING Y
;

This program runs just once to create the driver table parmdl.tablelist.

Step 2. Loading multiple tables

The following data-driven program runs every time when you need to reload Oracle tables into SAS:

/* source ORACLE library */
libname oralib oracle path="xxx" schema="yyy" user="uuu"
 	PASSWORD="{SAS002}D2DF612A161F7F874C4EF97F" access=readonly;
 
/* target SAS library */
libname sasdl '/sas/data/appdata';
 
/* driver table SAS library */
libname parmdl '/sas/data/parmdata';
 
options symbolgen mprint mlogic;
%let cutoff_date = %eval(%sysfunc(today()) - 365);
 
/* --------------------------------------------------- */
/* loop through table list to extract & load           */
/* --------------------------------------------------- */
%macro ETL;
   %let dsid = %sysfunc(open(parmdl.tablelist));
   %syscall set(dsid);
   %do %while(%sysfunc(fetch(&dsid)) eq 0);
      data sasdl.&tname;
         set oralib.&tname;
         %if &dt_flag eq Y %then %str(where datepart(EVENT_DT) ge &cutoff_date;);
      run;
   %end;
   %let rc = %sysfunc(close(&dsid));
%mend ETL;
%ETL

In this code, we use a similar coding technique to that described in my earlier blog post Modifying variable attributes in all datasets of a SAS library. We loop through the initially created table parmdl.tablelist. in a macro %do-loop, and for each observation containing a single table name, implement a data step extracting one table at a time.

If macro variable &dt_flag=Y, then we generate and insert a where statement sub-setting the extracted table to the latest year’s data.

Note a very useful feature here %syscall set(dsid) that links SAS data set variables to macro variables that have the same name and data type. Since we have not defined those macro variables, in essence, it creates a namesake macro variable for each SAS data set variable.

Implementing data-driven load in SAS Data Integration Studio

The above SAS code is run on the SAS Application server – the same server where the SAS target library resides. To fully automate the ETL process, this code may be embedded into a SAS job in SAS Data Integration Studio and then scheduled via either Platform Suite for SAS or your operating system scheduler. Alternately, it can be run in batch mode or via script that that in turn can be scheduled by the operating system scheduler.

In SAS DI Studio:

1.  Create a New job and give it a meaningful name (e.g. Loading Oracle Tables into SAS Tables.)

2.  Drag & drop User Written Code transformation onto the SAS Job canvas and give it a meaningful name (e.g. ETL macro):

SAS Data Integration Studio Job

3.  Go to the Properties of this transformation and click on the Code tab, then paste your SAS code in the text entry area:

SAS DI job properties

4.  Click OK and Save your SAS Job.

Note, that we do not add Source and Target tables to the Job canvas, only User Written Code transformation. Also note, that the %ETL macro has no direct references to the database table names being extracted and loaded into SAS datasets. Instead, it receives the names of the tables to be loaded from the PARMDL.TABLELIST dataset. This makes it extremely flexible to the task of processing a completely different set of tables; all you need to do is update the table list in the driver table. No change to your SAS DI job is needed.

Now it's your turn

What is your experience with automating ETL processes? Please feel free to share it here in the Comments section.

 

Automating the loading of multiple database tables into SAS tables was published on SAS Users.

4月 212017
 

send an email that embeds a graphWhen using the SAS® system to email graphics output, a common request is to use SAS to send an email in which the graphics output is embedded in the body of the email. This functionality is not available until the second maintenance release for SAS® 9.4 (TS1M2). If you are using a version of SAS earlier than SAS 9.4 TS1M2, your best option is to create graphics output in a format such as RTF or PDF, and then attach the RTF or PDF file to your email.

Using the INLINED Option to Embed Graphics

If you are running SAS 9.4 TS1M2 or later, you can embed graphics output in an email. To do this, use the INLINED suboption with the ATTACH option in a SAS FILENAME statement that uses the EMAIL engine. Here is an example:

filename sendmail email to=("first.last@company.com")          from=("first.last@company.com")
    attach=("c:\temp\email.png" inlined='sgplot')
    type='text/html' subject="Emailing graphics output";

 

Then, later in your code, reference the value specified for the INLINED option in DATA step code that creates custom HTML output, as shown below:

                data _null_;  
        file sendmail;  
  put '<html>';
  put '<body>';
  put '<img src=cid:sgplot>';
  put '</body>';
  put '</html>';
run;

With this technique, although the graph is sent as an attachment, the attachment is hidden. When the email recipient opens the email, the attached graph is automatically displayed in the email (so that it looks like the graph is embedded in the body of the email).

Note: When using SAS to email graphics output, you must first set the EMAILSYS system option to SMTP and the EMAILHOST system option to the name of the SMTP email server at your site.

Embedding Multiple Graphs

You can also send multiple graphs in a single email using a SAS FILENAME statement as shown here:

            filename sendmail email to=("first.last@company.com") from=("first.last@company.com")
    attach=("c:\temp\email1.png" inlined='sgplot1'  "c:\temp\email2.png" inlined='sgplot2')
    type='text/html' subject="Emailing graphics output";

Then, create custom HTML output using DATA step code similar to the following:

     Data _null_;  
       file sendmail;  
 put '<html>';
 put '<body>';
 put '<img src=cid:sgplot1>';
 put '<img src=cid:sgplot2>';
 put '</body>';
 put '</html>';
    run;

Embedding a Graph and a PROC PRINT Table

This example shows how to embed a graph and PRINT procedure table in one email. Let us assume that you have a graph named sgplot.png stored in C:\Temp. You want to send an email using SAS that displays the SGPLOT graph in the body of the email directly before a table created with PROC PRINT. The following sample code demonstrates how to do this using a TITLE statement with PROC PRINT:

filename sendmail email  to=("first.last@company.com") from=("first.last@company.com")
     attach=("c:\temp\sgplot.png" inlined='sgplot') 
     type='text/html' subject="Email test of GRAPH output";
      ods _all_ close; 
ods html file=sendmail; 
title1 '<img src=cid:sgplot>';
proc print data=sashelp.class; 
run;
ods html close; 
ods listing; 
filename sendmail clear;

Embedding a Graph (Complete Program)

Here is a complete sample program that demonstrates embedding graphics in an email using graphics output created with the SGPLOT procedure:

%let workdir=%trim(%sysfunc(pathname(work)));
ods _ALL_ close; 
ods listing gpath="&workdir";
ods graphics / reset=index outputfmt=PNG imagename='email';  
title1 'Graph output emailed using SAS';
proc sgplot data=sashelp.cars; 
  bubble x=horsepower y=mpg_city size=cylinders;
run;
filename sendmail email to=("first.last@company.com") from=("first.last@company.com")
     attach=("&workdir./email.png" inlined='sgplot')
     type='text/html' subject="Emailing graphics output";
      data _null_;
 file sendmail;  
 put '<html>';
 put '<body>';
 put '<img src=cid:sgplot>';
 put '</body>';
 put '</html>';
run; 
      filename sendmail clear;

In conclusion, if you are running SAS 9.4 TS1M2 or later, using the INLINED option in a FILENAME statement is an excellent option when emailing graphics output.  Note that you can use this technique to email any graphics file in PNG, GIF, or JPEG format created with the SAS SG procedures, ODS Graphics, or SAS/GRAPH® procedures (such as GPLOT and GCHART).  You can also use this technique to email graphics files created with software other than SAS.

Use SAS to send an email that embeds a graph in the body of the email was published on SAS Users.