11月 142017
 

SAS Visual Analytics 7.4 has added the support for date parameters. Recall from my first post,  Using parameters in SAS Visual Analytics, a parameter is a variable whose value can be changed at any time by the report viewer and referenced by other report objects. These objects can be a calculated item, aggregated measure, filter, rank or display rule. And remember, every time the parameter is changed, the corresponding objects are updated to reflect that change.

Here is my updated table that lists the supported control objects and parameter types for SAS Visual Analytics 7.4. The type of parameter is required to match the type of data that is assigned to the control.
Notice that SAS Visual Analytics 7.4 has also introduced the support for multiple value selection control objects. I’ll address these in another blog.

Using Date Parameters in your SAS Visual Analytics Reports

Let’s look at an example of a SAS Visual Analytics Report using date parameters. In this fictitious report, we have been given the requirements that the user wants to pick two independent date periods for comparison. This is not the same requirement as filtering the report between a start and end date. This report requirement is such that a report user can pick two independent months in the source data to be able to analyze the change in Expense magnitude for different aggregation levels, such as Region, Product Line and Product.

In this example, we will compare two different Month,Year periods. This could easily be two different Quarter,Year or Week,Year periods; depending on the report requirements, these same steps can be applied.

In this high level breakdown, you can see in red I will create two date parameters from data driven drop-down lists. From these parameter values, I will create two calculated data items, shown in purple, and one aggregated measure that will be used in three different report objects, shown in green.

Here are the steps:

1.     Create the date parameters.

2.     Add the control objects to the report and assign roles.

3.     Create the dependent data items, i.e. the calculated data items and aggregated measure.

4.     Add the remaining report objects to the canvas and assign roles.

Step 1: Create the date parameters

First we will need to create the date parameters that will hold the values made by the report viewers. From the Data Pane, use the drop-down menu and select New Parameter….

Then create your first parameter as shown below. Give it a name.

Next, select minimum and maximum values allowed for this parameter. I used the min and max available in my data source, but you could select a more narrow range if you wanted to restrict the users to only have access to portions of the data, just so long as the values are in your data source since, in this example, we will use the data source to populate the available values in the drop-down list.

Then select a current value, this will serve as the default value that will populate when a user first opens the report.

Finally, select the format in which you want your data item to be formatted. I selected the same format as my underlying data item I will be using to populate the drop-down list.

Notice how your new parameters will now be available from your Data Pane.

Step 2: Add the control objects to the report and assign roles

Next, drag and drop the drop-down list control objects onto the report canvas. In this example, we are not using the Report or Section Prompt areas since I do not want to filter the objects in the report or section automatically. Instead, I am using these prompt values and storing them in a parameter. I will then use those values to create new calculated data items and an aggregated measure.

Once your control objects are in the report canvas, then use the Roles Pane to assign the data items to the roles. As you can see from the screenshot, we are using the Date by Month data item to seed the values of the drop-down list by assigning it to the Category role, this data item is in our data source.

Then we are going to assign our newly created parameters, Period1Parameter and Period2Parameter to the Parameter role. This will allow us to use the value selected in our calculations.

Step 3: Create the dependent data items, i.e. the calculated data items and aggregated measure

Now we are free to use our parameters as we like. In this example, I am prompting the report viewer for two values: Period 1 and Period 2 which are the two periods the user would like compared in this report. So, we will need to create two calculated data times from a single column in our source data. Since we want to display these as columns next to each other in a crosstab object and use them for an aggregated measure, this technique can be used.

Calculated Data Item: Period 1 Expenses

From the Data Pane, use the drop-down menu and select New Calculated Item…. Then use the editor to create this expression: If the Date by Month for this data row equals the parameter value selected for Period 1, then return the Expenses; else return 0.

Calculated Data Item: Period 2 Expenses

Repeat this using the Period2Parameter in the expression.

Aggregated Measure: Period Difference

Next, we want to calculate the difference between the two user selected Period Expenses. To do this, we will need to create an aggregated measure which will evaluate based on the report object’s role assignments. In other words, it will be calculated “on-the-fly” based on the visualization.

Similar to the calculated data items, use the Data Pane and from the drop-down menu select New Aggregated Measure…. Use the editor to create this expression. Notice that we are using our newly created calculated data items we defined using the parameter values. This expression does not use the parameter value directly, but indirectly through the calculated data item.

Step 4: Add the remaining report objects to the canvas and assign roles

No that we have:

  • our Control Objects to capture the user input.,
  • the Parameters to store the values.,
  • and the Calculated Data Items and Aggregated Measure created…

we can add our report objects to the canvas and assign our roles.

You can see I used all three new measures in the crosstab object. I used the aggregated measure in the bar chart and treemap but notice the different aggregation levels. There is even a hierarchy assigned to the treemap category role. This Period Difference aggregated measure calculation is done dynamically and will evaluate for each visualization with its unique role assignments, even while navigating up and down the hierarchy.

Here are some additional screenshots of different period selections.

In this first screenshot you can see the parallel period comparison between December 2010 and 2011.

In these next two screenshots, we are looking at the Thanksgiving Black Friday month of November. We are comparing the two years 2010 and 2011 again. Here we see that the Board Product from the Game Product Line is bright blue indicating an increase in magnitude of Expenses in the most recent period, Nov2011.

By double clicking on Board in the treemap, we are taken to the next level of the hierarchy, Product Description, where we see a the largest magnitude of Expenses is coming from Backgammon and Bob Board Games.

In these final two screenshots we are comparing consecutive periods, November 2011 with December 2011. We can see from the bar chart easily the Region and Product Line where there is the greatest increase in Expenses.

I’ve configured a brush interaction between all three visualizations so that when I select the tallest bar it will highlight the corresponding data values in the crosstab and treemap.

Conclusion

Now you can use date parameters in your Visual Analytics Reports. There are several applications of this feature and this is only one way you can use parameters to drive business intelligence. Using this technique to create columns based on a user selected value is great when you need to compare values when your source data isn’t structured in this manner.

Using Date Parameters in your SAS Visual Analytics Reports was published on SAS Users.

11月 132017
 

Debugging is the bane of every programmer. SAS supports a DATA step debugger, but that debugger can't be used for debugging SAS/IML programs. In lieu of a formal debugger, many SAS/IML programmers resort to inserting multiple PRINT statements into a function definition. However, there is an easier way to query the values of variables inside a SAS/IML function: Use the PAUSE statement to halt the execution of program inside the misbehaving function. You can then interactively submit PRINT statements or any other valid statements. When you are finished debugging, you can submit the RESUME statement and the function will resume execution. (Or you can submit the STOP statement to halt execution and return to the main scope of the program.)

The PAUSE statement

The SAS/IML language is interactive. The PAUSE statement pauses the execution of a SAS/IML module. While the program is paused you can print or assign local variables inside the module. When you are ready for the module to resume execution, you submit the RESUME statement. (The SAS/IML Studio application works slightly differently. The PAUSE statement brings up a dialog box that you can use to submit statements. You press the Resume button to resume execution.)

For example, suppose you write a function called 'Func' whose purpose is to compute the sum of squares of the elements in a numeric vector. While testing the function, you discover that the answer is not correct when you pass in a row vector. You decide to insert a PAUSE statement ("set a breakpoint") near the end of the module, just before the RETURN statement, as follows:

proc iml;
/* in SAS/IML, the CALL EXECUTE statement runs immediately */
start Func(x);
   y = x`*x;
 
  /* Debugging tip: Use PAUSE to enter INTERACTIVE mode INSIDE module! */
  pause "Inside 'Func'; submit RESUME to resume computations.";
  /* Execution pauses until you submit RESUME, then continues from the next statement. */
 
   return (y);
finish;
 
w = Func( {1 2 3} );

When you run the program, it prints Inside 'Func'; submit RESUME to resume computations. The program then waits for you to enter commands. The program will remain paused until you submit the RESUME statement (include a semicolon!).

Because the program has paused inside the function, you can query or set the values of local variables. For this function, there are only two local variables, x and y. Highlight the following statement, and press F3 to submit it.

print y;     /* inside module; print local variable */

The output shows that the value of y is a matrix, not a scalar, which indicates that the expression x`*x does not compute the sum of squares for this input vector. You might recall that SAS/IML has a built-in SSQ function, which computes the sum of squares. Highlight the following statement and press F3 to submit it:

y = ssq(x);  /* assign local variable inside module */

This assignment has overwritten the value of the local variable, y. When you submit a RESUME statement, the function will resume execution and return the value of y. The program is now at the main scope, which you can demonstrate by printing the value of w:

resume;      /* resume execution. Return to main scope */
print w;     /* print result at main scope */

To make the change permanent, you must edit the function definition and redefine the module. Be sure to remove the PAUSE statement when you are finished debugging: If you run a program in batch mode that contains a PAUSE statement, the program will forever wait for input!

In conclusion, the PAUSE statement enables you to pause the execution of a program inside a module and interactively query and change the local variables in the module. This can help you to debug a function. After you finish investigating the state of the local variables, you can submit the RESUME statement, which tells the function to continue executing from the line after the PAUSE statement. (Or submit STOP to exit the function.) The PAUSE statement can be a useful alternative to inserting many PRINT statements inside a function during the debugging phase of program development.

The post A tip for debugging SAS/IML modules: The PAUSE statement appeared first on The DO Loop.

11月 112017
 

Here's a Proc Print trick for grouped data. Suppose your data is divided into groups, such as males and females. You could sort by the grouping variable before printing, like this: Suppose you want to better emphasize the groups. You could add a BY statement, like this: OK, but, personally, [...]

The post Simple Proc Print trick for grouped data appeared first on SAS Learning Post.

11月 092017
 

I recently saw an alarming article on social media about an outbreak of airborne plague spreading from Madagascar to Africa (and potentially to the rest of the world). The plague?!? - I thought that only happened hundreds of years ago?!? I don't really trust news on Facebook, so I went [...]

The post Graphing mistakes to avoid ... like the plague! appeared first on SAS Learning Post.

11月 082017
 

A SAS programmer wanted to display a table in which the rows have different formats. An example is shown below. The programmer wanted columns that represent statistics and rows that represent variables. She wanted to display formats (such as DOLLAR) for some variables—but only for certain statistics. For example, the number of nonmissing observations (N) should never use the format of the variable, whereas the minimum, maximum, and mean values should.

Format rows of a table

In SAS you can easily apply a format to a column, but not to a row. However, SAS provides two techniques that enable you to control the appearance of cells in tables:

Both PROC TEMPLATE and PROC REPORT have a learning curve. If you are in a hurry, an alternative solution is to use the DATA step to create a factoid. Recall that a factoid is an ODS table that contains character values and (formatted) numbers in the same column. A factoid can display mixed-type data by converting every value to a string. This conversion process gives you complete control over the format of each cell in the table. Consequently, you can use context to format some numbers differently from other numbers.

Creating a custom factoid

Suppose that you want to generate the table shown at the top of this article. You can obtain the statistics from PROC MEANS and then transpose the data. The following statements produce an output data set that contains descriptive statistics for three variables in the Sashelp.Cars data:

proc means data=sashelp.cars noprint;
var Cylinders EngineSize MSRP;
output out=stats(drop=_TYPE_ _FREQ_
                 rename=(_STAT_=Label1)
                 where=(Label1 ^= "STD") );
run;
 
proc print data=stats noobs;
run;
Descriptive statistics with formats

I have highlighted the first row of the output data to indicate why this output is not suitable for a report. The output variables retain the format of the original variables. However, the 'N' statistic is the sample size, and a sample of size '$428' does not make sense! Even '428.000' is a strange value to see for a sample size. It would be great to format the first row with an intrger format such as 8.0.

The next DATA step creates three character variables (CVALUE1-CVALUE3) that contain formatted values of the three numerical variables in the data. The SELECT-WHEN statement does the following:

  • For each observation and for each of the three numerical variables:
    • If the LABEL1 variable is 'N', then apply the 8.0 format to the value of the numerical variable.
    • Otherwise use the VVALUE function to get the formatted value for the numerical variable.
data Factoid;
set stats;
array nValue[3] Cylinders EngineSize MSRP;      /* numerical variables */
array cValue[3] $12.;                           /* cValue[i] is formatted version of nValue[i] */
/* with a little effort you could assign labels dynamically */
label cValue1="Cylinders" cValue2="EngineSize" cValue3="MSRP";
 
do i = 1 to dim(nValue);
   select (Label1);
      when ('N')    cValue[i] = put(nvalue[i], 8.0);
      otherwise     cValue[i] = vvalue(nvalue[i]);
   end;
end;
run;
 
proc print data=Factoid label noobs;
   var Label1 cValue:;
run;

The output displays the character columns, which contain formatted numbers. With additional logic, you could display fewer decimal values for the MEAN row.

Transpose the factoid

The previous table contains the information that the programmer wants. However, the programmer asked to display the transpose of the previous table. For completeness, here are SAS statements that transpose the table:

proc transpose data=Factoid out=TFactoid;
   var cValue:;
   ID Label1;
   IDLABEL Label1;
run;
 
proc print data=TFactoid(drop=_NAME_) noobs label;
   label _LABEL_ = "Variable";
run;

The final table is displayed at the top of the article.

Summary

In summary, this article shows one way to display a column of data in which each cell potentially has a different format. The trick is to create a factoid, which means that you create a character copy of numeric data. As you create the copy, you can use the PUT function to apply a custom format, or you can use the VVALUE function to get the existing formatted value.

In general, you should try to avoid creating copies of data, so use PROC TEMPLATE and PROC REPORT if you know how. However, if you don't know how to use those tools, the factoid provides an alternative way to control the formatted values that appear in a table. Although not shown in this article, a factoid also enables you to display character and numeric values in the same column.

The post How to format rows of a table in SAS appeared first on The DO Loop.

11月 082017
 

A SAS programmer wanted to display a table in which the rows have different formats. An example is shown below. The programmer wanted columns that represent statistics and rows that represent variables. She wanted to display formats (such as DOLLAR) for some variables—but only for certain statistics. For example, the number of nonmissing observations (N) should never use the format of the variable, whereas the minimum, maximum, and mean values should.

Format rows of a table

In SAS you can easily apply a format to a column, but not to a row. However, SAS provides two techniques that enable you to control the appearance of cells in tables:

Both PROC TEMPLATE and PROC REPORT have a learning curve. If you are in a hurry, an alternative solution is to use the DATA step to create a factoid. Recall that a factoid is an ODS table that contains character values and (formatted) numbers in the same column. A factoid can display mixed-type data by converting every value to a string. This conversion process gives you complete control over the format of each cell in the table. Consequently, you can use context to format some numbers differently from other numbers.

Creating a custom factoid

Suppose that you want to generate the table shown at the top of this article. You can obtain the statistics from PROC MEANS and then transpose the data. The following statements produce an output data set that contains descriptive statistics for three variables in the Sashelp.Cars data:

proc means data=sashelp.cars noprint;
var Cylinders EngineSize MSRP;
output out=stats(drop=_TYPE_ _FREQ_
                 rename=(_STAT_=Label1)
                 where=(Label1 ^= "STD") );
run;
 
proc print data=stats noobs;
run;
Descriptive statistics with formats

I have highlighted the first row of the output data to indicate why this output is not suitable for a report. The output variables retain the format of the original variables. However, the 'N' statistic is the sample size, and a sample of size '$428' does not make sense! Even '428.000' is a strange value to see for a sample size. It would be great to format the first row with an intrger format such as 8.0.

The next DATA step creates three character variables (CVALUE1-CVALUE3) that contain formatted values of the three numerical variables in the data. The SELECT-WHEN statement does the following:

  • For each observation and for each of the three numerical variables:
    • If the LABEL1 variable is 'N', then apply the 8.0 format to the value of the numerical variable.
    • Otherwise use the VVALUE function to get the formatted value for the numerical variable.
data Factoid;
set stats;
array nValue[3] Cylinders EngineSize MSRP;      /* numerical variables */
array cValue[3] $12.;                           /* cValue[i] is formatted version of nValue[i] */
/* with a little effort you could assign labels dynamically */
label cValue1="Cylinders" cValue2="EngineSize" cValue3="MSRP";
 
do i = 1 to dim(nValue);
   select (Label1);
      when ('N')    cValue[i] = put(nvalue[i], 8.0);
      otherwise     cValue[i] = vvalue(nvalue[i]);
   end;
end;
run;
 
proc print data=Factoid label noobs;
   var Label1 cValue:;
run;

The output displays the character columns, which contain formatted numbers. With additional logic, you could display fewer decimal values for the MEAN row.

Transpose the factoid

The previous table contains the information that the programmer wants. However, the programmer asked to display the transpose of the previous table. For completeness, here are SAS statements that transpose the table:

proc transpose data=Factoid out=TFactoid;
   var cValue:;
   ID Label1;
   IDLABEL Label1;
run;
 
proc print data=TFactoid(drop=_NAME_) noobs label;
   label _LABEL_ = "Variable";
run;

The final table is displayed at the top of the article.

Summary

In summary, this article shows one way to display a column of data in which each cell potentially has a different format. The trick is to create a factoid, which means that you create a character copy of numeric data. As you create the copy, you can use the PUT function to apply a custom format, or you can use the VVALUE function to get the existing formatted value.

In general, you should try to avoid creating copies of data, so use PROC TEMPLATE and PROC REPORT if you know how. However, if you don't know how to use those tools, the factoid provides an alternative way to control the formatted values that appear in a table. Although not shown in this article, a factoid also enables you to display character and numeric values in the same column.

The post How to format rows of a table in SAS appeared first on The DO Loop.

11月 072017
 

SAS Tips and TricksThere is certainly no shortage of terrific tips and tricks in various SAS blogs from some of our most distinguished SAS in-house experts. But, there's another group of equally qualified experts who don't often get to share their expertise on this channel: our customers. So, I went on a quest to get the inside scoop from various SAS users, polling Friends of SAS members to get their feedback on their favorite SAS tips.

We asked a few of these Friends of SAS members who are regular SAS users to share with us their top SAS tips and tricks for improving performance or something they wished they had known earlier in their SAS career. Based on that, we got a wide range of tips and tricks from a number of different SAS users – ranging from novice to expert and across various industries and product users. Check out some of them below:

FUNCTIONS

Functions are either built into SAS itself or you can write your own customized code that act in the same manner, all of which help in analyzing and processing data. There are a variety of function categories that include mathematical, date and time, character, truncation, and miscellaneous. Using functions makes us more efficient, and we don’t have to re-invent the wheel every time we want to figure something out. With this being said, some of our regular SAS users have a thing or two to say about dealing with functions that may help you out:

“Before you program any complex code, look for a SAS function that will do the task for you.”
     - John Ladds, Past President, OASUS

“Insert a line break in a concatenated string, such as: manylines = catx('0a'x,a,b,c);”
     - Aroop Ghosh, Principal Consultant, Webtalk Communications

“Use the lag function to create time related variables, for example, in time punch data”
     - Yolanda, Analyst, TD

“A good trick that I have recently learnt [sic]which can make the code less wordier is using the functions IFN and IFC as an alternative to IF THEN ELSE statements in conditional processing.”
     - Sunny Giroti, Master of Business Analytics Candidate, Schulich School of Business

“IFN can be used in place of IF THEN ELSE to shorten code”
     - Neil Menezes, Senior Business Anlyst, CTFS

“Ron Cody’s link from SAS.COM. It has many SAS function examples.”
     - John Lam, CIBC

SYNTAX/SHORTCUTS/EFFICIENCIES

You know what they say: time is money. So for a SAS programmer, finding shortcuts and ways to work more efficiently and faster are important to get a job done quicker. Here are a few ways SAS users think can make your life easy while working with SAS:

“Use missover to ensure no records are skipped when reading in a file”
     - Scott Bellefeuille, IT Solutions Developer (Merchant Services), TD Bank

“Pressing keys 'Ctrl'+'/' to comment out a line of code.”
     - Bunce Leung, Execution Manager, RBC

“Variable Lists - being able to refer to variables using double dashes to indicate all variables between first and last in a dataset is super useful for many procs. The later versions of being able to use the prefix and colon to indicate all datasets with a prefix is a great shortcut as well.”
     - Fareeza Khurshed, Manager (Statistical Services), Alberta Treasury Board and Finance

“I like to use PERL in SAS for finding stuff in character variables.”
     - Peter Timusk, Statistics Officer, Statistics Canada

“Title "SAS can give you an Inheritance". Have an ODBC driver on your local PC but not on a remote server? No problem. Use rsubmit with the inheritlib option. Your remote server will now inherit the ODBC driver and be able to access a database you thought you could only reach with your PC.”
     - Horst Wolter, Manager, TD Bank

“If you want to speed the processing of your program. Run your join statements on the "work" library. It is must faster.”
     - Estela Tavares, Economist, Statistics Canada

“When dealing with probability, can logistic be used in all cases? Trick Q - as A is N0. What about the times, probability is 0 and 1. What if the data is heavily distributed on 1s and 0s.”
     - Mukul Pandey, Student Business Analytics, Schulich School of Business

“Proc tabulate can perform descriptive statistics better than proc freq and proc means.”
     - Taha Azizi, Senior Business Insight Analyst, TD

Your turn

Were any of these tips and tricks useful? Do you use them already? What are some of your top SAS tips and tricks? Please be sure to share in the comments below!

Looking for more tips and tricks? Check out this video featuring six Canadian SAS programmers, including a few Friends of SAS members, who share some of their favourite SAS programming tips.

About Friends of SAS

If you’re not familiar with Friends of SAS, it is an exclusive online community available only to our Canadian SAS customers and partners to recognize and show our appreciation for their affinity to SAS. Members complete activities called 'challenges' and earn points that can be redeemed for rewards. There are opportunities to build powerful connections, gain privileged access to SAS resources and events, and boost your learning and development of SAS all in a fun environment.

Interested in learning more about Friends of SAS? Feel free to email myself at Natasha.Ulanowski@sas.com or Martha.Casanova@sas.com with any questions or more details.

SAS tips and tricks: Users-tell-all edition was published on SAS Users.

11月 062017
 

When she was a little girl, SAS Senior User Experience Designer Khaliah Cothran’s parents gave her a creativity building set. She fell in love with building models of all kinds, from cars and trains to merry-go-rounds. And just like that, an engineer was born. A few years later, Cothran happened [...]

The post Building the STEM pipeline one story at a time appeared first on SAS Analytics U Blog.