Tech

1月 142019
 

In the second of three posts on using automated analysis with SAS Visual Analytics, we used the automated analysis object to get a better understanding of our variable of interest, X-Sell and Up-sell Flag, and how it is influenced by other variables in our dataset.

In this third and final post, you'll see how to filter the data even more to set up your customer care workers for success.

Remember how on the left-hand side of the analysis we had a list of subgroups with their probabilities? We can use those to filter our data or create additional subsets of data. Let’s create a calculated category from one of the subgroups and then use that to filter a list table of customers. If I right click on the 87% subgroup and select Derive subgroup item a new calculated category will appear in my Data pane.

Here is the new data item located in our data pane:

To see the filter for this data object we can right click on it and select edit.

We can now use this category as a filter. Here we have a basic customer table that does not have a filter applied:

If we apply the filter for customers who fall in the 87% subgroup and a filter for those customers who have not yet upgraded, we have a list of customers that are highly likely to upgrade.

We could give this list to our customer care centers and have them call these customers to see if they want to upgrade. Alternatively, the customer care center could use this filter to target customers for upgrades when they call in. So, if a customer calls into the center, the employee could see if that customer meets the criteria set out in the filter. If they do, they are highly likely to upgrade, and the employee should provide an offer to them.

How to match callers with sales channels

Let’s go back to our automated analysis and perform one more action. We’ll create a new object from the subgroup and assess the group by acquisition channel. This will help us determine which acquisition channel(s) the customers who are in our 87% subgroup purchased their plans from. Then we’ll know which sales teams we need to communicate to about our sales strategy.

To do this we’ll select our 87% group, right click and select New object from subgroup on new page, then Acquisition Channel.

Here we see the customers who are in or out of our subgroup by acquisition channel.

Because it is difficult to see the "in" group, we’ll remove those customers who are out of our subgroup by selecting out from the legend then right click and select New filter from selection, then Exclude selection.

Now we can see which acquisition channel the 87% subgroup purchased their current plan from and how many have already upgraded.

In less than a minute using SAS Visual Analytics' automated analysis we’ve gained business insights based on machine learning that would have taken hours to produce manually. Not only that, we’ve got easy-to-understand results that are built with natural language processing. We can now analyze all variables and remove any bias, ensuring we don’t miss key findings. Business users gain access to analytics without them having the expert skills needed to build models and interpret results. Automated analysis is a start and SAS is committed to investing time and resources into this new wave of BI. Look for more enhancements in future releases!

Miss the previous posts?

This is the third of a three-part series demonstrating automated analysis using SAS Visual Analytics on Viya. Part 1 describes a common visualization approach to handling customer data that leaves room for error and missed opportunities. Part 2 shows improvements through automated analysis.

Want to see automated analysis in action? Watch this video!

How SAS Visual Analytics' automated analysis takes customer care to the next level - Part 3 was published on SAS Users.

1月 102019
 

Everyone’s excited about artificial intelligence. But most people, in most jobs, struggle to see the how AI can be used in the day-to-day work they do. This post, and others to come, are all about practical AI. We’ll dial the coolness factor down a notch, but we explore some real gains to be made with AI technology in solving business problems in different industries.

This post demonstrates a practical use of AI in banking. We’ll use machine learning, specifically neural networks, to enable on-demand portfolio valuation, stress testing, and risk metrics.

Background

I spend a lot of time talking with bankers about AI. It’s fun, but the conversation inevitably turns to concerns around leveraging AI models, which can have some transparency issues, in a highly-regulated and highly-scrutinized industry. It’s a valid concern. However, there are a lot of ways the technology can be used to help banks –even in regulated areas like risk –without disrupting production models and processes.

Banks often need to compute the value of their portfolios. This could be a trading portfolio or a loan portfolio. They compute the value of the portfolio based on the current market conditions, but also under stressed conditions or under a range of simulated market conditions. These valuations give an indication of the portfolio’s risk and can inform investment decisions. Bankers need to do these valuations quickly on-demand or in real-time so that they have this information at the time they need to make decisions.

However, this isn’t always a fast process. Banks have a lot of instruments (trades, loans) in their portfolios and the functions used to revalue the instruments under the various market conditions can be complex. To address this, many banks will approximate the true value with a simpler function that runs very quickly. This is often done with first- or second-order Taylor series approximation (also called quadratic approximation or delta-gamma approximation) or via interpolation in a matrix of pre-computed values. Approximation is a great idea, but first- and second-order approximations can be terrible substitutes of the true function, especially in stress conditions. Interpolation can suffer the same draw-back in stress.

An American put option is shown for simplicity. The put option value is non-linear with respect to the underlying asset price. Traditional approximation methods, including this common second-order approximation, can fail to fit well, particularly when we stress asset prices.

Improving approximation with machine learning

Machine learning is technology commonly used in AI. Machine learning is what enables computers to find relationships and patterns among data. Technically, traditional first- order and second-order approximation is a form of classical machine learning, such as linear regression. But in this post we’ll leverage more modern machine learning, like neural networks, to get a better fit with ease.

Neural networks can fit functions with remarkable accuracy. You can read about the universal approximation theorem for more about this. We won’t get into why this is true or how neural networks work, but the motivation for this exercise is to use this extra good-fitting neural network to improve our approximation.

Each instrument type in the portfolio will get its own neural network. For example, in a trading portfolio, our American options will have their own network and interest rate swaps, their own network.

The fitted neural networks have a small computational footprint so they’ll run very quickly, much faster than computing the true value of the instruments. Also, we should see accuracy comparable to having run the actual valuation methods.

The data, and lots of it

Neural networks require a lot of data to train the models well. The good thing is we have a lot of data in this case, and we can generate any data we need. We’ll train the network with values of the instruments for many different combinations of the market factors. For example, if we just look at the American put option, we’ll need values of that put option for various levels of moneyness, volatility, interest rate, and time to maturity.

Most banks already have their own pricing libraries to generate this data and they may already have much of it generated from risk simulations. If you don’t have a pricing library, you may work through this example using the Quantlib open source pricing library. That’s what I’ve done here.

Now, start small so you don’t waste time generating tons of data up front. Use relatively sparse data points on each of the market factors but be sure to cover the full range of values so that the model holds up under stress testing. If the model was only trained with interest rates of 3 -5 percent, it’s not going to do well if you stress interest rates to 10 percent. Value the instruments under each combination of values.

Here is my input table for an American put option. It’s about 800k rows. I’ve normalized my strike price, so I can use the same model on options of varying strike prices. I’ve added moneyness in addition to underlying.

This is the input table to the model. It contains the true option prices as well as the pricing inputs. I used around 800K observations to get coverage across a wide range of values for the various pricing inputs. I did this so that my model will hold up well to stress testing.

The model

I use SAS Visual Data Mining and Machine Learning to fit the neural network to my pricing data. I can use either the visual interface or a programmatic interface. I’ll use SAS Studio and its programmatic interface to fit the model. The pre-defined neural network task in SAS Studio is a great place to start.

Before running the model, I do standardize my inputs further. Neural networks do best if you’ve adjusted the inputs to a similar range. I enable hyper-parameter auto-tuning so that SAS will select the best model parameters for me. I ask SAS to output the SAS code to run the fitted model so that I can later test and use the model.

The SAS Studio Neural Network task provides a wizard to specify the data and model hyper parameters. The task wizard generates the SAS code on the right. I’ve allowed auto-tuning so that SAS will find the best model configuration for me.

I train the model. It only takes a few seconds. I try the model on some new test data and it looks really good. The picture below compares the neural network approximation with the true value.

The neural network (the solid red line) fits very well to the actual option prices (solid blue line). This holds up even when asset prices are far from their base values. The base value for the underlying asset price is 1.

If your model’s done well at this point, then you can stop. If it’s not doing well, you may need to try a deeper model, or different model, or add more data. SAS offers model interpretability tools like partial dependency to help you gauge how the model fits for different variables.

Deploying the model

If you like the way this model is approximating your trade or other financial instrument values, you can deploy the model so that it can be used to run on-demand stress tests or to speed up intra-day risk estimations. There are many ways to do this in SAS. The neural network can be published to run in SAS, in data-base, in Hadoop, or in-stream with a single click. I can also access my model via REST API, which gives me lots of deployment options. What I’ll do, though, is use these models in SAS High-Performance Risk (HPRisk) so that I can leverage the risk environment for stress testing and simulation and use its nice GUI.

HPRisk lets you specify any function, or method, to value an instrument. Given the mapping of the functions to the instruments, it coordinates a massively parallel run of the portfolio valuation for stress testing or simulation.

Remember the SAS file we generated when we trained the neural network. I can throw that code into HPRisk’s method and now HPRisk will run the neural network I just trained.

I can specify a scenario through the HPRisk UI and instantly get the results of my approximation.

Considerations

I introduced this as a practical example of AI, specifically machine learning in banking, so let’s make sure we keep it practical, by considering the following:
 
    • Only approximate instruments that need it. For example, if it's a European option, don’t approximate. The function to calculate its true price, the Black-Scholes equation, already runs really fast. The whole point is that you’re trying to speed up the estimation.
 
    • Keep in mind that this is still an approximation, so only use this when you’re willing to accept some inaccuracy.
 
    • In practice, you could be training hundreds of networks depending on the types of instruments you have. You’ll want to optimize the training time of the networks by training multiple networks at once. You can do this with SAS.
 
    • The good news is that if you train the networks on a wide range of data, you probably won’t have to retrain often. They should be pretty resilient. This is a nice perk of the neural networks over the second-order approximation whereby parameters need to be recomputed often.
 
    • I’ve chosen neural networks for this example but be open to other algorithms. Note that different instruments may benefit from different algorithms. Gradient boosting and others may offer simpler, more intuitive models, that get similar accuracy.

When it comes to AI in business, you’re most likely to succeed when you have a well-defined problem, like our stress testing that takes too long or isn’t accurate. You also need good data to work with. This example had both, which made it a good candidate for to demonstrate practical AI.

More resources

Interested in other machine learning algorithms or AI technologies in general? Here are a few resources to keep learning.

Article: A guide to machine learning algorithms and their applications
Blog post: Which machine learning algorithm should I use?
Video: Supervised vs. Unsupervised Learning
Article: Five AI technologies that you need to know

Practical AI in banking was published on SAS Users.

1月 102019
 


My New Year's resolution: “Unclutter your life” and I hope this post will help you do the same.

Here I share with you a data preparation approach and SAS coding technique that will significantly simplify, unclutter and streamline your SAS programming life by using data templates.

Dictionary.com defines template as “anything that determines or serves as a pattern; a model.” However, I was flabbergasted when my “prior art research” for the topic of this blog post ended rather abruptly: “No results found for data template.”

What do you mean “no results?!” (Yes, sometimes I talk to the Internet. Do you?) We have templates for everything in the world: MS Word templates, C++ templates, Photoshop templates, website templates, holiday templates, we even have our own PROC TEMPLATE. But no templates for data?

At that point I paused, struggling to accept reality, but then felt compelled to come up with my own definition:

A data template is a well-defined data structure containing a data descriptor but no data.

Therefore, a SAS data template is a SAS dataset (data table) containing the descriptor portion with all necessary attributes defined (variable types, labels, lengths, formats, and informats) and empty (zero observations) data portion.

Less clutter = greater efficiency

When you construct SAS data tables using SAS code or data management tools such as using data design documentation as a feed to code-generating SAS program.

Unfortunately, despite all these benefits the data template concept is not explicitly and consistently employed and is noticeably absent from data development methodologies and practices.

Let’s try to change that!

How to create SAS data templates from scratch

It is very easy to create SAS data template. Here is an example:

 
libname PARMSDL 'c:\projects\datatemplates';
data PARMSDL.MYTEMPLATE;
   label
      newvar1 = 'Label for new variable 1'
      newvar2 = 'Label for new variable 2'
      /* ... */
      newvarN = 'Label for new variable N'
      ;
  length
      newvar1 newvar2 $40
      newvarN 8
      ;
   format newvarN mmddyy10.;
   informat newvarN date9.;
   stop;
run;

First, you need to assign a permanent library (e.g. PARMSDL) where you are going to store your SAS dataset template. I usually do not store data templates in the same library as data. Nor do I store it in the same directory/folder where you store your SAS code. Ordinarily, I store data templates in a so-called parameter data library (that is why I use PARMSDL as a libref), along with other data defining SAS code structure.

In the data step, the very first statement LABEL defines all variables’ labels as well as variable position determined by the order in which they are listed.

Statement LENGTH defines variables’ types (numeric or character) and their length in bytes. Here you may group variables of the same length to shorten your code or define them individually to be more explicit.
Statement FORMAT defines variables’ formats as needed. You don’t have to define formats for all the variables; define them only if necessary.

Statement INFORMAT (also optional) defines informats that come handy if you use this data template for creating SAS datasets by reading external raw files. With informats defined on the data template, you won’t have to specify informats in your INPUT statement while reading external file, as the informats will be inherently associated with the variable names. That is why SAS data sets have informat attribute for its variables in the first place (if you ever wondered why.)

Finally, don’t forget the STOP statement at the end of your data step, just before the RUN statement. Otherwise, instead of zero observations, you will end up with a data table that has a single observation with all missing variable values. Not what we want.

It is worth noting that obs=0 system option will not work instead of the STOP statement as it is applied only to the data being read, but we read no data here. For the same reason, (obs=0) data set option will not work either. Try it, and SAS log will dispel your doubts:

 
data PARMSDL.MYTEMPLATE (obs=0);
                        ---
                        70
WARNING 70-63: The option OBS is not valid in this context.  Option ignored.

How to create SAS data templates by inheritance

If you already have some data table with well-defined variable attributes, you may easily create a data template out of that data table by inheriting its descriptor portion:

 
data PARMSDL.MYTEMPLATE;
   set SASDL.MYDATA (obs=0);
run;

Option (obs=0) does work here as it is applied to the dataset being read, and therefore STOP statement is not necessary.

You can also combine inheritance with defining new variables, as in the following example:

 
data MYTEMPLATE;
   set SASDL.MYDATA (obs=0); *<-- inherited template;
   * variables definition: ;
   label
      newvar1 = &#039;Label for new variable 1&#039;
      newvarN = &#039;Label for new variable N&#039;
      oldvar =  &#039;New Label for OLD variable’ 
      ;
   length
      newvar1 $40
      newvarN 8
      oldvar  $100 /* careful here, see notes below */
      ;
   format newvarN mmddyy10.;
   informat newvarN date9.;
run;

A word of warning

Be careful when your new variable definition type and length contradicts inherited definition.
You can overwrite/re-define inherited variable attributes such as labels, formats and informats with no problem, but you cannot overwrite type and in some cases length. If you do need to have a different variable type for a specific variable name on your data template, you should first drop that variable on the SET statement and then re-define it in the data step.

With the length attribute the picture is a bit different. If you try defining a different length for some variable, SAS will produce the following WARNING in the LOG:

WARNING: Length of character variable  has already been set.
Use the LENGTH statement as the very first statement in the DATA STEP to declare the length of a character variable.

You can either use the advice of the WARNING statement and place the LENGTH statement as the very first statement or at least before the SET statement. In this case, you will find that you can increase the length without a problem, but if you try to reduce the length relative to the one on the parent dataset SAS will produce the following WARNING in the LOG:

WARNING: Multiple lengths were specified for the variable  by input data set(s). This can cause truncation of data.

In this case, a cleaner way will also be to drop that variable on the SET statement and redefine it with the LENGTH statement in the data step.

Keep in mind that when you drop these variables from the parent data set, besides losing their type and length attributes, you will obviously lose the rest of the attributes too. Therefore, you will need to re-define all the attributes (type, length, label, format, and informat) for the variables you drop. At least, this technique will allow you to selectively inherit some variables from a parent data set and explicitly define others.

How to use SAS data templates

One way to apply your data template to a newly created dataset is to: 1) Copy your data template in that new dataset; 2) Append your data table to that new data set. Here is an example:

 
/* copying data template into dataset */
data SASDL.MYNEWDATA
   set PARMSDL.MYTEMPLATE;
run;
 
/* append data to your dataset with descriptor */
proc append base=SASDL.MYNEWDATA data=WORK.MYDATA;
run;

Your variable types and lengths should be the same on the BASE= and DATA= tables; labels, formats and informats will be carried over from the BASE= dataset/template.

It is simple, but could be simplified even more to reduce your code to just a single data step:

 data SASDL.MYNEWDATA;
   if 0 then set PARMSDL.MYTEMPLATE;
   set WORK.MYDATA;
   /* other statements */
run;

Even though set PARMSDL.MYTEMPLATE; statement has never executed because of the explicitly FALSE condition (0 means FASLE) in the IF statement, the resulting dataset SASDL.MYDATA gets all its variable attributes carried over from the PARMSDL.MYTEMPLATE data template during data step compilation.

This same coding technique can be used to implicitly apply data variable attributes from a well-defined data set by inheritance even though that data set is not technically a data template (has more than 0 observations.) Run the following code to make sure MYDATA table has all the variables and attributes of the SASHELP.CARS data table while data values come from the ABC data set:

 
data ABC;
  make='Toyota';
run;
 
data MYDATA;
   if 0 then set SASHELP.CARS;
   set ABC;
run;

Perhaps the benefits of SAS data templates are best demonstrated when you read external data into SAS data table. Here is an example for you to run (of course, in real life MYTEMPLATE should be a permanent data set and instead of datalines it should be an external file):

 
data MYTEMPLATE;
   label
      fdate = 'Flight Date'
      count = 'Flight Count'
      fdesc = 'Flight Description'
      reven = 'Revenue, $';
   length fdate count reven 8 fdesc $22;
   format fdate date9. count comma12. reven dollar12.2;
   informat fdate mmddyy10. count comma8. fdesc $22. reven comma10.;
   stop;
run;
 
data FLIGHTS;
   if 0 then set MYTEMPLATE;
   input fdate count fdesc & reven;
   datalines;
12/05/2018 500   Flight from DCA to BOS  120,034
10/01/2018 1,200 Flight from BOS to DCA  90,534
09/15/2018 2,234 Flight from DCA to MCO  1,350
;

Here is how the output data set looks:

Notice how simple the last data step is. No labels, no lengths, no formats, no informats – no clutter. Yet, the raw data is read in nicely, with proper informats applied, and the resulting data set has all the proper labels and variable formatting. And when you repeat this process for another sample of similar data you can still use the same data template, and your read-in data step stays the same – simple and concise.

Your thoughts

Do you find SAS data templates useful? Do you use them in any shape or form in your SAS data development projects? Please share your thoughts.

Simplify data preparation using SAS data templates was published on SAS Users.

1月 072019
 

In the first of three posts on using automated analysis with SAS Visual Analytics, we explored a typical visualization designed to give telco customer care workers guidance on customers most receptive to upgrade their plans. While the analysis provided some insight, it lacked analytical depth -- and that increases the risk of  wasting time, energy and money on a strategy that may not succeed.

Let’s now look at the same data, but this time deepen the analytical view by putting SAS Visual Analytics' automated analysis into play. We’ll use automated analysis to determine significant variables that impact our key business measure, X-sell and Up-sell Flag.

Less time spent on data discovery, quicker response time

The automated analysis object determines the most important underlying factors for a specific response variable, in our case the X-Sell and Up-Sell flag. After you specify a response variable, most of the remaining data items are added as underlying factors. Variables that are identical to the response variable, variables that have excessive missing values, or variables that have high cardinality are not added as underlying factors. For category responses, you can select the event level (category value) that interests you.

To run automated analysis, I will use the data pane and right click on Xsell and Upsell Flag Category, select Analyze, then Analyze on new page.

 

Here we see the results for Not Yet Upgraded.

Seeing how we really want to understand what made our customers upgrade so we can learn from it, let’s change the results to see upgraded accounts. To do this I will use the drop-down menu to change the category value to Upgraded.

 

Now we see the details for Upgraded. Let’s look at each piece of information within this chart.

 

The top section tells us that the probability of a customer upgraded is 12.13%. It also tells us the other variables in our dataset that influences that probability. The strongest influencers are Total Days over plan, Days Suspended Last 6M (months), Total Times Over Plan and Delinquent Indicator. Remember from our previous analysis, the correlation matrix determined that Total Days over plan, Delinquent Indicator and Days suspended last 6M were correlated with our X-sell and Up-sell Flag. So this part of the analysis is pretty similar. However, the rest of the automated analysis provides so much more information than what we got from our previous analysis and it was produced in under a minute.

The next section gives us a visual on how strong each influencer is on our variable of interest, Xsell and Upsell Flag Category. Total Days Over Plan is the strongest followed by Days Suspended Last 6M followed by Total Times Over Plan…. If we mouse over each of the boxes, we’ll see their relative importance.

After SAS Visual Analytics adds the underlying factors, it creates a relative importance score for each underlying factor. The most important underlying factor is assigned a score of 1, and all other scores are proportional to that value.

If I mouse over Total Days Over Plan I’ll see the relative importance score for that variable.

Here we see that Total Days Over Plan relative importance in influencing a customer to change plans is 1. That means it was the most important factor in predicting our variable of interest, cross-sell and up-sell flag. If I mouse over the Days Suspended Last 6M, I can see that the relative importance for that variable is 0.6664.

The percentages along the left-hand side give us the probability (or chance) of the subgroups of customers likely to upgrade. SAS Visual Analysis shows the top groups and the bottom groups based on probability. The first group of customers are 100% likely to upgrade. These customers have Total Days Over Plan greater than or equal to 33, Days Suspended Last 6M greater than or equal to 6, 6M Avg Minutes on Network Normally Distributed less than -6.9, Delinquent Indicator of 1,2,3 or 4. This means going forward, if we have customers that meet these criteria we should target them for an upgrade because they are 100% likely to upgrade. We can also use the next three customer groups to target as well.

For measure responses, the results display the four groups that result in the greatest values of the response. The results also display the two groups that result in the smallest values of the response. For category responses, the results display the four groups that contain the greatest percentages of the response. The results also display the two groups that contain the least percentages of the response.

The bottom right chart shows how a variable relates to our variable of interest. Below the chart is a description outlining key findings.

An explanatory plot is included for each underlying factor. The contents of this plot depend on the variable type of both the response variable and the underlying factor.

If I click on Days Suspended Last 6M from the colored button bar, the informative text will be highlighted, and the plot chart will be updated to reflect my selection.

But what if you want to see all the variables analyzed and discover what actions were taken on them? If we maximize the automated analysis object we’d see a table at the bottom. This table outlines actions taken on the predictors.

Here we see that Census Area Total Males was rejected because it is too strongly correlated with another measure. This reason would be easy for someone to miss and would affect the results of an analysis or model if that predictor was not removed. Automated analysis really does do the thinking for us and makes models more accurate!

In the second post of this three-part series, we’ll see how we can turn the results from this automated analysis into actionable items.

SAS® Visual Analytics on SAS® Viya® Try it for free!

How SAS Visual Analytics' automated analysis takes customer care to the next level - Part 2 was published on SAS Users.

1月 032019
 

You're the operations director for a major telco's contact center. Your customer-care workers enjoy solving problems. Turning irate callers into fans makes their day.

They also hate flying blind. They've been begging you for deeper insight into customer data to better serve their callers. They want to know which customers will likely accept offers and upgrades they're authorized to give. Their success = customer satisfaction = your company's success, right?

Automated analytics facilitate that level of insight, and this post introduces you to it. It will help you begin to think through what it looks like to equip your contact center workers to be heroes. Two subsequent posts will further demonstrate how SAS Visual Analytics leverages automated analytics.

What is automated analytics?

If you're already familiar with business intelligence tools, it's not a stretch to call automated analytics disruptive, significantly changing the way you see BI. In essence, automated analytics uses machine learning to find meaningful relationships between variables. It provides valuable insights in easy-to-understand text generated using natural language.

Automated analytics, which is expanding to include Artificial Intelligence, overcomes barriers to insight-driven business decisions by reducing:

  • Time to insights.
  • Bias in the analysis.
  • The need for more employee training.

An analyst-intensive approach to better insights

Now put your analyst hat on and imagine a day in the life of interpreting data visualizations. Pictured below is a report created to explore and visualize customers' interactions with a telecommunications company. It contains usage information from a subset of customers who have contacted customer care centers. Enhanced by adding cleansed demographics data, this report is being used to target customers for cross-sell or up-sell opportunities.

Note that the Private Label GM channel have the highest upgrade rate of 50%. This could mean that customers who purchased their plans through the Private Label GM channel were not well informed on their options and might have purchased a plan that did not fit their needs. We could investigate this further and see how we can assist our customers better when purchasing their plans through the Private label GM channel.

This report also shows us that the unknown handset type had the highest upgrade rate of all phone types. Unknown handset indicates that this customer brought their phone over from another company. So, this high upgrade rate is not surprising as a recent promotion targeted those users to switch their phones and upgrade.

The analysis showing our upgrade rate and total upgrades by plan type shows us that the Lotta Minutes Classic plan had the highest number of upgrades. This is not surprising as it also has the highest number of accounts. However, the Data Bytes Value plan had the highest upgrade rate but very few accounts. We could focus on the Unlimited SL plan customers and offer them upgrades, as they seem to be more likely to upgrade than customers on other plans and there are quite a few customers still on that plan.

From the analysis on the bottom we can see the correlation of other variables to our variable of interest, cross-sell and up-sell flag. We can see that Total Days Over Plan, Delinquent Indicator and Days Suspended Last 6M are weakly correlated to upgrades.

What’s interesting here is that data plan is not correlated to our variable of interest, Xsell and Upsell flag. This tells me that if we had started a campaign targeted on Unlimited SL customers, we probably wouldn’t have much success.

We might want to target customers based on total days over plan or delinquent indicator or days suspended in the last 6 months, but they were only weakly correlated.

While this correlation provided great insight and may have prevented us from going down the wrong path, I had to physically choose the variables I wanted to include. I used my own logic and chose variables I thought might influence our variable of interest, Xsell and Upsell Flag.

Risk of mistakes, missed opportunity

But there are many other variables in this dataset. What if one of the other variables that I hadn’t thought of was correlated? I would miss some key findings. Or what if multiple variables in combination better predict our variable of interest, Xsell and Upsell Flag?

To dive deeper we could add a decision tree, or other charts to try to determine where we should focus our future efforts. This would take some time to build and we’d need to interpret the results on our own. However, if we use automated analysis the application would:

  • Choose the most relevant categories and measures.
  • Perform the most appropriate analytics for our data.
  • Provide us with results that are easy-to-understand.

Upcoming: A closer look at automated analytics

In next week's post, you'll see what happens when we turn loose the power of automated analytics with the SAS Viya Platform and let SAS Visual Analytics analyze all the measures and categories.

What's your experience with automated analytics? Share in the comments.

How SAS Visual Analytics' automated analysis takes customer care to the next level - Part 1 was published on SAS Users.

12月 222018
 

This post rounds out the year and my series of articles on SAS REST APIs. The first two articles in the series: Using SAS Viya REST APIs to access images from SAS Visual Analytics and Using SAS Cloud Analytics Service REST APIs to run CAS Actions, examined how to use SAS Viya REST and SAS CAS REST APIs to access SAS data from external resources. Access the links to for a quick detour to get some background. This article takes things a step further and outlines how to use a simple application to interact with SAS Viya using REST APIs.

What do chocolate and toffee have to do with optimization? Read on and find out.

The application

When deciding on an example to use in this article, I wanted to focus on the interaction between the application and SAS, not app complexity. I decided to use an application created by my colleague, Deva Kumar. His OptModel1 is an application built on the restAF framework and demonstrates how SAS REST APIs can be used to build applications that exploit various SAS Viya functionalities. This application optimizes the quantities of chocolate and toffee to purchase based on a budget entered by the user.

Think of the application as comparable to the guns and butter economic model. The idea in the model is the more you spend on the military (guns), the less you spend on domestic programs and the civilian goods (butter). As President Johnson stated in 1968, "That bitch of a war, killed the lady I really loved -- the Great Society." In this article, I'll stick to chocolate and toffee, a much less debatable (and tastier) subject matter.

The OptModel1 application uses the runOptmodel CAS action to solve the optimization problem. The application launches and authenticates the user, the app requests a budget. Based on the amount entered, a purchase recommendation returns for chocolate and toffee. The user may also request a report based on returned values. In the application, OptModel1 and SAS interact through REST API calls. Refer to the diagram below for application code workflow.

Create the application

To create the application yourself, access the source code and install instructions on SAS' github page. I recommend cloning, or in the least, accessing the repository. I refer to code snippets from multiple files throughout the article.

Application Workflow

Represented below is the OptModel1 work flow. Highlighted in yellow is each API call.

OptModel1 Work Flow

OptModel1 Work Flow

Outlined in the following sections is each step in the work flow, with corresponding numbers from the diagram.

Launch the application

Enter url http://localhost:5006/optmodel in a browser, to access the login screen.

OptModel1 app login page

1. Login

Enter proper credentials and click the 'Sign In' button. The OptModel1 application initiates authentication in the logon.html file with this code:

        <script>
            function logonButton() {
                let store = restaf.initStore();
                store.logon(LOGONPAYLOAD)
                    .then(msg => console.log(msg))
                    .catch(err => alert(err));
            }
        </script>

Application landing page

After successfully logging in, the application's main page appears.

Application landing page

Notice how the host and access token are part of the resulting url. For now, this is as far as I'll go on authentication. I will cover this topic in depth in a future article.

As I stated earlier, this is the simplest of applications. I want to keep the focus on what is going on under the covers and not on a flashy application.

2a. Application initialization

Once the app confirms authentication, the application initialization steps ensue. The app needs to be available to multiple users at once, so each session gets their own copy of the template Visual Analytics (VA) report. This avoids users stepping on each other’s changes. This is accomplished through a series of API calls as explained below. The code for these calls is in vaSetup.js and reportViewer.js.

2b. Copy data

The app copies data from the Public caslib to a temporary worklib – a worklib is a standard caslib like casuser. The casl code below is submitted to CAS server for execution. The code to make the API call to CAS is in vaSetup.js. The relevant snippet of javascript code is:

  // create casl statements
    let casl = `
        /* Drop the table in memory */
        action table.dropTable/
        caslib='${appEnv.work.caslib}' name='${appEnv.work.table}' quiet=TRUE;
 
        /* Delete the table from the source */
        action table.deletesource / 
        caslib='${appEnv.work.caslib}' source='${appEnv.work.table}.sashdat' quiet=TRUE;
 
        /* Run data step to copy the template table to worklib */
        action datastep.runCode /
            code='
            data ${appEnv.work.caslib}.${appEnv.work.table}; 
            set ${appEnv.template.caslib}.${appEnv.template.table};
            run;';
 
        /* Save the new work table */
        action table.save /
            caslib  = '${appEnv.work.caslib}'
            name    = '${appEnv.work.table}'
            replace = TRUE
            table= {
                caslib = '${appEnv.work.caslib}'
                name   = '${appEnv.work.table}'
            };
 
        /* Drop the table to force report to reload the new table */
        action table.dropTable/
            caslib='${appEnv.work.caslib}' name='${appEnv.work.table}' quiet=TRUE;
 
 
    `;
 
    // run casl statements on the server via REST API
    let payload = {
        action: 'sccasl.runCasl',
        data: {code: casl}
    }
    await store.runAction(session, payload);

2c. Does report exist?

This step checks to see if the personal copy of the VA report already exists.

2d. Delete temporary report

If the personal report exists it is deleted so that a new one can be created using the latest VA report template.

// If temporary report exists delete it - allows for potential new template report
    let reportsList = await getReport( store, reports, `${APPENV.work.report}`);
    if ( reportsList !== null ) {
        await store.apiCall(reportsList.itemsCmd(reportsList.itemsList(0), 'delete'));
      };

2e. Create new report

A new personal report is created. This new report is associated with the table that was created in step 2b.

// make the service call to create the temporary report
    let changeData = reportTransforms.links('createDataMappedReport');
    let newReport = await store.apiCall(changeData, p);

2f. Save report info

A new personal report is created. This new report is associated with the table that was created in step 2b.

// create src parameter for the iframe
    let options = "&appSwitcherDisabled=true&reportViewOnly=true&printEnabled=true&sharedEnabled=true&informationEnabled=true&commentEnabled=true&reportViewOnly=true";
    let href = `${appEnv.host}/SASReportViewer/?reportUri=${reportUri}${options}`;
 
    // save href in appEnv to use for displaying VA report in an iframe
    appEnv.href = href;

3. Enter budget

Enter budget in the space provided (I use $10,000 in this example) and click the Optimize button. This action instructs the application calculate the amount of chocolate and toffee to purchase based on the model.

Enter budget and optimize

4. & 5. Generate and execute CASL code

The code to load the CAS action set, run the CAS action, and store the results in a table, is in the genCode.js file:

  /* Assumption: All necessary input tables are in memory */
	pgm = "${pgm}";
	/*Load action set and run optimization*/
	loadactionset 'optimization';
		action optimization.runOptmodel / 
		code=pgm printlevel=0; 
		run; 
 
	/* save result of optimization for VA to use */
	action table.save /
		caslib  = '${appEnv.work.caslib}'
		name    = '${appEnv.work.table}'
		replace = TRUE
		table= {
			caslib = '${appEnv.work.caslib}'
			name   = '${appEnv.work.table}'
		};
 
	/* fetch results to return for the UI to display */
	action table.fetch r=result /
		table= {caslib = '${appEnv.work.caslib}' name = '${appEnv.work.table}'};
	run;
 
	/* drop the table to force report to reload the new table */
	action table.dropTable/
		caslib='${appEnv.work.caslib}' name='${appEnv.work.table}' quiet=TRUE;

Note: The drop table step at the end of the preceding code is important to force VA to reload the data for the report.

6. Get the results - table form

The results return to the application in table form. We now know to buy quantities of 370 chocolate and 111 toffee with our $10,000 budget. Please refer to the casTableViewer for code details of this step.

Data view in table format

6. Get the results - report form

Select the View Graph button. This action instructs OptModel1 to display the interactive report with the new data (the report we created in step 2f). Please refer to the onReport function in index.html for code details of this step.

Data view in report format

Now that we know how much chocolate and toffee to buy, we can make enough treats for all of the holiday parties just around the corner. More importantly, we see how to integrate SAS REST APIs into our application. This completes the series on using SAS REST APIs. The conversation is not over however. I will continue to search out and report on other topics related to SAS, open source languages, and agile technologies. Happy Holidays!

SAS REST APIs: a sample application was published on SAS Users.

12月 182018
 

The REST architecture that SAS Viya is built on is, by its nature, open. This is a very powerful thing! In addition, the supplied command-line interfaces (CLIs) add a user-friendly interface to make it easier to make REST calls. Occasionally, however, it is necessary to call REST directly. This can occur when there is (currently) no CLI interface to a piece of functionality, or you wish to run a more complex task from a single command. In the SAS Global Enablement and Learning (GEL) group, as we staged our software images and developed our materials for our SAS Viya training, we found ourselves with some of these needs. As a result, we developed the GEL pyviyatools.

The GEL pyviyatools are a set of Python-based command-line tools that call the SAS Viya REST APIs. The tools can be used to make direct calls to any REST-endpoint (like a cURL command), and as a framework to build additional tools that make multiple rest calls to provide more complex functionality. The tools are designed to be used in conjunction with the sas-admin command line interfaces (CLI).

One of the challenges of making REST calls to SAS Viya is getting your authentication token. The tools simplify this issue by using the authentication mechanism provided by the SAS Viya command-line interfaces.

callrestapi (call_rest_api) is a general tool, and the building block for all the other tools. It calls a function callrestapi() that can also be used from any python program to build more complex tools.

The tools are self-documenting just like the Viya CLIs (just use the -h or –help option)

With callrestapi, you must pass a method and endpoint. You can optionally pass JSON data for a post request, content type headers, and the -o option to change the style of output.

In addition to this basic cURL-like functionality, there are some tools built on top of callrestapi that perform more complex functions. Here are few examples -- check out the GitHub project for a full list.

    createdomain.py creates a SAS Viya authentication domain

    updatedomain.py loads a set of userids and passwords to a Viya domain from a csv file

    listrules.py lists authorization rules subset on a principal and/or a uri

    loginviauthinfo.py uses an authinfo file to authenticate to the CLI

    updateprefences.py updates preferences for a user or group of users

    updatedomain.py loads a set of userids and passwords to a SASViya domain from a csv file

    createfolders.py creates a set of SAS Viya folders from a csv file

    explainaccess.py explains access for a folder, object or service endpoint

You can get the tools from GitHub where the installation and usage instructions are documented

Please try these tools if you need more command-line functions in your SAS Viya environment. In addition, if you want to contribute additional tools built on the framework, please see the CONTRIBUTING.md file in the GitHub repository. You can also report any issues or suggestions via GitHub issues.

Introducing Python-based command-line tools for SAS Viya was published on SAS Users.

12月 142018
 
Several years ago, I wrote a paper about the top-ten questions about the DATA step that SAS Technical Support receives from customers. Those topics are still popular among people who contact us for help. In this blog, I’m sharing some additional questions that we’re asked on a regular basis. Those questions cover SAS dates, arrays, and how to reference local PC files from SAS® Enterprise Guide® and SAS® Studio when those applications connect to a SAS® server in UNIX operating environments.

About SAS® dates

Let’s begin with dates. We regularly hear customers say something similar to this: "I have a date, but I’m not sure how to use it or whether it’s even a SAS date yet." No worries--we can figure it out! A SAS date is a numeric variable whose value represents the number of days between January 1, 1960 and a specific date. For example, assume that you have a variable named X that has a value of 12398, but you’re not sure what that value represents. Is it a SAS date? Or does it represent January 23, 1998?
 
To determine what the value represents, you first need to run the CONTENTS procedure on the data set and determine whether the variable in question is character or numeric.
 
For this example, here is the partial output from the PROC CONTENTS step:

Alphabetic List of Variables and Attributes
#    Variable    Type    Len    Format

1    x           Num       8
2    y           Char      3
3    z           Num       8    Z5.

If X is a numeric variable, is a format shown in the FORMAT column for that variable? In this case, the answer is no. However, if the variable is numeric and there is no assigned format, this might be a SAS date that needs to be formatted to make sense of the value. If you run a simple DATA step to add any date format to that SAS date value, you will see that 12398 represents the date December 11, 1993.
 
data a;
mydate=12398;
format mydate worddate.;
run;  

If you print the results of this program with the PRINT procedure, the output for data set A is as shown below:
 
Obs         mydate

 1     December 11, 1993

Is this a valid date in the context of this data sample? If you’re unsure, look at the other date values to see whether most of them are similarly structured. Most of the time, if a variable is stored as a SAS date, the variable is already assigned a date format, which is shown in the PROC CONTENTS output. If the value 12398 is a numeric variable such that the digits represent the month, day, and year of a given date (for example, January 23, 1998), you can convert it to a SAS date by running the following DATA step:
 
data a;
x=12398;
y=input(put(x,5.),mmddyy6.);
format y date9.;
run;

The PROC PRINT output from this step shows that the variable Y has a formatted value of 23JAN1998.
 
Obs      x              y

 1     12398    23JAN1998

The format that you assign to the variable can be any SAS format or custom-date format.
 
If the original variable is a character variable, you can convert it to a SAS date by using the INPUT function and the MMDDYY6. informat.
 
data a;
x='12398';
y=input(x,mmddyy6.);
format y date9.;
run;

Using arrays in SAS

Many customers aren’t quite sure that they understand how to use arrays. Arrays are a common construct in many programming languages. Arrays can seem less complex when you remember that they are a temporary grouping of variables. When you perform the same operation on multiple variables, you have less to program if you can refer to a group of variables by a single name. You simply execute a DO loop that processes each variable in turn, and the task is complete!

We often see arrays used for "reshaping data" or transposing a data set from wide-to-long (or long-to-wide). For example, assume that you want to reshape a data set, comprised of three variables and four observations, into a data set that contains twelve variables. Using an array approach makes the programming much easier, as shown below:

In this example:

    1. The variables X, Y, and Z are loaded into an array named VARS, which means that they can be referred to as VARS(1) – VARS(3) or by the variable names X, Y, and Z.
    2. A multidimensional array named ALL is created with twelve variables. The first number in parentheses represents rows, and the second represents columns.
    3. A DO loop processes each variable in the VARS array.
    4. The ALL array is populated one observation at a time by the value of I and the value of J as the DO loop increments.

Because the ALL array is populated by each observation as it is read from data set One, the END= option in the SET statement creates the variable LAST as a flag. This variable indicates when the last observation is read, and the IF statement tests variable LAST. If the variable has a value of 1 (which evaluates to "true"), the statement prints the contents of the program data vector to the output data set. Here's the starting data set and the reshaped result:

Managing PC files in client/server environments

When I began working in Technical Support many years ago, the only interface to Base SAS® software was the Display Manager System, which has separate Program Editor, Log, and Output windows. Now, you can run SAS in various ways, and many of our customers use SAS Enterprise Guide and SAS Studio as their interfaces. One of the most frequently asked questions from customers is about how to access local PC files from these applications that access SAS through a UNIX server.

SAS Enterprise Guide offers built-in tasks to upload and download data sets and other files. You can find these tasks on the Tasks->Data menu.

Two of the tasks, Upload Data Files to Server and Download Data Files to PC, allow you to copy SAS data sets directly between your local PC and your SAS libraries. The third task, Copy Files, allow you to copy any file (or group of files) between your local PC and the file system of the SAS session. See this article to learn how to apply a common pattern with this task: export and download any file from SAS Enterprise Guide. (Note: The Copy Files task was added in SAS Enterprise Guide 7.13. For earlier releases, you can follow the steps in this article.)

If you’re using the SAS Studio interface, you can upload and download files between the server and your PC.

Upload File and Download File buttons in SAS Studio

 
To download a file from the SAS server to your computer:

    1. Select the file that you want to download from the folder tree.
    2. Click the download button and save the file according to the information in your browser dialog box.

To upload one or more files from your local computer:

    1. Select the folder to which you want to upload the files and click the upload button.
    2. In the Upload Files window, click Choose Files to browse for the files that you want to upload.
    3. Select one or more files from your computer and click Open. The selected files are displayed as well as their size. An error message is displayed when you try to upload files where the total size exceeds 10 MB.
    4. Click Upload to complete the upload process.

Always go back to the basics

The three topics that are discussed here don't represent new features or challenges. However, these topics generate many calls to Technical Support. It's a reminder that even as SAS continues to add new features and technology, we still need to know how to tackle the basic building blocks of our SAS programs.

FAQs about SAS dates, arrays and managing local PC files was published on SAS Users.

12月 102018
 

When I was growing up, there were two kinds of Sundays: regular Sundays and George Sundays. George was the proprietor of a local Italian restaurant in my hometown and hosted the extended LaRusso clan for Sunday lunch every few weeks. His restaurant, appropriately named George’s, owns some of my favorite childhood memories – and some of my worst.

Every couple of months, my aunts, uncles, a baker’s dozen of cousins, and my immediate family members would take over George’s backroom and see if we could challenge the city’s noise ordinance. George would do nothing to discourage us, appearing every so often to fire balls of uncooked dough at us or ply us with more caffeine-laced sugary drinks, despite instructions to the contrary from our parents.

Invariably, though, an otherwise pleasant afternoon took a turn for the worse as we were leaving the restaurant. That was when my parents, thinking they were doing us a favor, would let us choose one item off George’s famous “candy wall.” You see, George didn’t stock just one or two different kinds of candy, he had dozens. Every different kind of chocolate bar, brand of gum, and flavor of jelly beans beckoned from George’s Candy Wall. For a 6 or 7-year-old kid, it was just too much. All these choices literally paralyzed me. Ten minutes of indecisiveness and several ultimatums later my parents would usher me out of the restaurant, usually empty-handed and crying. Even on the rare occasions when I did settle on something, I spent the rest of the afternoon lamenting my decision, thinking I left behind something that I would have enjoyed more.

When it comes to the multitude of great support and learning resources we offer new users of SAS, I often wonder if it can feel like you’re staring at George’s Candy Wall as well. While support.sas.com remains the holy grail of SAS customer support, there are so many good choices, it can sometimes be hard to know where to start. That’s why we’ve put together a new resource to make things easier for new SAS users: the SAS Starter Kit.

Need help navigating SAS Support Resources? Here’s your guide

SAS Support ResourcesThe SAS Starter Kit is the perfect place for SAS newbies to start, outlining the five essential steps to help you learn the basics, grow your skills and connect with other users from around the world.

Step 1 invites you to create a SAS profile. A profile provides you access to things like free, on-demand training, software downloads and access to our SAS Communities, where you can ask questions, get answers and connect with SAS experts from nearly every industry and around the world. You can

Step 2 is your SAS Resource Cheat Sheet. SAS Cares is your one stop listing of all the SAS resources you’ll ever need. Add it to your web favorites or print it out and add a little color to your cube. Keep this one close; it provides quick, one-click access to some of SAS’ most helpful resources.

Step 3 is designed to expand your SAS knowledge. This step introduces you to a full menu of free tutorials to binge watch, a number of free e-courses for a deeper dive and a number of other learning resources from e-books to webinars and more.

Step 4 is the perfect resource if you’re completely new to SAS or just trying something new. Our New SAS User Community is a great place to get coding help, share ideas and best practices, or just lurk! Our SAS Communities have more than 200,000 members ready to help get you unstuck or share what they know.

Finally, Step 5 introduces you to product-specific resources to help develop your skills with your specific tools. Here you’ll find the latest product news, code samples, and step-by-step instructional resources to guide you through common tasks using your product of choice.

I hope you find the SAS Starter Kit a sweet addition to your SAS toolkit.

Five essential steps to getting started with SAS

Navigating the Candy Wall of SAS Support Resources was published on SAS Users.

12月 042018
 

When a Visual Analytics 8.3 report moves on a screen from one page to the next – all by itself, without a human hovering over a keyboard – you're seeing the Report Playback feature of SAS Visual Analytics Viewer 8.3 in action.

Reasons for using visual movement

Playable dashboards are easy to create and use. But let's ponder for a moment: Why would you want to set your report in motion? You might want it to scroll automatically:

  • At a kiosk or booth where folks linger for short periods of time.
  • During a presentation to an audience so you're hands-free. You decide how long each page displays and are free to focus on explaining key facts and figures in the moving report without the distraction of manually flipping through each page. Sort of like your car's cruise control –  you take your foot off the pedal and the vehicle keeps going.

Design considerations for playable dashboards

If the intent is to let the report run on its own in a kiosk or a booth, be mindful that such environments require information to move fast. Those watching the playable dashboard expect to grasp key facts and figures quickly. Time is of the essence.

A short attention span benefits from a report design whereby each report page contains one report object that quickly conveys the essence of the message in a few seconds. If you use a complex report design with multiple report objects and a small font, chances aren't good that your audience will absorb meaning from your report.

Any report object (for example, scatter plot) that requires your user to first look at the legend and then comprehend the data in the graph would be unsuitable for playable dashboards that are set to move at a fast rate, such as three or four seconds per page.

Example of a playable dashboard

I designed a report to illustrate carbon dioxide (CO2) emissions for 20 countries. I added five report objects that are easy to comprehend in about five seconds (a subjective estimate, of course.) I also added a scatter plot and geomap with legends that are challenging to comprehend to illustrate why report objects with legends can be unsuitable for a playable dashboard!

For the scatter plot, the presenter would have to expand the legend tooltip to show the legends for the country data in that report object – not realistic in a fast-paced dashboard. In the geomap, the audience needs to look at the legends at the bottom (icons, colors, etc.) and associate that legend with the display in the graph. That’s a lot of brain activity for five seconds – unrealistic. It makes sense, then, to use report objects here that don’t depend on user comprehension of legends to understand the data.

Let the show begin!

When the scatter plot or geomap is displayed, notice how it’s hard to comprehend such report objects in five seconds. In such a short timeframe, it's impossible to process legends and the data, all at once.

How to Create a Playable Dashboard in the Web-based Viewer

  1. In SAS Visual Analytics Viewer, I opened the report and chose Edit playback from the main menu.
  2. In the Edit Playback dialog, I chose the following options:

a. Transition unit – I can choose to display one page at a time or one object at a time. I chose to display one page at a time.

b. Seconds per unit – I chose to display each page for five seconds.

c. Show canvas only – I chose this option because it hides the report control area, page tabs, and page controls for a nicer look.

d. Show timer – This option would display a countdown for each page or object transition. I did not choose this option.

e. Show navigation controls for the report playback – I chose this option because it displays navigation controls in the bottom right corner of the viewer when I hover over the report with my mouse. Personally, I really like this feature because it gives me the flexibility to intervene and move the report pages forward or backward, pause the playback, or exit the playback.

Finally, I save and exit, and the playable dashboard begins to play on my monitor screen.

SAS® Visual Analytics on SAS® Viya® Try it for free!

How to create a playable dashboard with SAS Visual Analytics was published on SAS Users.