SAS Viya

6月 082018
 

Who doesn’t love a good makeover story? We know we do at BNL Consulting—and so do our customers. That’s why we’re thrilled with SAS Visual Analytics sophisticated new look and feel as well as its expanded functionality. SAS Visual Analytics takes advantage of the performance and scalability of the SAS Viya platform, providing a Business Intelligence framework that can work with massive amounts of data, bringing forward the powerful analytics that has made SAS the market leader in this space.

With SAS Viya, we get an even more unified platform with a consistent look and feel across applications which have been rewritten as zero client HTML5 web applications. The visualizations that are available in SAS Visual Analytics are embedded within the other tools as well. Now customers—from analysts to business executives—can point and click all from their web browser to get the information they need to make better business decisions right out of the box.

The newest version of SAS Visual Analytics provides direct access to an in-memory set of data as well. The process of loading data that can be made available to SAS Visual Analytics has been streamlined to give users fast access to information to create dashboards, reports and perform ad-hoc analysis. And because the application is so intuitive, users can be up and running in a matter of hours, not days.

In what has become a highly competitive market for business analytics—with BI tools being ubiquitous—SAS Visual Analytics running on SAS Viya provides a formidable means to analyze data, helping organizations tell a story—simply and easily. With SAS Visual Analytics on their side, customers will no longer need to reach outside the platform and make additional purchases. The unified platform is extremely compelling.

New features and benefits

SAS Visual Analytics customers will also benefit from a number of new features, including:

  • A new look and feel that provides drag and drop layout controls to size dashboard objects. New graph, chart, and map objects offer greater control of the properties to make them presentation-ready.
  • Objects on the screen can “talk” to each other, listening in on object events to filter accordingly in unison.
  • With no more LASR server, the process of loading and accessing data is much faster, with even better performance than the previous in-memory LASR server architecture.
  • There is now the ability to create custom HTML5/JavaScript objects, which can be hooked up to the data loaded in memory (data-driven content). HTML/JavaScript/D3 has become the standard in web development, opening up almost countless options for SAS to visualize data and turn it into useful information. Before, custom development was sometimes needed, but SAS Visual Analytics can handle a customer’s needs, all with point and click, drag and drop.

Next steps

When organizations use BI tools, they expect appealing visuals on top of rich data that can tell a story. SAS Visual Analytics on SAS Viya delivers in a big way, ushering in a new era for SAS BI users. We encourage customers to try out SAS Visual Analytics and see for themselves. We have found that a successful path is to start with small-scale proof of concepts, designed to satisfy a few critical uses. As you become more comfortable, expand to include more users and different types of projects.

With all the software options out there—both COTS and Open Source—we want customers to feel confident that SAS Visual Analytics for SAS Viya is the right choice. For more details, you can contact BNL at BNLConsulting.com.

SAS Visual Analytics on the new SAS platform was published on SAS Users.

6月 062018
 

Log Management in SAS ViyaLogs. They can be an administrator’s best friend or a thorn in their side. Who among us hasn’t seen a system choked to death by logs overtaking a filesystem? Thankfully, the chances of that happening with your SAS Viya 3.3 deployment is greatly reduced due to the automatic management of log files by the SAS Operations Infrastructure, which archives log files every day.

With a default installation of SAS Viya 3.3, log files older than three days will be automatically zipped up, deleted, and stored in /opt/sas/viya/config/var/log. This process is managed by the sas-ops-agent process on each machine in your deployment. According to SAS R&D, this results in up to a 95% compression rate on the overall log file requirements.

The task that archives log files is managed by the sas-ops-agent on each machine.  Running ./sas-ops tasks shows that the LogFileArchive process runs daily at 0400:

 {
 "version": 1,
 "taskName": "LogfileArchive",
 "description": "Archive daily",
 "hostType": "linux",
 "runType": "time",
 "frequency": "0000-01-01T04:00:00-05:00",
 "maxRunTime": "2h0m0s",
 "timeOutAction": "restart",
 "errorAction": "cancel",
 "command": "sas-archive",
 "commandType": "sas",
 "publisherType": "none"
 },

While the logs are zipped to reduce size, the zip files are stored locally so additional maintenance may be required to move the zipped files to another location. For example, my test system illustrates that the zip files are retained once created.

[sas@intviya01 log]$ ll
total 928
drwxr-xr-x. 3 sas sas 20 Jan 24 14:19 alert-track
drwxrwxr-x. 3 sas sas 20 Jan 24 16:13 all-services
drwxr-xr-x. 3 sas sas 20 Jan 24 14:23 appregistry
...
drwxr-xr-x. 4 sas sas 31 Jan 24 14:20 evmsvrops
drwxr-xr-x. 3 sas sas 20 Jan 24 14:27 home
drwxr-xr-x. 3 root sas 20 Jan 24 14:18 httpproxy
drwxr-xr-x. 3 sas sas 20 Jan 24 14:35 importvaspk
-rw-r--r--. 1 sas sas 22 Jan 26 04:00 log-20180123090001Z.zip
-rw-r--r--. 1 sas sas 22 Jan 27 04:00 log-20180124090000Z.zip
-rw-r--r--. 1 sas sas 22 Jan 28 04:00 log-20180125090000Z.zip
-rw-r--r--. 1 sas sas 10036 Jan 29 04:00 log-20180126090000Z.zip
-rw-r--r--. 1 sas sas 366303 Mar 6 04:00 log-20180303090000Z.zip
-rw-r--r--. 1 sas sas 432464 Apr 3 04:00 log-20180331080000Z.zip
-rw-r--r--. 1 sas sas 22 Apr 4 04:00 log-20180401080000Z.zip
-rw-r--r--. 1 sas sas 22 Apr 5 04:00 log-20180402080000Z.zip
-rw-r--r--. 1 sas sas 15333 Apr 6 04:00 log-20180403080000Z.zip
-rw-r--r--. 1 sas sas 21173 Apr 7 04:00 log-20180404080000Z.zip
-rw-r--r--. 1 sas sas 22191 Apr 8 04:00 log-20180405080000Z.zip
-rw-r--r--. 1 sas sas 21185 Apr 9 04:00 log-20180406080000Z.zip
-rw-r--r--. 1 sas sas 21405 Apr 10 04:00 log-20180407080000Z.zip
drwxr-xr-x. 3 sas sas 20 Jan 24 14:33 monitoring

If three days is too short of a time to retain logs, you can adjust the default timeframe for archiving logs by modifying the default task list for the sas-ops-agent on each machine.

Edit the tasks.json file to suit your needs and then issue a command to modify the task template for sas-ops-agent processes. For example, this will modify the log archive process to retain seven days of information:

...  
{
 "version": 1,
 "taskName": "LogfileArchive",
 "description": "Archive daily",
 "hostType": "linux",
 "runType": "time",
 "frequency": "0000-01-01T04:00:00-05:00",
 "maxRunTime": "2h0m0s",
 "timeOutAction": "restart",
 "errorAction": "cancel",
 "command": "sas-archive -age 7",
 "commandType": "sas",
 "publisherType": "none"
 },
...
 
$ ./sas-ops-agent import -tasks tasks.jason

Restart the sas-ops-agent processes on each of your machines and you will be good to go.

I hope you found this post helpful.

Additional Resources for Administrators

SAS Administrators Home Page
How-to Videos for Administrators
SAS Administration Community
SAS Administrators Blogs
SAS Administrator Training
SAS Administrator Certification

Log Management in SAS Viya 3.3 was published on SAS Users.

6月 022018
 

The SAS PlatformFor software users and SAS administrators, the question often becomes how to streamline their approach into the easiest to use system that most effectively completes the task at hand. At SAS Global Forum 2018, the topic of a “Big Red Button” was an idea that got audience members asking – is there a way to have just a few clicks complete all the stages of the software administration lifecycle? In this article, we review Sergey Iglov’s SAS Global Forum paper A ‘Big Red Button’ for SAS Administrators: Myth or Reality?” to get a better understanding of what this could look like, and how it could change administrators’ jobs for the better. Iglov is a director at SASIT Limited.

What is a “Big Red Button?”

With the many different ways the SAS Platform can be utilized, there is a question as to whether there is a single process that can control “infrastructure provisioning, software installation and configuration, maintenance, and decommissioning.” It has been believed that each of these steps has a different process; however, as Iglov concluded, there may be a way to integrate these steps together with the “Big Red Button.”

This mystery “button” that Iglov talked about would allow administrators to easily add or delete parts of the system and automate changes throughout; thus, the entire program could adapt to the administrator’s needs with a simple click.

Software as a System –SAS Viya and cloud based technologies

Right now, SAS Viya is compatible with the automation of software deployment processes through a centralized management. Right now, SAS Viya is compatible with a centralized automated deployment process. Through insights easily created and shared on the cloud, SAS Viya stands out, as users can access a centrally hosted control panel instead of needing individual installations.

Using CloudFormation by Amazon Web Services

At this point, the “Big Red Button” points toward systems such as CloudFormation. CloudFormation allows users of Amazon Web Services to lay out the infrastructure needed for their product visually, and easily make changes that will affect the software. As Iglov said, “Once a template is deployed using CloudFormation it can be used as a stack to simplify resources management. For example, when a stack is deleted all related resources are deleted automatically as well.”

Conclusion

Connecting to SAS Viya, CloudFormation can install and configure the system, and make changes. This would help SAS administrators adapt the product to their needs, in order to derive intelligence from data. While the future potential to use a one-click button is out there for many different platforms, using cloud based software and programs such as CloudFormation enable users to go through each step of SAS Platform’s administration lifecycle efficiently and effectively.

Additional Resources

SAS Viya Brochure
Sergey Iglov: "A 'Big Red Button' for SAS administrators: Myth or Reality?"

Additional SAS Global Forum 2018 talks of interest for SAS Administrators

A Programming Approach to Implementing SAS® Metadata-Bound Libraries for SAS® Data Set Encryption Deepali Rai, SAS Institute Inc.

Command-Line Administration in SAS® Viya®
Danny Hamrick, SAS

External Databases: Tools for the SAS® Administrator
Mathieu Gaouette, Prospective MG inc.

SAS® Environment Manager – A SAS® Viya® Administrator’s Swiss Army Knife
Michelle Ryals, Trevor Nightingale, SAS Institute Inc.

Troubleshooting your SAS® Grid Environment
Jason Hawkins, Amadeus Software Limited

Multi-Factor Authentication with SAS® and Symantec VIP
Jody Steadman, Mike Roda, SAS Institute Inc.

OpenID Connect Opens the Door to SAS® Viya® APIs
Mike Roda, SAS Institute Inc.

Understanding Security for SAS® Visual Analytics 8.2 on SAS® Viya®
Antonio Gianni, Faisal Qamar, SAS Institute Inc.

Latest and Greatest: Best Practices for Migrating to SAS® 9.4
Alec Fernandez, Leigh Fernandez, SAS Institute Inc.

Planning for Migration from SAS® 9.4 to SAS® Viya®
Don B. Hayes, DLL Consulting Inc.; Spencer Hayes, Cached Consulting LLC; Michael Shealy, Cached Consulting LLC; Rebecca Hayes, Green Peach Consulting Inc.

SAS® Viya®: Architect for High Availability Now and Users Will Thank You Later
Jerry Read, SAS Institute Inc.

Taming Change: Bulk Upgrading SAS® 9.4 Environments to a New Maintenance Release
Javor Evstatiev, Andrey Turlov

Is there a “Big Red Button” to use The SAS Platform? was published on SAS Users.

5月 222018
 

SAS ViyaSAS Viya Presentations is our latest extension of the SAS Platform and interoperable with SAS® 9.4. Designed to enable analytics to the enterprise, it seamlessly scales for data of any size, type, speed and complexity. It was also a star at this year’s SAS Global Forum 2018. In this series of articles, we will review several of the most interesting SAS Viya talks from the event. Our first installment reviews Hadley Christoffels’ talk, A Need For Speed: Loading Data via the Cloud.

You can read all the articles in this series or check out the individual interviews by clicking on the titles below:
Part 1: Technology that gets the most from the Cloud.


Technology that gets the most from the Cloud

Few would argue about the value the effective use of data can bring an organization. Advancements in analytics, particularly in areas like artificial intelligence and machine learning, allow organizations to analyze more complex data and deliver faster, more accurate results.

However, in his SAS Global Forum 2018 paper, A Need For Speed: Loading Data via the Cloud, Hadley Christoffels, CEO of Boemska, reminded the audience that 80% of an analyst’s time is still spent on the data. Getting insight from your data is where the magic happens, but the real value of powerful analytical methods like artificial intelligence and machine learning can only be realized when “you shorten the load cycle the quicker you get to value.”

Data Management is critical and still the most common area of investment in analytical software, making data management a primary responsibility of today’s data scientist. “Before you can get to any value the data has to be collected, has to be transformed, has to be enriched, has to be cleansed and has to be loaded before it can be consumed.”

Benefits of cloud adoption

The cloud can help, to a degree. According to Christoffels, “cloud adoption has become a strategic imperative for enterprises.” The advantages of moving to a cloud architecture are many, but the two greatest are elasticity and scalability.

Elasticity, defined by Christoffels, allows you to dynamically provision or remove virtual machines (VM), while scalability refers to increasing or decreasing capacity within existing infrastructure by scaling vertically, moving the workload to a bigger or smaller VM, or horizontally, by provisioning additional VM’s and distributing the application load between them.

“I can stand up VMs in a matter of seconds, I can add more servers when I need it, I can get a bigger one when I need it and a smaller one when I don’t, but, especially when it comes to horizontal scaling, you need technology that can make the most of it.” Cloud-readiness and multi-threaded processing make SAS® Viya® the perfect tool to take advantage of the benefits of “clouding up.”

SAS® Viya® can addresses complex analytical challenges and speed up data management processes. “If you have software that can only run on a single instance, then scaling horizontally means nothing to you because you can’t make use of that multi-threaded, parallel environment. SAS Viya is one of those technologies,” Christoffels said.

Challenges you need to consider

According to Christoffels, it’s important, when moving your processing to the cloud, that you understand and address existing performance challenges and whether it will meet your business needs in an agile manner. Inefficiencies on-premise are annoying; inefficiencies in the cloud are annoying and costly, since you pay for that resource.

It’s not the best use of the architecture to take what you have on premise and just shift it. “Finding and improving and eliminating inefficiencies is a massive part in cutting down the time data takes to load.”

Boemska, Christoffels’ company, has tools to help businesses find inefficiencies and understand the impact users have on the environment, including:

  1. Real-time diagnostics looking at CPU Usage, Memory Usage, SAS Workload, etc.
  2. Insight and comparison provides a historic view in a certain timeframe, essential when trying to optimize and shave off costly time when working in cloud.
  3. Utilization reports to better understand how the platform is used.

Optimizing inefficiencies with SAS Viya

But scaling vertically and horizontally from cloud-based infrastructure to speed the loading and data management process solves only part of the problem. Christoffels said SAS Viya capabilities completes the picture. SAS Viya offers a number of benefits in a Cloud infrastructure, Christoffels said. Code amendments that make use of the new techniques and benefits now available in SAS Viya, such as the multi-threaded DATA step or CAS Action Sets, can be extremely powerful.

One simple example of the benefits of SAS Viya, Christoffels said, is that with in-memory processing, PROC SORT is a procedure that’s no longer needed; SAS Viya does “grouping on the fly,” meaning you can remove sort routines from existing programs, which of itself, can cut down processing time significantly.

As a SAS Programmer, just the fact that SAS Viya can run multithreaded, the fact that you don’t have to do these sorts, the way it handles grouping on the fly, the fact that multithreaded nature and capability is built into how you deal with tables are all “significant,” according to Christoffels.

Conclusion

Data preparation and load processes have a direct impact on how applications can begin and subsequently complete. Many organizations are using the Cloud platform to speed up the process, but to take full advantage of the infrastructure you have to apply the right software technology. SAS Viya enables the full realization of Cloud benefits through performance improvements, such as the transposing of data and the transformation of data using the DATA step or CAS Action Sets.

Additional Resources

SAS Global Forum Video: A Need For Speed: Loading Data via the Cloud
SAS Global Forum 2018 Paper: A Need For Speed: Loading Data via the Cloud
SAS Viya
SAS Viya Products


Read all the posts in this series.

Part 1: Technology that gets the most from the Cloud

Technology that gets the most from the Cloud was published on SAS Users.

5月 082018
 

SAS reporting tools for GDPR and other privacy protection lawThe European Union’s General Data Protection Regulation (GDPR) taking effect on 25 May 2018 pertains not only to organizations located within the EU; it applies to all companies processing and holding the personal data of data subjects residing in the European Union, regardless of the company’s location.

If the GDPR acronym does not mean much to you, think of the one that does – HIPAA, FERPA, COPPA, CIPSEA, or any other that is relevant to your jurisdiction – this blog post is equally applicable to all of them.

The GDPR prohibits personal data processing revealing such individual characteristics as race or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, as well as the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health, and data concerning a natural person’s sex life or sexual orientation. It also has special rules for data relating to criminal convictions or offenses and the processing of children’s personal data.

Whenever SAS users produce reports on demographic data, there is always a risk of inadvertently revealing personal data protected by law, especially when reports are generated automatically or interactively via dynamic data queries. Even for aggregate reports there is a high potential for such exposure.

Suppose you produce an aggregate cross-tabulation report on a small demographic group, representing a count distribution by students’ grade and race. It is highly probable that you can get the count of 1 for some cells in the report, which will unequivocally identify persons and thus disclose their education record (grade) by race. Even if the count is not equal to 1, but is equal to some other small number, there is still a risk of possible deducing or disaggregating of Personally Identifiable Information (PII) from surrounding data (other cells, row and column totals) or related reports on that small demographic group.

The following are the four selected SAS tools that allow you to take care of protecting personal data in SAS reports by suppressing counts in small demographic group reports.

1. Automatic data suppression in SAS reports

This blog post explains the fundamental concepts of data suppression algorithms. It takes you behind the scenes of the iterative process of complementary data suppression and walks you through SAS code implementing a primary and secondary complementary suppression algorithm. The suppression code uses BASE SAS – DATA STEPs, SAS macros, PROC FORMAT, PROC MEANS, and PROC REPORT.

2. Implementing Privacy Protection-Compliant SAS® Aggregate Reports

This SAS Global Forum 2018 paper solidifies and expands on the above blog post. It walks you through the intricate logic of an enhanced complementary suppression process, and demonstrates SAS coding techniques to implement and automatically generate aggregate tabular reports compliant with privacy protection law. The result is a set of SAS macros ready for use in any reporting organization responsible for compliance with privacy protection.

3. In SAS Visual Analytics you can create derived data items that are aggregated measures.  SAS Visual Analytics 8.2 on SAS Viya introduces a new Type for the aggregated measures derived data items called Data Suppression. Here is an excerpt from the documentation on the Data Suppression type:

“Obscures aggregated data if individual values could easily be inferred. Data suppression replaces all values for the measure on which it is based with asterisk characters (*) unless a value represents the aggregation of a specified minimum number of values. You specify the minimum in the Suppress data if count less than parameter. The values are hidden from view, but they are still present in the data query. The calculation of totals and subtotals is not affected.

Some additional values might be suppressed when a single value would be suppressed from a subgroup. In this case, an additional value is suppressed so that the suppressed value cannot be inferred from totals or subtotals.

A common use of suppressed data is to protect the identity of individuals in aggregated data when some crossings are sparse. For example, if your data contains testing scores for a school district by demographics, but one of the demographic categories is represented only by a single student, then data suppression hides the test score for that demographic category.

When you use suppressed data, be sure to follow these best practices:

  • Never use the unsuppressed version of the data item in your report, even in filters and ranks. Consider hiding the unsuppressed version in the Data pane.
  • Avoid using suppressed data in any object that is the source or target of a filter action. Filter actions can sometimes make it possible to infer the values of suppressed data.
  • Avoid assigning hierarchies to objects that contain suppressed data. Expanding or drilling down on a hierarchy can make it possible to infer the values of suppressed data.”

This Data Suppression type functionality is significant as it represents the first such functionality embedded directly into a SAS product.

4. Is it sensitive? Mask it with data suppression

This blog post provides an example of using the above Data Suppression type aggregated measures derived data items in SAS Visual Analytics.

We need your feedback!

We want to hear from you.  Is this blog post useful? How do you comply with GDPR (or other Privacy Law of your jurisdiction) in your organization? What SAS privacy protection features would you like to see in future SAS releases?

SAS tools for GDPR privacy compliant reporting was published on SAS Users.

5月 082018
 

SAS reporting tools for GDPR and other privacy protection lawThe European Union’s General Data Protection Regulation (GDPR) taking effect on 25 May 2018 pertains not only to organizations located within the EU; it applies to all companies processing and holding the personal data of data subjects residing in the European Union, regardless of the company’s location.

If the GDPR acronym does not mean much to you, think of the one that does – HIPAA, FERPA, COPPA, CIPSEA, or any other that is relevant to your jurisdiction – this blog post is equally applicable to all of them.

The GDPR prohibits personal data processing revealing such individual characteristics as race or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, as well as the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health, and data concerning a natural person’s sex life or sexual orientation. It also has special rules for data relating to criminal convictions or offenses and the processing of children’s personal data.

Whenever SAS users produce reports on demographic data, there is always a risk of inadvertently revealing personal data protected by law, especially when reports are generated automatically or interactively via dynamic data queries. Even for aggregate reports there is a high potential for such exposure.

Suppose you produce an aggregate cross-tabulation report on a small demographic group, representing a count distribution by students’ grade and race. It is highly probable that you can get the count of 1 for some cells in the report, which will unequivocally identify persons and thus disclose their education record (grade) by race. Even if the count is not equal to 1, but is equal to some other small number, there is still a risk of possible deducing or disaggregating of Personally Identifiable Information (PII) from surrounding data (other cells, row and column totals) or related reports on that small demographic group.

The following are the four selected SAS tools that allow you to take care of protecting personal data in SAS reports by suppressing counts in small demographic group reports.

1. Automatic data suppression in SAS reports

This blog post explains the fundamental concepts of data suppression algorithms. It takes you behind the scenes of the iterative process of complementary data suppression and walks you through SAS code implementing a primary and secondary complementary suppression algorithm. The suppression code uses BASE SAS – DATA STEPs, SAS macros, PROC FORMAT, PROC MEANS, and PROC REPORT.

2. Implementing Privacy Protection-Compliant SAS® Aggregate Reports

This SAS Global Forum 2018 paper solidifies and expands on the above blog post. It walks you through the intricate logic of an enhanced complementary suppression process, and demonstrates SAS coding techniques to implement and automatically generate aggregate tabular reports compliant with privacy protection law. The result is a set of SAS macros ready for use in any reporting organization responsible for compliance with privacy protection.

3. In SAS Visual Analytics you can create derived data items that are aggregated measures.  SAS Visual Analytics 8.2 on SAS Viya introduces a new Type for the aggregated measures derived data items called Data Suppression. Here is an excerpt from the documentation on the Data Suppression type:

“Obscures aggregated data if individual values could easily be inferred. Data suppression replaces all values for the measure on which it is based with asterisk characters (*) unless a value represents the aggregation of a specified minimum number of values. You specify the minimum in the Suppress data if count less than parameter. The values are hidden from view, but they are still present in the data query. The calculation of totals and subtotals is not affected.

Some additional values might be suppressed when a single value would be suppressed from a subgroup. In this case, an additional value is suppressed so that the suppressed value cannot be inferred from totals or subtotals.

A common use of suppressed data is to protect the identity of individuals in aggregated data when some crossings are sparse. For example, if your data contains testing scores for a school district by demographics, but one of the demographic categories is represented only by a single student, then data suppression hides the test score for that demographic category.

When you use suppressed data, be sure to follow these best practices:

  • Never use the unsuppressed version of the data item in your report, even in filters and ranks. Consider hiding the unsuppressed version in the Data pane.
  • Avoid using suppressed data in any object that is the source or target of a filter action. Filter actions can sometimes make it possible to infer the values of suppressed data.
  • Avoid assigning hierarchies to objects that contain suppressed data. Expanding or drilling down on a hierarchy can make it possible to infer the values of suppressed data.”

This Data Suppression type functionality is significant as it represents the first such functionality embedded directly into a SAS product.

4. Is it sensitive? Mask it with data suppression

This blog post provides an example of using the above Data Suppression type aggregated measures derived data items in SAS Visual Analytics.

We need your feedback!

We want to hear from you.  Is this blog post useful? How do you comply with GDPR (or other Privacy Law of your jurisdiction) in your organization? What SAS privacy protection features would you like to see in future SAS releases?

SAS tools for GDPR privacy compliant reporting was published on SAS Users.

4月 272018
 

Analyzing ticket sales and customer data for large sports and entertainment events is a complex endeavor. But SAS Visual Analytics makes it easy, with location analytics, customer segmentation, predictive artificial intelligence (AI) capabilities – and more. This blog post covers a brief overview of these features by using a fictitious event company [...]

Analyze ticket sales using location analytics and customer segmentation in SAS Visual Analytics was published on SAS Voices by Falko Schulz

4月 262018
 

A future of flying cars and Minority Report-styled predictive dashboards may still be some time away, but the possibilities of robotics and Artificial Intelligence (AI)-powered automation are a reality today. From connected cars to smart homes and offices, we see daily how big data and the Internet of Things (IoT) [...]

IoT use cases on display at new innovation centre was published on SAS Voices by Randy Goh

4月 192018
 

A very common coding technique SAS programmers use is identifying the largest value for a given column using DATA Step BY statement with the DESCENDING option. In this example I wanted to find the largest value of the number of runs (nRuns) by each team in the SASHELP.BASEBALL dataset. Using a SAS workspace server, one would write:

DESCENDING BY Variables in SAS Viya

Figure 1. Single Threaded DATA Step in SAS Workspace Server

Figure 2 shows the results of the code we ran in Figure 1:

Figure 2. Result from SAS Code Displayed in Figure 1

To run this DATA Step distributed we will leveraging the SAS® Cloud Analytic Services in SAS® Viya™. Notice in Figure 3, there is no need for the PROC SORT step which is required when running DATA Step single threaded in a SAS workspace server. This is because SAS® Cloud Analytic Services in SAS® Viya™ . Instead we will use

Figure 3. Distributed DATA Step in SAS® Cloud Analytic Services in SAS® Viya™

Figure 4 shows the results when running distributed DATA Step in SAS® Cloud Analytic Services in SAS® Viya™.

Figure 4. Results of Distributed DATA Step in SAS® Cloud Analytic Services in SAS® Viya™

Conclusion

Until the BY statement running in the SAS® Cloud Analytic Services in SAS® Viya™ supports DESCENDING use this technique to ensure your DATA Step runs distributed.

Read more SAS Viya posts.

Read our SAS 9 to SAS Viya whitepaper.

How to Simulate DESCENDING BY Variables in DATA Step Code that Runs Distributed in SAS® Viya™ was published on SAS Users.

4月 142018
 

SAS Visual Forecasting 8.2 effectively models and forecasts time series in large scale. It is built on SAS Viya and powered by SAS Cloud Analytic Services (CAS).

In this blog post, I will build a Visual Forecasting (VF) Pipeline, which is a process flow diagram whose nodes represent tasks in the VF Process.  The objective is to show how to perform the full analytics life cycle with large volumes of data: from accessing data and assigning variable roles accurately, to building forecasting models, to select a champion model and overriding the system generated forecast. In this blog post I will use 1,337 time series related to a chemical company and will illustrate the main steps you would use for your own applications and datasets.

In future posts, I will work in the Programming application, a collection of SAS procedures and CAS actions for direct coding or access through tasks in SAS Studio, and will develop and assess VF models via Python code.

In a VF pipeline, teams can easily save forecast components to the Toolbox to later share in this collaborative environment.

Forecasting Node in Visual Analytics

This section briefly describes what is available in SAS Visual Analytics, the rest of the blog discusses SAS Visual Forecasting 8.2.

In SAS Visual Analytics the Forecast object will select the model that best fits the data out of these models: ARIMA, Damped trend exponential smoothing, Linear exponential smoothing, Seasonal exponential smoothing, Simple exponential smoothing, Winters method (additive), or Winters method (multiplicative). Currently there are no diagnostic statistics (MAPE, RMSE) for the model selected.

You can do “what-if analysis” using the Scenario Analysis and Goal Seeking functionalities. Scenario Analysis enables you to forecast hypothetical scenarios by specifying the future values for one or more underlying factors that contribute to the forecast. For example, if you forecast the profit of a company, and material cost is an underlying factor, then you might use scenario analysis to determine how the forecasted profit would change if the material cost increased by 10%. Goal Seeking enables you to specify a target value for your forecast measure, and then determine the values of underlying factors that would be required to achieve the target value. For example, if you forecast the profit of a company, and material cost is an underlying factor, then you might use Goal Seeking to determine what value for material cost would be required to achieve a 10% increase in profit.

Another neat feature in SAS Visual Analytics is that one can apply different filters to the final forecast. Filters are underlying factors or different levels of the hierarchy, and the resulting plot incorporates those filters.

Data Requirements for Forecasting

There are specific data requirements when working in a forecasting project, a time series dataset that contains at least two variables: 1) the variable that you want to forecast which is known as the target of your analysis, for example, Revenue and 2) a time ID variable that contains the time stamps of the target variable. The intervals of this variable are regularly spaced. Your time series table can contain other time-varying variables. When your time series table contains more than one individual series, you might have classification variables, as shown in the photo below. Distribution Center and Product Key are classification variables. Optionally, you can designate numerical variables (ex: Discount) as independent variables in your models.

You also have the option of adding a table of attributes to the time series table. Attributes are categorical variables that define qualities of the time series. Attribute variables are similar to BY variables, but are not used to identify the series that you want to forecast. In this post, the data I am using includes Distribution Center, Supplier Name, Product Type, Venue and Product Category. Notice that the attributes are time invariant, and that the attribute table is much smaller than the time series table.

The two data sets (SkinProduct and SkinProductAttributes) used in this blog contain 1,337 time series related to a chemical company. This picture shows a few rows of the two data sets used in this post, note that DATE intervals are regularly spaced weeks. The SkinProduct dataset is referred as the Time Series table in the SkinProductAttributes dataset as the Attribute Data.

Introduction SAS Visual Forecasting 8.2

Developing Models in SAS Visual Forecasting 8.2

Step One: Create a Forecasting Project and Assign Variables

From the SAS Home menu select the action Build Models that will take you to SAS Model Studio, where you select New Project, and enter 1) the Name of your project, 2) the Type of project, make sure you enter “Forecasting” and 3) the data source.

Introduction SAS Visual Forecasting 8.2

In the Data tab,  assign the variables roles by using the icon in the upper right corner.

The roles assigned in this post are typical of role assignments in forecasting projects. Notice these variables are in the Time Series table. Also, notice that the classification variables are ordered from highest to lowest hierarchy:

Time Variable: Date
Dependent Variable: Revenue
Classification Variables: Distribution Center and Product Key

In the Time Series table, you might have additional variables you’d like to assign the role “independent” that should be considered for model generation. Independent variables are the explanatory, input, predictor, or causal variables that can be used to model and forecast the dependent variable. In this post, the variable “Discount” is assigned the role “independent”. To do this assignment: right click on the variable, and select Edit Variables.

To bring in the second dataset with the Attribute variables, follow the steps in this photo:

 

Step two: Automated Modeling with Pipelines

The objective in this step is to select a champion model.  Working in the Pipelines tab, one explores the time series plots, uses the code editor to modify the default model, adds a 2nd model to the pipeline, compares the models and selects a champion model.

After step one is completed, select the Pipelines Tab

The first node in the VF pipelines is the Data node. After right-clicking and running this node, one can see the time series by selecting Explore Time Series. Notice that one can filter by the attribute variables, and that the table shows the exact historical data values.

Auto-forecasting is the next node in the default pipeline. Remember that we are modeling 1,332 time series. For each time series, the Auto-forecasting node automatically diagnoses the statistical characteristics of the time series, generates a list of appropriate time series models, automatically selects the model, and generates forecasts. This node evaluates and selects for each time series the ARIMAX and exponential smoothing models.

One can customize and modify the forecasting models (except the Hierarchical Forecasting model) by editing the model’s code. For example, to add the class of models for intermittent demand to the auto- forecasting node, one could open the code editor for that node and replace these lines
rc = diagspec.setESM();
rc = diagspec.setARIMAX();
with:
rc = diagspec.setIDM();

To open the code editor see photo below. After changes, save code and close the editor.

At this point, you can run the Auto Forecasting node, and after looking at its results, save it to the toolbox, so the editing changes are saved and later reused or shared with team members.

By expanding the Nodes pane and the Forecasting Modeling pane on the left, you can select from several models and add a 2nd modeling node to the pipeline

The next photo shows a pipeline with the Naïve Forecast as the second model. It was added to the pipeline by dropping its node into the parent node (data node). This is the resulting pipeline:

After running the Model Comparison node, compare the WMAE (Weighted Mean Absolute Error) and WMAPE (Weighted Mean Absolute Percent Error) and select a champion model.

You can build several pipelines using different model strategies. In order to select a champion model from all the models developed in the pipelines one uses the Pipeline Comparison tab.

Before you work on any overrides for your forecasting project, you need to make sure that you are working with the best pipeline and modeling node for your data. SAS Visual Forecasting selects the best fit model in each pipeline. After each pipeline is run, the champion pipeline is selected based on the statistics of fit that you chose for the selection criteria. If necessary, you can change the selected champion pipeline.

Step Three: Overrides

The Overrides tab is used to manually adjust the forecasts in the future. For example, if you want to account for some promotions that your company and its competitors are running and that are not captured by the models.

The Overrides module allows users to select subsets of time series at the aggregate level by selecting attribute values in the attribute table that you defined in the data tab. The filters based on the attributes are highly customizable and do not restrict you to use the hierarchy that was used for the modeling. The section of a filter using attribute is often referred to as faceted search. Whenever you create a new filter based on a selection of values of the attributes (also known as facets), the aggregate for all series that match the facets will be displayed on your main panel.

There is a wealth of information in the Overrides overview: 1) a list of the BY variables, as well as attribute variables, available to use as filters.

2) a Plot of the Time Series Aggregation and Overrides displaying historical and forecast data, and

3) a Forecast and Overrides table which can be used to create, edit and submit, override values for a time series based on external factors that are not included in the forecast models

Conclusion

Using SAS Visual Forecasting 8.2 you can effectively model and forecast time series in large scale. The Visual Forecasting Pipeline greatly facilitates the automatic forecasting of large volumes of data, and provides a structured and robust method with efficient and flexible processes.

References

An introduction to SAS Visual Forecasting 8.2 was published on SAS Users.