Tech

10月 222019
 

PROC SQL

PROC SQL is a very powerful ANSI 92 compliant version of SQL that also allows us to leverage many unique SAS capabilities. Recently I was asked if the PROC SQL in Figure 1 could be refactored into PROC FedSQL so it could run faster by leveraging SAS Viya’s in-memory engine CAS (SAS® Cloud Analytic Services). I was struggling to find a way to refactor this into PROC FedSQL, so I reached out to the SAS Jedi (aka Mark Jordan) for help.

/* Original SQL Statements */
proc sql; create table BenchMark as
     select count(*) as ItemCount
     , sum( abs( nhits - nruns ) < 0.1*natbat )   as DIFF_10
     from sashelp.baseball;
run;

Figure 1. Original PROC SQL

In Figure 2, we can review the SAS Log of our PROC SQL code.

  • It is line 77 that we want to refactor into PROC FedSQL so we can benefit performance improvements by running that code distributed in CAS.
  • On line 77, we use the alias DIFF_10 to create the new column name that is calculated by the two SAS functions SUM and ABS.
  • The expression on line 77 will cause SQL to return a value of 1 if the condition is true and a value of 0 if the condition is false.
  • The alias DIFF_10 will contain the summation of the value returned by the condition (i.e. 0 or 1) for all rows in our data set SASHELP.BASESBALL.

In Figure 5, we can review the results of our PROC SQL statement.

Figure 2. SAS Log of PROC SQL

PROC FedSQL

PROC FedSQL is ANSI 99 compliant without any of the unique SAS capabilities that PROC SQL contains, but PROC FedSQL is CAS enabled, which allows us to leverage SAS Viya’s distributed computing capabilities to improve run-times. Figure 3 is the refactored PROC FedSQL code that the SAS Jedi came up with.

/* PROC FedSQL code */
proc fedsql sessref=casauto; 
   create table BenchMark as
     select count(*) as ItemCount
     , sum(case 
           when (abs (nhits - nruns ) < (0.1*natbat)
                ) is true then 1 end 
          ) as DIFF_10
     from baseball;
quit;

Figure 3. CAS-enabled PROC FedSQL

Figure 4 contains the SAS Log of our CAS enabled PROC FedSQL.

  • Notice on lines 77 we added a CASE statement to the SUM function for our alias DIFF_10.
  • On lines 78-79, the WHEN statement return a value of 1 when the condition is true and a value of 0 when it is false.
  • The alias DIFF_10 will contain the summation, of the value returned by the CASE statement (i.e. 0 or 1) for all rows in our CAS table BASESBALL.

In Figure 5 we can review the results of our PROC FedSQL statement.

Figure 4. SAS log of PROC FedSQL code

Figure 5. Validation that the values from the refactoring of PROC SQL into PROC FedSQL match

Conclusion

As we adopt SAS Viya, a goal is to leverage CAS to speed up the processing of routines written in PROC SQL. To accomplish this, refactor PROC SQL code into PROC FedSQL code. For PROC SQL that cannot be refactored, simply run that PROC SQL code as-is in SAS Viya’s SAS Programming Run-time Environment (SPRE).

SAS® Viya®: How to Emulate PROC SQL Using CAS-Enabled PROC FedSQL was published on SAS Users.

10月 212019
 

Image by rawpixel from Pixabay

When my younger son grabs a book or a toy from his older siblings without permission, his line of defense is always the same: “Sharing is caring!” Our kids' schools teach and reinforce this philosophy. Likewise, our family has rules to ensure peaceful, orderly sharing.

Similarly, many organizations value collaboration. They encourage researchers, data owners, data scientists and business analysts to share work product and ideas and facilitate it among their teams. As with families, they often find it easier said than done.

A big part of my job is to meet customers and advise them on how SAS technology can help solve their business challenges. A recurring topic has been around SAS Viya, the analytics capabilities collectively known as the SAS® Platform. I emphasize how SAS Viya seamlessly enables collaboration across diverse users and teams.

SAS Viya collaboration use case with Commitments of Traders data

How does it work in real life? Here is an example to demonstrate how SAS coders and business analysts can easily collaborate on SAS Viya. I am using a publicly available data set known as Commitments of Traders (COT) that the U.S. Commodity Futures Trading Commission (CFTC) publishes on its website (https://www.cftc.gov/MarketReports/CommitmentsofTraders/index.htm.) Traders and researchers closely watch and analyze this data set for trends and price movements in the commodities market.

A SAS programmer readies the data

Figure 1: Drop-down menu

First, I need to bring the original COT file, saved in ‘.txt’ format, to my enterprise SAS environment. On SAS Viya, I have a choice of using a programmatic or graphical user interface (GUI) approach to import data and perform data wrangling/preparation. Both interfaces are easily accessible from a drop-down menu on SAS Drive, a web-based central hub for SAS Viya applications (see Figure 1). I choose the programmatic approach by clicking on Develop SAS Code from the drop-down menu.

SAS® Studio, the programming interface for SAS Viya, is a web-based development environment that includes code autocomplete, a library of frequently used code snippets, pre-built GUI wizards for numerous analytical routines, etc.

In SAS Studio, I prepare the data with five steps: 1) Import the raw text file; 2) Reduce the number of variables; 3) Compute the traders’ net position; 4) Import the mapping table; and 5) Add the commodity category variable from the mapping table. For all of these steps, I use SAS DATA Step and PROC statements that have been around for 40+ years – old school SAS.

The last two steps in my program, Step 6 and Step 7, are unique to SAS Viya (see Figure 2 below.) Those lines of code start the Cloud Analytics Server (CAS) session and load the final curated table into CAS, an in-memory distributed analytical engine that I consider the “heart and soul” of SAS Viya. Once the data is in CAS, an authorized user of my enterprise SAS Viya environment can easily locate data in CAS library/catalog for data exploration, modeling, or developing business intelligence content.

Figure 2: CAS

A business analyst creates and shares an interactive dashboard

Now it's the business analyst's turn. In this example, my teammate wants to build an interactive dashboard using curated COT data that I (the programmer) loaded into CAS. The analyst will access SAS Drive, select Explore and Visualize Data option from the menu, and be directed to SAS Visual Analytics, a web-based application that allows you to explore data and build point-and-click, interactive visualizations, no coding skills required.

Figure 3 and Figure 4 below show examples of the types of drillable dashboards and reports you can easily develop in SAS Visual Analytics with a few clicks. You can share the report internally via web link, view it in MS Office products, or publish it on the intranet or external facing website.

Figure 3: COT Dashboard

 

Figure 4: Monthly Trend

Collaboration bridges diverse skill sets, fosters successful projects

Studies show that success of any analytical project requires multi-disciplinary teams that include database administrators, data scientists, analysts, subject matter experts, management and IT support. SAS Viya helps them capitalize on their strengths to promote frictionless collaboration in a secure and controlled environment.

This post, focused on collaboration between SAS coders and business analysts, only scratches the surface of SAS Viya's collaboration and knowledge sharing capabilities. Likewise, open source coders (R and Python) and business analysts can collaborate on SAS Viya too.

Free trials below give programmers and business analysts a taste of what's possible with SAS Viya. Try one and tell us about it in the comments.

SAS® Visual Analytics on SAS® Viya® | Try it as a business analyst! SAS® Visual Analytics on SAS® Viya® | Try it as a programmer!

How SAS® Viya fosters collaboration was published on SAS Users.

10月 212019
 

Image by rawpixel from Pixabay

When my younger son grabs a book or a toy from his older siblings without permission, his line of defense is always the same: “Sharing is caring!” Our kids' schools teach and reinforce this philosophy. Likewise, our family has rules to ensure peaceful, orderly sharing.

Similarly, many organizations value collaboration. They encourage researchers, data owners, data scientists and business analysts to share work product and ideas and facilitate it among their teams. As with families, they often find it easier said than done.

A big part of my job is to meet customers and advise them on how SAS technology can help solve their business challenges. A recurring topic has been around SAS Viya, the analytics capabilities collectively known as the SAS® Platform. I emphasize how SAS Viya seamlessly enables collaboration across diverse users and teams.

SAS Viya collaboration use case with Commitments of Traders data

How does it work in real life? Here is an example to demonstrate how SAS coders and business analysts can easily collaborate on SAS Viya. I am using a publicly available data set known as Commitments of Traders (COT) that the U.S. Commodity Futures Trading Commission (CFTC) publishes on its website (https://www.cftc.gov/MarketReports/CommitmentsofTraders/index.htm.) Traders and researchers closely watch and analyze this data set for trends and price movements in the commodities market.

A SAS programmer readies the data

Figure 1: Drop-down menu

First, I need to bring the original COT file, saved in ‘.txt’ format, to my enterprise SAS environment. On SAS Viya, I have a choice of using a programmatic or graphical user interface (GUI) approach to import data and perform data wrangling/preparation. Both interfaces are easily accessible from a drop-down menu on SAS Drive, a web-based central hub for SAS Viya applications (see Figure 1). I choose the programmatic approach by clicking on Develop SAS Code from the drop-down menu.

SAS® Studio, the programming interface for SAS Viya, is a web-based development environment that includes code autocomplete, a library of frequently used code snippets, pre-built GUI wizards for numerous analytical routines, etc.

In SAS Studio, I prepare the data with five steps: 1) Import the raw text file; 2) Reduce the number of variables; 3) Compute the traders’ net position; 4) Import the mapping table; and 5) Add the commodity category variable from the mapping table. For all of these steps, I use SAS DATA Step and PROC statements that have been around for 40+ years – old school SAS.

The last two steps in my program, Step 6 and Step 7, are unique to SAS Viya (see Figure 2 below.) Those lines of code start the Cloud Analytics Server (CAS) session and load the final curated table into CAS, an in-memory distributed analytical engine that I consider the “heart and soul” of SAS Viya. Once the data is in CAS, an authorized user of my enterprise SAS Viya environment can easily locate data in CAS library/catalog for data exploration, modeling, or developing business intelligence content.

Figure 2: CAS

A business analyst creates and shares an interactive dashboard

Now it's the business analyst's turn. In this example, my teammate wants to build an interactive dashboard using curated COT data that I (the programmer) loaded into CAS. The analyst will access SAS Drive, select Explore and Visualize Data option from the menu, and be directed to SAS Visual Analytics, a web-based application that allows you to explore data and build point-and-click, interactive visualizations, no coding skills required.

Figure 3 and Figure 4 below show examples of the types of drillable dashboards and reports you can easily develop in SAS Visual Analytics with a few clicks. You can share the report internally via web link, view it in MS Office products, or publish it on the intranet or external facing website.

Figure 3: COT Dashboard

 

Figure 4: Monthly Trend

Collaboration bridges diverse skill sets, fosters successful projects

Studies show that success of any analytical project requires multi-disciplinary teams that include database administrators, data scientists, analysts, subject matter experts, management and IT support. SAS Viya helps them capitalize on their strengths to promote frictionless collaboration in a secure and controlled environment.

This post, focused on collaboration between SAS coders and business analysts, only scratches the surface of SAS Viya's collaboration and knowledge sharing capabilities. Likewise, open source coders (R and Python) and business analysts can collaborate on SAS Viya too.

Free trials below give programmers and business analysts a taste of what's possible with SAS Viya. Try one and tell us about it in the comments.

SAS® Visual Analytics on SAS® Viya® | Try it as a business analyst! SAS® Visual Analytics on SAS® Viya® | Try it as a programmer!

How SAS® Viya fosters collaboration was published on SAS Users.

10月 192019
 

Newcomers to SAS Viya Administration may appreciate these tried-and-tested patterns for securing folders, and the content within them (reports, data plans, models etc.). Nothing too fancy today; if you are new to security model design in SAS Viya, this post is for you.

It presents five patterns for permissions on folders, reports, data plans and the like, and shows how you can combine them and apply them to design a simple but effective security model (or authorization model) for a folder structure.

SAS Viya has two authorization systems

If you are just starting work with permissions in SAS Viya, then you should be aware that there are TWO authorization systems in SAS Viya:

  • The general authorization system is used to secure (manage access to) folders, reports, data plans, models and other content stored in SAS Viya’s database (the SAS Infrastructure Data Platform, which uses PostgrSQL behind the scenes). It is also used to manage access to SAS Viya applications and some of their features.
  • The CAS authorization system is used to secure (manage access to) data, held in or accessed through SAS Cloud Analytics Services (CAS).

There is no overlap between the two systems. If you are interested in the permissions on a CAS library or table, you use the CAS authorization system. For everything else stored inside SAS Viya, it’s the general authorization system. Click the links above for the documentation. It’s worth mentioning that SAS Viya does make use of the host operating system’s filesystem, and can also access data in other databases or in SAS 9, each of which has its own authorization system too. These are therefore important too, but they usually take up less of a SAS Administrator’s attention.

SAS Viya has a built-in custom group called SAS Administrators, which is granted broad administrator access to much of the SAS Viya system out of the box. It is not a good idea to make users members of SAS Administrators unless they will bear at least joint responsibility for the health and stability of your SAS Viya deployment. Departmental ‘power users’ who just need read and write access to all your users’ content should not be made members of your SAS Administrators group! It’s better to make a content administrators group (or equivalent) and grant that group more access than you grant to most users.

Editing folder permissions in SAS Environment Manager

In the Content page in SAS Environment Manager (and some other places too), you can edit the authorization settings for folders, reports and other objects stored by SAS Viya. For most objects, e.g. reports, data plans, models etc, the Edit Authorization dialog looks like this. I’ve overlaid a blue shape, intended to emphasize that all four permissions apply to the object itself:

The Edit Authorization dialog for a folder has two more permissions (add and remove), and also has a second set of permissions. Those on the left apply to the folder itself, while those on the right, with ‘(convey)’ in their names, apply to the folder’s contents (technically, they apply to the folder’s ‘container’ and are conveyed to its contents). Again I’ve overlaid blue shapes to emphasize this:

This allows you to easily set different permissions on the folder itself from those you set on the folder’s contents: you can prevent users from renaming, deleting or moving a folder, but allow them to edit everything inside it. That’s nice!

When securing folders and the content inside them in SAS Viya, three basic guidelines to follow are:

  • Secure folders, not individual objects (reports etc.) wherever possible. It’s simpler to manage, and easier to understand later. If two reports in the same folder should have different permissions, they should probably be in different folders.
  • Grant permissions to groups, not individual users. It’s simpler to manage, and easier to understand later. If there isn’t a group that contains the set of users you need, either make a Custom Group, ask your Active Directory or LDAP administrator to make a new group for you, or consider using a larger existing group if that will work without granting too much access to someone who should not have it.
  • Avoid prohibits, especially ‘(convey)’-ed prohibits. Try to design your authorization scheme or security model so that you grant permissions wherever you need them, to give just enough access to content, and not too much. If a group of users should have access to a folder, and some of its contents, don’t grant Read (convey) permission on the parent folder, and try to prohibit Read permission on the subfolder(s) that the group of users should not be able to see. That would initially work, but it’s poor design and will likely become an inconvenience later. Instead, grant the group Read access, not Read (convey), on the folder, and then selectively grant that group Read and Read (convey) on the contents that they should be able to see.

These guidelines give rise to some recurring patterns of access. Note that you won’t find these patterns anywhere in the software: there is no screen, or button, or command for defining and applying them. They are design concepts only. This post’s five simple general authorization patterns are as follows (click any image to enlarge it):

No Access

The default. SAS Viya does not give user access to anything unless it is directly or indirectly granted.

This pattern is of course only effective if the user or group does not have access conveyed from a parent folder. So long as it does not, your masterly inactivity on this folder’s permissions ensures it too is not accessible to this user or group. If the folder does have access conveyed for the user or group in question from higher up the folder hierarchy, remove those conveyed permissions, and grant the permissions you require (whether conveyed or not) more selectively to subfolders or to a smaller group, so that the group in question does not get access they should not get.

<Group> Read

Grants a group permission to see this object only.

When a group should have read access to an object (ideally a folder), but not the objects’ contents, apply this pattern for that group. To apply it, grant a specific group Read permission on this folder, on the left-hand side of the authorization grid (or if you are using a programmatic method to set permissions, on the object URI), and grant the group nothing on the right-hand side of the grid (or on the container URI).

The 'Group' in angled brackets is written that way because that you might use variants of this pattern for several different groups: you may use an HR Read pattern, a Sales Read pattern, or a North American Finance Modelers Read pattern, as demanded by your requirements. Substitute the name of an actual group to turn the generalised, abstract pattern into a specific concrete pattern.

<Group> Read Convey

Grants a group permission to see this object and everything inside it.

To apply this pattern, grant a specific group Read permission on this folder, on the left-hand side of the authorization grid (or if you are using a programmatic method to set permissions, on the object URI), and also grant it the Read permission on the right-hand side of the grid (or on the container URI).

<Group> Edit Contents

Grants a group permission to make changes to all of the content of this folder. When you grant permissions in the Group Edit Contents pattern, you will usually also grant the permissions in the Group Read Convey pattern too: users who edit things need to see them.

For this pattern, grant the group Add and Remove on the object itself, and grant it Update, Delete, Add and Remove on the object’s container. These permissions allow the group to add and remove objects from the folder, to modify the objects inside the folder, and to add and remove objects from any subfolders.

Usually, you wouldn’t apply this pattern on its own. If a group is allowed to edit the contents of a folder, they usually need to be able to see the folder too. So, apply this Group Edit Contents pattern plus the Group Read Convey pattern together, for the same group on the same object:

<Group> Secure

Grants a group permission to add, remove, or change permissions on this object.

To apply this pattern, grant a specific group Secure on this folder, and Secure (convey) on the container.

When you grant permissions in the Group Secure pattern, you will usually also grant the permissions in the Group Read Convey and Group Edit Contents patterns too: users who manage objects’ permissions are generally super-users, content administrators or similar, and need to be able to see and edit the things they are allowed to secure:

Using the patterns

We have seen five simple regularly-occurring general authorization system permissions patterns, and a couple of ways in which some of them are usually combined. You may have instances of each of these patterns for each of several different groups in a security model design. Let’s see how they can be applied to a fictitious folder structure for the made-up organization we use in some of our GEL Administration workshops, GELCorp:

You may notice that the symbols in this table’s cells correspond to the symbols for the patterns presented above. The white symbols show patterns of permissions that should be granted to the group in the column header, on the folder in the row header. The grey symbols show inherited permissions, to remind us that they are in effect on a folder, but do not need to be directly granted on it again.

I’d encourage you to attend one our GEL SAS Viya Administration workshops, or the Securing SAS Viya Deployments workshop both available in the VLE or as a face-to-face workshop, for a fuller explanation of why we’ve applied the permissions patterns in this way. This security model design is certainly not the only way you could choose to secure GELCorp’s folders. But I expect you can see, at a glance, how the folders are intended to be secured, even if it is not always obvious from this diagram why. Plus, if you were to follow this design for each group and each folder where a symbol appears, you would get exactly the same set of permissions we apply in our workshop.

Hopefully you would agree that this is quite an easy way to represent and the basic permissions design for a set of folders, for multiple groups!

Simple general authorization patterns was published on SAS Users.

10月 172019
 

CAS Table

SAS Viya’s in-memory tables are referred to as a CAS table and are accessed using a CAS Engine. In this post, we will explore how one can parallel load and compress a CAS table in one pass of the CAS table.

Note: When not using this technique (i.e. PROC CASUTIL with a COMPRESS option) your loading of a CAS table will be a single-threaded process (slower). To understand the following code, which can act as a template for you, see Figure 1, we will review the SAS Log of this code in Figure 2.

proc cas;
  file log;
  table.dropCaslib /
   caslib='sas7bdat' quiet = true;
  addcaslib /
    datasource={srctype="path"}
    name="sas7bdat"
    path="/viyafiles/sasss1/data"
  ; run;
 
  index /
    table={caslib="sas7bdat" name="cars.sas7bdat" singlepass=true}
    casout={caslib="sas7bdat" name="cars" compress=true replication=0}
  ; run;
  print _status; run;
 
  tabledetails /
    caslib="sas7bdat"
    name="cars"
  ; run;
quit;

Figure 1. Template of SAS Code to Parallel Load and Compress a CAS Table in One Pass of the CAS Table

To accomplish the parallel load and compression of a CAS table, we will leverage PROC CAS. Let’s review the SAS log in Figure 2:

  • Line 85 utilizes the FILE LOG statement to redirect the contents we would normally see in the results window directly to the SAS Log i.e. the information between lines 92 and 93 as well as lines 103 and 104. Note: using the FILE LOG statement is optional.
  • On lines 86-92 we are dropping and creating our CASLIB to point to the file system path that contains the SAS7BDAT data set that we want to load and compress, i.e. CARS.
  • On line 87 we added the option QUIET = TRUE to our statement. This is a very handy trick to avoid the ERROR message we get in the SAS Log in Figure 3. If you omit this option an ERROR message will be produced if you have a brand-new CAS Session and our SAS7BDAT CASLIB has not be defined to that session.
  • Lines 88-91 created our CASLIB to our SAS7BDAT data sets.
  • Lines 94-96 are the statements that accomplish the parallel load and compression of our CAS table.
  • On line 94 we use the INDEX statement which is always executed in parallel. Notice we are not creating any indexes in our example but simply using the INDEX statement to activate the parallel load.
  • On line 95 we identify the CASLIB pointing to our source data set CARS.SAS7BDAT. We are also using the option SINGLEPASS = TRUE which means our CAS table will be loaded and compressed as each thread adds a row to our CAS table.
  • On line 96 we are saving our CAS table to our CASLIB SAS7BDAT and naming it CARS. The COMPRESS = TRUE options ensures our CAS table will be compressed and the REPLICATION = 1 ensures our CAS table is replicated i.e. 2 copies of the CAS table, to ensure high availability of the CAS table.
  • On line 98 will print to the SAS Log information telling use that the table was loaded and compressed successfully i.e. {severity=0,reason=0,,statusCode=0}.
  • Lines 101-103 provide information on our compressed CAS table. Reviewing this information, we can see our CAS table has a compression ratio of 5.

Figure 2. SAS Log to Parallel Load and Compress a CAS Table in One Pass of the CAS Table

 

Figure 3. To Avoid this ERROR Message, We Will Add to Line 76 the option QUIET = TRUE, See Line 87 in Figure 2.

Conclusion

When loading multiple CAS tables it is a common practice to compress CAS tables to help avoid the paging of our CAS tables to CAS DISK_CACHE. Paging to CAS_DISK_CACHE impacts performance. In addition, one can parallel load and compress source tables from various formats i.e. CSV files, Hadoop tables as well as many others relational databases, such as Oracle, Teradata and so on.

How to Parallel Load and Compress a SAS® Cloud Analytic Services (CAS) Table was published on SAS Users.

10月 162019
 

Introduction

Generating a word cloud (also known as a tag cloud) is a good way to mine internet text. Word (or tag) clouds visually represent the occurrence of keywords found in internet data such as Twitter feeds. In the visual representation, the importance of each keyword is denoted by the font size or font color.

You can easily generate Word clouds by using the Python language. Now that Python has been integrated into the SAS® System (via the SASPy package), you can take advantage of the capabilities of both languages. That is, you create the word cloud with Python. Then you can use SAS to analyze the data and create reports. You must have SAS® 9.4 and Python 3 or later in order to connect to SAS from Python with SASPy. Developed by SAS, SASPy a Python package that contains methods that enable you to connect to SAS from Python and to generate analysis in SAS.

Configuring SASPy

The first step is to configure SASPy. To do so, see the instructions in the SASPy Installation and configuration document. For additional details, see also the SASPy Getting started document and the API Reference document.

Generating a word cloud with Python

The example discussed in this blog uses Python to generate a word cloud by reading an open table from the data.world website that is stored as a CSV file. This file is from a simple Twitter analysis job where contributors commented via tweets as to how they feel about self-driving cars. (For this example, we're using data that are already scored for sentiment. SAS does offer text analytics tools that can score text for sentiment too -- see this example about rating conference presentations.) The sentiments were classified as very positive, slightly positive, neutral, slightly negative, very negative, and not relevant. (In the frequency results that are shown later, these sentiments are specified, respectively, as 1, 2, 3, 4, 5, and not_relevant.) This information is important to automakers as they begin to -design more self-driving vehicles and as transportation companies such as Uber and Lyft are already adding self- driving cars to the road. Along with understanding the sentiments that people expressed, we are also interested in exactly what is contributors said. The word cloud gives you a quick visual representation of both. If you do not have the wordcloud package installed, you need to do that by submitting the following command:

pip install wordcloud

After you install the wordcloud package, you can obtain a list of required and optional parameters by submitting this command:

?wordcloud

Then, follow these steps:

  1. First, you import the packages that you need in Python that enable you to import the CSV file and to create and save the word-cloud image, as shown below.

  2. Create a Python Pandas dataframe from the twitter sentiment data that is stored as CSV data in the data file. (The data in this example is a cleaned-up subset of the original CSV file on the data.world website.)

  3. Use the following code, containing the HEAD() method, to display the first five records of the Sentiment and Text columns. This step enables you to verify that the data was imported correctly.

  4. Create a variable that holds all of the text in a single row of data that can be used in the generation of the word cloud.

  5. Generate the word cloud from the TEXTVAR variable that you create in step 4. Include any parameters that you want. For example, you might want to change the background color from black to white (as shown below) to enable you to see the values better. This step includes the STOPWORDS= parameter, which enables you to supply a list of words that you want to eliminate. If you do not specify a list of words, the parameter uses the built-in default list.

  6. Create the word-cloud image and modify it, as necessary.

Analyzing the data with SAS®

After you create the word cloud, you can further analyze the data in Python. However, you can actually connect to SAS from Python (using the SASPy API package), which enables you to take advantage of SAS software's powerful analytics and reporting capabilities. To see a list of all available APIs, see the API Reference.

The following steps explain how to use SASPy to connect to SAS.

  1. Import the SASPy package (API) . Then create and generate a SAS session, as shown below. The code below creates a SAS session object.

  2. Create a SAS data set from the Python dataframe by using the DATAFRAME2SASDATA method. In the code below, that method is shown as the alias DF2DS.

  3. Use the SUBMIT() method to include SAS code that analyzes the data with the FREQ procedure. The code also uses the GSLIDE procedure to add the word cloud to an Adobe PDF file.

    When you submit the code, SAS generates the PDF file that contains the word-cloud image and a frequency analysis, as shown in the following output:

Summary

As you can see from the responses in the word cloud, it seems that the contributors are quite familiar with Google driverless cars. Some contributors are also familiar with the work that Audi has done in this area. However, you can see that after further analysis (based on a subset of the data), most users are still unsure about this technology. That is, 74 percent of the users responded with a sentiment frequency of 3, which indicates a neutral view about driverless cars. This information should alert automakers that more education and marketing is required before they can bring self-driving cars to market. This analysis should also signal companies such as Uber Technologies Inc. and Lyft, Inc. that perhaps consumers need more information in order to feel secure with such technology.

Creating a word cloud using Python and SAS® software was published on SAS Users.

10月 152019
 

In a previous post, I discussed using logs to troubleshoot problems in your Viya environment. In this post, I will look at some additional ways to troubleshoot using some of the tools provided by the Viya Operations Infrastructure. With applications, servers and numerous micro-services all working together and generating their own logs in Viya, it can be difficult to find relevant logs. In order to manage the large number of logs and to enable you to locate messages of interest, the operations infrastructure provides components to collect and store log messages.

The collection process is illustrated in the diagram below.

Co-ordinated by the operations infrastructure:

  • sas-watch log continuously collects and sends log messages to the RabbitMQ exchange
  • sas-stream pulls the messages from RabbitMQ and writes them to disk as a tab-separated value (TSV) file
  • Every five minutes, the sas-ops-agentsrv runs the DatamartEtl task to extract log messages from the TSV file and load them into the VIYALOGS CAS-indexed search table

SAS Environment Manager uses the information in the VIYALOGS table and the VIYALOGS_SOURCES tables to display log messages and graphs that contain the frequency and trends of messages. The SAS Environment Manager LOG’s interface makes it really easy to search and analyze log messages. Using the interface, you can view, subset and search logs. The interface has the filtering capabilities on the left hand side and displays the messages on the right. By default, the filter is set to display all messages from all applications and services from the last 30 minutes.

You can modify the filter to extend or shorten the timeframe, subset the level of messages displayed or the source (service/application) that the messages are coming from. You can also search for any text within a message.

Many administrators would prefer a command-line interface, and the good news is there is one.

sas-ops is a command-line interface which allows for the monitoring of the operational infrastructure in a SAS Viya deployment environment.

I have found the sas-ops log command very useful to troubleshoot problems. The sas-ops log command can be used to stream log messages that are generated by SAS Viya applications and services. The messages can be streamed to a terminal window or piped to a file. The sas-ops logs command is located at /opt/sas/viya/home/bin and can be run from any machine in a Viya environment that is included in the CommandLine.

When would you use sas-ops logs to stream log messages? Some potential scenarios are to:

  • troubleshoot a poorly performing report or analysis
  • debug problems in the environment such as logon issues
  • monitor access to resources

In these cases, using sas-ops logs you can stream the log messages from all services to a single file or terminal.

In its simplest form, the command live streams all log messages from a Viya environment to the terminal. Selecting CTRC+C will stop the streaming.

./sas-ops logs

Partial output from the stream is shown below.

If you want to save the output, you can redirect the stream to a file.

./sas-ops logs &gt; /tmp/mylog.log

You can get more creative and achieve more complex tasks. You can change the format of the message output using –format. For example, to create a file with json which could be read by another process use:

./sas-ops logs –format pretty &gt; mylogs.json

You can also:

  • stream messages for just a specific Viya service
  • filter logs messages by text in a regular expression
  • stream for a specific duration

The duration is specified using the format 0h0m0s0ms, but you can also use individual parts of the specification, for example 30s for 30 seconds or 5m for 5 minutes.

Consider the situation where we want to monitor access to a particular CAS table over a specific period of time. The command below will output to a file all messages that contain the table name HR_SUMMARY for a period of 5 minutes.

./sas-ops logs –match HR_SUMMARY –timeout 5m &gt; /tmp/hr_summary_access.log

The output shows all the CAS actions that were performed on the table during the time period.

You can subset the stream to one service.

Consider a case where a user is having an issue logging in and you suspect you have an issue with the LDAP setup. To check the problem, you can firstly enable DEBUG logging on com.sas.identities. Then stream the log messages from the identities service.

./sas-ops logs –format pretty –source identities &gt; logonerrors.json

Viewing the output shows that there is something wrong with the LDAP query.

I think you will agree that sas-ops logs is a very useful tool for monitoring and troubleshooting issues in a Viya environment. For more information, check out the following resources:

I would like to thank Bryan Ellington for his helpful input with this post.

Capturing log messages from Viya deployments was published on SAS Users.

10月 102019
 

DATA Step BY Statements

DATA Step is a very powerful language that SAS and Open Source programmers leverage to stage data for the analytical life cycle. A popular technique is to use the DESCENDING option on BY variables to identify the largest value. Let’s review the example in Figure 1:

  • On line 74 we are using the descending option on the BY statement for the numeric variable MSRP. The reason we are doing this is so we can identify the most expensive car for each make of car in our data set.
  • On line 79 we group our data by MAKE of car.
  • On line 80 we leverage the FIRST. statement on the subsetting IF statement to output the first record for each MAKE. In Figure 2 we can review the results.


Figure 1. Descending BY Statement


Figure 2. Listing of Most Expensive Cars by MAKE

What is CAS?

CAS is SAS Viya’s in-memory engine that processes data and logic in a distributed computing paradigm. When working with CAS tables we can simulate the DESCENDING BY statement by creating a CAS View which will then become the source table to our DATA Step. Let’s review Figure 3:

  • On line 79 we will leverage the CASL (SAS® Cloud Analytic Services Language) action set TABLE with the action VIEW to create the CAS View that will be used as our source table in the DATA Step.
  • On lines 80 and 81 we will store our CAS View in the CASUSER CASLIB with the name of DESCENDING.
  • On line 82 and 83 we use the TABLES statement to specify the input CAS table to our CAS View.
  • On line 84 we use the VARLIST statement to identify the columns from the input table we want in our CAS View.
  • On lines 85 we create a new variable for our CAS View using the computedVars statement,
  • On line 86 we provide the math for our new variable N_MSRP. N_MSRP is the negated value of the input CAS table variable MSRP. Note: This simulation only works for numeric variables. For character data I suggest using LAST. processing which you can review in this blog post.


Figure 3. Simulating DESCENDING BY Statement for Numeric Variables

Now that we have our CAS View with its new variable N_MSRP, we can move on to the DATA Step code in Figure 3.

  • On line 92 the SET statement specifies the source to our DATA Step; CAS View CASUSER.DESCENDING
  • On line 83 we leverage the BY Statement to group our data in ascending order for the CAS View variables MAKE and N_MSRP. Because N_MSRP is in ascending order of our original variable MSRP is in DESCENDING order.
  • On line 94 we use a subsetting IF statement to output the first occurrence of each MAKE.

Figure 4 is a listing of our new CAS table CASUSER.DESCENDING2 and displays the most expensive car for each make of car.


Figure 4. Listing of Most Expensive Cars by MAKE

Template for Creating a CAS View

/* Create a CAS view */
/* For each DESCENDING numeric create a new variable(s) */
/* The value of the new variable(s) is the negated value */
/* of the original DESCENDING BY numeric variable(s) */
proc cas;
   table.view / replace = true
   caslib='casuser'
   name='descending'
   tables={{
      name='cars'
      varlist={'msrp' 'make'},
      computedVars={{name='n_msrp'}},
      computedVarsProgram='n_msrp = -(msrp)'
   }};
run;
quit;
 
data casuser.descending2;
   set casuser.descending;
   by make n_msrp ;
   if first.make ;
run;
 
proc print data=casuser.descending2;
title "Most Expensive Cars";
run;

Conclusion

It is a very common coding technique to process data with a DESCENDING BY statement using DATA Step. With Viya 3.5 the DESCENDING BY statements is supported, for numeric and character data in DATA Step, with this caveat: DESCENDING works on all but the first BY variable on the BY statement. For earlier versions of SAS Viya this simulation is the best practices for numeric data that you want in DESCENDING order.

How to Simulate DATA Step DESCENDING BY Statements in SAS® Cloud Analytic Services (CAS) was published on SAS Users.

10月 092019
 

As a fellow student, I know that making sure you get the right books for learning a new skill can be tough. To get you started off right, I would like to share the top SAS books that professors are requesting for students learning SAS. With this inside sneak-peak, you can see what books instructors and professors are using to give new SAS users a jump-start with their SAS programming skills.

1. Learning SAS by Example: A Programmer's Guide, Second Edition

At the top of the list is Ron Cody’s Learning SAS by Example: A Programmer’s Guide, Second Edition. This book teaches SAS programming to new SAS users by building from very basic concepts to more advanced topics. Many programmers prefer examples rather than reference-type syntax, and so this book uses short examples to explain each topic. The new edition of this classic has been updated to SAS 9.4 and includes new chapters on PROC SGPLOT and Perl regular expressions. Check out this free excerpt for a glimpse into the way the book can help you summarize your data.

2. An Introduction to SAS University Edition

I cannot recommend this book highly enough for anyone starting out in data analysis. This book earns a place on my desk, within easy reach. - Christopher Battiston, Wait Times Coordinator, Women's College Hospital

The second most requested book will help you get up-and-running with the free SAS University Edition using Ron Cody’s easy-to-follow, step-by-step guide. This book is aimed at beginners who want to either use the point-and-click interactive environment of SAS Studio, or who want to write their own SAS programs, or both.

The first part of the book shows you how to perform basic tasks, such as producing a report, summarizing data, producing charts and graphs, and using the SAS Studio built-in tasks. The second part of the book shows you how to write your own SAS programs, and how to use SAS procedures to perform a variety of tasks. In order to get familiar with the SAS Studio environment, this book also shows you how to access dozens of interesting data sets that are included with the product.

For more insights into this great book, check out Ron Cody’s useful tips for SAS University Edition in this recent SAS blog.

3. The Little SAS Book: A Primer, Fifth Edition

Our third book is a classic that just keeps getting better. The Little SAS Book is essential for anyone learning SAS programming. Lora Delwiche and Susan Slaughter offer a user-friendly approach so readers can quickly and easily learn the most commonly used features of the SAS language. Each topic is presented in a self-contained two-page layout complete with examples and graphics. Also, make sure to check out some more tips on learning SAS from the authors in their blog post.

We are also excited to announce that the newest edition of The Little SAS Book is coming out this Fall! The sixth edition will be interface independent, so it won’t matter if you are using SAS Studio, SAS Enterprise Guide, or the SAS windowing environment as your programming interface. In this new edition, the authors have included more examples of creating and using permanent SAS data sets, as well as using PROC IMPORT to read data. The new edition also deemphasizes reading raw data files using the INPUT statement—a topic that is no longer covered in the new base SAS programmer certification exam. Check out the upcoming titles page for more information!

4. SAS Certification Prep Guide: Statistical Business Analysis Using SAS 9

Number four is a must-have study guide for the SAS Certified Statistical Business Analyst Using SAS 9 exam. Written for both new and experienced SAS programmers, the SAS Certification Prep Guide: Statistical Business Analysis Using SAS 9 is an in-depth prep guide for the SAS Certified Statistical Business Analyst Using SAS 9: Regression and Modeling exam. The authors step through identifying the business question, generating results with SAS, and interpreting the output in a business context. The case study approach uses both real and simulated data to master the content of the certification exam. Each chapter also includes a quiz aimed at testing the reader’s comprehension of the material presented. To learn more about this great guide, watch an interview with co-author Joni Shreve.

5. SAS for Mixed Models: Introduction and Basic Applications

Models are a vital part of analyzing research data. It seems only fitting, then, that this popular SAS title would be our fifth most popular book requested by SAS instructors. Mixed models are now becoming a core part of undergraduate and graduate programs in statistics and data science. This book is great for those with intermediate-level knowledge of SAS and covers the latest capabilities for a variety of SAS applications. Be sure to read the review of this book by Austin Lincoln, a technical writer at SAS, for great insights into a book he calls a “survival guide” for creating mixed models.

Want more?

I hope this list will help in your search for a SAS book that will get you to the next step in your SAS education goals. To learn more about SAS Press, check out our up-and-coming titles, and to receive exclusive discounts make sure to subscribe to our newsletter.

Top 5 SAS Books for Students was published on SAS Users.

10月 012019
 

On behalf of the entire global Customer Contact Center, “Happy CX Day!” to all our SAS users!

Customer Experience Day (aka #CXDay2019) is one of our favorite days of the year—when we can reflect on customer interactions, questions and feedback from the past year—and look to the year ahead for ways to enhance our customers’ experience, like expanding our support options and helping drive improvements to our website and self-service offerings.

Plus, it gives us another excuse to celebrate all of our wonderful SAS users and partners! Whoohoo!

The opportunity to join our users on their journeys with SAS—from acquiring SAS, to learning, updating and renewing it, to attending SAS events, and beyond---is one of our favorite aspects of our jobs!

To show our appreciation, we want to share a little love.

 

 

….they are curious and passionate, and ask great questions that help us learn something new each day! - Mary

…they are passionate and dedicated to learning SAS! I love how users are always willing to help with new users’ questions in our SAS Communities groups! - Antionen

…they’re using SAS in such innovative and amazing ways to help make the world a better place. - Tricia

… we get satisfaction by providing the answers to tough questions. It makes our jobs worthwhile knowing that we have added personal value to anyone who has reached out to SAS. - Lida

...by caring for people, we’re a part of making the unimaginable possible. - Keila

…of the exciting ways they’re applying SAS to change lives – from new advancements in cancer research, clinical trials and drug testing, learning about species and ecosystems in efforts to protect endangered species and biodiversity, to impacting young lives by using advanced analytics to measure, as well as impact, student progress in K-12. - Lisa

Not familiar with the Customer Contact Center? We’re the folks who answer your SAS inquiries and point you in the right direction to get the help you need. Well, that’s part of what we do! We don’t just answer questions, we’re also listening to you and looking at ways to make things easier to navigate, simpler to find, and faster to share.

Get to know us better! Fun facts about our team:

  • We’re located in four SAS offices:
  • Collectively, our team speaks and supports 17 languages
  • …and supports nearly 100 countries
  • Some engagement professionals on our team speak more than three languages
  • We collaborate with just about every team at SAS
  • In 2018, we received over 113,000 inquiries worldwide
  • ~55% of SAS customers choose live chat as their communication channel
  • So far this year, we’ve received over 80,000 inquiries from around the world
  • The two most common SAS topics we’re asked about are SAS Training and Analytics U options

If you need any help, want to share feedback, or simply want to talk SAS, please reach out to us!

You can chat with us live on the SAS website, tweet us @SAS_Cares, or contact us via phone, email, or web form.

Want to collaborate with other SAS users? Search for answers or post questions in the SAS Support Communities. This is a great resource for your usage questions! If you're new to SAS, consider frequenting the New SAS User Community, where friendly, knowledgeable volunteers like KurtBremser are eager to help.

Wishing you a fabulous CX Day!

Happy Customer Experience Day! was published on SAS Users.