SAS Viya

11月 172018
 

Disclaimer: this article does not cover or promote any political views. It’s all about data and REST APIs.

I am relieved, thankful, elated, glad, thrilled, joyful (I could go on with more synonyms from my thesaurus.com search for 'happy') November 6, 2018 has come and gone. Election day is over. This means no more political ads on TV, and those signs lining the streets will be coming down! It is a joy to now watch commercials about things that matter. Things like injury lawyers who are on your side or discovering a copper colored pan is going to cook my food better than a black one.

The data in this article pertains to advertising expenditures in the 2018 elections. This is the second of three articles in a series outlining the use of REST APIs and SAS. The first article, Using SAS Viya REST APIs to access images from SAS Visual Analytics, I used SAS Viya REST APIs to download an image from a flight data SAS report. In this article I use Cloud Analytics Service (CAS) REST APIs to run statistical methods on political ad spending data. The third article will bring both APIs together in an application.

The data

In the closing days of the election season, while being inundated with political advertising, I thought about how much money is spent during each cycle. The exact numbers vary depending on the resource, but the range for this year’s mid-term elections is between four and five billion dollars.

A little research reveals that outside the candidates themselves, the biggest spenders on political ads are political action committees, aka PACs. The Center for Responsive Politics compiled the data set used in this article, and derives from a larger data set released by the Federal Election Commission. The data set lists a breakdown of PAC contributions to campaign finances.

CAS REST APIs

As I explained in the previous article, SAS publishes two sets of APIs. Which APIs to use depends on the service, the data organization, or the intended use of the data. Please refer to the SAS Viya REST API article for more information on each set of APIs.

CAS REST APIs use CAS actions to perform statistical methods across a variety of SAS products. You can also use the CAS REST APIs to configure and maintain the SAS Viya environment. Here, I focus on the CAS actions. Calling the CAS actions via the REST API allow users to access SAS data and procedures and integrate them into their applications.

The process

How to construct the API call

I start with the API documentation for information on how to construct and use the CAS REST APIs. The REST API can submit actions and return the results. Parameters and result data are in JSON format. To specify your parameters, encapsulate the attributes in a JSON object, then submit a POST method on the action. The URL for your action will include the UUID of your session in the format: /cas/sessions/{uuid}/actions/{action}. Replace {uuid} and action with the appropriate values.

Create a session

The first requirement is to create a session. I use the following cURL command to create the session.

curl -X POST http://sasserver.demo.sas.com:8777/cas/sessions \
    -H 'Authorization: Bearer <access-token-goes-here>'

The response is a JSON object with a session ID:

{
    "session": "16dd9ee7-3189-1e40-8ba7-934a4a257fd7"
}

I’ll use the UUID for the session to build the URLs for the remainder of the REST calls.

Build the CAS REST API call body

Now we know the general structure of the CAS REST API call. We can browse the CAS actions by name to determine how to build the body text.

Using the simple.summary action definition, I build a JSON body to access the PAC spending from a CASTable, create a new table grouped by political views, and calculate total spending. The resulting code is below:

{
	"table":{"caslib":"CASUSER(sasdemo)","name":"politicalspending2018","groupBy":{"name":"view"}},
	"casout":{"caslib":"CASUSER(sasdemo)","name":"spendingbyaffiliation","promote":true},
	"inputs":"total",
	"subset":["SUM","N"],
}

Each line of code above contributes to running the CAS action:

  1. Define the table to use and how to group the data
  2. The output of the API call will create a new CASTable
  3. Dictate the column to summarize.
  4. The statistical method(s) to include in the result table; in this case I want to sum the Total column and count the number of PACs by group.

Send the CAS REST API

Next, I send the body of the text with the curl call below. Notice the session ID obtained earlier is now part of the URL:

curl -X POST http://sasserver.demo.sas.com:8777/cas/sessions/16dd9ee7-3189-1e40-8ba7-934a4a257fd7/actions/simple.summary \
  -H 'Authorization: Bearer <access-token-goes-here>' \
  -H 'Accept = application/json' \
  -H 'Content-Type = application/json'

The REST call creates a new CASTable, SPENDINGBYAFFILIATION. Refer to the screen shot below.

New table

SAS CASTable created by the simple.summary action

I also have the option of returning the data to create the SPENDINGBYAFFILIATION table in JSON format. To accomplish this, remove the casout{} line from the preceding call. Below is a snippet of the JSON response.

JSON response

JSON response to the simple.summary REST call

After parsing the JSON response code, it is now ready for utilization by a web application, software program, or script.

Moving on

The Thanksgiving Day holiday is fast approaching here in the United States. I plan to eat a lot of turkey and sweet potato pie, welcome the out-of-town family, and watch football. It will be refreshing to not hear the back-and-forth banter and bickering between candidates during commercial breaks. Oh, but wait, Thanksgiving is the start of the holiday season. This means one thing: promotions on Black Friday deals for items I may not need will start airing and last through year's-end. I guess if it is not one thing filling the advertising air waves, it is another. I'll just keep the remote handy and hope I can find another ball game on.

What’s next?

I understand and appreciate political candidates’ needs to communicate their stance on issues and promote their agendas. This takes money. I don't see the spending trend changing direction in the coming years. I can only hope the use of the funds will promote candidates' qualifications, beliefs, and ideas, and not to bash or belittle their opponents.

My next article will demonstrate how to use both the SAS Viya and the CAS REST APIs under the umbrella of one web application. And I promise, no politics.

Using SAS Cloud Analytics Service REST APIs to run CAS Actions was published on SAS Users.

11月 172018
 

Disclaimer: this article does not cover or promote any political views. It’s all about data and REST APIs.

I am relieved, thankful, elated, glad, thrilled, joyful (I could go on with more synonyms from my thesaurus.com search for 'happy') November 6, 2018 has come and gone. Election day is over. This means no more political ads on TV, and those signs lining the streets will be coming down! It is a joy to now watch commercials about things that matter. Things like injury lawyers who are on your side or discovering a copper colored pan is going to cook my food better than a black one.

The data in this article pertains to advertising expenditures in the 2018 elections. This is the second of three articles in a series outlining the use of REST APIs and SAS. The first article, Using SAS Viya REST APIs to access images from SAS Visual Analytics, I used SAS Viya REST APIs to download an image from a flight data SAS report. In this article I use Cloud Analytics Service (CAS) REST APIs to run statistical methods on political ad spending data. The third article will bring both APIs together in an application.

The data

In the closing days of the election season, while being inundated with political advertising, I thought about how much money is spent during each cycle. The exact numbers vary depending on the resource, but the range for this year’s mid-term elections is between four and five billion dollars.

A little research reveals that outside the candidates themselves, the biggest spenders on political ads are political action committees, aka PACs. The Center for Responsive Politics compiled the data set used in this article, and derives from a larger data set released by the Federal Election Commission. The data set lists a breakdown of PAC contributions to campaign finances.

CAS REST APIs

As I explained in the previous article, SAS publishes two sets of APIs. Which APIs to use depends on the service, the data organization, or the intended use of the data. Please refer to the SAS Viya REST API article for more information on each set of APIs.

CAS REST APIs use CAS actions to perform statistical methods across a variety of SAS products. You can also use the CAS REST APIs to configure and maintain the SAS Viya environment. Here, I focus on the CAS actions. Calling the CAS actions via the REST API allow users to access SAS data and procedures and integrate them into their applications.

The process

How to construct the API call

I start with the API documentation for information on how to construct and use the CAS REST APIs. The REST API can submit actions and return the results. Parameters and result data are in JSON format. To specify your parameters, encapsulate the attributes in a JSON object, then submit a POST method on the action. The URL for your action will include the UUID of your session in the format: /cas/sessions/{uuid}/actions/{action}. Replace {uuid} and action with the appropriate values.

Create a session

The first requirement is to create a session. I use the following cURL command to create the session.

curl -X POST http://sasserver.demo.sas.com:8777/cas/sessions \
    -H 'Authorization: Bearer <access-token-goes-here>'

The response is a JSON object with a session ID:

{
    "session": "16dd9ee7-3189-1e40-8ba7-934a4a257fd7"
}

I’ll use the UUID for the session to build the URLs for the remainder of the REST calls.

Build the CAS REST API call body

Now we know the general structure of the CAS REST API call. We can browse the CAS actions by name to determine how to build the body text.

Using the simple.summary action definition, I build a JSON body to access the PAC spending from a CASTable, create a new table grouped by political views, and calculate total spending. The resulting code is below:

{
	"table":{"caslib":"CASUSER(sasdemo)","name":"politicalspending2018","groupBy":{"name":"view"}},
	"casout":{"caslib":"CASUSER(sasdemo)","name":"spendingbyaffiliation","promote":true},
	"inputs":"total",
	"subset":["SUM","N"],
}

Each line of code above contributes to running the CAS action:

  1. Define the table to use and how to group the data
  2. The output of the API call will create a new CASTable
  3. Dictate the column to summarize.
  4. The statistical method(s) to include in the result table; in this case I want to sum the Total column and count the number of PACs by group.

Send the CAS REST API

Next, I send the body of the text with the curl call below. Notice the session ID obtained earlier is now part of the URL:

curl -X POST http://sasserver.demo.sas.com:8777/cas/sessions/16dd9ee7-3189-1e40-8ba7-934a4a257fd7/actions/simple.summary \
  -H 'Authorization: Bearer <access-token-goes-here>' \
  -H 'Accept = application/json' \
  -H 'Content-Type = application/json'

The REST call creates a new CASTable, SPENDINGBYAFFILIATION. Refer to the screen shot below.

New table

SAS CASTable created by the simple.summary action

I also have the option of returning the data to create the SPENDINGBYAFFILIATION table in JSON format. To accomplish this, remove the casout{} line from the preceding call. Below is a snippet of the JSON response.

JSON response

JSON response to the simple.summary REST call

After parsing the JSON response code, it is now ready for utilization by a web application, software program, or script.

Moving on

The Thanksgiving Day holiday is fast approaching here in the United States. I plan to eat a lot of turkey and sweet potato pie, welcome the out-of-town family, and watch football. It will be refreshing to not hear the back-and-forth banter and bickering between candidates during commercial breaks. Oh, but wait, Thanksgiving is the start of the holiday season. This means one thing: promotions on Black Friday deals for items I may not need will start airing and last through year's-end. I guess if it is not one thing filling the advertising air waves, it is another. I'll just keep the remote handy and hope I can find another ball game on.

What’s next?

I understand and appreciate political candidates’ needs to communicate their stance on issues and promote their agendas. This takes money. I don't see the spending trend changing direction in the coming years. I can only hope the use of the funds will promote candidates' qualifications, beliefs, and ideas, and not to bash or belittle their opponents.

My next article will demonstrate how to use both the SAS Viya and the CAS REST APIs under the umbrella of one web application. And I promise, no politics.

Using SAS Cloud Analytics Service REST APIs to run CAS Actions was published on SAS Users.

11月 142018
 

Prior to SAS Viya

With the creation of SAS Viya, the ability to run DATA Step code in a distributed manner became a reality. Prior to distributed DATA Step, DATA Step programmers never had to think about achieving repeatable results when SAS7BDAT datasets were the sources to their DATA Step code that contains a BY statement. This is because prior to SAS Cloud Analytics Services (CAS), DATA Step ran single-threaded and the source SAS7BDAT dataset was stored on disk. Every time one would run the code we obtained repeatable results because the sequence of rows within the BY group were preserved between runs. To illustrate this, review figures 1, 2, and 3.

Figure 1 is the source SAS7BDAT dataset WORK.TEST1. Notice the sequence of VAR2, especially on row 1 and 4 (i.e., _N_ =1 and 4).

_n_ VAR1 VAR2
1 1 N
2 1 Y
3 1 Y
4 2 Y
5 2 Y
6 2 N


Figure 1. WORK.TEST1 the original SAS7BDAT dataset

In figure 2, we see a BY statement with variable VAR1. This will ensure VAR1 is in ascending order. We are also using FIRST. processing to identify the first occurrence of the BY group. Because this data is stored on disk and because the DATA Step is executed using a single thread, the result table will be repeatable no matter how many times we run the DATA Step code.

Figure 2. Focus on the IF statement, especially VAR2

In figure 3, we see the output SAS7BDAT dataset WORK.TEST2.

_n_ VAR1 VAR2
1 1 N

Figure 3. WORK.TEST2 result dataset from running the code in Figure 2

In figure 4, we are running the same DATA Step but this time our source and target tables are CAS tables. The source table CASLIB.TEST1 was created by lifting the original SAS7BDAT dataset WORK.TEST1 (review figure 1) into CAS.

Figure 4. DATA Step executing in CAS

In figure 5, we see that the DATA Step logic is being respected in runs 1, 2 and 3; but we are not achieving repeatable results. This is due to CAS running on multiple threads. Note that the BY statement – which will group the data correctly for each BY group – is done on the fly. Also, the BY statement will not preserve the sequence of rows within the BY group between runs.

For some processes, this is not a concern but for others it could be. If you need to obtain repeatable results in DATA Step code that runs distributed in CAS as well as match your SAS 9 single-threaded DATA Step results, I suggest the following workaround be used.

Figure 5. DATA Step logic is respected but yields different results with each run

With SAS Viya

The workaround is very simplistic to understand and implement. For each SAS7BDAT dataset being lifted into a CAS table, see figure 6, we need to add a new variable ROW_ID.

_n_ VAR1 VAR2
1 1 N
2 1 Y
3 1 Y
4 2 Y
5 2 Y
6 2 N

Figure 6. Original SAS7BDAT dataset source WORK.TEST1

To accomplish this, we will leverage the automatic variable _N_ that is available to all DATA Step programmers. _N_ is initially set to 1. Each time the DATA step loops past the DATA statement, the variable _N_ increments by 1. The value of _N_ represents the number of times the DATA step has iterated. In our case, the value for each row is the row sequence in the original SAS7BDAT dataset. Figure 7 contains the SAS code we ran on the SAS 9.4M5 workspace server or the SAS Viya compute server to add the new variable ROW_ID.

 

Figure 7. Creating the new variable ROW_ID

By reviewing figure 8 we can see the new variable ROW_ID in the SAS7BDAT dataset WORK.TEST1. Now that we have the new variable, we are ready to lift this dataset into CAS.

_N_ VAR1 VAR2 ROW_ID
1 1 N 1
2 1 Y 2
3 1 Y 3
4 2 Y 4
5 2 Y 5
6 2 N 6

Figure 8. WORK.TEST1 with the new variable ROW_ID

There are many ways to lift a SAS7BDAT dataset into CAS. One way is to use a DATA Step like we did in figure 9.

Figure 9. DATA Step code to create distributed CAS table CASLIB.TEST1 

To obtain the repeatable results, we need to control the sequence of rows within each BY group. We accomplish this by adding the new variable ROW_ID as the last variable to the BY statement in our DATA Step code, see figure 10.

Figure 10. Add ROW_ID as last variable of the BY group

Figure 11 shows us the output CAS table created by the code in figure 10. By adding the new variable ROW_ID and using that variable as the last variable of the BY statement, we are controlling the sequencing of rows within the BY groups for all 3 runs.

VAR1 VAR2 ROW_ID
1 N 1

Figure 11. Distrusted CAS table CASLIB.TEST2

Conclusion

With distributed DATA Step comes great opportunities to improve runtimes. It also means we need to understand differences between single-threaded processing of SAS7BDAT datasets that are stored on disk and distributed processing of CAS tables store in-memory. To help you with that journey I suggest you read the SAS Global Forum paper, Parallel Programming with the DATA Step: Next Steps.

How to achieve repeatable results with distributed DATA Step BY Groups was published on SAS Users.

11月 132018
 

In my previous blog post I demonstrated how to create your own CAS actions and action sets.  In this post, we will explore how to create your own CAS functions using the CAS Language (CASL).  A function is a component of the CASL programming language that can accept arguments, perform a computation or other operation, and return a value.  The value that is returned can be used in an assignment statement or elsewhere in expressions.

About SAS functions

SAS provides two types of supplied functions: built-in functions and common functions.  Built-in functions contain functionality that is unique to CASL.  These allow you to perform operations on your result tables, arrays, and dictionaries, and provide run-time support for your CASL programs.  Built-in functions cannot be replaced with user-defined functions.

Conversely, common functions provide functionality that is common to other SAS functions.  When used in a CASL program, SAS functions take a CASL value and a CASL value is returned.  Unlike built-in functions, you can replace these functions with user-defined functions.

Since the capabilities of built-in functions are unique to CASL, let’s look at these in-depth and demonstrate with an example.  Save the following FedSQL code in an external file called hmeqsql.sas.  This code will be read into CAS and stored as a variable.

The execDirect action executes FedSQL code in CAS.  The READPATH built-in function reads the FedSQL code saved in hmeqsql.sas and stores it in the CASL variable sqlcode which is used as input to the query parameter.

The fetch action displays the first 20 rows from the output table hmeq.out.

If you don’t feel like looking through the documentation for a built-in or common function, a list of each can be generated programmatically.  Run the following code to see a list of built-in functions.

Partial list of CASL built-in functions

Run the following code to see a list of common functions.

Partial list of common functions

User-defined CASL functions

In addition to the customizable capabilities of built-in functions supplied by SAS, you can also create your own functions using the FUNCTION statement.  User-defined functions can be called in expressions using CASL and they provide a large amount of flexibility.  The following example creates four different functions for temperature conversion.

After creating these functions, they can be called immediately, or you can store them in an external file and call them via a %include statement.  In this example, the user-defined functions have been stored in an external file called FunctionStore.sas.  You can call one, all, or any number of your user-defined functions.

The output from each function call is displayed in the log.

Lastly, if you want to see all user-defined functions, run the FUNCTIONLIST statement.  A list will be printed to the log.

More about CASL programming and using functions in CASL

Check out these resources for further information on programming in the CASL language and using functions in CASL.

Customize your CASL code with built-in and user-defined functions was published on SAS Users.

11月 072018
 

Migration, version road maps and configurations were the themes of several questions that came up in a recent webinar about combining SAS Grid Manager and SAS Viya. You’ll see in this blog post that we were ready to get into the nitty-gritty details in our answers below – just as we did in the previous FAQs post. We hope you find them useful in your work using SAS Grid Manager and SAS Viya together.

1. Can we migrate SAS programs that are currently on SAS PC environments into the SAS Grid environment – or do we need to rewrite the programs for SAS Grid Manager?

No, you don’t need to rewrite your SAS programs to run on a SAS Grid environment. Many customers migrate their code from other environments (like PCs or servers) and submit them to SAS Grid Manager from SAS Display Manager, SAS Studio or any other application of their choice.

If you already use SAS Enterprise Guide to run jobs on a remote server, the process may be as simple as changing your server configuration to use a grid-launched workspace server (information that your SAS Administrator would provide) and continuing to work in much the same way as always, requiring no changes to your code.

Depending on other changes that take place at the same time SAS Grid Manager is implemented, there may need to be some small adjustments to your programs.  For example, if your organization consolidates source data onto new storage, you may need to change paths associated with your LIBNAME statements.  These should be housekeeping items rather than significant rewrites of the logic in your SAS code.

If you plan to continue to use the programming environment provided by BASE SAS itself (DMS) and have been using SAS/CONNECT, you will need to add the SIGNON statement to start a session on the SAS Grid Manager

  • ENDRSUBMIT statement to end the block of code to be run on the grid
  • Divide and Conquer – Writing Parallel SAS Code to Speed Up Your SAS Program.

    2. Is there a version of SAS Grid Manager that runs on the SAS Viya architecture?

    The SAS Grid Manager roadmap includes a release of SAS Grid Manager on the SAS Viya Architecture late in 2019.

    3. Will I be able to migrate my SAS Grid Manager configuration and jobs from SAS 9.4 to the SAS Viya-based release of SAS Grid Manager?

     The plan to deliver SAS Grid Manager on the SAS Viya architecture includes automation to migrate jobs, flows, and schedule information from your SAS 9.4 environment to your SAS Viya environment.  It is our goal to make this transition as straightforward and easy as possible – especially where there is feature parity between SAS 9 based and SAS Viya-based solutions.  Since each product delivers solution-specific PROCs and other functionality that can be used within a job executed by SAS Grid Manager, each customer should work with their SAS team to understand which jobs can be migrated and which jobs may need to continue to run against your SAS 9.4 environment.  

    * * *

    These were all great questions that we thought deserved more detail than we could offer in a webinar.  If you have more questions that weren’t covered here, or in our previous post on this topic, just post them in the comments section.  We’ll answer them quickly.  Thanks for your interest!

    3 questions about implementing SAS Grid Manager and SAS Viya was published on SAS Users.

  • 11月 062018
     

    This post was also written by SAS' Xiangxiang Meng.

    You can communicate with various clients (SAS, Python, Lua, Java, and REST) in the same place using Pandas Data Analysis Library, CAS actions should come naturally. CAS enables you to subset tables using Python expressions. Using Python, you can create conditions that are based on the data pulled, instead of creating the conditions yourself. SAS® will use the information you want pulled to determine which rows to select.

    For example, rather than using fixed values of rows and columns to select data, SAS can create conditions based on the data in the table to determine which rows to select. This is done using the same syntax as DataFrames. CASColumn objects support Python’s various comparison operators and builds a filter that subsets the rows in the table. You can then use the result of that comparison to index into a CASTable. It sounds much more complicated than it is, so let’s look at an example.

    The examples below are from the Iris flower data set, which is available in the SASHELP library, in all distributions of SAS. The listed code and output are produced using the IPython interface but can be employed with Jupyter Notebook just as easily.

    If we want to get a CASTable that only contains values where petal_length is greater than 7, we can use the following expression to create our filter.


    Behind the scenes, this expression creates a computed column that is used in a WHERE expression on the CASTable. This expression can then be used as an index value for a CASTable. Indexing this way essentially creates a boolean mask. Wherever the expression values are true, the rows of the table are returned. Wherever the expression is false, the rows are filtered out.

    These two steps are more commonly done in one line.


    We can further filter rows out by indexing another comparison.

    Comparisons can be joined using the bitwise comparison operators & (and) and | (or). You do have to be careful with these though due to the operator precedence. Bitwise comparison has a higher precedence than comparisons such as greater-than and less-than, so you need to wrap your comparisons in parentheses.


    In all cases, we are not changing anything about the underlying data in CAS. We are simply constructing a query that is executed with the CASTable when it is used as the parameter in a CAS action. You can see what is happening behind the scenes by displaying the resulting CASTable objects.


    You can also do mathematical operations on columns with constants or other columns within your comparisons.

    The list of supported operations is shown in the table below.

    The supported comparison and operators are shown in the following table.

    As you can see in the tables above, it is possible to do comparisons on character columns as well. This includes using many of Python’s string methods on the column values. These are accessed using the str attribute of the column, just like in DataFrames.

    This easy syntax allows the Python client to manipulate data much easier when working in SAS Viya.

    Another great tip? The Python client allows you to manipulate data on the fly, without moving or copying the data to another location. Creating computed columns allows you to speed up the wrangling of data, while giving you options for how want to get there.

    Want to learn more great tips about integrating Python with SAS Viya? Check out Kevin Smith and Xiangxiang Meng’s SAS Viya: The Python Perspective to learn how Python can be intergraded into SAS® Viya® —and help you manipulate data with ease.

    Great tip for dynamic data selection using SAS Viya and Python was published on SAS Users.

    11月 032018
     

    When you begin to work within the SAS Viya ecosystem, you learn that the central piece is SAS Cloud Analytic Services (CAS). CAS allows all clients in the SAS Viya ecosystem to communicate and run analytic methods. The great part about SAS Viya is that the R client can drive CAS directly using familiar objects and constructs for R programmers.

    The SAS Scripting Wrapper for Analytics Transfer (SWAT) package is an R interface to CAS. With this package, you can load data into memory and apply CAS actions to transform, summarize, model and score the data. You can still retain the ease-of-use of R on the client side to further post process CAS result tables.

    But before you can do any analysis in CAS, you need some data to work with and a way to get to it. There are two data access components in CAS:

    1. Caslibs, definitions that give access to a resource that contains data.
    2. CASTables, for analyzing data from a caslib resource. You load the data into a CASTable, which contains information about the data in the columns.

    Other references you may find of interest include this GitHub repository where you can find more information on installing and configuring CAS and SWAT. Also available is this article on using RStudio with SAS Viya.

    The following excerpt from SAS® Viya® : the R Perspective, the book I co-authored with my SAS colleague Xiangxiang Meng, demonstrates the way the R client in SAS Viya allows you to select data with precision. The examples come from the iris flower data set, which is available in the SASHELP library, in all distributions of SAS. The CASTable object sorttbl is sorted by the Sepal.Width column.

    Rather than using fixed values of rows and columns to select data, we can create conditions that are based on the data in the table to determine which rows to select. The specification of conditions is done using the same syntax as that used by data.frame objectsCASTable objects support R’s various comparison operators and build a filter that subsets the rows in the table. You can then use the result of that comparison to index into a CASTableIt sounds much more complicated than it is, so let’s look at an example.

    This expression creates a computed column that is used in a where expression on the CASTable. This expression can then be used as an index value for a CASTable. Indexing this way essentially creates a Boolean mask. Wherever the expression values are true, the rows of the table are returned. Wherever the expression is false, the rows are filtered out.

    > newtbl <- sorttbl[expr,] > head(newtbl) 
     
      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species 
    1          7.7         2.6          6.9         2.3 virginica 
    2          7.7         2.8          6.7         2.0 virginica 
    3          7.6         3.0          6.6         2.1 virginica 
    4          7.7         3.8          6.7         2.2 virginica

    These two steps are commonly entered on one line.

    > newtbl <- sorttbl[sorttbl$Petal.Length > 6.5,]
    > head(newtbl) 
     
      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species 
    1          7.7         2.6          6.9         2.3 virginica 
    2          7.7         2.8          6.7         2.0 virginica 
    3          7.6         3.0          6.6         2.1 virginica 
    4          7.7         3.8          6.7         2.2 virginica

    We can further filter rows out by indexing another comparison expression.

    > newtbl2 <- newtbl[newtbl$Petal.Width < 2.2,] > head(newtbl2) 
     
      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species 
    1          7.7         2.8          6.7         2.0 virginica 
    2          7.6         3.0          6.6         2.1 virginica

    Comparisons can be joined using the bitwise comparison operators & (and) and | (or). You must be careful with these operators though due to operator precedence. Bitwise comparison has a lower precedence than comparisons such as greater-than and less-than, but it is still safer to enclose your comparisons in parentheses.

    > newtbl3 <- sorttbl[(sorttbl$Petal.Length > 6.5) & (sorttbl$Petal.Width < 2.2),] > head(newtbl3) 
     
      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species 
    1          7.7         2.8          6.7         2.0 virginica 
    2          7.6         3.0          6.6         2.1 virginica

    In all cases, we are not changing anything about the underlying data in CAS. We are simply constructing a query that is executed with the CASTable when it is used as the parameter in a CAS action. You can see what is happening behind the scenes by displaying the attributes of the resulting CASTable objects.

    > attributes(newtbl3) 
     
    $conn 
    CAS(hostname=server-name.mycompany.com, port=8777, username=username, session=11ed56e2-f9dd-9346-8d01-44a496e68880, protocol=http) 
     
    $tname
    [1] "iris" 
     
    $caslib 
    [1] "" 
     
    $where 
    [1] "((\"Petal.Length\"n > 6.5) AND (\"Petal.Width\"n < 2.2))" 
     
    $orderby 
    [1] "Sepal.Width" 
     
    $groupby 
    [1] "" 
     
    $gbmode 
    [1] "" 
     
    $computedOnDemand 
    [1] FALSE 
     
    $computedVars 
    [1] "" 
     
    $computedVarsProgram 
    [1] "" 
     
    $XcomputedVarsProgram 
    [1] "" 
     
    $XcomputedVars 
    [1] "" 
     
    $names 
    [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  
    [5] "Species"      
     
    $class 
    [1] "CASTable" 
    attr(,"package") 
    [1] "swat"

    You can also do mathematical operations on columns with constants or on other columns within your comparisons.

    > iris[(iris$Petal.Length + iris$Petal.Width) * 2 > 17.5,] 
     
        Sepal.Length Sepal.Width Petal.Length Petal.Width   Species 
    118          7.7         3.8          6.7         2.2 virginica 
    119          7.7         2.6          6.9         2.3 virginica

    The list of supported operators is shown in the following table:

    Operator Numeric Data Character Data
    + (add) ✔
    - (subtract) ✔
    * (multiply) ✔
    / (divide) ✔
    %% (modulo) ✔
    %/% (integer division) ✔
    ^ (power) [✔

    The supported comparison operators are shown in the following table.

    Operator Numeric Data Character Data
    == (equality) ✔ ✔
    != (inequality) ✔ ✔
    < (less than) ✔ ✔
    > (greater than) ✔ ✔
    <= (less than or equal to) ✔ ✔
    >= (greater than or equal to) ✔ ✔

     

    As you can see in the preceding tables, you can do comparisons on character columns as well. In the following example, all of the rows in which Species is equal to "virginica" are selected and saved to a new CASTable object virginica. Note that in this case, data is still not duplicated.

    > tbl <- defCasTable(conn, 'iris') > virginica <- tbl[tbl$Species == 'virginica',] > dim(virginica) 
     
    [1] 50  5 
     
    > head(virginica) 
     
      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species 
    1          7.7         3.0          6.1         2.3 virginica 
    2          6.3         3.4          5.6         2.4 virginica 
    3          6.4         3.1          5.5         1.8 virginica 
    4          6.0         3.0          4.8         1.8 virginica 
    5          6.9         3.1          5.4         2.1 virginica 
    6          6.7         3.1          5.6         2.4 virginica

    It’s easy to create powerful filters that are executed in CAS while still using the R syntax. However, the similarities to dataframe don’t end there. CASTable objects can also create computed columns and by groups using similar techniques.

    Want to learn more? Get your copy of SAS Viya: The R Perspective

    How to use SAS® Viya® and R for dynamic data selection was published on SAS Users.

    10月 312018
     

    This article is the first in a series of three posts to address REST APIs and their use in, and with, SAS. Today, I'll present a basic example using SAS Viya REST APIs to download an image from a report in SAS Visual Analytics.

    The second article will show an example of the Cloud Analytics Services (CAS) REST APIs. My third planned article will outline show a simple application that accesses SAS Viya using both sets of REST APIs.

    The inspiration for this example: a visualization of air traffic data

    I ran across a great post from Mike Drutar: How to create animated line charts that "grow" in SAS Visual Analytics. I followed the steps in Mike's example, which creates a visualization of airline traffic. The result was an animated line chart. For this article, I removed the animation, as it will serve me better in my use case.

    SAS Viya APIs and CAS APIs: Two entry points into SAS Viya

    The first thing I'd like to cover is why SAS Viya offers two sets of REST APIs. Let's consider who is using the APIs, and what they are trying to accomplish? SAS Viya APIs target enterprise application developers (who may or may not be versed in analytics), who intend to build on the work of model builders and data scientists. These developers want to deliver apps based on SAS Viya technology -- for example, to call an analytical model to score data. On the other hand, the CAS REST API is used by data scientists and programmers (who are decidedly adept at analytics) and administrators, who need to interact with CAS directly and are knowledgeable about CAS actions. CAS actions are the building blocks of analytical work in SAS Viya.

    How to get started with SAS Viya REST APIs

    The best place to start working with SAS Viya REST APIs is on the SAS Developer's web site. There, you will find links to the API documentation.

    The REST APIs are written to make it easy to integrate the capabilities of SAS Viya to help build applications or create scripts. The APIs are based on URLs, using HTTP Authentication, and HTTP verbs. The API documentation page is split into multiple categories. The following table outlines the breakdown:

    API Category Description
    Visualization Provide access to reports and report images
    Compute Act on SAS compute and analytic servers, including Cloud Analytic Services (CAS)
    Text Analytics Provide analysis and categorization of text documents
    Data Management Enable data manipulation and data quality operations on data sources
    Decision Management Provide access to machine scoring and business rules
    Core Services Provide operations for shared resources such as files and folders

     

    The REST API documentation page is divided into multiple sections.

    SAS Viya REST API doc

    1. The categories are listed in the upper-left side.
    2. Once a you select a category, related services and functions are listed in the lower left pane.
    3. The service appears in the center pane with a description, parameters, responses, and error codes.
    4. The right pane displays how to form a sample request, any optional or required body text, and sample response code.

    The REST API call process

    The example outlined in this article shows how to access a report image from SAS Visual Analytics. To try this out yourself, you will need: a SAS Viya environment (with SAS Visual Analytics configured), an access token, and a REST client. The REST client can be cURL (command line), Postman (a popular REST API environment), or Atom with the rest-client plugin -- or any other scripting language of your choice. Even if you do not have access to an environment right now, read on! As a SAS developer, you're going to want to be aware of these capabilities.

    Get a list of reports from SAS Visual Analytics

    Run the following curl command to get a list of reports on the SAS server:

    curl -X GET http://sasserver.demo.sas.com/reports/reports\
      -H 'Authorization: Bearer &lt;access-token-goes-here&gt;' \
      -H 'Accept: application/vnd.sas.table.column+json'

    Alternatively, use Postman to enter the command and parameters:

    GET Report List API call from Postman

    From the JSON response, find the report object and grab the id of the desired report:

    GET Report List Response

    Create a job

    The next step is to create an asynchronous job to generate the SVG image from the report. I use the following HTTP POST with the /jobs verb:

    curl -X POST <a href="http://sasserver.demo.sas.com/reportImages/jobs/">http://sasserver.demo.sas.com/reportImages/jobs\
      -H 'Authorization: Bearer &lt;access-token-goes-here&gt;' \
      -H 'Accept = application/vnd.sas.report.images.job+json'\
      -H 'Content-Type = application/vnd.sas.report.images.job.request+json'

    Using the following sample Body text

    {
      "reportUri" : "/reports/reports/b555ea27-f204-4d67-9c74-885311220d45",
      "layoutType" : "entireSection",
      "selectionType" : "report",
      "size" : "400x300",
      "version" : 1
    }

    Here is the sample response:

    POST Job Creation Response

    The job creation kicks off an asynchronous action. The response indicates whether the job is completed at response time, or whether it's still pending. As you can see from the above response, our job is still in a 'running' state. The next step is to poll the server for job completion.

    Poll for job completion

    Using the 'id' value from the job creation POST, the command to poll is:

    curl -X GET http://sasserver.demo.sas.com/reportImages/jobs/f7a12533-ac40-4acd-acda-e0c902c6c2c1\
      -H 'Authorization: Bearer ' \ 
      -H ‘Accept = application/vnd.sas.report.images.job+json’

    And the response:

    GET Poll Job Creation Response

    Once the job comes back with a 'completed' state, the response will contain the information we need to fetch the report image.

    Get the image

    I am now ready to get the image. Using the image file name (href field) from the response above, I run the following command:

    curl -X GET http://sasserver.demo.sas.com/reportImages/images/K1870020424B498241567.svg\
      -H 'Authorization: Bearer ' \ 
      -H ‘'Accept: image/svg+xml'

    Postman automatically interprets the response as as an image. If you use the curl command, you'll need to redirect the output to a file.

    SAS Visual Analytics Graph for Air Traffic

    What's Next?

    SAS Visual Analytics is usually considered an interactive, point-and-click application. With these REST APIs we can automate parts of SAS Visual Analytics from a web application, a service, or a script. This opens tremendous opportunities for us to extend SAS Visual Analytics report content outside the bounds of the SAS Visual Analytics app.

    I'll cover more in my next articles. In the meantime, check out the Visualization APIs documentation to see what's possible. Have questions? Post in the comments and I'll try to address in future posts.

    Using SAS Viya REST APIs to access images from SAS Visual Analytics was published on SAS Users.

    10月 312018
     

    This article is the first in a series of three posts to address REST APIs and their use in, and with, SAS. Today, I'll present a basic example using SAS Viya REST APIs to download an image from a report in SAS Visual Analytics.

    The second article will show an example of the Cloud Analytics Services (CAS) REST APIs. My third planned article will outline show a simple application that accesses SAS Viya using both sets of REST APIs.

    The inspiration for this example: a visualization of air traffic data

    I ran across a great post from Mike Drutar: How to create animated line charts that "grow" in SAS Visual Analytics. I followed the steps in Mike's example, which creates a visualization of airline traffic. The result was an animated line chart. For this article, I removed the animation, as it will serve me better in my use case.

    SAS Viya APIs and CAS APIs: Two entry points into SAS Viya

    The first thing I'd like to cover is why SAS Viya offers two sets of REST APIs. Let's consider who is using the APIs, and what they are trying to accomplish? SAS Viya APIs target enterprise application developers (who may or may not be versed in analytics), who intend to build on the work of model builders and data scientists. These developers want to deliver apps based on SAS Viya technology -- for example, to call an analytical model to score data. On the other hand, the CAS REST API is used by data scientists and programmers (who are decidedly adept at analytics) and administrators, who need to interact with CAS directly and are knowledgeable about CAS actions. CAS actions are the building blocks of analytical work in SAS Viya.

    How to get started with SAS Viya REST APIs

    The best place to start working with SAS Viya REST APIs is on the SAS Developer's web site. There, you will find links to the API documentation.

    The REST APIs are written to make it easy to integrate the capabilities of SAS Viya to help build applications or create scripts. The APIs are based on URLs, using HTTP Authentication, and HTTP verbs. The API documentation page is split into multiple categories. The following table outlines the breakdown:

    API Category Description
    Visualization Provide access to reports and report images
    Compute Act on SAS compute and analytic servers, including Cloud Analytic Services (CAS)
    Text Analytics Provide analysis and categorization of text documents
    Data Management Enable data manipulation and data quality operations on data sources
    Decision Management Provide access to machine scoring and business rules
    Core Services Provide operations for shared resources such as files and folders

     

    The REST API documentation page is divided into multiple sections.

    SAS Viya REST API doc

    1. The categories are listed in the upper-left side.
    2. Once a you select a category, related services and functions are listed in the lower left pane.
    3. The service appears in the center pane with a description, parameters, responses, and error codes.
    4. The right pane displays how to form a sample request, any optional or required body text, and sample response code.

    The REST API call process

    The example outlined in this article shows how to access a report image from SAS Visual Analytics. To try this out yourself, you will need: a SAS Viya environment (with SAS Visual Analytics configured), an access token, and a REST client. The REST client can be cURL (command line), Postman (a popular REST API environment), or Atom with the rest-client plugin -- or any other scripting language of your choice. Even if you do not have access to an environment right now, read on! As a SAS developer, you're going to want to be aware of these capabilities.

    Get a list of reports from SAS Visual Analytics

    Run the following curl command to get a list of reports on the SAS server:

    curl -X GET http://sasserver.demo.sas.com/reports/reports\
      -H 'Authorization: Bearer &lt;access-token-goes-here&gt;' \
      -H 'Accept: application/vnd.sas.table.column+json'

    Alternatively, use Postman to enter the command and parameters:

    GET Report List API call from Postman

    From the JSON response, find the report object and grab the id of the desired report:

    GET Report List Response

    Create a job

    The next step is to create an asynchronous job to generate the SVG image from the report. I use the following HTTP POST with the /jobs verb:

    curl -X POST <a href="http://sasserver.demo.sas.com/reportImages/jobs/">http://sasserver.demo.sas.com/reportImages/jobs\
      -H 'Authorization: Bearer &lt;access-token-goes-here&gt;' \
      -H 'Accept = application/vnd.sas.report.images.job+json'\
      -H 'Content-Type = application/vnd.sas.report.images.job.request+json'

    Using the following sample Body text

    {
      "reportUri" : "/reports/reports/b555ea27-f204-4d67-9c74-885311220d45",
      "layoutType" : "entireSection",
      "selectionType" : "report",
      "size" : "400x300",
      "version" : 1
    }

    Here is the sample response:

    POST Job Creation Response

    The job creation kicks off an asynchronous action. The response indicates whether the job is completed at response time, or whether it's still pending. As you can see from the above response, our job is still in a 'running' state. The next step is to poll the server for job completion.

    Poll for job completion

    Using the 'id' value from the job creation POST, the command to poll is:

    curl -X GET http://sasserver.demo.sas.com/reportImages/jobs/f7a12533-ac40-4acd-acda-e0c902c6c2c1\
      -H 'Authorization: Bearer ' \ 
      -H ‘Accept = application/vnd.sas.report.images.job+json’

    And the response:

    GET Poll Job Creation Response

    Once the job comes back with a 'completed' state, the response will contain the information we need to fetch the report image.

    Get the image

    I am now ready to get the image. Using the image file name (href field) from the response above, I run the following command:

    curl -X GET http://sasserver.demo.sas.com/reportImages/images/K1870020424B498241567.svg\
      -H 'Authorization: Bearer ' \ 
      -H ‘'Accept: image/svg+xml'

    Postman automatically interprets the response as as an image. If you use the curl command, you'll need to redirect the output to a file.

    SAS Visual Analytics Graph for Air Traffic

    What's Next?

    SAS Visual Analytics is usually considered an interactive, point-and-click application. With these REST APIs we can automate parts of SAS Visual Analytics from a web application, a service, or a script. This opens tremendous opportunities for us to extend SAS Visual Analytics report content outside the bounds of the SAS Visual Analytics app.

    I'll cover more in my next articles. In the meantime, check out the Visualization APIs documentation to see what's possible. Have questions? Post in the comments and I'll try to address in future posts.

    Using SAS Viya REST APIs to access images from SAS Visual Analytics was published on SAS Users.

    10月 312018
     

    An important step of every analytics project is exploring and preprocessing the data.  This transforms the raw data to make it useful and quality.  It might be necessary, for example, to reduce the size of the data or to eliminate some columns. All these actions accelerate the analytical project that comes right after.  But equally important is how you "productionize" your data science project.  In other words, how you deploy your model so that the business processes can make use of it.

    SAS Viya can help with that.  Several SAS Viya applications have been engineered to directly add models to a model repository including SAS® Visual Data Mining and Machine Learning, SAS® Visual Text Analytics, and SAS® Studio. While the recent post on publishing and running models in Hadoop on SAS Viya outlined how to build models, this post will focus on the process to deploy your models with SAS Model Manager to Hadoop.

    SAS Visual Data Mining and Machine Learning on SAS Viya contains a pipeline interface to assist data scientists in finding the most accurate model.  In that pipeline interface, you can do several tasks such as import score code, score your data, download score API code or download SAS/BASE scoring code.  Or you may decide – once you have a version ready - to store the model out of the development environment by registering your analytical model in a model repository.

    Registered models will show up in SAS Model Manager and are copied to the model repository.   That repository provides long-term storage and includes version control.  It's a powerful tool for managing and governing your analytical models.  A registered version of your model will never get lost, even it's deleted from your development environment.   SAS models are not the only kind of models that SAS Model Manager can handle:  Python, R, Matlab models can also be imported.

    SAS Model Manager can read, write, and manage the model repository and provide actions for model editing, comparing, testing, publishing, validating, monitoring, lineage, and history of the models.  It also allows you to easily demonstrate your compliance with regulations and policies. You can organize models into different projects.   Within a project it's feasible to test, deploy and monitor the performance of the registered models.

    Deploying your models

    Deploying, a key step for any data scientist and model manager, can assist in bringing the models into production processes. Kick off deployment by publishing your models.  SAS Model Manager can publish models to systems being used for batch processing or publish to applications where real-time execution of the models is required.   Let's have a look at how to publish the analytical model to a Hadoop cluster and run the model into the Hadoop cluster.  In doing so, you can score the data where it resides and avoid any data movement.

    1. Create the Hadoop public destination.

    The easiest way to do this is via the Visual Interface.  Go to SAS Environment Manager and click on the Publish destinations icon:

    Click on the new destination icon:

    Important: