Cloud Computing

9月 042019
 

Editor’s note: This article is a continuation of the series by Conor Hogan, a Solutions Architect at SAS, on SAS and database and storage options on cloud technologies. Access all the articles in the series here.

In a previous article in this series, Accessing Databases in the Cloud – SAS Data Connectors and Amazon Web Services, I covered SAS and database as a service (DBaaS) and storage offerings from Amazon Web Services (AWS). Today, I cover the various storage options available on AWS and how connect to and interact with them from SAS.

Object Storage

Amazon Simple Storage Service (S3) is a low-cost, scalable cloud object storage for any type of data in its native format. Individual Amazon S3 objects can range in size from 1 byte all the way to 5 terabytes (TB). Amazon S3 organizes these objects into buckets. A bucket is globally unique. You access the bucket directly through an API from anywhere in the world, if granted permissions. The default granted to the bucket is least access. Amazon advertises 11 9’s, or 99.999999999% of durability, meaning that you never lose your data. Data replicates automatically across availability zones to meet this durability. You can reduce the number of replicants or use one of the various tiers of archive services to reduce your object storage cost. Costs are calculated based on terabytes of storage per month with added costs for request and transfers of data.

SAS and S3

Support for Amazon Web Services S3 as a Caslib data source for SAS Cloud Analytic Services (CAS) was added in SAS Viya 3.4. This data source enables you to access SASHDAT files and CSV files in S3. You can use the CASLIB statement or the table.addCaslib action to add a Caslib for S3. SAS is currently exploring native object storage integration with AWS S3 for more file types. For other file types you can copy the data from S3 and then use a SAS Data Connector to load the data into memory. For example, if I had Excel data in S3, I could use PROC S3 to copy the data locally and then load the data into CAS using the SAS Data Connector to PC Files.

Block Storage

Amazon Elastic Block Store (EBS) is the block storage service designed for use with Amazon Elastic Compute Cloud (EC2). Only when attached to an operating system is the storage class accessible. Storage volumes can be treated as an independent disk drive controlled by a server operating system. You would mount an EBS volume to an operating system as if it were a physical disk. EBS volumes are valuable because they are the storage that will persist when you terminate your compute instance. You can choose from four different volume types that supply performance levels at corresponding costs.

SAS and EBS

EBS is used as the permanent SAS data storage and persists through a restart of your SAS environment. The performance choices made when selecting from the different EBS volume type will have a direct impact on the performance that you get from SAS. One thing to consider is using compute instances that have enhanced EBS performance or dedicated solid state drive instance storage. For example, the SAS Viya on AWS QuickStart uses Storage Optimized and Memory Optimized compute instances with local NVMe-based SSDs that are physically connected to the host server that is coupled to the lifetime of the instance. This is beneficial for performance.

SAS Cloud Analytic Services (CAS) is an in-memory server that relies on the CAS Disk Cache as the virtual memory storage backend. This is especially true if you are reading data from a database. In this case, make sure you have enough block storage, in the form of EBS volumes for use as the CAS Disk Cache.

File Storage

Amazon Elastic File System (EFS) provides access to data through a shared file system. EFS is an elastic network file system that grows and shrinks as you add or remove files, so you only pay for the storage you consume. Users create, delete, modify, read, and write files organized logically in a directory structure for intuitive access. This allows simultaneous access for multiple users to a common set of file data managed with user and group permissions. Amazon FSx for Lustre is the high-performance file system service.

SAS and EFS

EFS shared file system storage can be a powerful tool if utilizing a SAS Grid architecture. If you have a requirement in your SAS architecture for a shared location that any node in a group can access and write to, then EFS could meet your requirement. To access the data stored in your network file system you will have to mount the EFS file system. You can mount your Amazon EFS file systems to any EC2 instance, or any on-premises server connected to your Amazon VPC.

BONUS: Serverless

Amazon Athena is query service for Amazon S3. This service makes it easy to submit queries against the objects stored in S3. You can run analysis on this data using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries you run. Amazon Athena uses Presto with ANSI SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Avro, and Parquet.

SAS and Athena

Amazon Athena is ODBC/JDBC compliant which means I can use SAS/ACCESS Interface to ODBC or SAS/ACCESS Interface to JDBC to connect using SAS. Download an Amazon Athena ODBC driver and submit code from SAS just like you would any ODBC data source. Athena is a great tool if you want to use the serverless computing power of Amazon to query data in S3.

Finally

Many times, we do not have a choice of technologies we use and infrastructures on which they sit. Luckily, if you use AWS, integration with SAS is not a concern. I’ve now covered databases and storage for AWS. In future articles, I’ll cover the same topics for Microsoft Azure and Google Cloud Platform.

Additional Resources

Storage in the Cloud – SAS and Amazon Web Services was published on SAS Users.

8月 222019
 

Editor’s note: This is the first article in a series by Conor Hogan, a Solutions Architect at SAS, on SAS and database and storage options on cloud technologies. This article covers the SAS offerings available to connect to and interact with the various database options available in Amazon Web Services.

As companies move their computing to the cloud, they are also moving their storage to the cloud. Just like compute in the cloud, data storage in the cloud is elastic and responds to demand while only paying for what you use. As more technologies moves to a cloud-based architecture, companies must consider questions like: Where is my data going to be stored? Do I want a hybrid solution? What cloud storage options do I have? What storage solution best meets my business needs?. Another question requiring an answer is: Is the software I use cloud-ready?. The answer in the case of SAS is, YES! SAS offers various cloud deployment patterns on various cloud providers and supports integration with cloud storage services.

This is part one in a series covering database as a service (DBaaS) and storage offerings from Amazon Web Services (AWS). Microsoft Azure and Google Cloud Platform will be covered in future articles. The goal is to supply a breakdown of these services to better understanding the business requirements of these offerings and how they relate to SAS. I will focus primarily on SAS Data Connectors as part of SAS Viya, but all the same functionality is available using a SAS/ACCESS Interface in SAS 9.4. SAS In-Database technologies in SAS Viya are called SAS Data Connect Accelerators and are synonymous with the SAS Embedded Process.

SAS integration with AWS

SAS has extended SAS Data Connectors and SAS In-Database Technologies support to Amazon Web Services database variants. A database running in AWS is much like your on-premise database, but instead Amazon is managing the software and hardware. Amazon’s DBaaS offerings take care of the scalability and high availability of the database with minimal user input. SAS integrates with your cloud database even if SAS is running on-premise or with a different cloud provider.

AWS databases

Amazon offers database service technologies familiar to users. It is important to understand the new terminology and how the different database services best meet the demands of your specific application. Many common databases already in use are being refactored and provided as service offerings to customers in AWS. The advantages for customers are clear: no hardware to manage and no software to install. Databases that scale automatically to meet demand and software that updates and create backups automatically means customers can spend more time creating value from their data and less time managing their infrastructure.

For the rest of this article I cover various database management systems, the AWS offering for each database type, and SAS integration. First let's consider the diagram below depicting a decision flow chart to determine integration points between AWS database services and SAS.

Integration points between AWS database services and SAS

Trace you path in the diagram and read on to learn more about connection details.

Relational Database Management System (RDBMS)

In the simplest possible terms, an RDBMS is a collection of managed tables with rows and columns. You can divide relational databases into two functional groups: online transaction processing (OLTP) and online analytical processing (OLAP). These two methods serve two distinct purposes and are optimized depending in how you plan to use the data in the database.

Transactional Databases (OLTP)

Transactional databases are good at processing reads, inserts, updates and deletes. These queries usually have minimal complexity, in large volumes. Transactional databases are not optimized for business intelligence or reporting. Data processing typically involves gathering input information, processing the data and updating existing data to reflect the collected and processed information. Transactional databases prevent two users accessing the same data concurrently. Examples include order entry, retail sales, and financial transaction systems. Amazon offers several types of transactional database services. You can organize Amazon Relational Database Service (RDS) into three categories: enterprise licenses, open source, and cloud native.

Enterprise License

Many customers already have workloads built around enterprise databases. Amazon provides a turn-key enterprise solution for customers not looking to break their relationship with enterprise vendors or refactor their existing workflows. AWS offers Oracle and Microsoft SQL Server as a turn-key enterprise solution in RDS. Both offerings include the required software license, however Oracle also allows you to “Bring Your Own License” (BYOL). SAS has extended SAS Data Connector support for both cloud variants. You can use your existing license for SAS Data Connector to Oracle or SAS Data Connector to Microsoft SQL Server to interact with these RDS databases.

Remember you can install and manage your own database on a virtual machine if there is not an available database as a service offering. The traditional backup and update responsibilities are left to the customer in this case. For example, both SAS Data Connector to Teradata and SAS Data Connect Accelerator for Teradata are supported for Teradata installed on AWS.

Open Source

Amazon provides service offerings for common open source databases like MySQL, MariaDB, and PostgreSQL. SAS has extended SAS Data Connector support for all these cloud variants. You can use your existing license for SAS Data Connector to MYSQL to connect to either RDS MYSQL or RDS MariaDB and SAS Data Connector to PostgreSQL to interface with RDS PostgreSQL.

Cloud Native

Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, combining the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. SAS has extended SAS Data Connector support for Amazon Aurora. You can use your existing license for SAS Data Connector to MYSQL to connect to either Aurora MYSQL or and SAS Data Connector to PostgreSQL to interface with Aurora PostgreSQL.

Analytical Databases (OLAP)

Analytical Databases optimize on read performance. These databases work best from complex queries in smaller volume. When working with an analytical database you are typically doing analysis on multidimensional data interactively from multiple perspectives. Redshift is the analytical database service offered by Amazon. SAS has a dedicated product called SAS Data Connector to Amazon Redshift that was purpose built for analytics workloads running in the Amazon cloud.

NoSQL Databases

A non-relational or NoSQL database is any database not conforming to the relational database model. These databases are more easily scalable to a cluster of machines. NoSQL databases are a more natural fit for the cloud because the loose dependencies make the data easier to distribute and scale. The different NoSQL databases are designed to solve a specific business problem. Some of the most common data structures are key-value, column, document, and graph databases. Here is a brief overview of the most common data structures.

Key-Value Database

A key-value database stores data as a collection of key-value pairs. The key acts as a unique identifier for each record. Amazon’s key-value database as a service is DynamoDB. SAS interacts with DynamoDB using industry standard ODBC or JDBC drivers.

Columnar Database

Data in a traditional relational database is sorted by rows. The alternative columnar databases optimize by sorting data quickly using columns, saving valuable time and network I/O. Redshift is the columnar database service offered by Amazon. SAS has a dedicated product called SAS Data Connector to Amazon Redshift that was purpose built for this database.

Document Database

A document database queries data in documents typically stored in JSON format. DocumentDB is the document database service offering from Amazon. SAS interacts with DocumentDB using industry standard ODBC or JDBC drivers. DocumentDB is MongoDB-compatible which means existing MongoDB drivers and tolls work with DocumentDB. Currently SAS is building out functionally to support SAS Data Connector to MongoDB and you should expect that to expand further into DocumentDB as well.

Graph Database

Amazon Neptune is the graph database service designed to work with a complex hierarchy of interconnected data. These design of these types of databases queries relationships in data and reduce the number of table joins. SAS interacts with Amazon Neptune using industry standard ODBC or JDBC drivers.

Hadoop

The traditional deployment of Hadoop is changing dramatically with the cloud. Traditional Hadoop vendors may have a tough time keeping up with the service offerings available in the cloud. Hadoop still offers reliable replicated storage across nodes and powerful parallel processing of large jobs without much data movement. Amazon offers Elastic Map Reduce as their Hadoop as a service offering. Amazon Elastic supports both SAS Data Connector to Hadoop and SAS Data Connect Accelerator for Hadoop.

Finally

It is important to think about the use case for your database and the type of data that you plan to store before you select an AWS database service. Understanding your workloads is critical to getting the right performance and cost. When dealing with cloud databases always remember that you will be charged for the storage that you use but also for the data that you move out of the database. To do analysis and reporting on your data may require data transfer. Be aware of these costs and think about how you can lower these costs by keeping frequently accessed data cached somewhere or remain on-premise.

Additional Resources

  1. Support for Databases in SAS® Viya® 3.4
  2. Support for Cloud and Database Variants in SAS® 9.4

Accessing Databases in the Cloud – SAS Data Connectors and Amazon Web Services was published on SAS Users.

6月 192019
 

As a data scientist, did you ever come to the point where you felt the need for an evolved analytics platform bringing together the disparate skills of open source and commercial software? A system that can enable advanced analytic capabilities. This is now possible and easy to implement. With many deployment possibilities, SAS Viya allows you to choose the data storage location where compute happens, and the deployment methods for models.

Let’s say you want to expand your model development process with SAS Viya analytical capabilities and you don’t want to wait for getting such environment up and running. Unfortunately, you have no infrastructure, nor the experience to install SAS Viya. Moving the traditional way, you could go for:

  • Protracted hardware procurement and provisioning
  • Deployment planning and coordination with IT
  • Effort and time required for software installation/configuration

This solution may be the right path for many organizations, but I think we all recognize this: the traditional approach could take days, weeks and yes sometimes months.

What if you could get up and running with a full SAS Viya platform in two hours? If you have some affinity for cloud-based solutions, SAS offers you the AWS SAS Viya Cloud Rapid Deployment tool. SAS released this AWS Quick Start as a rapid deployment architecture for SAS Viya on AWS. Deployable products include SAS Visual Data Mining and Machine Learning, SAS Visual Statistics and SAS Visual Analytics.

The goal of this article is to brief you how I launched such an AWS SAS Viya Quickstart. I strongly advise you to watch this related video by my colleague Erwan Granger. Much of what is covered here appears in Erwan's video. The recording predates the SAS Viya 3.4 release, but main concepts are still the same.

What you will need

The following is a list of items you need to complete this task.

  • AWS Account with appropriate creation privileges
  • A valid SAS Viya License; this means you will need a SAS Software Order Confirmation e-mail
  • Optional: you deploy with your own DNS Name and SSL Certificate. In that case you need to register a domain managed by Amazon Route 53. For instructions on registering the domain, see the Route 53 documentation. And you can request and register a certificate with AWS Certificate Manager.

Furthermore, it’s good to know this Quick Start provides two deployment options. You can deploy SAS Viya into a new Virtual Private Cloud (VPC) or into an existing VPC. The first option builds a new AWS environment consisting of the VPC, private and public subnets, NAT gateways, security groups, Ansible controllers, and other infrastructure components, and then deploys SAS Viya into this new VPC. The second option provisions SAS Viya in your existing AWS infrastructure. I decided to go for the first option.

What you will build

Here's an architectural overview of what we will build:

SAS Viya architecture on the AWS Cloud

You can find exactly the same architecture on the SAS Viya AWS Quick Start landing page.

Configure the build

We’ll be following the build process outlined in the Quick Start guide. On the landing page, next to the "What you’ll build tab" you can click on "How to deploy". From there launch the "Deploy into a new VPC" wizard.

Deploy into a new VPC wizard

Prerequisite prep

Make sure you sign in with your AWS account and you have chosen the region where you want to deploy. On that first screen you can leave the Amazon S3 template URL default. That template is the basics for the AWS CloudFormation we are launching. CloudFormation is a tool from AWS that allows you to spin up resources in the right order. The template is the blueprint document for your CloudFormation. By keeping the default template, we will build exactly the architecture displayed above.

Pre-req prep template

Now click "Next" and move to the page where we can specify more details and the required parameters of the CloudFormation parameters.

Cloudfourmation parameters

The first parameter is the SAS Viya Software Order file, which is the Amazon S3 location of the Software Order e-mail attachment.

SAS Viya install package location

In the Administration section, you provide parameters to configure your AWS architecture. That way, you control access, instance type, and if you will use a SAS Viya Mirror repository.

CloudFormation administration parameters

Administration parameter definitions:

  • The name of an Amazon EC2 key pair, so you can access the Ansible controller
  • The Amazon Availability zone for the public and private subnet
  • Allowable IP range for HTTP traffic; must be a valid IP CIDR range
  • Allowable IP Range for SSH traffic to the Ansible controller; must be a valid IP CIDR range
  • SAS Administrator password
  • Password for Default (sasuser) user
  • Amazon EC2 Instance type for CAS Compute VM
  • Amazon EC2 Instance type for SAS Viya Services VM
  • (Optional) Location of SAS Viya Deployment Repository data
  • (Optional) Operator Email

If you want to work with custom DNS names and SSL, you will need to provide the next three parameters as well.

DNS and SSL configuration (optional)

DNS and SSL parameters:

You may accept the defaults on the remaining parameters.

Optional parameters

After clicking "Next" another set of optional parameters are available. I mostly go with accepting the default parameters provided. The lone exception is the Rollback on failure.

Optional administration parameters

Based on what I’ve learned from Erwan's video, the safer choice is "No" on the Rollback option. This way, if the deployment process encounters issues, the log will identify in which step the error occurred. Of course this means you are responsible to manually delete AWS created resources that are not longer necessary. The easiest way to do this is by deleting the CloudFormation Stacks afterward.

Kick off the build

To conclude the deployment wizard, click "Next" once more and acknowledge the necessary AWS resources to create. By clicking "Create stack" the deployment process starts.

Start the build process

You can monitor the deployment log using AWS CloudWatch. In his video, Erwan demonstrates this at around minute 23.

After a successful formation you will find two AWS CloudFormation Stacks created. The Outputs gives you the direct links to SAStudioV and SASDrive.

SAS Studio and SAS Drive stacks

That’s it. You are deployed and ready to begin using your SAS Viya environment!

Additional Reference

Alexander Koller writes about SAS on AWS and takeaways for preparing for the AWS associate solution architect exam.

Your experiences and opinion matter

New forces are shaping the analytics ecosystem. Because of increased competition, rise in customer expectations and new, emerging technology such as AI and Machine Learning, challenging IT departments with evolving their analytic ecosystems to meet the demands of their business partners.

How is your organization doing this? How does your Analytics Cloud strategy compare to the market? And what do your peers think about migrating Analytics to the cloud? We can give you some insights and an industry benchmark on the topic.

Tell us about your experience in this 5 minute survey and we will be happy to share a detailed industry insight report with you, to answer these questions.

Deploy SAS Viya on AWS - Quick Start was published on SAS Users.

5月 222019
 

Analytics can bring tremendous value to a business. However, investing in an analytics solution is often easier said than done. The key challenge is demonstrating the value of analytics without first investing in technology and resources. The solution? Take a results-based approach to establish the value of analytics by using [...]

Establish the benefits of analytics before investing in a solution was published on SAS Voices by David Annis

5月 172019
 

In the article Serverless functions and SAS Viya - a good match I discussed using serverless functions to deliver SAS Viya applications. Ignoring all the buzz words, a serverless function boils down to a set of REST APIs. So, if you tried the example you are now a REST API developer 🙂 .

The serverless function allowed the application developer to do the following:

  1. Define what the end user must supply to the function. A good application developer will try to make the request simple and easy to understand.
  2. Return to the end user a response easily consumed by the client's program. Again, a good application developer would make sure the response satisfies most common usage scenarios.
  3. Hide all the details of what it took to satisfy the users request.

This blog discusses using GraphQL to achieve the same goals. First, I will briefly discuss GraphQL, where it fits in with SAS Viya application integration, and how to create GraphQL-based applications. I also provide a series of examples based on real-world scenarios.

The images below display a high level comparison of the approaches between serverless and GraphQL.

serverless and GraphQL process flow

serverless and GraphQL process flow

Steps in the GraphQL flow

  1. A GraphQL server replaces the AWS API Gateway.
  2. The code that runs in the GraphQL server is referred to as "resolvers" - as the name implies, resolvers are used by the GraphQL server to execute user requests.
  3. The resolvers make the necessary REST API calls to the SAS Viya Server.

All of the code in this article resides in the restaf-graphql-demo GitHub repository. If you are not familiar with GraphQL please review the links at the end of this article before proceeding.

Why GraphQL?

Some smart folks at Facebook created GraphQL to solve problems they encountered using standard REST APIs. Companies like Github, Netflix, PayPal, The New York Times and many others are adopting GraphQL.

Some of the key motivators are:

  1. Users define and request what they need, following exact specifications
  2. A convenient way to front existing systems (REST-based or not) and databases with a Developer Experience friendly API
  3. Returning only the requested information reduces the data transferred - important for reducing network traffic
  4. GraphQL is less "chatty" - where REST API will requires multiple trips to the server, GraphQL can accomplish the same task in one round trip

Why GraphQL for SAS Viya application developers?

While the general GraphQL characteristics listed above are important, GraphQL is also a useful technology for developers creating applications integrated with SAS Viya.

  1. GraphQL is a ready-made vehicle for SAS users to deliver their applications as the next generation "stored process" developed with the data step+procedures, CAS Language (CASL) statements, custom CASL actions and SAS REST APIs.
  2. GraphQL is a great way for front-end and back-end developers to communicate.
  3. Developers can code to an agreed contract as specified by the GraphQL schema.
  4. Front-end developers can be confident what they get is exactly what they asked for.

Writing the GraphQL-based applications

The GraphQL queries used in this article are examples for demonstration purposes only and not "standards or strict guidelines" to follow. The code in the GitHub repository and the examples outlined below will help you jump-start your excellent adventure in GraphQL and SAS Viya applications.

The high-level steps for writing an application using GraphQL query are:

SAS Viya Side

  1. SAS programmers, data analysts and data scientists develop their intellectual protocol with SAS programs written with SAS procedures, CAS Actions, data step and CASL language.

Server Side

  1. Build the GraphQL schema and define the queries (see this for examples). In relation to SAS Viya, the schema describes the input and output of the SAS programs.
    • Make sure you have discussed this with the UI developers and the SAS programmers
  2. Write the resolvers - GraphQL server will call this code to resolve the requests by the user (see this for examples).
  3. Register both of these with the GraphQL server.

Client Side

  1. You can build the web apps in the normal way with these characteristics:
    • These apps will call a single end point (/graphql) with a POST method.
    • The payload is the GraphQL query
    • The response will match the query and are easily accessible

The image below shows the flow of a GraphQL-based application. User's queries are sent to the GraphQL server. The server parses the queries and calls the appropriate resolver (your code) to obtain the values for the requested fields. In this project the resolvers use restaf to make REST API calls to SAS Viya.

GraphQL-based application process flow

GraphQL-based application process flow

The rest of the blog discusses a few examples. All these examples are available in the repository. I chose to write the examples using JavaScript since it is one of the languages I am familiar with and can write reasonably decent code in. You can develop GraphQL-based SAS Viya applications in all the popular languages of today.

Example 1: Scoring a loan from client app
In this example, a data scientist working for a bank, has created a model to score a loan applicant's eligibility. The scientist outlines the following requirements:

  1. The user can only enter the desired loan amount and their current assets. All the other parameters needed for scoring have set values. All the values must be passed to the SAS code as a dictionary named _args_.
  2. Since the scientist wants to run A/B experiments the location and name of the scoring model's astore must be passed in as dictionary named _appEnv_.
  3. The code developed by the data scientist is below. The score returns as a dictionary.

    {score= <value>}

SAS Code

I wrote the SAS program in this example in CASL.

loadactionset "astore";

  /* convert arguments to a cas table */
/* _args_  and _appEnv_ are  generated by caslBase - see caslBase for details */

/* CASL function to convert a dictionary to a cas table  see lib/argsToTable.js for details*/
argsToTable(_args_, 'casuser', 'INPUTDATA' );

/* score */
action astore.score /
    table  = { caslib= 'casuser', name = 'INPUTDATA' } 
    rstore = { caslib= _appEnv_.astore.caslib,  name=_appEnv_.astore.name }
    casout  = { caslib = 'casuser', name = 'OUTPUTDATA' replace= TRUE};

/* fetch results */
action table.fetch r = result /
    table = {  caslib = 'casuser' name = 'INPUTDATA' } ;

/* extract the score and send it as a dictionary */
score = result.Fetch[1].P_BAD;
scoreo= {score= score};
send_response(scoreo);

Key points to note:

  1. The resolver creates and prepends two CASL dictionaries _args_ and _appEnv_.
  2. The CASL program returns the result using the send_response function.
    • One of the cool things is that CASL allows the programmer to customize the returned value. In this example the score extracts into a dictionary.

Schema

Based on the requirement the schema is as shown below:

type Query {
   scoreLoan(amount: Int assets: Int) : Float

Key Point:

  1. The two values the user specifies are defined as the filter parameters to the query.

Application

scoreLoan

Key point:

  1. The user enters the two values the data scientist requires.

Client code

async function runScore(amount, assets){
    let payload = {
        query: `query {
            scoreLoan(amount: ${amount} assets: ${assets} )
        }`
    }

    let config = {
        url            : host + '/graphql',
        withCredentials: true,
        method         : 'POST',
        data           : payload
    }

    let r = await axios(config);
    return r.data.data.scoreLoan;
}

Key points:

  1. The payload is the GraphQL query.
  2. I use the POST method.
  3. The end point is /graphql - this is the only endpoint the application will use.
  4. The response is available as r.data.data.scoreLoan
  5. Note the simplicity of the client code to access the GraphQL server and obtain the results.

Resolver

let caslBase = require('../lib/caslBase');

module.exports = async function scoreLoan (_, args, context) {
    let { store } = context;
    let input = {
        JOB    : 'J1',
        CLAGE  : 100, 
        CLNO   : 20, 
        DEBTINC: 20, 
        DELINQ : 2, 
        DEROG  : 0, 
        MORTDUE: 4000, 
        NINQ   : 1,
        YOJ    : 10
    };

    input.LOAN  = args.amount;
    input.VALUE = args.assets;

    let env = {
        astore: {
            caslib: 'Public',
            name  : 'GRADIENT_BOOSTING___BAD_2'
        }
    }
    let result = await caslBase(store,['argsToTable.casl', 'score.casl'], input, env);
    let score = result.items('results', 'score');
    
    return score;

}

Key points:

  1. As required, the default values for the other parameters are added to the user input.
  2. The resolver contains the location and name of the model.
  3. The names of the SAS code are passed to caslBase - this allows the code to read the SAS code from a repository.
  4. The caslBase function calls the jsonToDict to convert the json parameters to CASL dictionary and passes it on to CAS along with the code.
  5. The user receives the resulting score.
Example 2: Reporting wine production to management
The TwoBit winery management wants a simple report to view the production of different wines by year. They want to be able to pick the year range and the wines in which they are interested. The data shown below is for the TwoBit Winery. The goal is to query for selected wines and filter on years.

The data for the winery is listed below.

 
Obs year cabernet merlot pinot chardonnay twobit
1 2000 10 20 30 40 50
2 2001 5 10 15 5 0
3 2002 6 7 11 12 13
4 2003 5 8 0 0 50
5 2004 11 5 7 8 100
6 2005 1 1 0 0 1000
7 2006 0 0 0 0 3000

 

SAS Code

The SAS experts at the company created the following SAS code to meet management's request. Note that for demo purposes the wine data is created inline.

data wineList;  
 input year cabernet merlot pinot chardonnay twobit ;  
 cards;  
 2000 10 20 30 40 50   
 2001 5 10 15 5 0  
 2002 6 7 11 12 13  
 2003 5 8 0 0 50 
 2004 11 5 7 8 100  
 2005 1  1 0 0 1000  
 2006 0 0 0 0 3000  
;;;; 
run;  
/* _selections_ macro was generated in src/lib/getSelections function.
data wine ;  
    set winelist( where= (year GE &amp;from &amp;&amp; year LE &amp;to)); 
    keep &amp;_selections_; 
    run;  
ods html style=barrettsblue;  
    proc print data=wine;run;  
ods html close;run ;

Key points to note:

  1. The code requires macro variables &from, &to and &_selections_ be set before this code executes.
  2. The name of the returned table is wine.

Schema

type Query{
wineProduction(from: Int, to: Int): WineProduction
}

type WineProduction {
"""
An array containing wine production
"""
wines : [WineList]

"""
ODS output and Log output
"""
report: SASResults
}

type WineList {
year : Int
cabernet : Int
merlot : Int
pinot : Int
chardonnay: Int
twobit : Int
}

type WineProductionCas {
wines : [WineList]
}

type SASResults {
        """
        ODS output from the server
        """
        ods: String
        """
        Log output from the server
        """
        log: String
    
    }

Key points:

  1. As required, the year range is specified as filters for the query.
  2. As required, the user can pick the wines in which they are interested.

Application

The application is shown below.

Client code

The relevant client code is shown below (see this in the repository for the full program).

 let gqString = `query userQuery($from: Int, $to: Int) {
                           results: wineProduction(from: $from to: $to) {
                              wines { 
                                  ${wineList} 
                                } 
                                ${reportList}
                             } 
                            }`;
        let payload = {
            url   : host + '/graphql',
            method: 'POST',
            data: { 
                query: gqString,
                variables: {
                    from: fromYear.value,
                    to  : toYear.value
                }
            }
        }
        setReportValues(null);
        setResultValues(null);
        axios(payload)
         .then ( r => {
            let res = r.data.data.results;
           // Simple to extract the results
            setResultValues(res.wines);
            if (res.report != null ) {
                setReportValues(res.report);
            }
        
         })
         .catch( e => alert(e))
    }
})

Key points:

  1. The GraphQL query string is sent as the payload (wineList and reportList are strings computed earlier in the program based on user selection).
  2. The endpoint is again /graphql with a POST method.
  3. This snippet also shows the preferred way to send the filter values.

Resolver

The root resolver is shown below.

let getProgram    = require('../lib/getProgram');
let getSelections = require('../lib/getSelections');
let spBase        = require('../lib/spBase');

module.exports = async function wineProduction (_, args, context, info){
    let {store} = context;

<span style="font-size: 14px;">   // read source - reads in the sas program</span>
    let src = await getProgram(store, ['wines.sas']); 

    // update args with the wine list specified by the user
    let selections = getSelections(info, 'wines', args);

   // execute the sas code with compute server and get results
    let resultSummary = await spBase(store, selections.args, src);
    
    // resultSummary is now passed to the resolvers for wines and results fields.
    return resultSummary;
}

Key points:

  1. Code from the GitHub repo uses winelist.js to resolve the list of wines.
  2. Code from sasresults.js, sasOds.js and sasLog.js returns ODS output and the SAS log.
  3. The SAS code reads in from a repository using the getPrograms function.
Example 3: List SAS Visual Analytics reports
Another common use case is retrieving information about reports developed with SAS Visual Analytics. The GraphQL query to get the list of reports, who edited it last and when is shown below. This example uses the reports REST API.

Schema

{
    reports {
        name
        modifiedBy
        modifiedOn
   }
}

Creating a UI for this is a challenge exercise for the reader (meaning I did not get around to writing it 🙂 ). The returned results look something like this:

{
    "data": {
    "reports": [
        {
            "name": "Application Activity",
            "modifiedBy": "SAS Supplied",
            "modifiedOn": "2018-04-20T14:24:05.258Z"
       },
      {
           "name": "CAS Activity",
           "modifiedBy": "SAS Supplied",
          "modifiedOn": "2018-06-08T20:21:14.727Z"
        }
...

Resolver

module.exports = async function reports (_, args, context) {
    let {store} = context;
    let reports = store.getService ('reports');
    let list =await getList(store, reports);
    return list;
}

async function getList(store, reports) {
    let reportsList =await store.apiCall (reports.links ('reports'));
    if (reportsList.itemsList().size ===0) {
       return [];
     }
    let r = reportsList.itemsList().map (name => {
         let t = {
             name : name,
             modifiedBy: reportsList.items(name, 'data', 'modifiedBy'),
             modifiedOn: reportsList.items(name, 'data', 'modifiedTimeStamp')
         };
        return t;
     });
   return r;
}

Example 4: Getting the URL and image of a specific report
The query below can be used to obtain the URL to display the interactive report and svg image of a specific report.

Schema

{
      report(name:"Application Activity"){
           url
          image
      }
}

The returned value will be along these lines:

{
  "data": {
    "report": {
      "url": "http://superuser.com/?reportUri=/reports/reports/ecec39ad-994f-4055-8e40-4360f410bc6e...",
      "image: {the svg of the image}
    }
}

Resolver

There are 3 resolvers associated with this query, the root resolver and resolvers for image and url. For the sake of brevity, I will not review those here. please visit the code in the repository.

In conclusion

The examples above cover some basic scenarios for SAS Viya applications.

  1. Using CAS actions
  2. Using traditional data step and procs
  3. Obtaining ODS output
  4. Working with SAS Visual Analytics

The simplicity of the client code and the resolvers are what makes GraphQL attractive for writing SAS Viya applications. You can also exploit other features in SAS Viya using the same pattern. Further, you can use the examples in this repository to easily customize your own use cases. The resolvers and helper functions are written to be reusable with minimal effort. The instructions are in the README file in the repository. If you create interesting schema and resolvers for SAS Viya, please share them with the SAS user community.

Opinion

Like all new technologies GraphQL has its proponents and detractors. Also, many people get caught in the low-value arguments about GraphQL being better or worse than REST. I personally do not follow these discussions since you should use the best tool for the job.

I find GraphQL most attractive when developing a back-end for SAS Viya applications. Both front and back-end developers will benefit from the clear definition of the schema. Having well supported GraphQL servers by Apollo and Facebook makes it easier to adopt GraphQL.

Useful links

There are a growing number of resources from which to learn and model. Below is small starter list.

  1. graphql.org
  2. Apollo
  3. Relay
  4. GraphQL Concepts Visualized by Dhaivat Pandya
  5. GraphQL tutorial from TutorialsPoint
  6. How to GraphQL

GraphQL and SAS Viya applications - a good match was published on SAS Users.

4月 122019
 

At the end of my SAS Users blog post explaining how to install SAS Viya on the Azure Cloud for a SAS Hackathon in the Nordics, I promised to provide some technical background. I ended up with only one manual step by launching a shell script from a Linux machine and from there the whole process kicked off. In this post, I explain how we managed to automate this process as much as possible. Read on to discover the details of the script.

Pre-requisite

The script uses the Azure command-line interface (CLI) heavily. The CLI is Microsoft's cross-platform command-line experience for managing Azure resources. Make sure the CLI is installed, otherwise you cannot use the script.

The deployment process

The process contains three different steps:

  1. Test the availability of the SAS Viya installation repository.
  2. Launch a new Azure Virtual Machine. This action uses a previously created custom Azure image.
  3. Perform the actual installation.

Let’s examine the details of each step.

Test the availability of the SAS Viya installation repository

When deploying software in the cloud, Red Hat Enterprise Linux recommends using a mirror repository. Since the SAS Viya package allows for this installation method, we decided to use the mirror for the hackathon images. This is optional, but optimal, say if your deployment does not have access to the Internet or if you must always deploy the same version of software (such as for regulatory reasons or for testing/production purposes).

In our Azure Subscription we created an Azure Resource group with the name ‘Nordics Hackathon.’ Within that resource group, there is an Azure VM running a web server hosting the downloaded SAS Viya repository.

Azure VM running HTTPD Server and hosting a SAS Viya Mirror Repository

Of course, we cannot start the SAS Viya installation before being sure this VM – hosting all rpms to install SAS Viya – is running.
To validate that the VM is running, we issue the start command from the CLI:

az vm start -g [Azure Resource Group] -n [AZ VM name]

Something like:

az vm start -g my_resourcegroup -n my_viyarepo34

If the server is already running, nothing happens. If not, the command starts the VM. We can also check the Azure console:

Azure Console with 'Running' VMs

Launching the VM

The second part of the script launches a new Azure VM. We use the custom Azure image we created earlier. The SAS Viya image creation is explained in the first blog post.

The Azure image used for the Nordics hackathon was the template for all other SAS Viya installations. On this Azure image we completed several valuable tasks:

  • We performed a SAS Viya pre-deployment assessment using the SAS Viya Administration Resource Kit (Viya ARK) utility tool. The Viya ARK - Pre-installation Playbook is a great tool that checks all prerequisites and performs many pre-deployment tasks before deploying SAS Viya software.
  • Installed R-Server and R-Studio
  • Installed Ansible
  • Created a SAS Viya Playbook using the SAS Orchestration CLI.
  • Customized Ansible playbooks created by SAS colleagues used to kickoff OpenLdap & JupyterHub installation.

Every time we launch our script, an exact copy of a new Azure Virtual machine launches, fully customized according to our needs for the Hackathon.
Below is the Azure CLI command used in the script which creates a new Azure VM.

az vm create --resource-group [Azure Resource Group]--name $NAME --image viya_Base \
--admin-username azureuser --admin-password [your_pw] --subnet [subnet_id] \
--nsg [optional existing network security group] --public-ip-address-allocation static \
--size [any Azure size] --tags name=$NAME

After the creation of the VM, we install SAS Viya in the third step of the process.

Installation

After running the script three times (using a different value for $NAME), we end up with the following high-level infrastructure:

SAS Viya on Azure Cloud deployemnt

After the launch of the Azure VM, the viya-install.sh script starts the install script using the original image in the /opt/sas/install/ location.
In the last step of the deployment process, the script installs OpenLdap, SAS Viya and JupyterHub. The following command runs the script:

az vm run-command invoke -g [Azure Resource Group] -n $NAME --command-id RunShellScript --scripts "sudo /opt/sas/install/viya-install.sh &amp;"

The steps in the script should be familiar to those with experience installing SAS Viya and/or Ansible playbooks. Below is the script in its entirety.

#!/bin/bash
touch /start
####################################################################
echo "Starting with the installation of OPENLDAP. Check the openldap.log in the playbook directory for more information" &gt; /var/log/myScriptLog.txt
####################################################################
# install openldap
cd /opt/sas/install/OpenLDAP
ansible-playbook openldapsetup.yml
if [ $? -ne 0 ]; then { echo "Failed the openldap setup, aborting." ; exit 1; } fi
cp ./sitedefault.yml /opt/sas/install/sas_viya_playbook/roles/consul/files/sitedefault.yml
if [ $? -ne 0 ]; then { echo "Failed to copy file, aborting." ; exit 1; } fi
####################################################################
echo "Starting Viya installation" &gt;&gt; /var/log/myScriptLog.txt
####################################################################
# install viya
cd /opt/sas/install/sas_viya_playbook
ansible-playbook site.yml
if [ $? -ne 0 ]; then { echo "Failed to install sas viya, aborting." ; exit 1; } fi
####################################################################
echo "Starting jupyterhub installation" &gt;&gt; /var/log/myScriptLog.txt
####################################################################
# install jupyterhub
cd /opt/sas/install/jupy-azure
ansible-playbook deploy_jupyter.yml
if [ $? -ne 0 ]; then { echo "Failed to install jupyterhub, aborting." ; exit 1; } fi
####################################################################
touch /finish 
####################################################################

Up next

In a future blog, I hope to show you how get up and running with SAS Viya Azure Quick Start. For now, the details I provided in this and the previous blog post is enough to get you started deploying your own SAS Viya environments in the cloud.

Script for a SAS Viya installation on Azure in just one click was published on SAS Users.

4月 052019
 

You are a data scientist, in your office, doing data scientist-y things when, your manager's, manager's, manager makes an impossible request. She wants you take a raw data set from the stem cell research team, scrub the data, create and score models, and be ready to rescore when new data comes is available. And she wants it in a week. WHAT?! Your company doesn't own an analytics software license, and a spreadsheet is not going to work on this data with millions of records. Even if you received funding, how could you ever create and maintain an environment under your tight deadline? Take a deep breath, conjure your inner data scientist acumen, and realize SAS has the answer.

SAS Machine Learning on SAS Analytics Cloud provides on-demand programming access to machine learning algorithms in the cloud. No downloads, no install, no infrastructure, no maintenance. This solution provides a multithreaded, multiuser environment for concurrent access to data in memory. The solution is designed for data scientists (and others) coding in SAS or Python and allows them on-demand programmatic access to SAS Viya. You can find more details on Analytic Cloud in the fact sheet. You can even try it for free! The rest of this article will walk you through the features of this new SAS offering and outline how it can help you complete the task bestowed upon you.

Register and get started

Literally, to sign up for the trial, all you need are a SAS Profile, an email address, and a PC. You will be coding in SAS in less than a minute. From the SAS Cloud Analytics page, select the Get Free Trial button. This takes you to the SAS Profile login page (note you can create your SAS Profile here if you do not have one).

SAS Profile log in or creation

Agree to the Terms and Conditions on the License Agreement page and select the Continue button:

Trial License Agreement

You will receive an email containing a URL much like the following:

email confirmation with trial URL

Logging in

Select the link or paste it into your browser (Google Chrome 64-bit recommended) and you will see the log in screen. Enter your SAS Profile credentials and click the Sign In button.

Sign In screen

The Home screen (Applications) appears.

Home Page

We'll discuss the Data and Team pages in further detail later on in this article. You have two options for applications: SAS Studio (for SAS programming) and JupyterLab (for Python programming). This article focuses on SAS Studio. A follow up article will cover the JupyterLab use case. Select the SAS Studio button, a new tab opens to SAS Studio, and we're ready to start coding.

SAS Studio

You are familiar with the SAS language, but you need to brush up a little. Have no fear, support documentation is easily accessible. Also, the SAS Data Mining and Machine Learning Community is a great place to discover additional resources and ask questions. Finally, embedded in SAS Studio are code snippets. You decide to explore the latter.

Code snippets

In SAS Studio select the Snippets twisty in the left pane. Navigate to the SAS Viya Machine Learning section. Here you find code samples you will use to prep and analyze your data. When opening a snippet, you see code and detailed comments on what the code will accomplish. You will use these snippets as a guide when you load and prep your data and preform your analysis. Below is an image of the Prepare and Explore Data snippet. Notice each code step has accompanying comments.

You read through each snippet in the Machine Learning section. The command and structure of the code comes back to you pretty quickly and you're now ready to try it all out on your own data.

Uploading data

Now that you have an idea of what code you need to write, you need to load the data from the research department. You accomplish this by selecting the Server Files and Folders twisty and navigate to the Folder Shortcuts section. In this instance you want to upload your file into the shared/data directory (I'll explain why I chose this location in a moment). Use the Upload button to upload the research data file.

Upload file to the data directory

You're not alone

Files uploaded to shared/data are now visible by others logged into the environment. Wait, did I forget to mention this is a multi-user environment?! Well, yes, it is. You can invite others to collaborate on the project. To add and manage users, return to the Home screen (leaving SAS Studio open). Select the Team section in the left pane. The Team page lists users and displays an Invite button, used to send an invitation for system access to others.

Teams page

To invite others, click the button and enter the email address of the new user. This generates and sends an invitation in an email. The new user accepts the invite and now has access to the system. Using the URL provided in the email the new user logs in using their own SAS Profile credentials. The default role for new users is ‘User.’ A user with admin privileges can change the role to ‘Admin. In the free trial, you are permitted to have a total of five users.

Shared data

You may have guessed by now the Data section lists directories and files located in the shared directory in SAS Studio.

Data page

You also notice here you have 5 GB of storage space. This includes shared and non-shared files.

I love this. How do I get more?

Now you know your way around the system and are ready to start coding. Return to SAS Studio, open a new program, and commence your analysis of the stem cell data. When you successfully deliver the project and impress your management chain, you can mention how the SAS Analytics Cloud solution made it all possible (and simple). You now have a case for the departmental procurement of the solution opening your organization up to add more users, access more storage, and gain more power to run advanced machine learning algorithms on your data.

Your turn

In this article I've outlined how to easily register for the SAS Machine Learning trial and start coding in the matter of a minute. Try it out yourself. Register, load your data, get coding, and solve your problem.

Related Resources

For more details on the development of SAS Analytics Cloud, check out Missy Hannah's interview with two UI developers on the project.

Zero to SAS in 60 Seconds- SAS Machine Learning on SAS Analytics Cloud was published on SAS Users.

3月 272019
 

PAYG financial services: coming to a bank near you

You walk into your neighborhood bank to see about a mortgage. You and your spouse have your eye on the perfect 3BR, 2BA brick ranch near your child's school, and it won't be on the market long. An hour later, you burst through the front door with a bottle of champagne: "We're qualified!"

Also celebrating is your bank's branch manager. She was skeptical when headquarters analysts equipped branches for "Cloud-based application using SAS" , saying it would speed up loan applications. But your quick, frictionless transaction proved them right.The bank's accountants are happy too. The new pay-as-you-go mode of using SAS software in the cloud means big savings.

The above scenario is possible now through serverless functions, which enable your SAS Viya applications to take input from end users, score the loan application, and return results.

The rest of this post gets into the nitty gritty of serverless functions and SAS Viya, detailing what happens in a bank's computers after a customer applies for a loan. The qualification process starts by running a previously built scoring model to generate a score. You will see how the combination of REST APIs in SAS Viya, analytic models and the restaf library make the task of building the serverless function relatively simple.

The blog titled "SAS REST APIs: a sample application" demonstrated building a SAS Viya application using REST APIs, SAS Visual Analytics and SAS Operational Research. This is typical web applications with application server and SAS Viya running on premise.

If you are one of many users using(or considering) a cloud provider, serverless functions is an useful alternate way to deliver your applications to your users. This eliminates the need to manage the application server associated with your application. Additionally you get zero administration and auto-scaling among other benefits. Many SAS applications that respond quickly to user requests are ideal candidates to be deployed as serverless functions.

The example in this article is available on SAS software’s GitHub site in the viya-apps-serverless-score repository.  If you want to see the end application for frame of reference, see the Using the serverless functions section at the bottom of this article.

Let’s begin with a bit of background on serverless computing and then dig into the details of the application and functions.

Serverless computing explained

The benefits of serverless functions as touted by AWS serverless, Azure and serverless.com:

AWS Lambda

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume– there is no charge when your code is not running. With Lambda, you can run code for virtually any type of application or backend service – all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.

What is serverless computing?

According to Azure serverless computing is the abstraction of servers, infrastructure, and operating systems. When you build serverless apps you don’t need to provision and manage any servers, so you can take your mind off infrastructure concerns. Serverless computing is driven by the reaction to events and triggers happening in near-real-time—in the cloud. As a fully managed service, server management and capacity planning are invisible to the developer and billing is based just on resources consumed or the actual time your code is running.

Four core benefits of serverless computing from serverless.com:

  1. Zero administration – Deploy code without provisioning anything beforehand or managing anything afterward. There is no concept of a fleet, an instance, or even an operating system. No more bothering the Ops department.
  2. Auto-scaling – Let your service providers manage the scaling challenges. No need to fire alerts or write scripts to scale up and down. Handle quick bursts of traffic and weekend lulls the same way — with peace of mind.
  3. Pay-per-use – Function-as-a-service compute and managed services charged based on usage rather than pre-provisioned capacity. You can have complete resource utilization without paying a cent for idle time. The results? 90% cost-savings over a cloud VM, and the satisfaction of knowing that you never pay for resources you don’t use.
  4. Increased velocity – Shorten the loop between having an idea and deploying to production. Because there’s less to provision up front and less to manage after deployment, smaller teams can ship more features. It’s easier than ever to make your idea live.

OK, so there is a server involved in serverless computing. The beauty in this technology is that once you deploy your code, you don't have to worry about the underlying infrastructure. You just know that the app should work and you only incur costs when the app is running.

Basic flow

Serverless functions are loaded and executed based on the occurrence of one of the triggers/events supported by the cloud vendor. In this example the API Gateway triggers the serverless functions when an http call invokes the function. The API Gateway calls the handler for the function and passes in the user data. On return from the handler the response is sent to the client. This article focuses on the code inside the Serverless Function box in the picture below.

Figure 1: Request Workflow

This example utilizes two key functions:

  1. app – This function serves up an html application for user to enter the data. This is an example of a web application as a serverless function.
  2. score – This function takes user input from the web app, executes scoring on a Viya Server and returns the results.

Serverless.yml

The serverless.yml defines the serverless functions and the handlers, used to execute and other system related information. We will focus only on the application specific information.

The code snippet below shows the definition of the path and handler for the two functions in the serverles.yml file.

functions:
  app: 
    handler: src/app.app
    events:
      - http:
          path: app
          method: get
          cors: 
            origin: '*'
          request:
            parameters:
              paths:
                id: true  
 
  score:
    handler: src/score.score
    events:
      - http:
          path: score
          method: post
          cors: 
            origin: '*'

The functions(app & score) in the yaml define:

  1. event - http event will trigger this function
  2. path - this is path to the function - similar to what you define in Express or hapijs
  3. method - http standard GET, PUT etc...
  4. others - refer to the cloud vendor's documentation for other available options.

The serverless.yml file also sets application related information using environment variables. In this particular use case we define how to access SAS Viya and which scoring model to use.

environment:
#
# Information for logging into SAS Viya
#
  VIYA_SERVER: http://example.viya.server.com
  CLIENTID: raf
  CLIENTSECRET: raf
  USER:rafuser
  PASSWORD: rafpass
 
#
# astore to be used for scoring
#
  caslib: casuser
  name: GRADIENT_BOOSTING___BAD_2

A note on securing your password

In this example we store the userid and password in environment variables. This is to the keep the focus on the internals of serverless functions for SAS Viya. Locally you can use "serverless variables" to secure the information during development. However, for production deployment, refer to your provider's recommendations and the user community for best practices.

Sounds like a followup blog in the future 🙂

Anatomy of the serverless function

Figure 2 shows the flow inside the serverless function for this example. This pattern will repeat itself in your serverless functions.

Figure 2: Serverless Function Flow

Serverless function score

The code below is the handler for the score function. The rest of this section will discuss each of the key features of the handler.

//
// See src/score.js for the full code
//
module.exports.score = async function (event, context ) {
 
   let store      =  restaf.initStore(); /* initialize restaf     */
   let inParms = parseEvent(event);  /* get user input        */
   let payload = getLogonPayload(); /* get logon information */
 
   return store.logon(payload)               /* logon to SAS Viya */
        .then (()    > scoreMain( store, inParms )) /* score     */
        .then(result > setPayload(result)) /* return results     */
        .catch(err   > setError(err))	      /* else return errors */
}

Step 1: Parse the input

The event parameter contains the input from the caller (web application, another serverless function, etc).
The content of the event parameter is whatever the designer of the serverless function desires. In this particular case, a sample event data is shown below.

{
    "input": {
        "JOB"    : "J1",
        "CLAGE"  : 100,
        "CLNO"   : 20,
        "DEBTINC": 20,
        "DELINQ" : 2,
        "DEROG"  : 0,
        "MORTDUE": 4000,
        "NINQ"   : 1,
        "YOJ"    : 10,
        "LOAN"   : 10000,
        "VALUE"  : 1000000
    }
}

The parseEvent function validates the incoming information.

module.exports = function parseEvent(event)
    let input = null;
    let body = {};
    let rstore = {
        caslib:  process.env.ASTORE_CASLIB,
        name  : process.env.ASTORE_NAME
    }
    if ( event.body !=  null ) {
        body = ( typeof event.body === 'string') ? JSON.parse(event.body) : Object.assign({}, event.body);
       if ( body.hasOwnProperty('input') === true ) {
          input = body.input;
    }
    return { rstore: rstore, input: input }
}

Step 2: Logon to SAS Viya

The server.yml defines the SAS Viya logon information. Note there are other secure ways to manage sensitive information like passwords. You should refer to your provider’s documentation.

module.exports = function getLogonPayload() {
    let p = {
        authType    : 'password',
        host        : `${process.env.VIYA_SERVER}`,
        user        : process.env['USER'],
        password    : process.env['PASSWORD'],
        clientID    : process.env['CLIENTID'],
        clientSecret: (process.env.hasOwnProperty('CLIENTSECRET')) ? process.env[ 'CLIENTSECRET' ] : ''
        };
    return p;
 }

The line restaf.logon(payload) in function in the handler code logs on to the SAS Viya Server using this information.

Step 3 and Step 4: Create Payload and make REST API calls

On successful logon the server is called to do the scoring. This particular example uses the sccasl.runcasl method to run CAS Language (CASL) statements and return the scores. Creating the score has two steps:

  1. upload user input: The user input is converted to a csv and uploaded to a CAS table
  2. Submit CASL statements to SAS Viya (CAS) to do the scoring

The code in src/scoreMain in the repository accomplishes both these steps.

Each of these steps use a CAS action:

    • table.upload – to upload the user data into a CAS Table. The input data is converted into a comma-delimited file(csv) and then uploaded. The REST call using restaf looks like this:
    let csv = makecsv(input); /* create a csv */
    let JSON_Parameters = {
        casout: {
            caslib : 'casuser', /* a valid caslib */
            name   : 'INPUTDATA', /* name of output file on cas server */
            replace: true
        },
 
        importOptions: {
            fileType: 'csv' /* type of the file being uploaded */
        }
    };
 
    let payload = {
        headers: { 'JSON-Parameters': JSON_Parameters },
        data   : csv,
        action : 'table.upload'
    };
 
    let result = await store.runAction(session, payload);
    • sccasl.runcasl – execute CASL statements to do the scoring
 // Setup casl statements 	 	 
 let caslStatements = `	 	 
 loadactionset "astore";	 	 
 action table.loadTable /	 	 
 caslib = "${rstore.caslib}" 	 	 
 path = "${rstore.name}.sashdat"	 	 
 casout = { caslib = "${rstore.caslib}" name = "${rstore.name}" replace=TRUE};	 	 
 
 action astore.score /	 	 
 table = { caslib= 'casuser' name = 'INPUTDATA' } 	 	 
 rstore = { caslib= "${rstore.caslib}" name = '${rstore.name}' }	 	 
 out = { caslib = 'casuser' name = 'OUTPUTDATA' replace= TRUE};	 	 
 action table.fetch r = result/	 	 
 format = TRUE	 	 
 table = { caslib = 'casuser' name = 'OUTPUTDATA' } ;	 	 
 send_response(result);	 	 
 `;	 	 
 // execute cas actions	 	 
 payload = {	 	 
 action: 'sccasl.runcasl',	 	 
 data : { code: caslStatements}	 	 
 }	 	 
 result = await store.runAction(session, payload);

Step 5: Create response

AWS serverless functions must return data and error(s) in a certain form. The two functions setPayload.js and setError.js accomplish this.

module.exports = function setPayload (body) {
    return {
        "statusCode": 200,
        "headers"   : {
            'Access-Control-Allow-Origin'     : '*',
            'Access-Control-Allow-Credentials': true
          },
        "isBase64Encoded": false,
        "body"           : JSON.stringify(body)
    }
  }

Using the serverless functions

When the serverless function is deployed you will get a link for each of the functions. In our case we receive the request shown below (with xxxx replaced with appropriate information).

GET - https://xxxx.amazonaws.com/demo/app

The first link serves up the web application. The user enters some values and the app calls the score serverless function to get the results.
Alternatively, you can write your own application and make an http POST call to the score function using a link such as:

POST - https://xxxx.amazonaws.com/demo/score

To invoke the web application, you will visit the link

https://xxxx.amazonaws.com/demo/app

with your browser. You should see a display shown in Figure 3:

Figure 3: Application Input Screen

Entering values into the two fields and pressing Submit calls the second serverless function, score, and results in a pie chart as seen in Figure 4:

Figure 4: Score Report Screen

Please see the loan.html file in the GitHub repository for details on the application. Displayed below is the relevant part of the Javascript in the loan.html. The score-function-url is the url for the score function. The payload was described earlier in this article. The http call is made using axios.

async function runScore(inputValues ){
 
    let payload = {
        astore: {
            caslib: 'Public',
            name: 'GRADIENT_BOOSTING___BAD_2'
        },
        input: inputValues
    }
    let config = {
        url: {score-function-url}
        method: 'POST',
        data: payload
    }
    let r = await axios(config);
    return r.data.score;
 
}

Porting to other cloud providers

The cloud provider dependent information is handled in the following functions: score.js, parseEvent.js, setPayload.js and setError.js. The rest of the code is host agnostic. In writing your own functions the recommendation is to follow the same pattern as much as possible. The generic code is then available in its own repository for reuse with other providers and applications.

Go try it yourself

I have shown you how to deliver your SAS Viya applications as serverless functions. To access more examples please see the GitHub restaf-demos repository.

Supporting Resources

Serverless functions and SAS Viya - a good match was published on SAS Users.

2月 022019
 

SAS Visual Analytics

I don't know about you, but when I read challenges like:

  • Detecting hidden heart failure before it harms an individual
  • Can SAS Viya AI help to digitalize pension management?
  • How to recommend your next adventure based on travel data
  • How to use advanced analytics in building a relevant next best action
  • Can SAS help you find your future home?
  • When does a customer have their travel mood on, and to which destination will he travel?
  • How can SAS Viya, Machine Learning and Face Recognition help find missing people?

…I can continue with the list of ideas provided by the teams participating in the SAS Nordics User Group’s Hackathon. But one thing is for sure, I become enthusiastic and I'm eager to discover the answers and how analytics can help in solving these questions.

When the Nordics team asked for support for providing SAS Viya infrastructure on Azure Cloud platform, I didn't hesitate to agree and started planning the environment.

Environment needs

Colleagues from the Nordics countries informed us their Hackathon currently included fourteen registered teams. Hence, they needed at least fourteen different environments with the latest and greatest SAS Viya Tools like SAS Visual Analytics, SAS VDMML and SAS Text Analytics. In addition, participants wanted to get the chance to use open source technologies with SAS and asked us to install R-Studio and Jupyter. This would allow data scientists develop models in a programming language of choice and provide access to SAS predictive modeling capabilities.

The challenge I faced was how to automate this installation process. We didn't want to repeat an exact installation fourteen times! Also, in case of a failure we needed a way to quickly reinstall a fresh virtual machine in our environment. We wanted to create the virtual machines on the Azure Cloud platform. The goal was to quickly get SAS Viya instances up and running on Azure, with little user interaction. We ended up with a single script expecting one parameter: the name of the instance. Next, I provide an overview of how we accomplished our task.

The setup

As we need to deploy fourteen identical copies of the same SAS Viya software, we decided to make use of the SAS Mirror Manager, which is a utility for synchronizing SAS software repositories. After downloading the mirror repository, we moved the complete file structure to a Web Server hosted on a separate Nordics Hackathon repository virtual machine, but within a similar private network where the SAS Viya instances will run. This guarantees low latency when downloading the software.

Once the repository server is up and running, we have what we needed to create a SAS Viya base image. Within that image, we first need to make sure to meet the requirements described in the SAS Viya Deployment Guide. To complete this task, we turned to the Viya Infrastructure Resource Kit (VIRK). The VIRK is a collection of tools, created by Erwan Granger, that assist in infrastructure and readiness-verification tasks. The script is located in a repository on SAS software’s GitHub page. By running the VIRK script before creation of the base image, we guarantee all virtual machines based on the image meet the necessary requirements.

Next, we create within the base image the SAS Viya Playbook as described in the SAS Viya Deployment Guide. That allows us to kick off a SAS Viya installation later. The Viya installation must occur later during the initial launch of a new VM based on that image. We cannot install SAS Viya beforehand because one of the requirements is a static IP address and a static hostname, which is different for each VM we launch. However, we can install R-Studio server on the base image. Another important file we make available on this base image is a script to initiate the Ansible installations of OpenLdap, SAS Viya and Jupyter.

Deployment

After the common components are in place we follow the instructions from Azure on how to create a custom image of an Azure VM. This capability is available on other public cloud providers as well. Now all the prerequisites to create working Viya environments for the Hackathon are complete. Finally, we create a launch script to install a full SAS Viya environment with single command and one parameter, the hostname, from the Azure CLI.

$ ./launchscript.sh viya01
$ ./launchscript.sh viya02
$ ./launchscript.sh viya03
...
$ ./launchscript.sh viya12
$ ./launchscript.sh viya13
$ ./launchscript.sh viya14

The script

The main parts of this launch script are:

  1. Testing if the Nordics Hackathon Repository VM is running because we must download software from our own locally created repository.
  2. Launch a new VM, based on the SAS Viya Image we created during preparation, assign a public static IP address, and choose a Standard_E32-16s_v3 Azure VM.
  3. Launch our own Viya-install script to perform the following three sub-steps:
    • Install openLDAP as the identity provider
    • Install SAS Viya just as you would do by following the SAS Viya Deployment Guide.
    • Install Jupyter with a customized Ansible script made by my colleague Alexander Koller.

The result of this is we have fourteen full SAS Viya installations ready in about one hour and 45 minutes. We recently posted a Linkedin video describing the entire process.

Final thoughts

I am planning to write a blog on SAS Communities to share more technical insight on how we created the script. I am honored I was asked to be part of the jury for the Hackathon. I am looking forward to the analytical insights that the different teams will discover and how they will make use of SAS Viya running on the Azure Cloud platform.

Additional resources

Series of Webinars supporting the Nordic Hackathon

Installing SAS Viya Azure virtual machines with a single click was published on SAS Users.

9月 072018
 

PC Magazine defines the broad industry term Software as a Service (SaaS) as, “Software that is rented rather than purchased. Instead of buying applications and paying for periodic upgrades, SaaS is subscription based, and upgrades are automatic during the subscription period.” SaaS, according to the same source, is ideally suited [...]

Does software as a service work for analytics? was published on SAS Voices by David Annis