Frederik Vandenberghe

6月 192019
 

As a data scientist, did you ever come to the point where you felt the need for an evolved analytics platform bringing together the disparate skills of open source and commercial software? A system that can enable advanced analytic capabilities. This is now possible and easy to implement. With many deployment possibilities, SAS Viya allows you to choose the data storage location where compute happens, and the deployment methods for models.

Let’s say you want to expand your model development process with SAS Viya analytical capabilities and you don’t want to wait for getting such environment up and running. Unfortunately, you have no infrastructure, nor the experience to install SAS Viya. Moving the traditional way, you could go for:

  • Protracted hardware procurement and provisioning
  • Deployment planning and coordination with IT
  • Effort and time required for software installation/configuration

This solution may be the right path for many organizations, but I think we all recognize this: the traditional approach could take days, weeks and yes sometimes months.

What if you could get up and running with a full SAS Viya platform in two hours? If you have some affinity for cloud-based solutions, SAS offers you the AWS SAS Viya Cloud Rapid Deployment tool. SAS released this AWS Quick Start as a rapid deployment architecture for SAS Viya on AWS. Deployable products include SAS Visual Data Mining and Machine Learning, SAS Visual Statistics and SAS Visual Analytics.

The goal of this article is to brief you how I launched such an AWS SAS Viya Quickstart. I strongly advise you to watch this related video by my colleague Erwan Granger. Much of what is covered here appears in Erwan's video. The recording predates the SAS Viya 3.4 release, but main concepts are still the same.

What you will need

The following is a list of items you need to complete this task.

  • AWS Account with appropriate creation privileges
  • A valid SAS Viya License; this means you will need a SAS Software Order Confirmation e-mail
  • Optional: you deploy with your own DNS Name and SSL Certificate. In that case you need to register a domain managed by Amazon Route 53. For instructions on registering the domain, see the Route 53 documentation. And you can request and register a certificate with AWS Certificate Manager.

Furthermore, it’s good to know this Quick Start provides two deployment options. You can deploy SAS Viya into a new Virtual Private Cloud (VPC) or into an existing VPC. The first option builds a new AWS environment consisting of the VPC, private and public subnets, NAT gateways, security groups, Ansible controllers, and other infrastructure components, and then deploys SAS Viya into this new VPC. The second option provisions SAS Viya in your existing AWS infrastructure. I decided to go for the first option.

What you will build

Here's an architectural overview of what we will build:

SAS Viya architecture on the AWS Cloud

You can find exactly the same architecture on the SAS Viya AWS Quick Start landing page.

Configure the build

We’ll be following the build process outlined in the Quick Start guide. On the landing page, next to the "What you’ll build tab" you can click on "How to deploy". From there launch the "Deploy into a new VPC" wizard.

Deploy into a new VPC wizard

Prerequisite prep

Make sure you sign in with your AWS account and you have chosen the region where you want to deploy. On that first screen you can leave the Amazon S3 template URL default. That template is the basics for the AWS CloudFormation we are launching. CloudFormation is a tool from AWS that allows you to spin up resources in the right order. The template is the blueprint document for your CloudFormation. By keeping the default template, we will build exactly the architecture displayed above.

Pre-req prep template

Now click "Next" and move to the page where we can specify more details and the required parameters of the CloudFormation parameters.

Cloudfourmation parameters

The first parameter is the SAS Viya Software Order file, which is the Amazon S3 location of the Software Order e-mail attachment.

SAS Viya install package location

In the Administration section, you provide parameters to configure your AWS architecture. That way, you control access, instance type, and if you will use a SAS Viya Mirror repository.

CloudFormation administration parameters

Administration parameter definitions:

  • The name of an Amazon EC2 key pair, so you can access the Ansible controller
  • The Amazon Availability zone for the public and private subnet
  • Allowable IP range for HTTP traffic; must be a valid IP CIDR range
  • Allowable IP Range for SSH traffic to the Ansible controller; must be a valid IP CIDR range
  • SAS Administrator password
  • Password for Default (sasuser) user
  • Amazon EC2 Instance type for CAS Compute VM
  • Amazon EC2 Instance type for SAS Viya Services VM
  • (Optional) Location of SAS Viya Deployment Repository data
  • (Optional) Operator Email

If you want to work with custom DNS names and SSL, you will need to provide the next three parameters as well.

DNS and SSL configuration (optional)

DNS and SSL parameters:

You may accept the defaults on the remaining parameters.

Optional parameters

After clicking "Next" another set of optional parameters are available. I mostly go with accepting the default parameters provided. The lone exception is the Rollback on failure.

Optional administration parameters

Based on what I’ve learned from Erwan's video, the safer choice is "No" on the Rollback option. This way, if the deployment process encounters issues, the log will identify in which step the error occurred. Of course this means you are responsible to manually delete AWS created resources that are not longer necessary. The easiest way to do this is by deleting the CloudFormation Stacks afterward.

Kick off the build

To conclude the deployment wizard, click "Next" once more and acknowledge the necessary AWS resources to create. By clicking "Create stack" the deployment process starts.

Start the build process

You can monitor the deployment log using AWS CloudWatch. In his video, Erwan demonstrates this at around minute 23.

After a successful formation you will find two AWS CloudFormation Stacks created. The Outputs gives you the direct links to SAStudioV and SASDrive.

SAS Studio and SAS Drive stacks

That’s it. You are deployed and ready to begin using your SAS Viya environment!

Additional Reference

Alexander Koller writes about SAS on AWS and takeaways for preparing for the AWS associate solution architect exam.

Your experiences and opinion matter

New forces are shaping the analytics ecosystem. Because of increased competition, rise in customer expectations and new, emerging technology such as AI and Machine Learning, challenging IT departments with evolving their analytic ecosystems to meet the demands of their business partners.

How is your organization doing this? How does your Analytics Cloud strategy compare to the market? And what do your peers think about migrating Analytics to the cloud? We can give you some insights and an industry benchmark on the topic.

Tell us about your experience in this 5 minute survey and we will be happy to share a detailed industry insight report with you, to answer these questions.

Deploy SAS Viya on AWS - Quick Start was published on SAS Users.

4月 122019
 

At the end of my SAS Users blog post explaining how to install SAS Viya on the Azure Cloud for a SAS Hackathon in the Nordics, I promised to provide some technical background. I ended up with only one manual step by launching a shell script from a Linux machine and from there the whole process kicked off. In this post, I explain how we managed to automate this process as much as possible. Read on to discover the details of the script.

Pre-requisite

The script uses the Azure command-line interface (CLI) heavily. The CLI is Microsoft's cross-platform command-line experience for managing Azure resources. Make sure the CLI is installed, otherwise you cannot use the script.

The deployment process

The process contains three different steps:

  1. Test the availability of the SAS Viya installation repository.
  2. Launch a new Azure Virtual Machine. This action uses a previously created custom Azure image.
  3. Perform the actual installation.

Let’s examine the details of each step.

Test the availability of the SAS Viya installation repository

When deploying software in the cloud, Red Hat Enterprise Linux recommends using a mirror repository. Since the SAS Viya package allows for this installation method, we decided to use the mirror for the hackathon images. This is optional, but optimal, say if your deployment does not have access to the Internet or if you must always deploy the same version of software (such as for regulatory reasons or for testing/production purposes).

In our Azure Subscription we created an Azure Resource group with the name ‘Nordics Hackathon.’ Within that resource group, there is an Azure VM running a web server hosting the downloaded SAS Viya repository.

Azure VM running HTTPD Server and hosting a SAS Viya Mirror Repository

Of course, we cannot start the SAS Viya installation before being sure this VM – hosting all rpms to install SAS Viya – is running.
To validate that the VM is running, we issue the start command from the CLI:

az vm start -g [Azure Resource Group] -n [AZ VM name]

Something like:

az vm start -g my_resourcegroup -n my_viyarepo34

If the server is already running, nothing happens. If not, the command starts the VM. We can also check the Azure console:

Azure Console with 'Running' VMs

Launching the VM

The second part of the script launches a new Azure VM. We use the custom Azure image we created earlier. The SAS Viya image creation is explained in the first blog post.

The Azure image used for the Nordics hackathon was the template for all other SAS Viya installations. On this Azure image we completed several valuable tasks:

  • We performed a SAS Viya pre-deployment assessment using the SAS Viya Administration Resource Kit (Viya ARK) utility tool. The Viya ARK - Pre-installation Playbook is a great tool that checks all prerequisites and performs many pre-deployment tasks before deploying SAS Viya software.
  • Installed R-Server and R-Studio
  • Installed Ansible
  • Created a SAS Viya Playbook using the SAS Orchestration CLI.
  • Customized Ansible playbooks created by SAS colleagues used to kickoff OpenLdap & JupyterHub installation.

Every time we launch our script, an exact copy of a new Azure Virtual machine launches, fully customized according to our needs for the Hackathon.
Below is the Azure CLI command used in the script which creates a new Azure VM.

az vm create --resource-group [Azure Resource Group]--name $NAME --image viya_Base \
--admin-username azureuser --admin-password [your_pw] --subnet [subnet_id] \
--nsg [optional existing network security group] --public-ip-address-allocation static \
--size [any Azure size] --tags name=$NAME

After the creation of the VM, we install SAS Viya in the third step of the process.

Installation

After running the script three times (using a different value for $NAME), we end up with the following high-level infrastructure:

SAS Viya on Azure Cloud deployemnt

After the launch of the Azure VM, the viya-install.sh script starts the install script using the original image in the /opt/sas/install/ location.
In the last step of the deployment process, the script installs OpenLdap, SAS Viya and JupyterHub. The following command runs the script:

az vm run-command invoke -g [Azure Resource Group] -n $NAME --command-id RunShellScript --scripts "sudo /opt/sas/install/viya-install.sh &"

The steps in the script should be familiar to those with experience installing SAS Viya and/or Ansible playbooks. Below is the script in its entirety.

#!/bin/bash
touch /start
####################################################################
echo "Starting with the installation of OPENLDAP. Check the openldap.log in the playbook directory for more information" > /var/log/myScriptLog.txt
####################################################################
# install openldap
cd /opt/sas/install/OpenLDAP
ansible-playbook openldapsetup.yml
if [ $? -ne 0 ]; then { echo "Failed the openldap setup, aborting." ; exit 1; } fi
cp ./sitedefault.yml /opt/sas/install/sas_viya_playbook/roles/consul/files/sitedefault.yml
if [ $? -ne 0 ]; then { echo "Failed to copy file, aborting." ; exit 1; } fi
####################################################################
echo "Starting Viya installation" >> /var/log/myScriptLog.txt
####################################################################
# install viya
cd /opt/sas/install/sas_viya_playbook
ansible-playbook site.yml
if [ $? -ne 0 ]; then { echo "Failed to install sas viya, aborting." ; exit 1; } fi
####################################################################
echo "Starting jupyterhub installation" >> /var/log/myScriptLog.txt
####################################################################
# install jupyterhub
cd /opt/sas/install/jupy-azure
ansible-playbook deploy_jupyter.yml
if [ $? -ne 0 ]; then { echo "Failed to install jupyterhub, aborting." ; exit 1; } fi
####################################################################
touch /finish 
####################################################################

Up next

In a future blog, I hope to show you how get up and running with SAS Viya Azure Quick Start. For now, the details I provided in this and the previous blog post is enough to get you started deploying your own SAS Viya environments in the cloud.

Script for a SAS Viya installation on Azure in just one click was published on SAS Users.

2月 022019
 

SAS Visual Analytics

I don't know about you, but when I read challenges like:

  • Detecting hidden heart failure before it harms an individual
  • Can SAS Viya AI help to digitalize pension management?
  • How to recommend your next adventure based on travel data
  • How to use advanced analytics in building a relevant next best action
  • Can SAS help you find your future home?
  • When does a customer have their travel mood on, and to which destination will he travel?
  • How can SAS Viya, Machine Learning and Face Recognition help find missing people?

…I can continue with the list of ideas provided by the teams participating in the SAS Nordics User Group’s Hackathon. But one thing is for sure, I become enthusiastic and I'm eager to discover the answers and how analytics can help in solving these questions.

When the Nordics team asked for support for providing SAS Viya infrastructure on Azure Cloud platform, I didn't hesitate to agree and started planning the environment.

Environment needs

Colleagues from the Nordics countries informed us their Hackathon currently included fourteen registered teams. Hence, they needed at least fourteen different environments with the latest and greatest SAS Viya Tools like SAS Visual Analytics, SAS VDMML and SAS Text Analytics. In addition, participants wanted to get the chance to use open source technologies with SAS and asked us to install R-Studio and Jupyter. This would allow data scientists develop models in a programming language of choice and provide access to SAS predictive modeling capabilities.

The challenge I faced was how to automate this installation process. We didn't want to repeat an exact installation fourteen times! Also, in case of a failure we needed a way to quickly reinstall a fresh virtual machine in our environment. We wanted to create the virtual machines on the Azure Cloud platform. The goal was to quickly get SAS Viya instances up and running on Azure, with little user interaction. We ended up with a single script expecting one parameter: the name of the instance. Next, I provide an overview of how we accomplished our task.

The setup

As we need to deploy fourteen identical copies of the same SAS Viya software, we decided to make use of the SAS Mirror Manager, which is a utility for synchronizing SAS software repositories. After downloading the mirror repository, we moved the complete file structure to a Web Server hosted on a separate Nordics Hackathon repository virtual machine, but within a similar private network where the SAS Viya instances will run. This guarantees low latency when downloading the software.

Once the repository server is up and running, we have what we needed to create a SAS Viya base image. Within that image, we first need to make sure to meet the requirements described in the SAS Viya Deployment Guide. To complete this task, we turned to the Viya Infrastructure Resource Kit (VIRK). The VIRK is a collection of tools, created by Erwan Granger, that assist in infrastructure and readiness-verification tasks. The script is located in a repository on SAS software’s GitHub page. By running the VIRK script before creation of the base image, we guarantee all virtual machines based on the image meet the necessary requirements.

Next, we create within the base image the SAS Viya Playbook as described in the SAS Viya Deployment Guide. That allows us to kick off a SAS Viya installation later. The Viya installation must occur later during the initial launch of a new VM based on that image. We cannot install SAS Viya beforehand because one of the requirements is a static IP address and a static hostname, which is different for each VM we launch. However, we can install R-Studio server on the base image. Another important file we make available on this base image is a script to initiate the Ansible installations of OpenLdap, SAS Viya and Jupyter.

Deployment

After the common components are in place we follow the instructions from Azure on how to create a custom image of an Azure VM. This capability is available on other public cloud providers as well. Now all the prerequisites to create working Viya environments for the Hackathon are complete. Finally, we create a launch script to install a full SAS Viya environment with single command and one parameter, the hostname, from the Azure CLI.

$ ./launchscript.sh viya01
$ ./launchscript.sh viya02
$ ./launchscript.sh viya03
...
$ ./launchscript.sh viya12
$ ./launchscript.sh viya13
$ ./launchscript.sh viya14

The script

The main parts of this launch script are:

  1. Testing if the Nordics Hackathon Repository VM is running because we must download software from our own locally created repository.
  2. Launch a new VM, based on the SAS Viya Image we created during preparation, assign a public static IP address, and choose a Standard_E32-16s_v3 Azure VM.
  3. Launch our own Viya-install script to perform the following three sub-steps:
    • Install openLDAP as the identity provider
    • Install SAS Viya just as you would do by following the SAS Viya Deployment Guide.
    • Install Jupyter with a customized Ansible script made by my colleague Alexander Koller.

The result of this is we have fourteen full SAS Viya installations ready in about one hour and 45 minutes. We recently posted a Linkedin video describing the entire process.

Final thoughts

I am planning to write a blog on SAS Communities to share more technical insight on how we created the script. I am honored I was asked to be part of the jury for the Hackathon. I am looking forward to the analytical insights that the different teams will discover and how they will make use of SAS Viya running on the Azure Cloud platform.

Additional resources

Series of Webinars supporting the Nordic Hackathon

Installing SAS Viya Azure virtual machines with a single click was published on SAS Users.

10月 312018
 

An important step of every analytics project is exploring and preprocessing the data.  This transforms the raw data to make it useful and quality.  It might be necessary, for example, to reduce the size of the data or to eliminate some columns. All these actions accelerate the analytical project that comes right after.  But equally important is how you "productionize" your data science project.  In other words, how you deploy your model so that the business processes can make use of it.

SAS Viya can help with that.  Several SAS Viya applications have been engineered to directly add models to a model repository including SAS® Visual Data Mining and Machine Learning, SAS® Visual Text Analytics, and SAS® Studio. While the recent post on publishing and running models in Hadoop on SAS Viya outlined how to build models, this post will focus on the process to deploy your models with SAS Model Manager to Hadoop.

SAS Visual Data Mining and Machine Learning on SAS Viya contains a pipeline interface to assist data scientists in finding the most accurate model.  In that pipeline interface, you can do several tasks such as import score code, score your data, download score API code or download SAS/BASE scoring code.  Or you may decide – once you have a version ready - to store the model out of the development environment by registering your analytical model in a model repository.

Registered models will show up in SAS Model Manager and are copied to the model repository.   That repository provides long-term storage and includes version control.  It's a powerful tool for managing and governing your analytical models.  A registered version of your model will never get lost, even it's deleted from your development environment.   SAS models are not the only kind of models that SAS Model Manager can handle:  Python, R, Matlab models can also be imported.

SAS Model Manager can read, write, and manage the model repository and provide actions for model editing, comparing, testing, publishing, validating, monitoring, lineage, and history of the models.  It also allows you to easily demonstrate your compliance with regulations and policies. You can organize models into different projects.   Within a project it's feasible to test, deploy and monitor the performance of the registered models.

Deploying your models

Deploying, a key step for any data scientist and model manager, can assist in bringing the models into production processes. Kick off deployment by publishing your models.  SAS Model Manager can publish models to systems being used for batch processing or publish to applications where real-time execution of the models is required.   Let's have a look at how to publish the analytical model to a Hadoop cluster and run the model into the Hadoop cluster.  In doing so, you can score the data where it resides and avoid any data movement.

  1. Create the Hadoop public destination.

The easiest way to do this is via the Visual Interface.  Go to SAS Environment Manager and click on the Publish destinations icon:

Click on the new destination icon:

Important: