Tech

2月 222019
 

Have you heard? SAS recently announced a new practical programming credential, SAS® Certified Specialist: Base Programming Using SAS® 9.4. This new practical programming credential is different than our previous exams. Now it requires you to take a performance-based exam, in which you access a SAS environment and then write and execute SAS code. Practical, right? At the end, your answers are scored for correctness.

This new SAS Certified Specialist credential will run in parallel with the current SAS Certified Base Programmer credential until June 2019. So make sure to check out the complete exam content guide for a list of objectives that are tested on the exam!

A new certification guide


SAS is also releasing an accompanying certification guide to help you prepare for the new programming credential: SAS® Certified Specialist Prep Guide: Base Programming Using SAS® 9.4. This new certification guide has been streamlined to remove redundancy throughout the chapters and has been aligned with the courses, SAS® Programming 1: Essentials and SAS® Programming 2: Data Manipulation Techniques, along with the exam content guide.

The new certification guide also includes a workbook that provides programming scenarios to help you prepare for the performance-based portion of the exam. Make sure to also review the SAS® 9.4 Base Programming Exam Experience Tutorial to help you prepare!

Here are some of the changes made to the certification guide:

    • Removal of INFILE/INPUT statements. These statements are no longer taught in SAS® Programming 1 and SAS® Programming 2 courses or tested in the exam. Thus, these statements have been replaced with the SET statement and the IMPORT procedure.
    • Removal of arrays. Arrays are no longer taught in SAS® Programming 1 and SAS® Programming 2 courses or tested in the exam.
    • Addition of the TRANSPOSE and EXPORT procedures, as well as macro variables. These additions are tested and are taught in the SAS® Programming 1 and SAS® Programming 2 courses.
    • Updating of existing examples and the addition of more annotated examples that are easier to follow and have better quality graphics. These changes help to improve the readability of examples, and there is a closer relationship between the sample questions in the book and the questions that appear on the actual exam.
    • Installation of sample data, and the way you work with SAS libraries has been simplified.
    • A new companion piece, the Quick Syntax Reference Guide, is also available for download.

More opportunities to test your skills

For even more programming practice check out Ron Cody’s Learning SAS by Example. This is full of practical examples, and exercises with solutions which allow you to test your programming skills for the exam.

Not ready for the certification exam just yet? You can also browse our books for getting started with SAS. There you will find, of course, The Little SAS Book: A Primer 5th Edition and Exercises and Projects for The Little SAS® Book, Fifth Edition. Both are great guides for new users wanting to learn SAS and practice before getting to the New Base Programming Specialist Exam!

Thinking about getting SAS® certified? Check out the new SAS certification guide! was published on SAS Users.

2月 152019
 
Beginning with SAS® 9.4, you can embed graphics output within HTML output using the ODS HTML5 destination. This technique works with SAS/GRAPH® procedures (such as GPLOT and GCHART), SG procedures (such as SGPLOT and SGRENDER), and when you create graphics output with ODS Graphics enabled. Most (if not all) existing web browsers support graphics output embedded in HTML5 output.

Note: The default graphics output format for the ODS HTML5 destination is Scalable Vector Graphics (SVG). SVG documents display clearly at any size in any viewer or browser that supports SVG. So, SVG files are ideal for display on a computer monitor, PDA, or cell phone; or printed documents. Because it's a vector graphic, a single SVG document can be transformed to any screen resolution without compromising the clarity of the document. Here's an example:

The same SVG graph, scaled at 90% and then at 200%. But 100% crisp!

SAS/GRAPH procedures

When you use the ODS HTML5 destination with a SAS/GRAPH procedure, specify a value of SVG, PNG, or JPEG for the DEVICE option in the GOPTIONS statement. The following sample PROC GPLOT code embeds SVG graphics inside the resulting HTML output:

goptions device=svg;
ods _all_ close;  
ods html5 path="c:\temp" file="svg_graph.html"; 
symbol1 i=none v=squarefilled; 
proc gplot data=sashelp.cars; 
  plot mpg_city * horsepower;   
  where make="Honda"; 
run;
quit;  
ods html5 close; 
ods preferences;

Note that the ODS PREFERENCES statement above resets the ODS environment back to its default settings when you use the SAS windowing environment.

When you use the PNG or JPEG device driver with the ODS HTML5 destination, add the BITMAP_MODE="INLINE" option to the ODS HTML5 statement. Here is an example:

goptions device=png;
ods _all_ close; 
ods html5 path="c:\temp" file="png_graph.html"     options(bitmap_mode="inline");
symbol1 i=none v=squarefilled; 
proc gplot data=sashelp.cars; 
  plot mpg_city * horsepower;   
  where make="Honda"; 
run;
quit;  
ods html5 close; 
ods preferences;

ODS Graphics and SG procedures

When you use SG procedures and ODS Graphics, specify a value of SVG, PNG, or JPEG for the OUTPUTFMT option in the ODS GRAPHICS statement. The following sample code uses PROC SGPLOT to embed SVG graphics inside the HTML output with the ODS HTML5 destination:

ods _all_ close; 
ods html5 path="c:\temp" file="svg_graph.html"; 
ods graphics on / reset=all outputfmt=svg;
proc sgplot data=sashelp.cars; 
  scatter y=mpg_city x=horsepower / markerattrs=(size=9PT symbol=squarefilled);   
  where make="Honda"; 
run;
ods html5 close; 
ods preferences;  

The following sample code uses PROC SGPLOT to embed PNG graphics inside the HTML output with the ODS HTML5 destination:

ods _all_ close; 
ods html5 path="c:\temp" file="png_graph.html" options(bitmap_mode="inline");   
      ods graphics on / reset=all outputfmt=png;
proc sgplot data=sashelp.cars; 
  scatter y=mpg_city x=horsepower / markerattrs=(size=9PT symbol=squarefilled);   
  where make="Honda"; 
run;
      ods html5 close; 
      ods preferences; 

The technique above also works when you use the ODS GRAPHICS ON statement with other procedures that produce graphics output (such as the LIFETEST procedure).

Note that the ODS HTML5 destination supports the SAS Graphics Accelerator. The SAS Graphics Accelerator enables users with visual impairments or blindness to create, explore, and share data visualizations. It supports alternative presentations of data visualizations that include enhanced visual rendering, text descriptions, tabular data, and interactive sonification. Sonification uses non-speech audio to convey important information about the graph.

You can use the ODS HTML5 destination in most situations where you need to embed all of your output into a single HTML output location. For example, when you email HTML output as an attachment or when you create graphics output via a SAS stored process. If you currently use the ODS HTML destination, you might want to experiment with the ODS HTML5 destination to see whether it meets your needs even if you cannot completely switch to it yet.

Embed scalable graphics using the ODS HTML5 destination was published on SAS Users.

2月 122019
 

Multi-tenancy is one of the exciting new capabilities of SAS Viya. Because it is so new, there is quite a lot of misinformation going around about it. I would like to offer you five key things to know about multi-tenancy before implementing a project using this new paradigm.

All tenants share one SAS Viya deployment

Just as apartment units exist within a larger, common building, all tenants, including the provider, exist within one, single SAS Viya deployment. Tenants share some SAS Viya resources such as the physical machines, most microservices, and possibly the SAS Infrastructure Data Server. Other SAS Viya resources are duplicated per tenant such as the CAS server and compute launcher. Regardless, the key point here is that because there is one SAS Viya deployment, there is one, and only one, SAS license that applies to all tenants. Adding a new tenant to a multi-tenant deployment could have licensing ramifications depending upon how the CAS server resources are allocated.

Decision to use multi-tenancy must be made at deployment time

Many people, myself included, are not very comfortable with commitment. Making a decision that cannot be changed is something we avoid. Deciding whether your SAS Viya deployment supports multi-tenancy cannot be put off for later.

This decision must be made at the time the software is deployed. There is currently no way to convert a multi-tenant deployment to a single-tenant deployment or vice versa short of redeployment, so choose wisely. As with marriage, the decision to go single-tenant or multi-tenant should not be taken lightly and there are benefits to each configuration that should be considered.

Each tenant is accessed by separate login

Let’s return to our apartment analogy. Just as each apartment owner has a separate key that opens only the apartment unit they lease, SAS Viya requires users to log on (authenticate) to a specific tenant space before allowing them access.

SAS Viya facilitates this by accessing each tenant by way of a separate sub-domain address. As shown in the diagram below, a user wishing to use the Acme tenant must access the deployment with a URL of acme.viya.sas.com while a GELCorp user would use a URL of gelcorp.viya.sas.com.

This helps create total separation of tenant access and allows administrators to define and restrict user access for each tenant. It does, however, mean that each tenant space is authenticated individually and there is no notion of single sign-on between tenants.

No content is visible between tenants

You will notice in both images above that there are brick walls between each of the tenants. This is to illustrate how tenants are completely separated from one another. One tenant cannot see any other tenant’s content, data, users, groups or even that other tenants exist in the system.

One common scenario for multi-tenancy is to keep business units within a single corporation separated. For example, we could set up Sales as a tenant, Finance as a tenant, and Human Resources as a tenant. This works very well if we want to truly segregate the departments' work. But what happens when Sales wants to share a report with Finance or Finance wants to publish a report for the entire company to view?

There are two options for this situation:
• We could export content from one tenant and import it into the other tenant(s). For example, we would export a report from the Sales tenant and import it into the Finance tenant, assuming that data the report needs is available to both. But now we have the report (and data) in two places and if Sales updates the report we must repeat the export/import process.
• We could set up a separate tenant at the company level for shared content. Because identities are not shared between tenants, this would require users to log off the departmental tenant and log on to the corporate tenant to see shared reports.

There are pros and cons to using multi-tenancy for departmental separation and the user experience must be considered.

Higher administrative burden

Managing and maintaining a multi-tenancy deployment is more complex than taking care of a single-tenant deployment. Multi-tenancy requires additional CAS servers, additional micro-services, possibly additional machines, and multiple administrative personas. The additional resources can complicate backup strategies, authorization models, operating system security, and resource management of shared resources.

There are also more levels of administration which requires an administrator persona for the provider of the environment and separate administrator personas for each tenant. Each of these administration personas have varying scope into which aspects of the entire deployment they can interact with. For example, the provider administrator can see all system resources, all system activity, logs and tenants, but cannot see any tenant content.

Tenant administrators can only see and interact with dedicated tenant resources such as their CAS server and can also manage all tenant content. They cannot, however, see system resources, other tenants, or logs.

Therefore, coordinating management of a complete multi-tenant deployment will require multiple administration personas, careful design of operating system group membership to protect and maintain sufficient access to files and processes, and possibly multiple logins to accomplish administrative tasks.

Now what?

I have pointed out a handful of key concepts that differ between the usual single-tenant deployments and what you can expect with a multi-tenant deployment of SAS Viya. I am obviously just scratching the surface on these topics. Here are a couple of other resources to check out if you want to dig in further.

Documentation: Multi-tenancy: Concepts
Article: Get ready! SAS Viya 3.4 highlights for the Technical Architect

5 things to know about multi-tenancy was published on SAS Users.

2月 122019
 

As one of SAS' newest systems engineers, recently joining the Americas Artificial Intelligence Team, I’m incredibly excited to gain expertise in artificial intelligence and machine learning. I also look forward applying my knowledge to enable others to leverage the advanced technologies that SAS offers.

However, as a recent graduate with no prior experience coding in SAS, I expected a steep learning curve and a slow introduction into the world of AI. When I first arrived on campus right after ringing in 2019, I had no idea how fast I would employ SAS technology to create a tangible AI application.

During my second week on the job, I learned about the development and deployment of effective computer vision models. Not only did I create a demo to distinguish between a valid company logo and a counterfeit version, but I was also amazed by the ease and speed at which it could be done.

Read on for a look at the model and how it works.

Model Development

For this model, I wanted to showcase the potential of using computer vision to protect corporate identity — in this case, a company’s logo. I used the familiar SAS logo as an example of a valid company logo, and for demonstration purposes, I employed the logo used by the Scandinavian Airlines System to represent what a counterfeit could look like. Although this airline is a legitimate business that isn’t knocking off SAS (they are actually one of our customers), the two logos showcase the technology’s ability to distinguish between similar branding. Thus, the model could easily be adapted to detect actual counterfeits.

I first assembled a collection of sample images of both companies’ logos to serve as training data. For each image, I drew a bounding box around the logo to label it as either a “SASlogo” or a “SAS_counterfeit.”

Use SAS to spot a counterfeit logo

This data was then used to train the model to identify the two logos through machine learning. The model was written on the SAS Deep Learning Python Interface, DLPy, using a YOLO (You Only Look Once) algorithm.

Creating a computer vision model

Creating a computer vision model with SAS

To test the model’s effectiveness in object detection, additional validation images were supplied to verify its ability to identify both valid and counterfeit versions of the logo. In the following image of SAS headquarters in Cary, the model correctly identified the displayed logo as the valid version with a confidence level of 0.61.

Model Deployment

One of the key advantages of using the DLPy interface is the ability to easily deploy the model to various SAS engines. I simply created an ASTORE file as the model output, which can be deployed with SAS Event Stream Processing (ESP), Cloud Analytics Service (CAS), or Micro Analytic Service (MAS).

However, the model can also be deployed when SAS technology is not available by creating an ONNX file as the output. This type of file can be used to integrate the model into an iOS application, for example.

Astore and ONNX of YOLO model

As I conclude my first weeks at SAS, I am thrilled by the opportunity to continue to build expertise in SAS technologies and programming. For the next several months, I will be attending the Customer Advisory Academy in Cary, and I look forward to applying the knowledge and skills I gain to the creation of other AI applications in the future!

Want to learn more about object detection in computer vision? Read this blog post!

How to spot counterfeit company logos with AI – no SAS programming experience needed was published on SAS Users.

2月 082019
 

Creating a map with SAS Visual Analytics begins with the geographic variable.  The geographic variable is a special type of data variable where each item has a latitude and longitude value.  For maximum flexibility, VA supports three types of geography variables:

  1. Predefined
  2. Custom coordinates
  3. Custom polygons

This is the first in a series of posts that will discuss each type of geography variable and their creation. The predefined geography variable is the easiest and quickest way to begin and will be the focus of this post.

SAS Visual Analytics comes with nine (9) predefined geographic lookup types.  This lookup method requires that your data contains a variable matching one of these nine data types:

  • Country or Region Names – Full proper name of a country or region (ISO 3166-1)
  • Country or Region ISO 2-Letter Codes – Alpha-2 country code (ISO 3166-1)
  • Country or Region ISO Numeric Codes – Numeric-3 country code (ISO 3166-1)
  • Country or Region SAS Map ID Values – SAS ID values from MPASGFK continent data sets
  • Subdivision (State, Province) Names – Full proper name for level 2 admin regions (ISO 3166-2)
  • Subdivision (State, Province) SAS Map ID Values – SAS ID values from MAPSGFK continent data sets (Level 1)
  • US State Names – Full proper name for US State
  • US State Abbreviations – Two letter US State abbreviation
  • US Zip Codes – A 5-digit US zip code (no regions)

Once you have identified a variable in your dataset matching one of these types, you are ready to begin.  For our example map, the dataset 'Crime' and variable 'State name' will be used.  Let’s get started.

Creating a predefined geography variable in SAS Visual Analytics

  1. Begin by opening VA and navigate to the Data panel on the left of the application.
  2. Select the desired dataset and locate a variable that matches one of the predefined lookup types discussed above. Click the down arrow to the right of the variable and select ‘Geography’ from the Classification dropdown menu.
  3. The ‘Edit Geography Item’ window will open. Depending upon the type of geography variable selected, some of the options on this dialog will vary.  The 'Name' textbox is common for all types and will contain the variable selected from your dataset.  Edit this label as needed to make it more user friendly for your intended audience.
  4. The ‘Geography data type’ drop down list is where you select the desired type of geography variable.  In this example, we are using the default predefined option.
  5. Locate the 'Name or code context' dropdown list.  Select the type of predefined variable that matches the data type of the variable chosen from your data.  Once selected, VA scans your data and does an internal lookup on each data item.  This process identifies latitude and longitude values for each item of your dataset.  Lookup results are shown on the right of the window as a percentage and a thumbnail size map.  The thumbnail map displays the the first 100 matches.
  6. If there are any unmatched data items, the first 5 will be displayed.  This may provide a better understanding of your data.  In this example, it is clear from variable name as to what type should be selected (US State Names).  However, in most cases that choice will not be this obvious.  The lesson here, know your data!

Unmatched data items indicators

Once you are satisfied with the matched results, click the OK button to continue.  You should see a new section in the Data panel labeled ‘Geography’.  The name of the variable will be displayed beside a globe icon. This icon represents the geography variable and provides confirmation it was created successfully.

Icon change for geography variable

Now that the geography variable has been created, we are ready to create a map.  To do this, simply drag it from the Data panel and drop it on the VA report canvas.  The auto-map feature of VA will recognize the geography variable and create a bubble map with an OpenStreetMap background.  Congratulations!  You have just created your first map in VA.

Bubble map created with predefined geography variable

The concept of a geography variable was introduced in this post as the foundation for creating all maps in VA.  Using the predefined geography variable is the quickest way to get started with Geo maps.  In situations when the predefined type is not possible, using one of VA's custom geography types becomes necessary.  These scenarios will be discussed in future blog posts.

Fundamentals of SAS Visual Analytics geo maps was published on SAS Users.

2月 062019
 

Splitting external text data files into multiple files

Recently, I worked on a cybersecurity project that entailed processing a staggering number of raw text files about web traffic. Millions of rows had to be read and parsed to extract variable values.

The problem was complicated by the varying records composition. Each external raw file was a collection of records of different structures that required different parsing programming logic. Besides, those heterogeneous records could not possibly belong to the same rectangular data tables with fixed sets of columns.

Solving the problem

To solve the problem, I decided to employ a "divide and conquer" strategy: to split the external file into many files, each with a homogeneous structure, then parse them separately to create as many output SAS data sets.

My plan was to use a SAS DATA Step for looping through the rows (records) of the external file, read each row, identify its type, and based on that, write it to a corresponding output file.

Like how we would split a data set into many:

 
data CARS_ASIA CARS_EUROPE CARS_USA;
   set SASHELP.CARS;
   select(origin);
      when('Asia')   output CARS_ASIA;
      when('Europe') output CARS_EUROPE;
      when('USA')    output CARS_USA;
   end;   
run;

But how do you switch between the output files? The idea came from SAS' Chris Hemedinger, who suggested using multiple FILE statements to redirect output to different external files.

Splitting an external raw file into many

As you know, one can use PUT statement in a SAS DATA Step to output a character string or a combination of character strings and variable values into an external file. That external file (a destination) is defined by a

 
filename inf  'c:\temp\input_file.txt';
filename out1 'c:\temp\traffic.txt';
filename out2 'c:\temp\system.txt';
filename out3 'c:\temp\threat.txt';
filename out4 'c:\temp\other.txt';
 
data _null_;
   infile inf;
   input REC_TYPE $10. @;
   input;
   select(REC_TYPE);
      when('TRAFFIC') file out1;
      when('SYSTEM')  file out2;
      when('THREAT')  file out3;
      otherwise       file out4;
   end;
   put _infile_;
run;

In this code, the first INPUT statement retrieves the value of REC_TYPE. The trailing @ line-hold specifier ensures that an input record is held for the execution of the next INPUT statement within the same iteration of the DATA Step. It may not be used exactly as written, but the point is you need to capture the filed(s) of interest and stay on the same row.

The second INPUT statement reads the whole raw file record into the _infile_ DATA Step automatic variable.

Depending on the value of the REC_TYPE variable assigned in the first INPUT statement, SELECT block toggles the FILE definition between one of the four filerefs, out1, out2, out3, or out4.

Then the PUT statement outputs the _infile_ automatic variable value to the output file defined in the SELECT block.

Splitting a data set into several external files

Similar technique can be used to split a data table into several external raw files. Let’s combine the above two code samples to demonstrate how you can split a data set into several external raw files:

 
filename outasi 'c:\temp\cars_asia.txt';
filename outeur 'c:\temp\cars_europe.txt';
filename outusa 'c:\temp\cars_usa.txt';
 
data _null_;
   set SASHELP.CARS;
   select(origin);
      when('Asia')   file outasi;
      when('Europe') file outeur;
      when('USA')    file outusa;
   end;
   put _all_; 
run;

This code will read observations of the SASHELP.CARS data table, and depending on the value of ORIGIN variable, put _all_ will output all the variables (including automatic variables _ERROR_ and _N_) as named values (VARIABLE_NAME=VARIABLE_VALUE pairs) to one of the three external raw files specified by their respective file references (outasi, outeur, or outusa.)

You can modify this code to produce delimited files with full control over which variables and in what order to output. For example, the following code sample produces 3 files with comma-separated values:

 
data _null_;
   set SASHELP.CARS;
   select(origin);
      when('Asia')   file outasi dlm=',';
      when('Europe') file outeur dlm=',';
      when('USA')    file outusa dlm=',';
   end;
   put make model type origin msrp invoice; 
run;

You may use different delimiters for your output files. In addition, rather than using mutually exclusive SELECT, you may use different logic for re-directing your output to different external files.

Bonus: How to zip your output files as you create them

For those readers who are patient enough to read to this point, here is another tip. As described in this series of blog posts by Chris Hemedinger, in SAS you can read your external raw files directly from zipped files without unzipping them first, as well as write your output raw files directly into zipped files. You just need to specify that in your filename statement. For example:

UNIX/Linux

 
filename outusa ZIP '/sas/data/temp/cars_usa.txt.gz' GZIP;

Windows

 
filename outusa ZIP 'c:\temp\cars.zip' member='cars_usa.txt';

Your turn

What is your experience with creating multiple external raw files? Could you please share with the rest of us?

How to split a raw file or a data set into many external raw files was published on SAS Users.

2月 022019
 

SAS Visual Analytics

I don't know about you, but when I read challenges like:

  • Detecting hidden heart failure before it harms an individual
  • Can SAS Viya AI help to digitalize pension management?
  • How to recommend your next adventure based on travel data
  • How to use advanced analytics in building a relevant next best action
  • Can SAS help you find your future home?
  • When does a customer have their travel mood on, and to which destination will he travel?
  • How can SAS Viya, Machine Learning and Face Recognition help find missing people?

…I can continue with the list of ideas provided by the teams participating in the SAS Nordics User Group’s Hackathon. But one thing is for sure, I become enthusiastic and I'm eager to discover the answers and how analytics can help in solving these questions.

When the Nordics team asked for support for providing SAS Viya infrastructure on Azure Cloud platform, I didn't hesitate to agree and started planning the environment.

Environment needs

Colleagues from the Nordics countries informed us their Hackathon currently included fourteen registered teams. Hence, they needed at least fourteen different environments with the latest and greatest SAS Viya Tools like SAS Visual Analytics, SAS VDMML and SAS Text Analytics. In addition, participants wanted to get the chance to use open source technologies with SAS and asked us to install R-Studio and Jupyter. This would allow data scientists develop models in a programming language of choice and provide access to SAS predictive modeling capabilities.

The challenge I faced was how to automate this installation process. We didn't want to repeat an exact installation fourteen times! Also, in case of a failure we needed a way to quickly reinstall a fresh virtual machine in our environment. We wanted to create the virtual machines on the Azure Cloud platform. The goal was to quickly get SAS Viya instances up and running on Azure, with little user interaction. We ended up with a single script expecting one parameter: the name of the instance. Next, I provide an overview of how we accomplished our task.

The setup

As we need to deploy fourteen identical copies of the same SAS Viya software, we decided to make use of the SAS Mirror Manager, which is a utility for synchronizing SAS software repositories. After downloading the mirror repository, we moved the complete file structure to a Web Server hosted on a separate Nordics Hackathon repository virtual machine, but within a similar private network where the SAS Viya instances will run. This guarantees low latency when downloading the software.

Once the repository server is up and running, we have what we needed to create a SAS Viya base image. Within that image, we first need to make sure to meet the requirements described in the SAS Viya Deployment Guide. To complete this task, we turned to the Viya Infrastructure Resource Kit (VIRK). The VIRK is a collection of tools, created by Erwan Granger, that assist in infrastructure and readiness-verification tasks. The script is located in a repository on SAS software’s GitHub page. By running the VIRK script before creation of the base image, we guarantee all virtual machines based on the image meet the necessary requirements.

Next, we create within the base image the SAS Viya Playbook as described in the SAS Viya Deployment Guide. That allows us to kick off a SAS Viya installation later. The Viya installation must occur later during the initial launch of a new VM based on that image. We cannot install SAS Viya beforehand because one of the requirements is a static IP address and a static hostname, which is different for each VM we launch. However, we can install R-Studio server on the base image. Another important file we make available on this base image is a script to initiate the Ansible installations of OpenLdap, SAS Viya and Jupyter.

Deployment

After the common components are in place we follow the instructions from Azure on how to create a custom image of an Azure VM. This capability is available on other public cloud providers as well. Now all the prerequisites to create working Viya environments for the Hackathon are complete. Finally, we create a launch script to install a full SAS Viya environment with single command and one parameter, the hostname, from the Azure CLI.

$ ./launchscript.sh viya01
$ ./launchscript.sh viya02
$ ./launchscript.sh viya03
...
$ ./launchscript.sh viya12
$ ./launchscript.sh viya13
$ ./launchscript.sh viya14

The script

The main parts of this launch script are:

  1. Testing if the Nordics Hackathon Repository VM is running because we must download software from our own locally created repository.
  2. Launch a new VM, based on the SAS Viya Image we created during preparation, assign a public static IP address, and choose a Standard_E32-16s_v3 Azure VM.
  3. Launch our own Viya-install script to perform the following three sub-steps:
    • Install openLDAP as the identity provider
    • Install SAS Viya just as you would do by following the SAS Viya Deployment Guide.
    • Install Jupyter with a customized Ansible script made by my colleague Alexander Koller.

The result of this is we have fourteen full SAS Viya installations ready in about one hour and 45 minutes. We recently posted a Linkedin video describing the entire process.

Final thoughts

I am planning to write a blog on SAS Communities to share more technical insight on how we created the script. I am honored I was asked to be part of the jury for the Hackathon. I am looking forward to the analytical insights that the different teams will discover and how they will make use of SAS Viya running on the Azure Cloud platform.

Additional resources

Series of Webinars supporting the Nordic Hackathon

Installing SAS Viya Azure virtual machines with a single click was published on SAS Users.

1月 252019
 

Need to authenticate on REST API calls

In my blog series regarding SAS REST APIs (article 1, article 2, article 3) I outlined how to integrate SAS analytical capabilities into applications. I detailed how to construct REST calls, build body parameters and interpret the responses. I've not yet covered authentication for the operations, purposefully putting the cart before the horse. If you're not authenticated, you can't do much, so this post will help to get the horse and cart in the right order.

While researching authentication I ran into multiple, informative articles and papers on SAS and OAuth. A non-exhaustive list includes Stuart Rogers' article on SAS Viya authentication options, one of which is OAuth. Also, I found several resources on connecting to external applications from SAS with explanations of OAuth. For example, Joseph Henry provides an overview of OAuth and using it with PROC HTTP and Chris Hemedinger explains securing REST API credentials in SAS programs in this article. Finally, the SAS Viya REST API documentation covers details on application registration and access token generation.

Consider this post a quick guide to summarize these resources and shed light on authenticating via authorization code and passwords.

What OAuth grant type should I use?

Choosing the grant method to get an access token with OAuth depends entirely on your application. You can get more information on which grant type to choose here. This post covers two grant methods: authorization code and password. Authorization code grants are generally used with web applications and considered the safest choice. Password grants are most often used by mobile apps and applied in more trusted environments.

The process, briefly

Getting an external application connected to the SAS Viya platform requires the following steps:

  1. Use the SAS Viya configuration server's Consul token to obtain an ID Token to register a new Client ID
  2. Use the ID Token to register the new client ID and secret
  3. Obtain the authorization code
  4. Acquire the access OAuth token of the Client ID using the authorization code
  5. Call the SAS Viya API using the access token for the authentication.

Registering the client (steps 1 and 2) is a one-time process. You will need a new authorization code (step 3) if the access token is revoked. The access and refresh tokens (step 4) are created once and only need to be refreshed if/when the token expires. Once you have the access token, you can call any API (step 5) if your access token is valid.

Get an access token using an authorization code

Step 1: Get the SAS Viya Consul token to register a new client

The first step to register the client is to get the consul token from the SAS server. As a SAS administrator (sudo user), access the consul token using the following command:

$ export CONSUL_TOKEN=`cat /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tokens/consul/default/client.token`
64e01b03-7dab-41be-a104-2231f99d7dd8

The Consul token returns and is used to obtain an access token used to register the new application. Use the following cURL command to obtain the token:

$ curl -k -X POST "https://sasserver.demo.sas.com/SASLogon/oauth/clients/consul?callback=false&serviceId=app" \
     -H "X-Consul-Token: 64e01b03-7dab-41be-a104-2231f99d7dd8"
 {"access_token":"eyJhbGciOiJSUzI1NiIsIm...","token_type":"bearer","expires_in":35999,"scope":"uaa.admin","jti":"de81c7f3cca645ac807f18dc0d186331"}

The returned token can be lengthy. To assist in later use, create an environment variable from the returned token:

$ export IDTOKEN="eyJhbGciOiJSUzI1NiIsIm..."

Step 2: Register the new client

Change the client_id, client_secret, and scopes in the code below. Scopes should always include "openid" along with any other groups this client needs to get in the access tokens. You can specify "*" but then the user gets prompted for all their groups, which is tedious. The example below just uses one group named "group1".

$ curl -k -X POST "https://sasserver.demo.sas.com/SASLogon/oauth/clients" \
       -H "Content-Type: application/json" \
       -H "Authorization: Bearer $IDTOKEN" \
       -d '{
        "client_id": "myclientid", 
        "client_secret": "myclientsecret",
        "scope": ["openid", "group1"],
        "authorized_grant_types": ["authorization_code","refresh_token"],
        "redirect_uri": "urn:ietf:wg:oauth:2.0:oob"
       }'
{"scope":["openid","group1"],"client_id":"app","resource_ids":["none"],"authorized_grant_types":["refresh_token","authorization_code"],"redirect_uri":["urn:ietf:wg:oauth:2.0:oob"],"autoapprove":[],"authorities":["uaa.none"],"lastModified":1547138692523,"required_user_groups":[]}

Step 3: Approve access to get authentication code

Place the following URL in a browser. Change the hostname and myclientid in the URL as needed.

https://sasserver.demo.sas.com/SASLogon/oauth/authorize?client_id=myclientid&response_type=code

The browser redirects to the SAS login screen. Log in with your SAS user credentials.

SAS Login Screen

On the Authorize Access screen, select the openid checkbox (and any other required groups) and click the Authorize Access button.

Authorize Access form

After submitting the form, you'll see an authorization code. For example, "lB1sxkaCfg". You will use this code in the next step.

Authorization Code displays

Step 4: Get an access token using the authorization code

Now we have the authorization code and we'll use it in the following cURL command to get the access token to SAS.

$ curl -k https://sasserver.demo.sas.com/SASLogon/oauth/token -H "Accept: application/json" -H "Content-Type: application/x-www-form-urlencoded" \
     -u "app:appclientsecret" -d "grant_type=authorization_code&code=YZuKQUg10Z"
{"access_token":"eyJhbGciOiJSUzI1NiIsImtpZ...","token_type":"bearer","refresh_token":"eyJhbGciOiJSUzI1NiIsImtpZC...","expires_in":35999,"scope":"openid","jti":"b35f26197fa849b6a1856eea1c722933"}

We use the returned token to authenticate and authorize the calls made between the client and SAS. We also get a refresh token we use to issue a new token when the current one expires. This way we can avoid repeating all the previous steps. I explain the refresh process further down.

We will again create environment variables for the tokens.

$ export ACCESS_TOKEN="eyJhbGciOiJSUzI1NiIsImtpZCI6ImxlZ..."
$ export REFRESH_TOKEN="eyJhbGciOiJSUzI1NiIsImtpZC..."

Step 5: Use the access token to call SAS Viya APIs

The prep work is complete. We can now send requests to SAS Viya and get some work done. Below is an example REST call that returns user preferences.

$ curl -k https://sasserver.demo.sas.com/preferences/ -H "Authorization: Bearer $ACCESS_TOKEN"
{"version":1,"links":[{"method":"GET","rel":"preferences","href":"/preferences/preferences/stpweb1","uri":"/preferences/preferences/stpweb1","type":"application/vnd.sas.collection","itemType":"application/vnd.sas.preference"},{"method":"PUT","rel":"createPreferences","href":"/preferences/preferences/stpweb1","uri":"/preferences/preferences/stpweb1","type":"application/vnd.sas.preference","responseType":"application/vnd.sas.collection","responseItemType":"application/vnd.sas.preference"},{"method":"POST","rel":"newPreferences","href":"/preferences/preferences/stpweb1","uri":"/preferences/preferences/stpweb1","type":"application/vnd.sas.collection","responseType":"application/vnd.sas.collection","itemType":"application/vnd.sas.preference","responseItemType":"application/vnd.sas.preference"},{"method":"DELETE","rel":"deletePreferences","href":"/preferences/preferences/stpweb1","uri":"/preferences/preferences/stpweb1","type":"application/vnd.sas.collection","itemType":"application/vnd.sas.preference"},{"method":"PUT","rel":"createPreference","href":"/preferences/preferences/stpweb1/{preferenceId}","uri":"/preferences/preferences/stpweb1/{preferenceId}","type":"application/vnd.sas.preference"}]}

Use the refresh token to get a new access token

To use the refresh token to get a new access token, simply send a cURL command like the following:

$ curl -k https://sasserver.demo.sas.com/SASLogon/oauth/token -H "Accept: application/json" \
     -H "Content-Type: application/x-www-form-urlencoded" -u "app:appclientsecret" -d "grant_type=refresh_token&refresh_token=$REFRESH_TOKEN"
{"access_token":"eyJhbGciOiJSUzI1NiIsImtpZCI6ImxlZ...","token_type":"bearer","refresh_token":"eyJhbGciOiJSUzI1NiIsImtpZCSjYxrrNRCF7h0oLhd0Y","expires_in":35999,"scope":"openid","jti":"a5c4456b5beb4493918c389cd5186f02"}

Note the access token is new, and the refresh token remains static. Use the new token for future REST calls. Make sure to replace the ACCESS_TOKEN variable with the new token. Also, the access token has a default life of ten hours before it expires. Most applications deal with expiring and refreshing tokens programmatically. If you wish to change the default expiry of an access token in SAS, make a configuration change in the JWT properties in SAS.

Get an access token using a password

The steps to obtain an access token with a password are the same as with the authorization code. I highlight the differences below, without repeating all the steps.
The process for accessing the ID Token and using it to get an access token for registering the client is the same as described earlier. The first difference when using password authentication is when registering the client. In the code below, notice the key authorized_grant_types has a value of password, not authorization code.

$ curl -k -X POST https://sasserver.demo.sas.com/SASLogon/oauth/clients -H "Content-Type: application/json" \
       -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZ..." \
       -d '{
        "client_id": "myclientid", 
        "client_secret": "myclientsecret",
        "scope": ["openid", "group1"],
        "authorized_grant_types": ["password","refresh_token"],
        "redirect_uri": "urn:ietf:wg:oauth:2.0:oob"
        }'
{"scope":["openid","group1"],"client_id":"myclientid","resource_ids":["none"],"authorized_grant_types":["refresh_token","authorization_code"],"redirect_uri":["urn:ietf:wg:oauth:2.0:oob"],"autoapprove":[],"authorities":["uaa.none"],"lastModified":1547801596527,"required_user_groups":[]}

The client is now registered on the SAS Viya server. To get the access token, we send a command like we did when using the authorization code, just using the username and password.

curl -k https://sasserver.demo.sas.com/SASLogon/oauth/token \
     -H "Content-Type: application/x-www-form-urlencoded" -u "sas.cli:" -d "grant_type=password&username=sasdemo&password=mypassword"
{"access_token":"eyJhbGciOiJSUzI1NiIsImtpZCI6Imx...","token_type":"bearer","refresh_token":"eyJhbGciOiJSUzI1NiIsImtpZ...","expires_in":43199,"scope":"DataBuilders ApplicationAdministrators SASScoreUsers clients.read clients.secret uaa.resource openid PlanningAdministrators uaa.admin clients.admin EsriUsers scim.read SASAdministrators PlanningUsers clients.write scim.write","jti":"073bdcbc6dc94384bcf9b47dc8b7e479"}

From here, sending requests and refreshing the token steps are identical to the method explained in the authorization code example.

Final thoughts

At first, OAuth seems a little intimidating; however, after registering the client and creating the access and refresh tokens, the application will handle all authentication components . This process runs smoothly if you plan and make decisions up front. I hope this guide clears up any question you may have on securing your application with SAS. Please leave questions or comments below.

Authentication to SAS Viya: a couple of approaches was published on SAS Users.

1月 222019
 

You'll notice several changes in SAS Grid Manager with the release of SAS 9.4M6.
 
For the first time, you can get a grid solution entirely written by SAS, with no dependence on any external or third-party grid provider.
 
This post gives a brief architectural description of the new SAS grid provider, including all major components and their role. The “traditional” SAS Grid Manager for Platform has seen some architectural changes too; they are detailed at the bottom.

A new kid in town

SAS Grid Manager is a complex offering, composed of different layers of software. The following picture shows a very simple, high-level view. SAS Infrastructure here represents the SAS Platform, for example the SAS Metadata Server, SAS Middle Tier, etc. They service the execution of computing tasks, whether a batch process, a SAS Workspace server, and so on. In a grid environment these computing tasks are distributed on multiple hosts, and orchestrated/managed/coordinated by a layer of software that we can generically call Grid Infrastructure or Grid Middleware. That’s basically a set of lower-level components that sit between computing processes and the operating system.

Since its initial design more than a decade ago, the SAS Grid Manager offering has always been able to leverage different grid infrastructure providers, thanks to an abstraction layer that makes them transparent to end-user client software.
 
Our strategic grid middleware has been, since the beginning, Platform Suite for SAS, provided by Platform Computing (now part of IBM).
 
A few years ago, with the release of SAS 9.4M3, SAS started delivering an additional grid provider, SAS Grid Manager for Hadoop, tailored to grid environments co-located with Hadoop.
 
The latest version, SAS 9.4M6, opens up choices with the introduction of a new, totally SAS-developed grid provider. What’s its name? Well, since it’s SAS’s grid provider, we use the simplest one: SAS Grid Manager. To avoid confusion, what we used to call SAS Grid Manager has been renamed SAS Grid Manager for Platform.

The reasons for a choice

The SAS-developed provider for SAS Grid Manager:
 
• Is streamlined specifically for SAS workloads.
 
• Is easier to install (simply use the SAS Deployment Wizard (SDW) and administer.
 
• Extends workload management and scheduling capabilities into other technologies, such as
 
    o Third-party compute workloads like open source.
 
    o SAS Viya (in a future release).
 
• Reduces dependence of SAS Grid Manager on third party technologies.

So what are the new components?

The SAS-developed provider for SAS Grid Manager includes:
 
• SAS Workload Orchestrator
 
• SAS Job Flow Scheduler
 
• SAS Workload Orchestrator Web Interface
 
• SAS Workload Orchestrator Administration Utility

 
These are the new components, delivered together with others also available in previous releases and with other providers, such as the Grid Manager Thin Client Utility (a.k.a. SASGSUB), the SAS Grid Manager Agent Plug-in, etc. Let’s see these new components in more detail.

SAS Workload Orchestrator

The SAS Workload Orchestrator is your grid controller – just like Platform’s LSF is with SAS Grid Manager, it:
 
• Dispatches jobs.
 
• Monitors hosts and spreads the load.
 
• Is installed and runs on all machines in the cluster (but is not required on dedicated Metadata Server or Middle-tier hosts).
 
A notable difference, when compared to LSF, is that the SAS Workload Orchestrator is a single daemon, with its configuration stored in a single text file in json format.
 
Redeveloped for modern workloads, the new grid provider can schedule more types of jobs, beyond just SAS jobs. In fact, you can use it to schedule ANY job, including open source code running in Python, R or any other language.

SAS Job Flow Scheduler

SAS Job Flow Scheduler is the flow scheduler for the grid (just as Platform Process Manager is with SAS Grid Manager for Platform):
 
• It passes commands to the SAS Workload Orchestrator at certain times or events.
 
• Flows can be used to run many tasks in parallel on the grid.
 
• Flows can also be used to determine the sequence of events for multiple related jobs.
 
• It only determines when jobs are submitted to the grid, but they may not run immediately if the right conditions are not met (hosts too busy, closed, etc.)
 
The SAS Job Flow Scheduler provides flow orchestration of batch jobs. It uses operating system services to trigger the flow to handle impersonation of the user when it is time for the flow to start execution.
 
A flow can be built using the SAS Management Console or other SAS products such as SAS Data Integration Studio.
 
SAS Job Flow Scheduler includes the ability to run a flow immediately (a.k.a. “Run Now”), or to schedule the flow for some future time/recurrence.
 
SAS Job Flow Scheduler consists of different components that cooperate to execute flows:
 
SASJFS service is the main running service that handles the requests to schedule a flow. It runs on the middle tier as a dedicated thread in the Web Infrastructure Platform, deployed inside sasserver1. It uses services provided by the data store (SAS Content Server) and Metadata Server to read/write the configuration options of the scheduler, the content of the scheduled flows and the history records of executed flows.
 
Launcher acts as a gateway between SASJFS and OS Trigger. It is a daemon that accepts HTTP connections using basic authentication (username/password) to start the OS Trigger program as the scheduled user. This avoids the requirement to save end-users’ passwords in the grid provider, for both Windows and Unix.
 
OS Trigger is a stand-alone Java program that uses the services of the operating system to handle the triggering of the scheduled flow by providing a call to the Job Flow Orchestrator. On Windows, it uses the Windows Task Scheduler; on UNIX, it uses cron or crontab.
 
Job Flow Orchestrator is a stand-alone program that manages the flow orchestration. It is invoked by the OS scheduler (as configured by the OS Trigger) with the id of the flow to execute, then it connects to the SASJFS service to read the flow information, the job execution configuration and the credentials to connect to the grid. With that information, it sends jobs for execution to the SAS Workload Orchestrator. Finally, it is responsible for providing the history record for the flow back to SASJFS service.

Additional components

SAS Grid Manager provides additional components to administer the SAS Workload Orchestrator:
 
• SAS Workload Orchestrator Web Interface
 
• SAS Workload Orchestrator Administration Utility
 
Both can monitor jobs, queues, hosts, services, and logs, and configure hosts, queues, services, user groups, and user resources.
 
The SAS Workload Orchestrator Web Interface is a web application hosted by the SAS Workload Orchestrator process on the grid master host; it can be proxied by the SAS Web Server to always point to the current master in case of failover.

The SAS Workload Orchestrator Administration Utility is an administration command-line interface; it has a similar syntax to SAS Viya CLIs and is located in the directory /Lev1/Applications/GridAdminUtility. A sample invocation to list all running jobs is:
sas-grid-cli show-jobs --state RUNNING

What has not changed

Describing what has not changed with the new grid provider is an easy task: everything else.
 
Obviously, this is a very generic statement, so let’s call out a few noteworthy items that have not changed:
 
• User experience is unchanged. SAS programming interfaces to grid have not changed, apart from the lower-level libraries to connect to the new provider. As such, you still have the traditional SAS grid control server, SAS grid nodes, SAS thin client (aka SASGSUB) and the full SAS client (SAS Display Manager). Users can submit jobs or start grid-launched sessions from SAS Enterprise Guide, SAS Studio, SAS Enterprise Miner, etc.
 
• A directory shared among all grid hosts is still required to share the grid configuration files.
 
• A high-performance, clustered file system for the SASWORK area and for data libraries is mandatory to guarantee satisfactory performance.

What about SAS Grid Manager for Platform?

The traditional grid provider, now rebranded as SAS Grid Manager for Platform, has seen some changes as well with SAS 9.4M6:
 
• The existing management interface, SAS Grid Manager for Platform Module for SAS Environment Manager, has been completely re-designed. The user interface has completely changed, although the functions provided remain the same.
 
• Grid Management Services (GMS) is not updated to work with the latest release of LSF. Therefore, the SAS Grid Manager plug-in for SAS Management Console is no longer supported. However, the plug-in is included with SAS 9.4M6 if you want to upgrade to SAS 9.4M6 without also upgrading Platform Suite for SAS.

You can find more comprehensive information in these doc pages:
 
What’s New in SAS Grid Manager 9.4
 
• Grid Computing for SAS Using SAS Grid Manager (Part 2) section of Grid Computing in SAS 9.4

Native scheduler, new types of workloads, and more: introducing the new SAS Grid Manager was published on SAS Users.

1月 182019
 
Interested in learning about what's new in a software release? What if you want to know whether anything has changed in a SAS product? Or whether there are steps that you need to take before you upgrade to a new software release?
 
The online SAS product documentation for SAS® 9.4 and SAS® Viya® contains new sections that provide these answers. The following sections are usually listed on the left-hand side of the table of contents for the online Help document:
 
“What’s New in Base SAS: Details”
“What’s New in SAS 9.4 and SAS Viya”
“SAS Guide to Software Updates and Product Changes”
Note: To make the product-change information easier to find, this section was retitled for SAS® 9.4M6 and SAS® Viya® 3.4. For documentation about previous releases of SAS 9.4, this section is called “SAS Guide to Software Updates.” The information about product changes is included in a subsection called “Product Details and Requirements.” Although the title is different in newer documentation, the content remains the same.

What's available in each section?

• “What’s New” contains information about new features. For example, in SAS 9.4M6, “What’s New” discusses a new ODS destination for Word and a new procedure called PROC SGPIE.
 
• The “SAS Guide to Software Updates and Product Changes” includes the following subsections:
      o A section on software updates, which is for customers who are upgrading to a new release. (FYI: A software update is any modification that SAS provides for existing software. An upgrade is a new release of SAS. A maintenance release is a collection of updates that are applied to the currently installed software.)
      o Another subsection discusses product details and requirements. In it, you will find information about values or settings that have changed from one release to the next. For example, for SAS 9.4M6, the default style for HTML5 output changed from HTMLBlue to HTMLEncore. Another example is for SAS® 9.4M0, when the YEARCUTOFF= system option changed from 1920 to 1926.

Other links to these resources

In the “What’s New” documentation, there is a link in each section to the corresponding product topic in “SAS Guide to Software Updates and Product Changes.”

For example, when you scroll through this SAS Studio page, you see references both to various “What’s New” pages for different versions of SAS Studio and to “SAS Guide to Software Updates and Product Changes.”

In “What's New in Base SAS: Details,” you can search by the software and maintenance release to find new features. Beginning with SAS 9.4, new features for maintenance releases are introduced using the SAS 9.4Mx notation. For example, in the search box on the page, you can enter 9.4M6.

Final thoughts

With these new online Help sections, you can find information quickly about new features of the current SAS release, as well as what has changed from the previous release. As always, we welcome your feedback and suggestions for improving the documentation.

Special thanks

Elizabeth Downes and Marie Dexter in SAS Documentation Development were very willing to make the requested wording changes in the documentation. They also contributed to the content of this article. Thanks to both for their time and effort!

Navigating SAS documentation to find out about new, modified, and updated features was published on SAS Users.