Tech

3月 302018
 

Multi Node Data TransferWith SAS Viya 3.3, a new data transfer mechanism “MultiNode Data Transfer” has been introduced to transfer data between the data source and the SAS’ Cloud Analytics Services (‘CAS’), in addition to Serial and Parallel data transfer modes. The new mechanism is an extension of the Serial Data Transfer mode. In MultiNode Data transfer mode each CAS Worker makes a simultaneous concurrent connection to read and write data from the source DBMS or Distributed data system.

In CAS, SAS Data connectors are used for Serial mode and SAS Data Connect Accelerators are used for Parallel mode data transfer between CAS and DBMS. The SAS Data connector can also be used for the MultiNode data transfer mechanism. In a multi-node CAS environment when the Data Connector is installed on all Nodes, the Data connector can take advantage of a multi-node CAS environment and make concurrent data access connections from each CAS worker to read and write data from the data source environment.

The CAS Controller controls the MultiNode Data transfer. It directs each CAS worker node on how to query the source data and obtain the needed data. The CAS Controller checks the source data table for the first numeric column and uses the values to divide the table into slices using a MOD function of the number of CAS nodes specified. The higher the Cardinality of the selected numeric column, the easier the data can be divided into slices. If CAS chooses a low cardinality column, you could end-up with poor data distribution on the CAS worker nodes. The CAS controller directs each CAS worker to submit a query to obtain the needed slice of data. During this process, each CAS worker makes an independent, concurrent request to the data source environment.

Data is transferred from the source environment to the CAS worker nodes directly using a single thread connection, bypassing the CAS Controller.

The following diagrams describe the data access from CAS to data source environment using MultiNode Data transfer Mode. CAS is hosted on a multi-node environment with SAS Data Connector installed on each node (CAS Controller and Workers). A CASLIB is defined with NUMREADNODES= and NUMWRITENODES= value other than 1. With each data table access request, the CAS controller scan through the source data table for the first numeric columns and use the value to prepare a query for each CAS worker to run. The CAS Worker node submits an individual query to get its slice of the data. Something like:

Select * from SourceTable where mod(NumericField, NUMREADNODES) = WorkerNodeNumber

The data moves from the DBMS gateway server to each CAS Worker Nodes directly using a single thread connection, bypassing the CAS Controller. It’s a kind of parallel load using the serial mechanism, but it’s not a massively parallel data load. You can notice the bottleneck at DBMS gateway server. The data transfers always passes through the DBMS gateway server to the CAS Worker nodes.

Multi Node Data Transfer

Prerequisites to enable MultiNode Data Transfer include:

  • The CAS environment is a multi-node environment (multiple CAS Worker Nodes).
  • The SAS Data Connector for the data source is installed on each CAS Worker, and Controller Node.
  • The data source client connection components are installed on each CAS Worker, and Controller Node.

By default, SAS Data connector uses serial data transfer mode. To enable MultiNode Data Transfer mode you must use the NUMREADNODES= and NUMWRITENODES= parameters in CSLIB statement and specify value other than 1. If value is specified as 0, CAS will use all available CAS worker nodes. MultiNode Data Transfer Mode can use only number of available node, if you specify more than available nodes, the log prints a warning message.

The following code example describes the data load using “MultiNode” data transfer mode. It assigns a CASLIB using serial mode with NUMREADNODES=10 and NUMWRITENODES=10 and loads data from a Hive table to CAS. As NUMREADNODES= value is other than 1, it follows the MultiNode mechanism. You can notice in log, there is a warning message stating that the Number of Read node parameter exceeds the available Worker nodes. This is one way to verify whether CAS is using MultiNode data transfer mode, by specifying the higher number than available CAS worker nodes. If you specify value for NUMREADNODES =0, it will use all available nodes but no message or warning message in SAS log about multi node usage.

CAS mySession SESSOPTS=( CASLIB=casuser TIMEOUT=99 LOCALE="en_US" metrics=true);
caslib HiveSrl datasource=(srctype="hadoop",
server="xxxxxxx.xxx",
username="hadoop",
dataTransferMode="SERIAL",
NUMREADNODES=10, 
NUMWRITENODES=10,
hadoopconfigdir="/opt/MyHadoop/CDH/Config",
hadoopjarpath="/opt/MyHadoop/CDH/Jars",
schema="default");
proc casutil;
load casdata="prdsal2_1G" casout="prdsal2_1G"
outcaslib="HiveSrl" incaslib="HiveSrl" ;
quit;

SAS Log extract:

….
77 proc casutil;
78 ! load casdata="prdsal2_1G" casout="prdsal2_1G"
79 outcaslib="HiveSrl" incaslib="HiveSrl" ;
NOTE: Executing action 'table.loadTable'.
NOTE: Performing serial LoadTable action using SAS Data Connector to Hadoop.
WARNING: The value of numReadNodes(10) exceeds the number of available worker nodes(7). The load will proceed with numReadNodes=7. 
…
..

On the Database side, in this case Hive, note the queries submitted by CAS Worker Nodes. Each include the MOD function WHERE clause as described above.

On Hadoop Resource Manager User Interface you can notice the corresponding job execution for each query submitted by CAS worker nodes.

When using MultiNode mode to load data to CAS, data distribution is dependent on the cardinality of the numeric column selected by CAS during MOD function operation. You can notice the CAS data distribution for the above loaded table is not ideal, since it selected a column (‘year’) which is not ideal (in this case) for data distribution across CAS worker nodes. There is no option with MultiNode mechanism to specify a column name to be use for query preparation and eventually for data distribution.

If CAS cannot find suitable columns for MultiNode data transfer mode, it will use standard Serial mode to transfer data as shown in the following log:

……..
74
74 ! load casdata="prdsal2_char" casout="prdsal2_char"
75 outcaslib="HiveSrl" incaslib="HiveSrl" ;
NOTE: Executing action 'table.loadTable'.
NOTE: Performing serial LoadTable action using SAS Data Connector to Hadoop.
WARNING: The value of numReadNodes(10) exceeds the number of available worker nodes(7). The load will proceed with numReadNodes=7.
WARNING: Unable to find an acceptable column for multi-node reads. Load will proceed with numReadNodes = 1. 
NOTE: Cloud Analytic Services made the external data from prdsal2_char available as table PRDSAL2_CHAR in caslib HiveSrl.
…….

List of data platform supported with MultiNode Data Transfer using Data Connector:

  • Hadoop
  • Impala
  • Oracle
  • PostgreSQL
  • Teradata
  • Amazon Redshift
  • DB2
  • MS SQL Server
  • SAP HANA

The order of data types that SAS uses to divide data into slices for MultiNode Data Read.

  • INT (includes BIGINT, INTEGER, SMALLINT, TINYINT)
  • DECIMAL
  • NUMERIC
  • DOUBLE

Multi-Node Write:

While this post focused on loading data from a data source into CAS, multi-node data transfer also works when saving from CAS back to the data source. The important parameter when saving is NUMWRITENODES instead of NUMREADNODES. The behavior of multi-node saving is similar to that of multi-node loading.

Summary:

The SAS Data Connector can be used for MultiNode data transfer by installing Data Connector and DBMS client components on all CAS Worker nodes without additional license fees. The source data is transferred directly from DBMS gateway server to CAS Worker Nodes being divided up by a simple MOD function. By using this mechanism, the optimum data distribution in CAS Nodes are not guaranteed. It’s suggested to use all CAS Worker Nodes by specifying NUMREADNODES=0 when loading data to CAS using MultiNode mode.

Important links for more information about this topic:

Multi Node Data Transfer to CAS was published on SAS Users.

3月 302018
 

Reasons to look forward to SAS Global Forum 2018SAS Global Forum 2018 takes place April 8-11 in Denver. The event will attract nearly 6,000 SAS professionals from around the world and from nearly every industry for the a learning and networking experience second to none. At this year's event I'll be presenting the paper Getting More Insight into Your Forecast Errors with the GLMSELECT and QUANTSELECT Procedures on Tuesday, April 10 from 2:30-3:30 in Meeting Room 401. SAS Global Forum is my favorite SAS user event for so many reasons. Here are my top 10 reasons I'm particularly excited about this year's event.

Reason #1: Opening Session
It's always impressive to see the latest and greatest SAS software has to offer. I'm particularly looking forward to hearing the latest SAS Viya news and developments in Machine Learning. Keynotes by Oliver Schabenberger and other SAS Executives highlight the event.

Reason #2: Meeting my favorite SAS bloggers
It's inspiring to see how Rick Wicklin, Robert Allison, Chris Hemedinger and Sanjay Matange spend a lot of (spare?) time to provide SAS Tips for the SAS Community and SAS blogs. Thank you!

Reason #3: Presenting my paper
Inspired by Paul Goodwin, my paper will include lots of @SASsoftware code and SAS tips for SAS users and the SAS community.

Reason #4: Co-Authoring paper 2419: "An Easier and Faster Way to Untranspose a Wide File"
You'll learn important data prep and data quality tasks and the and benefits of using @SASsoftware code to perform these task.

Reason #5: Meeting other SAS Press authors
It'll be a lot of fun meeting other SAS book authors including Tricia Aanderud, Sanjay MatangeRick Wicklin, Chris Hemedinger and Robert Allison to discuss their experiences and meet the charming ladies of the SAS Press team.

Reason #6: Meeting SAS Press readers
Excited about meeting the SAS users who read SAS books at the SAS Press booth in the QUAD to discuss their experiences with SAS software for analytics.

Reason #7: Taking care of our SAS customers from Austria, Germany and Switzerland
It'll be good to see that many SAS software users from our region, many of whom are travelling to Denver to see news on SAS analytics, Machine Learning, Deep Learning, Artificial Intelligence and SAS Viya.

Reason #8: Meeting students who use SAS
It will be exhilarating exchanging ideas with students using SAS software for their research and classes. It's so refreshing to see the increasing interest in SAS from universities around the world.

Reason #9: Getting inspired
Always a great thing to see nearly 6,000 people with an interest in SAS software and SAS analytics interacting and exchanging best practices. I always return to work with a renewed sense of motivation and energy.

Reason #10: Interest in my SAS books and papers
Thanks a lot for your interest in my work! It is always a pleasure to discuss all things SAS with each and every one of you.

Hope to see many of you in Denver!

10 Reasons why I look forward to SAS Global Forum was published on SAS Users.

3月 292018
 

With the release of SAS Viya 3.3, you now have the ability to pass implicit SQL queries to a variety of SQL data sources, including Hive. Under an implicit pass-through, users can write SAS compliant SQL code, and SAS will:

  1. Convert as much code as possible into database native SQL.
  2. Execute the resulting query in-database.
  3. Bring the result back into SAS Viya.

My SAS Viya is co-located within a Hortonworks Hadoop environment. Within this environment, I have set up multiple tables within Hive, which provides structure and a query-like environment for Hadoop data. Using the SAS Data Explorer in SAS Viya, I can easily see the different tables in the Hive environment, and visually inspect them without having to load the data into SAS. The screenshot below shows the Hive table va_service_detail, which contains anonymous data related to recent hospital stays.

SQL Pass-through to Hive in SAS Viya

In my Hive environment, I have a second table called va_member_detail, which contains information about the individuals who were hospitalized in the above table, va_service_detail. A summary of this Hive table can be found in the screenshot below.

Using this data, I would like to perform an analysis to determine why patients are readmitted to the hospital, and understand how we can preventatively keep patients healthy. I will need to join these two tables to allow me to have visit-level and patient-level information in one table. Since medical data is large and messy, I would like to only import the needed information into SAS for my analysis.  The simplest way to do this is through an implicit SQL pass-through to Hive, as shown below:

With an implicit pass-through, I write normal SAS FedSQL code on top of a SAS Library called “Hadoop” pointing to my Hive Server. Once the code is submitted, the SAS System performs the following steps:

  1. Translates the SAS FedSQL code into HiveQL.
  2. Executes the HiveQL script in Hive.
  3. Loads the resulting data in parallel into SAS.

Looking at the log, we can see that the SQL statement was “Fully offloaded to the underlying data source via fill pass-through”, meaning that SAS successfully executed the query, in its entirety, in Hive. With the SAS Embedded Process for Hadoop, the resulting table is then lifted in-parallel from Hive into SAS Viya, making it available for analysis.

As we can see in the log, it took 42 seconds to execute the query in Hive, and bring the result into SAS. To compare efficiency, I redid the analysis, loading va_service_detail and va_member_detail into the memory of the SAS server and performed the join there. The execution took 58 seconds, but required three in-memory tables to do so, along with much more data passing through the network. The implicit pass-through has the benefits of increased speed and decreased latency in data transfer by pushing the query to its source, in this case Hive.

Conclusion

The Implicit SQL Pass-through to Hive in SAS Viya is a must have tool for any analyst working with Hadoop data. With normal SQL syntax in a familiar SAS interface, analysts can push down powerful queries into Hive, speeding up their analysis while limiting data transfer. Analysts can effectively work with large ever-growing data sizes, and speed up the time to value on solving key business challenges.

Implicit SQL Pass-through to Hive in SAS Viya was published on SAS Users.

3月 222018
 

Generating HTML output might be something that you do daily. After all, HTML is now the default format for Display Manager SAS output, and it is one of the available formats for SAS® Enterprise Guide®. In addition, SAS® Studio generates HTML 5.0 output as a default. The many faces of HTML are also seen during everyday operations, which can include the following:

  • Creating reports for the corporate intranet.
  • Creating a responsive design so that content is displayed well on all devices (including mobile devices).
  • Emailing HTML within the body of an email message.
  • Embedding figures in a web page, making the page easier to send in an email.

These tasks show the need for and the true power and flexibility of HTML. This post shows you how to create HTML outputs for each of these tasks with the Output Delivery System (ODS). Some options to use include the HTML destination (which generates HTML 4.1 output by default) or the HTML5 destination (which generates HTML 5.0 output by default).

Reports

With the HTML destination and PROC REPORT, you can create a summary report that includes drill-down data along with trafficlighting.

   ods html path="c:\temp" file="summary.html";	
 
   proc report data=sashelp.prdsale;
      column Country  Actual Predict; 
      define Country / group;
      define actual / sum;
      define predict / sum;
      compute Country;
         drillvar=cats(country,".html");
         call define(_col_,"url",drillvar);
      endcomp;
   run;
 
   ods html close;
 
   /* Create Detail data */
 
   %macro detail(country);
   ods html path="c:\temp" file="&country..html";
 
   proc report data=sashelp.prdsale(where=(country="&country"));
      column Country region product Predict Actual; 
      compute actual;
         if actual.sum >  predict.sum then 
         call define(_col_,"style","style={background=green}");
   endcomp;
   run;
 
   ods html close;
   %mend;
 
   %detail(CANADA)
   %detail(GERMANY)
%detail(U.S.A.)

Generating HTML output

In This Example

  • The first ODS HTML statement uses a COMPUTE block to create drill-down data for each Country variable. The CALL DEFINE statement within the COMPUTE block uses the URL access method.
  • The second ODS HTML statement creates targets for each of the drill-down values in the summary table by using SAS macro language to subset the data. The filename is based on the value.
  • Trafficlighting is added to the drill-down data. The added color is set to occur within a row when the data value within the Actual Sales column is larger than the data value for the Predicted Sales column.

HTML on Mobile Devices

One approach to generating HTML files is to assume that users access data from mobile devices first. Therefore, each user who accesses a web page on a mobile device should have a good experience. However, the viewport (visible area) is smaller on a mobile device, which often creates a poor viewing experience. Using the VIEWPORT meta tag in the METATEXT= option tells the mobile browser how to size the content that is displayed. In the following output, the content width is set to be the same as the device width, and the  initial-scale property controls the zoom level when the page first loads.

<meta name="viewport" content="width=device-width, initial-scale=1">

 ods html path="C:\temp" file="mobile.html" 
 metatext='name="viewport" content="width=device-width, initial-
 scale=1"';
   proc print data=sashelp.prdsale;
      title "Viewing Output Using Mobile Device";
   run;
   ods html close;

In This Example

  • The HTML destination and the METATEXT= option set the width of the output to the width of the mobile device, and the zoom level for the initial load is set.

HTML within Email

Sending SMTP (HTML) email enables you to send HTML within the body of a message. The body can contain styled output as well as embedded images. To generate HTML within email, you must set the EMAILSYS= option to SMTP, and the EMAILHOST= option must be set to the email server. To generate the email, use a FILENAME statement with the EMAIL access method, along with an HTML destination. You can add an image by using the ATTACH= option along with the INLINED= option to add a content identifier, which is defined in a later TITLE statement. For content to appear properly in the email, the CONTENT_TYPE= option must be set to text/html.

The MSOFFICE2K destination is used here instead of the HTML destination because it holds the style better for non-browser-based applications, like Microsoft Office. The ODSTEXT procedure adds the text to the message body.

   filename mymail email to="chevell.parker@sas.com"
                       subject="Forecast Report"
                       attach=('C:\SAS.png' inlined="logo")
                       content_type="text/html";   
 
   ods msoffice2k file=mymail rs=none style=htmlblue options(pagebreak="no");
     title j=l '<img src="cid:logo" width="120" height="100" />';
     title2 "Report for Company XYZ";
 
 
   proc odstext;
      H3 "Confidential!";
   run;
 
   title;   
   proc print data=sashelp.prdsale;
   run;
 
   ods msoffice2k close;

In This Example

  • The FILENAME statement with the EMAIL access method is used.
  • The ATTACH= option specifies the image to include.
  • The INLINED= option specifies a content identifier.
  • The CONTENT_TYPE= option is text/html for HTML output.
  • The ODSTEXT procedure adds the text before the table.
  • The TITLE statement defines the “logo” content identifier.

Graphics within HTML

The ODS HTML5 destination has many benefits, such as the ability to embed graphics directly in an HTML file (and the default file format is SVG). The ability to embed the figure is helpful when you need to email the HTML file, because the file is self-contained. You can also add a table of contents inline to this file.

ods graphics / height=2.5in width=4in;
ods html5 path="c:\temp" file="html5output.html";
   proc means data=sashelp.prdsale;
   run;
 
   proc sgplot data=sashelp.prdsale;
      vbar product / response=actual;
   run;
 
   ods html5 close;

In This Example

  • The ODS HTML5 statement creates a table along with an embedded figure. The image is stored as an SVG file within the HTML file.

Conclusion

HTML is used in many ways when it comes to reporting. Various ODS destinations can accommodate the specific output that you need.

The many faces of HTML was published on SAS Users.

3月 172018
 

This is a continuation of my previous blog post on SAS Data Studio and the Code transform. In this post, I will review some additional examples of using the Code transform in a SAS Data Studio data plan to help you prepare your data for analytic reports and/or models.

Create a Unique Identifier Example

The DATA step code below combines the _THREADID_ and the _N_ variables to create a UniqueID for each record.

SAS Data Studio Code Transform

The variable _THREADID_ returns the number that is associated with the thread that the DATA step is running in a server session. The variable _N_ is an internal system variable that counts the iterations of the DATA step as it automatically loops through the rows of an input data set. The _N_ variable is initially set to 1 and increases by 1 each time the DATA step loops past the DATA statement. The DATA step loops past the DATA statement for every row that it encounters in the input data. Because the DATA step is a built-in loop that iterates through each row in a table, the _N_ variable can be used as a counter variable in this case.

_THREADID_ and _N_ are variables that are created automatically by the SAS DATA step and saved in memory. For more information on automatic DATA step variables refer to its

Cluster Records Example

The DATA step code below combines the _THREADID_ and the counter variables to create a unique ClusterNum for each BY group.

This code uses the concept of FIRST.variable to increase the counter if it is the beginning of a new grouping. FIRST.variable and LAST.variable are variables that CAS creates for each BY variable. CAS sets FIRST.variable when it is processing the first observation in a BY group, and sets LAST.variable when it is processing the last observation in a BY group. These assignments enable you to take different actions, based on whether processing is starting for a new BY group or ending for a BY group. For more information, refer to the topic

De-duplication Example

The DATA step code below outputs the last record of each BY group; therefore, de-duplicating the data set by writing out only one record per grouping.

Below are the de-duplication results on the data set used in the previous Cluster Records Example section.

For more information about DATA step, refer to the

Below is the resulting customers2.xlsx file in the Public CAS library.

For more information on the available action sets, refer to the SAS® Cloud Analytic Services 3.3: CASL Reference guide.

For more information on SAS Data Studio and the Code transform, please refer to this SAS Data Studio Code Transform (Part 2) was published on SAS Users.

3月 172018
 

SAS Data Studio is a new application in SAS Viya 3.3 that provides a mechanism for performing simple, self-service data preparation tasks to prepare data for use in SAS Visual Analytics or other applications. It is accessed via the Prepare Data menu item or tile on SAS Home. Note: A user must belong to the Data Builders group in order to have access to this menu item.

In SAS Data Studio, you can either select to create a new data plan or open an existing one. A data plan starts with a source table and consists of transforms (steps) that are performed against that table. A plan can be saved and a target table can be created based on the transformations applied in the plan.

SAS Data Studio Code Transform

SAS Data Studio

In a previous blog post, I discussed the Data Quality transforms in SAS Studio.  This post is about the Code transform which enables you to create custom code to perform actions or transformations on a table. To add custom code using the Code transform, select the code language from the drop-down menu, and then enter the code in the text box.  The following code languages are available: CASL or DATA step.

Code Transform in SAS Data Studio

Each time you run a plan, the table and library names might change. To avoid errors, you must use variables in place of table and caslib names in your code within SAS Data Studio. Indicating variables in place of table and library names eliminates the possibility that the code will fail due to name changes.  Errors will occur if you use literal values. This is because session table names can change during processing.  Use the following variables:

  • _dp_inputCaslib – variable for the input CAS library name.
  • _dp_inputTable – variable for the input table name.
  • _dp_outputCaslib – variable for the output CAS library name.
  • _dp_outputTable –  variable for the output table name.

Note: For DATA step only, variables must be enclosed in braces, for example, data {{_dp_outputTable}} (caslib={{_dp_outputCaslib}});.

The syntax of “varname”n is needed for variable names with spaces and/or special characters.  Refer to the Avoiding Errors When Using Name Literals help topic for more Information.  There are also several

CASL Code Example

The CASL code example above uses the ActionSet fedSQL to create a summary table of counts by the standardized State value.  The results of this code are pictured below.

Results from CASL Code Example

For more information on the available action sets, refer to the SAS® Cloud Analytic Services 3.3: CASL Reference guide.

DATA Step Code Example

In this DATA step code example above, the BY statement is used to group all records with the same BY value. If you use more than one variable in a BY statement, a BY group is a group of records with the same combination of values for these variables. Each BY group has a unique combination of values for the variables.  On the CAS server, there is no guarantee of global ordering between BY groups. Each DATA step thread can group and order only the rows that are in the same data partition (thread).  Refer to the help topic

Results from DATA Step Code Example

For more information about DATA step, refer to the In my next blog post, I will review some more code examples that you can use in the Code transform in SAS Data Studio. For more information on SAS Data Studio and the Code transform, please refer to this SAS Data Studio Code Transform (Part 1) was published on SAS Users.

3月 152018
 

fraud detectionIn the medical field, an autopsy is valuable because it helps you understand the cause of death. But, what’s more valuable is identifying the leading indicators of an illness so that you can address it before the Grim Reaper comes knocking. Best in class organizations are taking a similar approach to their fraud detection, shifting from a purely hindsight view to insights and even foresight – getting out in front of the fraud before it happens, revenue is lost, reputation damaged and regulators apply even more pressures.

Proactively detecting fraud isn’t easy though. There is the nature of the challenge itself: Fraud is a behavioral problem and one that is dynamic, complex and often sophisticated. Then, there is the data challenge – lots of it and in many different formats, including structured and unstructured. Next is the analytics. There are many techniques available, and some might be good, and others not. Finally, the technology. There is no shortage of solutions, but they can be expensive and organizations need to beware of ending up with a collection of siloed, single-point solutions that don’t tell the full story.

That said, unless you’re willing to close your business, which is the only surefire way to get to 0% fraud, you’ve got to tackle it.

How to tackle fraud?

For starters, I advise leaders to define their risk appetite and tolerance. What is the level of risk that you – and the organization – can live with? If you can live with 5%, let’s say, then that’s your true North and benchmark to measure against. Once the risk appetite is set, next comes the balancing act of strategic long-term view and tactical short-term needs plus balancing fraud prevention against the customer experience, and more. Then, make sure you have the data, technology, people, processes, governance and analytics in place to continuously measure and refine.

What we are seeing today is that analytics is a key component of moving fraud detection from hindsight to foresight. It starts with dividing risk into three classes. The first is what you know. I have fraud, it’s happening, and I can put business rules in place to detect it. It’s a repeatable pattern that usually responds well to the “if x, then y” formula. The second class is what you do not know.  This is about anomaly detections and can often be found by highlighting things that don’t happen often, but stand out when they do. The third, and most challenging class, is when you don’t even know what you’re looking for. Is it a needle in a haystack? Maybe a rusty nail? This is where AI and ML come in play.

Applying best-in-class tools allows organizations to ingest enormous sets of data, including text, voice, social, structured and unstructured data. Adding best-in-class analytics helps to sort the noise from signals, and advanced analytics including Artificial Intelligence, Machine Learning and Natural Language Processing enable organizations to move faster, by processing in real time, and benefit from iterative learning, where humans help models become smarter and smarter until they can improve themselves every single time. And, of course, the best solutions provide an end-to-end analytics lifecycle from data to analytics to insights.

There’s no question that fraud is complex and challenging, but unless you’re willing to send your business to the morgue – and close your doors forever – you’ve got to tackle it. And, thanks to advances in analytics, we can help stop fraud before it starts.

Find out more at the SAS Global User Forum 2018

Join Constantine Boyadjiev for his “Suspect Behavior Identification through Sentiment Analysis and Communication Surveillance” Breakout Session at SAS Global Forum 2018 April 10 at 3 p.m. in Mile High Ballroom Theater C.

 

 

 

 

Move fraud detection from hindsight to insight to foresight was published on SAS Users.

3月 132018
 

SAS Visual Analytics 8.2 introduces the Hidden Data Role. This role can accept one or more category or date data items which will be included in the query results but will not be displayed with the object. You can use this Hidden Data Role in:

  • Mapping Data Sources.
  • Color-Mapped Display Rules.
  • External Links.

Note that this Hidden Data Role is not available for all Objects and cannot be used as both a Hidden Data Role and Data tip value, it can only be assigned to one role.

In this example, we will look at how to use the Hidden Data Role for an External Link.

Here are a few applications of this example:

  • You want to show an index of available assets, and you have a URL to point directly to that asset.
  • Your company sells products, you want to show a table summary of product profit but have a URL that points to each Product’s development page.
  • As the travel department, you want to see individual travel reports rolled up to owner, but have a URL that can link out to each individual report.

The applications are endless when applied to our customer needs.

In my blog example, I have NFL data for Super Bowl wins. I have attached two columns of URLs for demonstration purposes:

  • One URL is for each Super Bowl event, so I have 52 URLs, one for each row of data.
  • The second URL is for each winning team. There have been 20 unique Super Bowl winning teams, so I have 20 unique URLs.

Hidden Data Role in SAS Visual Analytics

In previous versions of SAS Visual Analytics, if you wanted to link out to one of these URLs, you would have to include it in the visualization like in the List Table shown above. But now, using SAS Visual Analytics 8.2, you can assign a column containing these URLs to the Hidden Data Role and it will be available as an External URL.

Here is our target report. We want to be able to link to the Winning Team’s website.

In Visual Analytics 8.2, for the List Table, assign the Winning Team URL column to the Hidden Data Role.

Then, for the List Table, create a new URL Link Action. Give the Action a name and leave the URL section blank. This is because my data column contains a fully qualified URL. If you were linking to a destination and only needed to append a name value pair, then you could put in the partial URL and pass the parameter value, but that’s a different example.

That is using the column which has 20 URLs that matches the winning team in the Hidden Data Role. Now, what if we use the column that has the 52 URLs that link out to the individual Super Bowl events?

That’s right, the cardinality of the Hidden Data Role item does impact the object. Even though the Hidden data item is not visible on the Object, remember it is included in the results query; and therefore, the cardinality of the Hidden data item impacts the aggregation of the data.

Notice that some objects will just present an information warning that a duplicate classification of the data has caused a conflict.

In conclusion, the Hidden Data Role is an exciting addition to the SAS Visual Analytics 8.2 release. I know you'll enjoy and benefit from it.

The power behind a Hidden Data Role in SAS Visual Analytics was published on SAS Users.

3月 092018
 

SAS Viya 3.3 introduces a set of command-line interfaces that SAS Viya administrators will find extremely useful. The command-line interfaces(CLI) will allow administrators to perform numerous administrative tasks in batch as an alternative to using the SAS Environment Manager interface. In addition, calls to the CLI’s can be chained together in scripts to automate more complex administration tasks. In the post I will introduce the administration CLI’s and look at a few useful examples.

The sas-admin CLI is the main interface; it acts as a wrapper for the other CLI’s. The individual CLI’s operate as interfaces to functionality from with sas-admin. The CLI’s provide a simplified interface to the SAS Viya REST services. They abstract the functionality of the REST services allowing an administrator to enter commands on a command line and receive a response back from the system. If the CLI’s do not surface, all functionality you need, calls to the REST API can be made to fill in the gaps.

In SAS Viya 3.3 the available interfaces(plug-ins) within sas-admin are:

Plugin Purpose
audit Gets SAS audit information.
authorization Gets general authorization information, creates and manages rules and permissions on folders.
backup Manages backups.
restore Manages restore operations
cas Manages CAS administration and authorization
configuration Manages the operations of the configuration service
compute Manages the operations of the compute service.
folders Gets and manages SAS folders.
fonts Manages VA fonts
devices Manages mobile device blacklist and whitelist actions and information.
identities Gets identity information, and manages custom groups and group membership
licenses Manages SAS product license status and information
job Manages the operations of the job flow scheduling service
reports Manages SAS Visual Analytics 8.2 reports
tenant Manages tenants in a multi-tenant deployment.
transfer Promotes SAS content.

 

The command-line interfaces are located on a SAS Viya machine (any machine in the commandline host group in your ansible inventory file) in the directory /opt/sas/viya/home/bin.

There are two preliminary steps required to use the command-line interface: you need to create a profile and authenticate.

To create a default profile (you can also create named profiles):

sas-admin profile set-endpoint “http://myserver.demo.myco.com”
sas-admin profile set-output text

You can also simple enter the following and respond to the prompts.

sas-admin profile init

The default profile will be stored in the user’s home directory in a file <homedir>/.sas/config.json

The output options range from text, which provides a simplified text output of the result, to full json which provides the full json output that is returned by the rest call which the CLI will submit.  The full json output is useful if you’re piping the output from one command into a tool which is expecting json.

To authenticate:

sas-admin auth login –user sasadm –password ********

The authentication step creates a token in a file stored in the user’s home directory which is valid for, by default, 12 hours.  The file location is <homedir>/.sas/credentials.json.

The syntax of a call to the sas-admin CLI is shown below. The CLI requires an interfaces(plugin) and a command.

The example shows a call to the identities interface. This command will list all the users who are members of the SAS Administrators custom group.

SAS Viya 3.3 command-line interfaces

In this execution of sas-admin:

  • the interface is identities.
  • there is a global option –output set so that the result is returned in basic text.
  • the command is list-members.
  • the command option –group-id specifies the group whose members you wish to list.

The built-in help of the CLI’s is a very useful feature.

./sas-admin --help

This command provides help on the commands and interfaces(plugins) available, and the global options that may be used.

You can also display help on a specific interface by adding the interface name and then specifying –help.

./sas-admin authorization -–help

Let’s look at an example of using the command-line interface to perform some common administrative tasks. In this example I will:

  • create a new folder that is a sub-folder of an existing folder.
  • create a rule to set authorization on a folder.
  • create and secure a caslib.

Many of the folders commands require the ID of a folder as an argument. The id of the folder is displayed when you create the folder, when you list folders using the CLI and in SAS Environment Manager.

To return a folder id based on its path you can use a rest call to the /folders/folders endpoint. The json that is returned can be parsed to retrieve the id. The folders id can then be used in subsequent calls to the CLI. The rest api call below requests the id of the /gelcontent folder.

curl -X GET “http://myserver.demo.myco.com/folders/folders/@item?path=/gelcontent” -H “Authorization: bearer $TOKEN” | python -mjson.tool

It returns the following json (partial)

{
“creationTimeStamp”: “2017-11-17T15:20:28.563Z”,
“modifiedTimeStamp”: “2017-11-20T23:03:19.939Z”,
“createdBy”: “sasadm”,
“modifiedBy”: “sasadm”,
“id”: “e928249c-7a5e-4556-8e2b-7be8b1950b88”,
“name”: “gelcontent”,
“type”: “folder”,
“memberCount”: 2,
“iconUri”: “/folders/static/icon”,
“links”: [
    {
        “method”: “GET”,
        “rel”: “self”,

NOTE: the authentication token($TOKEN) in the rest call is read from the credentials.json file created when the user authenticated via sas-admin auth login. To see how this is done check out the script at the end of the blog.

The next step is to create a folder that is a sub-folder of the /gelcontent folder. The id of the parent folder, and name of the new folder is passed to the create command of the folders interface.

./sas-admin –-output json folders create –-description “Orion Star” –-name “Orion” -–parent-id e928249c-7a5e-4556-8e2b-7be8b1950b88

Next using the folder id from the previous step set authorization on the folder. In this call to the authorization interface I will grant full control to the group gelcorpadmins on the new folder and its content.

./sas-admin authorization create-rule grant -–permissions read,create,update,delete,add,remove,secure -–group gelcorpadmins -–object-uri /folders/folders/49b7ba6a-0b2d-4e32-b9b9-2536d84cfdbe/** -–container-uri /folders/folders/49b7ba6a-0b2d-4e32-b9b9-2536d84cfdbe

Now in Environment Manager, check that the folder has been created and check the authorization settings. The authorization setting on the folder shows that a new rule has been created and applied providing explicit full access to gelcorpadmins (whose user-friendly name is “GELCorp Admins”).

The next task we might perform is to add a caslib and set authorization on it. We can do that with the following calls to the cas interface.

./sas-admin cas caslibs create path -name ordata --path /tmp/orion --server cas-shared-default
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata –-group gelcorpadmins –-grant ReadInfo
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata --group gelcorpadmins –-grant Select
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata --group gelcorpadmins --grant LimitedPromote
#!/bin/bash
clidir=/opt/sas/viya/home/bin/
endpoint=http://sasserver.demo.sas.com
export TOKEN=
export TOKEN=`grep access-token ~/.sas/credentials.json | cut -d’:’ -f2 | sed s/[{}\”,]//g `
#Get gelcontent folder id
curl -X GET “$endpoint/folders/folders/@item?path=/gelcontent” -H “Authorization: bearer $TOKEN” | python -mjson.tool > /tmp/newfolder.txt
id=$(grep ‘”id”:’ /tmp/newfolder.txt | cut -d’:’ -f2 | sed s/[{}\”,]//g)
echo “The folder ID is” $id
#Create orion Folder
$clidir/sas-admin –output text folders create –name Orion –parent-id $id > /tmp/folderid.txt
orionid=$(grep “Id ” /tmp/folderid.txt | tr -s ‘ ‘ | cut -f2 -d ” “)
echo “The orion folderid is” $orionid
# set permissions
$clidir/sas-admin authorization create-rule grant –permissions read,create,update,delete,add,remove,secure –group gelcorpadmins –object-uri /folders/folders/$orionid/** –container-uri /folders/folders/$orionid
$clidir/sas-admin authorization create-rule grant –permissions read –group gelcorp –object-uri /folders/folders/$orionid

The SAS Viya command-line interfaces are a very valuable addition to the administrator’s toolbox. There is obviously much more which can be done with the CLI’s than we can cover in this article. For more information and details of the available interfaces please check out the SAS Viya 3.3 command-line interfaces for Administration was published on SAS Users.

3月 092018
 

SAS Viya 3.3 introduces a set of command-line interfaces that SAS Viya administrators will find extremely useful. The command-line interfaces(CLI) will allow administrators to perform numerous administrative tasks in batch as an alternative to using the SAS Environment Manager interface. In addition, calls to the CLI’s can be chained together in scripts to automate more complex administration tasks. In the post I will introduce the administration CLI’s and look at a few useful examples.

The sas-admin CLI is the main interface; it acts as a wrapper for the other CLI’s. The individual CLI’s operate as interfaces to functionality from with sas-admin. The CLI’s provide a simplified interface to the SAS Viya REST services. They abstract the functionality of the REST services allowing an administrator to enter commands on a command line and receive a response back from the system. If the CLI’s do not surface, all functionality you need, calls to the REST API can be made to fill in the gaps.

In SAS Viya 3.3 the available interfaces(plug-ins) within sas-admin are:

Plugin Purpose
audit Gets SAS audit information.
authorization Gets general authorization information, creates and manages rules and permissions on folders.
backup Manages backups.
restore Manages restore operations
cas Manages CAS administration and authorization
configuration Manages the operations of the configuration service
compute Manages the operations of the compute service.
folders Gets and manages SAS folders.
fonts Manages VA fonts
devices Manages mobile device blacklist and whitelist actions and information.
identities Gets identity information, and manages custom groups and group membership
licenses Manages SAS product license status and information
job Manages the operations of the job flow scheduling service
reports Manages SAS Visual Analytics 8.2 reports
tenant Manages tenants in a multi-tenant deployment.
transfer Promotes SAS content.

 

The command-line interfaces are located on a SAS Viya machine (any machine in the commandline host group in your ansible inventory file) in the directory /opt/sas/viya/home/bin.

There are two preliminary steps required to use the command-line interface: you need to create a profile and authenticate.

To create a default profile (you can also create named profiles):

sas-admin profile set-endpoint “http://myserver.demo.myco.com”
sas-admin profile set-output text

You can also simple enter the following and respond to the prompts.

sas-admin profile init

The default profile will be stored in the user’s home directory in a file <homedir>/.sas/config.json

The output options range from text, which provides a simplified text output of the result, to full json which provides the full json output that is returned by the rest call which the CLI will submit.  The full json output is useful if you’re piping the output from one command into a tool which is expecting json.

To authenticate:

sas-admin auth login –user sasadm –password ********

The authentication step creates a token in a file stored in the user’s home directory which is valid for, by default, 12 hours.  The file location is <homedir>/.sas/credentials.json.

The syntax of a call to the sas-admin CLI is shown below. The CLI requires an interfaces(plugin) and a command.

The example shows a call to the identities interface. This command will list all the users who are members of the SAS Administrators custom group.

SAS Viya 3.3 command-line interfaces

In this execution of sas-admin:

  • the interface is identities.
  • there is a global option –output set so that the result is returned in basic text.
  • the command is list-members.
  • the command option –group-id specifies the group whose members you wish to list.

The built-in help of the CLI’s is a very useful feature.

./sas-admin --help

This command provides help on the commands and interfaces(plugins) available, and the global options that may be used.

You can also display help on a specific interface by adding the interface name and then specifying –help.

./sas-admin authorization -–help

Let’s look at an example of using the command-line interface to perform some common administrative tasks. In this example I will:

  • create a new folder that is a sub-folder of an existing folder.
  • create a rule to set authorization on a folder.
  • create and secure a caslib.

Many of the folders commands require the ID of a folder as an argument. The id of the folder is displayed when you create the folder, when you list folders using the CLI and in SAS Environment Manager.

To return a folder id based on its path you can use a rest call to the /folders/folders endpoint. The json that is returned can be parsed to retrieve the id. The folders id can then be used in subsequent calls to the CLI. The rest api call below requests the id of the /gelcontent folder.

curl -X GET “http://myserver.demo.myco.com/folders/folders/@item?path=/gelcontent” -H “Authorization: bearer $TOKEN” | python -mjson.tool

It returns the following json (partial)

{
“creationTimeStamp”: “2017-11-17T15:20:28.563Z”,
“modifiedTimeStamp”: “2017-11-20T23:03:19.939Z”,
“createdBy”: “sasadm”,
“modifiedBy”: “sasadm”,
“id”: “e928249c-7a5e-4556-8e2b-7be8b1950b88”,
“name”: “gelcontent”,
“type”: “folder”,
“memberCount”: 2,
“iconUri”: “/folders/static/icon”,
“links”: [
    {
        “method”: “GET”,
        “rel”: “self”,

NOTE: the authentication token($TOKEN) in the rest call is read from the credentials.json file created when the user authenticated via sas-admin auth login. To see how this is done check out the script at the end of the blog.

The next step is to create a folder that is a sub-folder of the /gelcontent folder. The id of the parent folder, and name of the new folder is passed to the create command of the folders interface.

./sas-admin –-output json folders create –-description “Orion Star” –-name “Orion” -–parent-id e928249c-7a5e-4556-8e2b-7be8b1950b88

Next using the folder id from the previous step set authorization on the folder. In this call to the authorization interface I will grant full control to the group gelcorpadmins on the new folder and its content.

./sas-admin authorization create-rule grant -–permissions read,create,update,delete,add,remove,secure -–group gelcorpadmins -–object-uri /folders/folders/49b7ba6a-0b2d-4e32-b9b9-2536d84cfdbe/** -–container-uri /folders/folders/49b7ba6a-0b2d-4e32-b9b9-2536d84cfdbe

Now in Environment Manager, check that the folder has been created and check the authorization settings. The authorization setting on the folder shows that a new rule has been created and applied providing explicit full access to gelcorpadmins (whose user-friendly name is “GELCorp Admins”).

The next task we might perform is to add a caslib and set authorization on it. We can do that with the following calls to the cas interface.

./sas-admin cas caslibs create path -name ordata --path /tmp/orion --server cas-shared-default
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata –-group gelcorpadmins –-grant ReadInfo
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata --group gelcorpadmins –-grant Select
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata --group gelcorpadmins --grant LimitedPromote
#!/bin/bash
clidir=/opt/sas/viya/home/bin/
endpoint=http://sasserver.demo.sas.com
export TOKEN=
export TOKEN=`grep access-token ~/.sas/credentials.json | cut -d’:’ -f2 | sed s/[{}\”,]//g `
#Get gelcontent folder id
curl -X GET “$endpoint/folders/folders/@item?path=/gelcontent” -H “Authorization: bearer $TOKEN” | python -mjson.tool > /tmp/newfolder.txt
id=$(grep ‘”id”:’ /tmp/newfolder.txt | cut -d’:’ -f2 | sed s/[{}\”,]//g)
echo “The folder ID is” $id
#Create orion Folder
$clidir/sas-admin –output text folders create –name Orion –parent-id $id > /tmp/folderid.txt
orionid=$(grep “Id ” /tmp/folderid.txt | tr -s ‘ ‘ | cut -f2 -d ” “)
echo “The orion folderid is” $orionid
# set permissions
$clidir/sas-admin authorization create-rule grant –permissions read,create,update,delete,add,remove,secure –group gelcorpadmins –object-uri /folders/folders/$orionid/** –container-uri /folders/folders/$orionid
$clidir/sas-admin authorization create-rule grant –permissions read –group gelcorp –object-uri /folders/folders/$orionid

The SAS Viya command-line interfaces are a very valuable addition to the administrator’s toolbox. There is obviously much more which can be done with the CLI’s than we can cover in this article. For more information and details of the available interfaces please check out the SAS Viya 3.3 command-line interfaces for Administration was published on SAS Users.