12月 222017
 

When I was growing up, our family had a bookcase containing a set of encyclopedias - it was where I went to obtain information and facts about various things, to impress my friends. Now that we have the Internet, Wikipedia has taken the place of encyclopedias for me - and [...]

The post What Wikipedia pages are the most popular? appeared first on SAS Learning Post.

12月 222017
 
In keras, we can visualize activation functions' geometric properties using backend functions over layers of a model.

We all know the exact function of popular activation functions such as 'sigmoid', 'tanh', 'relu', etc, and we can feed data to these functions to directly obtain their output. But how to do that via keras without explicitly specifying their functional forms?

This can be done following the four steps below:

1. define a simple MLP model with a one dimension input data, a one neuron dense network as the hidden layer, and the output layer will have a 'linear' activation function for one neuron.
2. Extract layers' output of the model (fitted or not) via iterating through model.layers
3. Using backend function K.function() to obtain calculated output for a given input data
4. Feed desired data to the above functions to obtain the output from appropriate activation function.

The code below is a demo:




from keras.layers import Dense, Activation
from keras.models import Sequential
import keras.backend as K
import numpy as np
import matplotlib.pyplot as plt



# 以下设置显示中文文方法根据 http://blog.csdn.net/rumswell/article/details/6544377
plt.rcParams['font.sans-serif'] = ['SimHei'] #指定默认字体
plt.rcParams['axes.unicode_minus'] = False #解决图像中中文符号显示为方块的问题

def NNmodel(activationFunc='linear'):
'''
定义一个神经网络模型。如果要定义不同的模型,可以直接修改该函数
'''
if (activationFunc=='softplus') | (activationFunc=='sigmoid'):
winit='lecun_uniform'
elif activationFunc=='hard_sigmoid':
winit='lecun_normal'
else:
winit='he_uniform'
model = Sequential()
model.add(Dense(1, input_shape=(1,), activation=activationFunc,
kernel_initializer=winit,
name='Hidden'))

model.add(Dense(1, activation='linear', name='Output'))
model.compile(loss='mse', optimizer='sgd')
return model

def VisualActivation(activationFunc='relu', plot=True):
x = (np.arange(100)-50)/10
y = np.log(x+x.max()+1)

model = NNmodel(activationFunc = activationFunc)

inX = model.input
outputs = [layer.output for layer in model.layers if layer.name=='Hidden']
functions = [K.function([inX], [out]) for out in outputs]

layer_outs = [func([x.reshape(-1, 1)]) for func in functions]
activationLayer = layer_outs[0][0]

activationDf = pd.DataFrame(activationLayer)
result=pd.concat([pd.DataFrame(x), activationDf], axis=1)
result.columns=['X', 'Activated']
result.set_index('X', inplace=True)
if plot:
result.plot(title=f)

return result


# Now we can visualize them (assuming default settings) :
actFuncs = ['linear', 'softmax', 'sigmoid', 'tanh', 'softsign', 'hard_sigmoid', 'softplus', 'selu', 'elu']

from keras.layers import LeakyReLU
figure = plt.figure()
for i, f in enumerate(actFuncs):
# 依次画图
figure.add_subplot(3, 3, i+1)
out=VisualActivation(activationFunc=f, plot=False)
plt.plot(out.index, out.Activated)
plt.title(u'激活函数:'+f)

This figure is the output from above code. As we can see, the geometric property of each activation function is well captured.

 Posted by at 4:44 下午
12月 222017
 

Another report requirement came my way and I wanted to share how to use our Visual Analytics’ out-of-the-box relative period calculations to solve it.

Essentially, we had a customer who wanted to see a metric for every month, the previous month’s value next to it, and lastly the difference between the two.

Relative Period Report in SAS Visual Analytics

To do this in SAS Visual Analytics, which is available in versions 7.3 and above, use the relative periodic operators. I am going to use the Mega_Corp data which has a date data item called Date by Month using the format: MMMYYYY. SAS Visual Analytics supports relative period calculations for month, quarter and year.
The first two columns, circled in red, are straight from the data. The metric we are interested in for this report is Profit.

Next, we will create the last column, Profit (Difference from Previous Period), which is an aggregated measure that uses the periodic operators.

From the Data pane, select the metric used in the list table, Profit. Then right-click on Profit and navigate the menus: Create / Difference from Previous Period / Using: Date by Month.

A new aggregated measure will be created for you:

If you right-click on the aggregated measure and select Edit Aggregated Measure…, you will see this relative period calculation, where it is taking the current period (notice the 0) minus the value for the previous period (notice the -1).

Okay – that’s it. This out-of-the-box relative period calculation is ready to be added to the list table. Notice the other Period Operators available in the list. These support SAS Visual Analytics’ additional out-of-the-box aggregated measure calculations such as the Difference between Parallel Periods, Year to Date cumulative calculations, etc.

Now we have to create the final column to meet our report requirement: the Previous Period column.

To do this we are going to leverage the out-of-the-box functionality of the relative period calculation. Since this aggregated measure calculates the previous period for the subtraction – let’s use this to our advantage.

Duplicate the out-of-the-box relative period calculation by right-clicking on Profit (Difference from Previous Period) and select Duplicate Data Item.

Then right-click on the new data item, and select Edit Aggregated Measure….

Now delete everything highlighted in yellow below, remember to also delete the minus sign. And give the data item a new name. Click OK. This will create an aggregated measure that will calculate the previous period.

The final result should look like this from either the Visual tab or Text tab:

Now we have all the columns to meet our report requirement:

Now that I’ve piqued your interest, I’m sure you are wondering if you could use this technique to create aggregated data items to represent the Period -1, -2, -3 offset? YES! This is absolutely possible.
Also, I went ahead and plotted the Difference from Previous Period on a line chart. This is an extremely useful visualization to gage if the variance between periods is acceptable. You can easily assign display rules to this visualization to flag any periods that may need further investigation.

Relative Period Report in SAS Visual Analytics was published on SAS Users.

12月 212017
 

With SAS Data Management, you can set up SAS Data Remediation to manage and correct data issues. SAS Data Remediation allows user- or role-based access to data exceptions.

When a data issue is discovered it can be sent automatically or manually to a remediation queue where it can be corrected by designated users. The issue can be fixed from within SAS Remediation without the need of going to the affected source system. For more efficiency, the remediation process can also be linked to a purpose designed workflow.

It involves a few steps to set up a remediation process that allows you to correct data issues from within SAS Remediation:

  • Set up Data Management job to retrieve data and correct data in remediation.
  • Set up a Workflow to control the remediation process.
  • Register the remediation service.

Set up Data Management job to retrieve and correct data in remediation

To correct data issues from within Data Remediation we need two real-time data management jobs to retrieve and send data. The retrieve job will read the record in question to populate its data in the remediation UI and a send job to write the corrected data back to the data source or a staging area first.

Retrieve and sent job

If the following remediation fields are available in the retrieve or send job’s External Data Provider node, data will be passed to the fields. The field values can be used to identify and work with the correct record:

REM_KEY (internal field to store issues record id)
REM_USERNAME (the current remediation user)
1.  REM_ITEM_NAME
2.  REM_ISSUE_NAME
3.  REM_APPLICATION
4.  REM_SUBJECT_AREA
5.  REM_PACKAGE_NAME
6.  REM_STATUS
7.  REM_ASSIGNEE

Retrieve Action

The "retrieve” action occurs when the issue is opened in SAS Data Remediation. Data Remediation will only pass REM_ values to the data management job if the fields are present in the External Data Provider node. Although the REM_ values are the only way the data management job can communicate with SAS Data Remediation but they are not all required, meaning you can just call the fields in the External Data Provider node you need.

The job’s output fields will be displayed in the Remediation UI as edit fields to correct the issue record. it's best to use a Field Layout node as the last job node to pass out the wanted fields with desired labels.

Note: The retrieve job should only return one record.

A simple example of a retrieve job would be to have the issue record id coming from REM_KEY into the data management job to select the record from the source system.

Send Action

The “send” action occurs when pressing the “Commit Changes” button in the Data Remediation UI. All REM_ values in addition to the output fields of the retrieve job (the issue record edit fields) are passed to the send job. The job will receive values for those fields present in the External Data Provider node.

The send job can now work with the remediation record and save it to a staging area or submit it to the data source directly.

Note: Only one row will be sent as an input to the send job. Any data values returned by the send job will be ignored by Data Remediation.

Move jobs the Data Management Server

When both jobs are written, and tested you need to move them to Data Management Server into a Real-Time Data Services sub-directory for Data Remediation to call them.

When Data Remediation is calling the jobs, it will use the user credentials for the person logged on to Data Remediation. Therefore, you need to make sure that the jobs on Data Management Server have been granted the right permission.

Set up a Workflow to control the remediation process

Although you don’t need to involve a workflow in SAS Data Remediation but to improve efficiency it might be a good using one.

You can design your own workflow using SAS Workflow Studio or you can use a prepared workflow already coming with Data Remediation. You need to make sure that the desired workflow is loaded on to Workflow Server to link it to the Data Remediation Service.

Using SAS Workflow will help you to better control Data Remediation issues.

Register the remediation service

We can now register our remediation service in SAS Data Remediation. Therefore, we go to Data Remediation Administrator “Add New Client Application.”

Under Properties we supply an ID, which can be the name of the remediation service as long as it is unique, and a Display name, which is the name showing in the Remediation UI.

Next we set up the edit UI for the issue record. Under Issue User Interface we go to: User default remediation UI…. Using Data Management Server:

The Server address is the fully qualified address for Data Management Server including the port it is listening on. For example: http://dmserver.company.com:21036.

The Real-time service to retrieve item attributes and Real-time service to send item attributes needs to point to the retrieve/send job respectively on Data Management Server, including the job suffix .ddf as well as any directories under Real-Time Data Services where the jobs are stored.

 

 

 

 

 

 

Under the tab Subject Area, we can register different subject categories for this remediation service.  When calling the remediation service we can categorize different remediation issues by setting different subject areas.

Under the tab Issues Types, we can register issue categories. This enables us to categorize the different remediation issues.

At Task Templates/Select Templates you can set the workflow to be used for a particular issue type.

By saving the remediation service you will be able to use it. You can now assign data issues to the remediation service to efficiently correct the data and improve your data quality from within SAS Data Remediation.

Manage remediation issues using SAS Data Management was published on SAS Users.

12月 202017
 

US health care costs are the highest in the developed world, despite per-capita healthcare spending roughly double that of the others, but with outcomes worse-or-equal-to comparable countries. A 2014 Commonwealth Fund study ranked the United States 11th out of 11 major industrial nations in overall healthcare. An aging population is [...]

Using data to discover what’s driving hospital readmissions was published on SAS Voices by Leo Sadovy

12月 202017
 

I previously showed an easy way to visualize a regression model that has several continuous explanatory variables: use the SLICEFIT option in the EFFECTPLOT statement in SAS to create a sliced fit plot. The EFFECTPLOT statement is directly supported by the syntax of the GENMOD, LOGISTIC, and ORTHOREG procedures in SAS/STAT. If you are using another SAS regression procedure, you can still visualize multivariate regression models:

  • If a procedure supports the STORE statement, you can save the model to an item store and then use the EFFECTPLOT statement in PROC PLM to create a sliced fit plot.
  • If a procedure does not support the STORE statement, you can manually create the "slice" of observations and score the model on the slice.

Use PROC PLM to score regression models

Most parametric regression procedures in SAS (GLM, GLIMMIX, MIXED, ...) support the STORE statement, which enables you to save a representation of the model in a SAS item store. The following program creates sample data for 500 patients in a medical study. The call to PROC GLM fits a linear regression model that predicts the level of cholesterol from five explanatory variables. The STORE statement saves the model to an item store named 'GLMModel'. The call to PROC PLM creates a sliced fit plot that shows the predicted values versus the systolic blood pressure for males and females in the study. The explanatory variables that are not shown in the plot are set to reference values by using the AT option in the EFFECTPLOT statement:

data Heart;    /* create example data */
set sashelp.heart(obs=500);
where cholesterol < 400;
run;
 
proc glm data=Heart;
   class Sex Smoking_Status BP_Status;
   model Cholesterol = Sex      Smoking_Status BP_Status  /* class vars  */
                       Systolic Weight;                   /* contin vars */
   store GLMModel;                    /* save the model to an item store */
run;
 
proc plm restore=GLMModel;                       /* load the saved model */
   effectplot slicefit / at(Smoking_Status='Non-smoker' BP_Status='Normal'
                            Weight=150);   /* create the sliced fit plot */
run;
Visualize multivariate regression models: Sliced fit plot by using PROC PLM in SAS

The graph shows a sliced fit plot. The footnote states that the lines obtained by slicing through two response surfaces that correspond to (Smoking_Status, BP_Status) = ('Non-smoker', 'Normal') at the value Weight = 150. As shown in the previous article, you can specify multiple values within the AT option to obtain a panel of sliced fit plots.

Create a sliced fit plot manually by using the SCORE statement

The nonparametric regression procedures in SAS (ADAPTIVEREG, GAMPL, LOESS, ...) do not support the STORE statement. Nevertheless, you can create a sliced fit plot using a traditional scoring technique: use the DATA step to create observations in the plane of the slice and score the model on those observations.

There are two ways to score regression models in SAS. The easiest way is to use PROC SCORE, the SCORE statement, or the CODE statement. The following DATA step creates the same "slice" through the space of explanatory variables as was created by using the EFFECTPLOT statement in the previous example. The SCORE statement in the ADAPTIVEREG procedure then fits the model and scores it on the slice. (Technical note: By default, PROC ADAPTIVEREG uses variable selection techniques. For easier comparison with the model from PROC GLM, I used the KEEP= option on the MODEL statement to force the procedure to keep all variables in the model.)

/* create the scoring observations that define the slice */
data Score;
length Sex $6 Smoking_Status $17 BP_Status $7; /* same as for data */
Cholesterol = .;             /* set response variable to missing   */
Smoking_Status='Non-smoker'; /* set reference levels ("slices")    */
BP_Status='Normal';          /*     for class vars                 */
Weight=150;                  /*     and continuous covariates      */
do Sex = "Female", "Male";       /* primary class var */
   do Systolic = 98 to 272 by 2; /* evenly spaced points for X variable */
      output;
   end;
end;
run;
 
proc adaptivereg data=Heart;
   class Sex Smoking_Status BP_Status;
   model Cholesterol = Sex      Smoking_Status BP_Status
                       Systolic Weight / nomiss 
   /* for comparison with other models, FORCE all variables to be selected */
                       keep=(Sex Smoking_Status BP_Status Systolic Weight);
   score data=Score out=ScoreOut Pred;   /* score the model on the slice */
run;
 
proc sgplot data=ScoreOut;             
   series x=Systolic y=Pred / group=Sex;  /* create sliced fit plot */
   xaxis grid; yaxis grid;
run;

The output, which is not shown, is very similar to the graph in the previous section.

Create a sliced fit plot manually by using the missing value trick

If your regression procedure does not support a SCORE statement, an alternative way to score a model is to use "the missing value trick," which requires appending the scoring data set to the end of the original data. I like to add an indicator variable to make it easier to know which observations are data and which are for scoring. The following statements concatenate the original data and the observations in the slice. It then calls the GAMPL procedure to fit a generalized additive model (GAM) by using penalized likelihood (PL) estimation.

/* missing value trick: append score data to original data */
data All;
set Heart         /* data to fit the model */
    Score(in=s);  /* grid of values on which to score model */
ScoreData=s;      /* SCoreData=0 for orig data; =1 for scoring observations */
run;
 
proc gampl data=All;
   class Sex Smoking_Status BP_Status;
   model Cholesterol = Param(Sex Smoking_Status BP_Status)
                       Spline(Systolic Weight);
   output out=GamOut pred;
   id ScoreData Sex Systolic; /* include these vars in output data set */
run;
 
proc sgplot data=GamOut(where=(ScoreData=1)); /* plot only the scoring obs */
   series x=Systolic y=Pred / group=Sex;  /* create sliced fit plot */
   xaxis grid; yaxis grid;
run;
Visualize multivariate regression models: create sliced fit plot in SAS by using the missing value trick

The GAMPL procedure does not automatically include all input variables in the output data set; the ID statement specifies the variables that you want to output. The OUTPUT statement produces predicted values for all observations in the ALL data set, but the call to PROC SGPLOT creates the sliced plot by using only the observations for which ScoreData = 1. The output shows the nonparametric regression model from PROC GAMPL.

You can also use the ALL data set to overlay the original data and the sliced fit plot. The details are left as an exercise for the reader.

Summary

The EFFECTPLOT statement provides an easy way to create a sliced fit plot. You can use the EFFECTPLOT statement directly in some regression procedures (such as LOGISTIC and GENMOD) or by using the STORE statement to save the model and PROC PLM to display the graph. For procedures that do not support the STORE statement, you can use the DATA step to create "the slice" (as a scoring data set) and use traditional scoring techniques to evaluate the model on the slice.

The post How to create a sliced fit plot in SAS appeared first on The DO Loop.

12月 202017
 

Running SAS programs in batchWhile SAS program development is usually done in an interactive SAS environment (SAS Enterprise Guide, SAS Display Manager, SAS Studio, etc.), when it comes to running SAS programs in a production or operations environment, it is routinely done in batch mode.

Why run SAS programs in batch mode?

First and foremost, this is done for automation, as the batch process does not require human participation at the time of run. It can be scheduled to run (using Operating System scheduler or other scheduling software) while we sleep, at any time of the day or at any time interval between two consecutive runs.

Running SAS programs in batch mode allows streamlining SAS processing by eliminating the possibility of human error, submitting multiple SAS jobs (programs) all at once or in a sequence securing programs and/or data dependencies.

SAS batch processing also takes care of self-documenting, as it automatically generates and stores SAS logs and outputs.

Imagine the following scenario. Every night, a SAS batch process “wakes up” at 3 a.m. and runs an ETL process on a SAS Application server that extracts multiple tables from a database, transforms, combines, and loads them into a SAS datamart; then moves some data tables across the network and loads them into SAS LASR server, so when you are back to work in the morning your SAS Visual Analytics application has all its data refreshed and ready to roll. Of course, the process schedule can be custom-tailored to your particular needs; your batch jobs may run every 15 minutes, once a week, every first Friday of the month – you name it.

What is a batch script file?

To submit a single SAS program in batch mode manually, you could submit an OS command that looks something like the following:

Unix/Linux

sas /sas/code/proj1/job1.sas -log /sas/code/proj1/job1.log

DOS/Windows

"C:\Program Files\SASHome\SASFoundation\9.4\Sas.exe" -SYSIN c:\proj1\job1.sas -NOSPLASH -ICON -LOG c:\proj1\job1.log

However, submitting an OS command manually has too many drawbacks: it’s too much typing, it only submits one SAS program at a time, and most importantly – it is manual, which means it is prone to human error.

Usually, these OS commands are packaged into so called batch files (shell scripts in Unix) that allow for sequential, parallel, as well as conditional execution of multiple OS line commands. They can be run either manually, or automatically – on schedule, or called by other batch scripts.

In a Windows/DOS Operating System, these script files are called batch files and have .bat filename extensions. In Unix-like operating systems, such as Linux, these script files are called shell scripts and have .sh filename extensions.

Since Windows batch files are similar, but slightly different from the Unix (and its open source cousin Linux) shell scripts, in the below examples we are going to use Unix/Linux shell scripts only, in order to avoid any confusion. And we are going to use terms Unix and Linux interchangeably.

Here is the typical content of a Linux shell script file to run a single SAS program:

#!/usr/bin/sh
dtstamp=$(date +%Y.%m.%d_%H.%M.%S)
pgmname="/sas/code/project1/program1.sas"
logname="/sas/code/project1/program1_$dtstamp.log"
/sas/SASHome/SASFoundation/9.4/sas $pgmname -log $logname

Note, that the shell script syntax allows for some basic programming features like current datetime function, formatting, and variables. It also provides some conditional processing similar to “if-then-else” logic. For detailed information on the shell scripting language you may refer to the following BASH shell script tutorial or any other source of many dialects or flavors of the shell scripting (C Shell, Korn Shell, etc.)

Let’s save the above shell script as the following file:
/sas/code/project1/program1.sh

How to submit a SAS program via Unix script

In order to run this shell script we would submit the following Linux command:
/sas/code/project1/program1.sh

Or, if we navigate to the directory first:
cd /sas/code/project1

then we can submit an abbreviated Linux command
./program1.sh
When run, this shell script not only executes a SAS program (program1.sas), but for every run it also creates and saves a uniquely named SAS Log file. You may create the SAS log file in the same directory where the SAS code is stored, as specified in the script shell above, or specify another directory of your choice.

For example, it creates the following SAS log file:
/sas/code/project1/program1_2017.12.06_09.15.20.log

The file name uniqueness is achieved by adding a date/time stamp suffix between the SAS program name and .log file name extension, in this particular case indicating that this SAS log file was created on December 6, 2017, at 09:15:20 (hours:minutes:seconds).

Unix script for submitting multiple SAS programs

Unix scripts may contain not only OS commands, but also other Unix script calls. You can mix-and-match OS commands and other script calls.

When scripts are created for each individual SAS program that you intend to run in a batch, you can easily combine them into a program flow by creating a flow script containing those single program scripts. For example, let’s create a script file /sas/code/project1/flow1.sh with the following contents:

/sas/code/project1/program1.sh
/sas/code/project1/program2.sh
/sas/code/project1/program3.sh

When submitted as

/sas/code/project1/flow1.sh

it will sequentially execute three scripts - program1.sh, program2.sh, and program3.sh, each of which will execute the corresponding SAS program - program1.sas, program2.sas, and program3.sas, and produce three SAS logs - program1.log, program2.log, and program3.log.

Unix script file permissions

In order to be executable, UNIX script files must have certain permissions. If you create the script file and want to execute it yourself only, the file permissions can be as follows:

-rwxr-----, or 740 in octal representation.

This means that you (the Owner of the script file) have Read (r), Write (w) and Execute (x) permission as indicated by the green highlighting; Group owning the script file has only Read (r) permission as indicated by yellow highlighting;  Others have no permissions to the script file at all as indicated by red highlighting.

If you want to give yourself (Owner) and Group execution permissions then your script file permissions can be as:

-rwxr-x---, or 750 in octal representation.

In this case, your group has Read (r) and Execute (x) permissions as highlighted in yellow.

In Unix, file permissions are assigned using the chmod Unix command.

Note, that in both examples above we do not give Others any permissions at all. Remember that file permissions are a security feature, and you should assign them at the minimum level necessary.

Conditional execution of scripts and SAS programs

Here is an example of a Unix script file that allows running multiple SAS programs and OS commands at different times.

#!/bin/sh

#1 extract data from a database
/sas/code/etl/etl.sh

&gt;#2 copy data to the Visual Analytics autoload directory
scp -B userid@sasAPPservername:/sas/data/*.sas7bdat userid@sasVAservername:/sas/config/.../AutoLoad

#3 run weekly, every Monday
dow=$(date +%w)
if [ $dow -eq 1 ]
then
   /sas/code/alerts_generation.sh
fi

#4 run monthly, first Friday of every month
dom=$(date +%d)
if [ $dow -eq 5 -a $dom -le 7 ]
then
   /sas/code/update_history.sh
   /sas/code/update_transactions.sh
fi

In this script, the following logical operators are used: -eq (equal), -le (less or equal), -a (logical and).

As you can see, the script logic takes care of branching to execute different SAS programs when certain timing conditions are met. With such an approach, you would need to schedule only this single script to run at a specified time/interval, say daily at 3 a.m.

In this case, the script will “wake up” every morning at 3 a.m. and execute its component scripts either unconditionally, or conditionally.

If one of the included programs needs to run at a different, lesser frequency (e.g. every Monday, or monthly on first Friday of every month) the script logic will trigger those executions at the appropriate times.

In the above script example steps #1 and #2 will execute every time (unconditionally) the script runs (daily). Step #1 runs ETL program to extract data from a database, step #2 copies the extracted data across the network from SAS Application server to the SAS LASR Analytic server’s drop zone from where they are automatically loaded (autoloaded) into the LASR.

Step #3 will run conditionally every Monday ( $dow -eq 1). Step #4 will run conditionally every first Friday of a month ($dow -eq 5 -a $dom -le 7).

For more information on how to format date for use in shell scripts please refer to this post.

Do you run your SAS programs in batch?

Please share your batch experiences in the comment section below. I am sure the rest of us will really appreciate it!

Running SAS programs in batch under Unix/Linux was published on SAS Users.

12月 192017
 

SAS' tag line is The Power to Know©, But what makes SAS so powerful? Ask our users and they'll tell you -- it's because SAS allows them to answer questions which previously could not be answered. How does SAS do this? SAS built a 4th generation programming language which makes [...]

What makes SAS so powerful was published on SAS Voices by David Pope

12月 192017
 

In my last article, Managing SAS Configuration Directory Security, we stepped through the process for granting specific users more access without opening up access to everyone. One example addressed how to modify security for autoload. There are several other aspects of SAS Visual Analytics that can benefit from a similar security model.

You can maintain a secure environment while still providing one or more select users the ability to:

  • start and stop a SAS LASR Analytic Server.
  • load data to a SAS LASR Analytic Server.
  • import data to a SAS LASR Analytic Server.

Requirements for these types of users fall into two areas: metadata and operating system.

The metadata requirements are very well documented and include:

  • an individual metadata identity.
  • membership in appropriate groups (for example: Visual Analytics Data Administrators for SAS Visual Analytics suite level administration; Visual Data Builder Administrators for data preparation tasks; SAS Administrators for platform level administration).
  • access to certain metadata (refer to the SAS Visual Analytics 7.3: Administration Guide for metadata permission requirements).

Operating System Requirements

Users who need to import data, load data, or start a SAS LASR Analytic Server need the ability to authenticate to the SAS LASR Analytic Server host and write access to some specific locations.

If the SAS LASR Analytic Server is distributed users need:

If the compute tier (the machine where the SAS Workspace Server runs) is on Windows, users need the Log on as a batch job user right on the compute machine.

In addition, users need write access to the signature files directory, the path for the last action logs for the SAS LASR Analytic Server, and the PIDs directory in the monitoring path for the SAS LASR Analytic Server.

Signature Files

There are two types of signature files: server signature files and table signature files. Server signature files are created when a SAS LASR Analytic Server is started. Table signature files are created when a table is loaded into memory. The location of the signature files for a specific SAS LASR Analytic Server can be found on the Advanced properties of the SAS LASR Analytic Server in SAS Management Console.

SAS Configuration Directory Security for SAS Visual Analytics

On Linux, if your signature files are in /tmp you may want to consider relocating them to a different location.

Last Action Logs and the Monitoring Path

In the SAS Visual Analytics Administrator application, logs of interactive actions for a SAS LASR Analytic Server are written to the designated last action log path. The standard location is on the middle tier host in <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator/Monitoring/Logs. The va.lastActionLogPath property is specified in the SAS Visual Analytics suite level properties. You can access the SAS Visual Analytics suite level properties in SAS Management Console under the Configuration Manager: expand SAS Applicaiton Infrastructure, right-click on Visual Analytics 7.3 to open the properties and select the Advanced tab.

The va.monitoringPath property specifies the location of certain monitoring process ID files and logs. The standard location is on the compute tier in <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator/Monitoring/. This location includes two subdirectories: Logs and PIDs. You can override the default monitoring path by adding the va.monitoringPath extended attribute to the SAS LASR Analytic Server properties.

Host Account and Group

For activities like starting the SAS LASR Analytic Server you might want to use a dedicated account such as lasradm or assign the access to existing users. If you opt to create the lasradm account, you will need to also create the related metadata identity.

For group level security on Linux, it is recommended that you create a new group, for example sasusers, to reserve the broader access provided by the sas group to only platform level administrators. Be sure to include in the membership of this sasusers group any users who need to start the SAS LASR Analytic Server or that need to load or import data to the SAS LASR Analytic Server.

Since the last action log path, the monitoring path, and the autoload scripts location all fall under <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator, you can modify the ownership of this folder to get the right access pattern.

A similar pattern can also be applied to the back-end store location for the data provider library that supports reload-on-start.

Don’t forget to change the ownership of your signature files location too!

SAS Admin Notebook: Managing SAS Configuration Directory Security for SAS Visual Analytics was published on SAS Users.