十二 122016
 

Historically, before data was managed it was moved to a central location. For a long time that central location was the staging area for an enterprise data warehouse (EDW). While EDWs and their staging areas are still in use – especially for structured, transactional and internally generated data – big […]

The post Managing data where it lives appeared first on The Data Roundtable.

十二 122016
 

In a previous blog about SAS Studio I’ve briefly introduced the concept of using the Webwork library instead of the default Work. I also suggested, in SAS Global Forum 2016 paper, Deep Dive with SAS Studio into SAS Grid Manager 9.4, to save intermediate results in the Webwork library, because this special library is automatically assigned at start-up and is shared across all workspace server sessions. In the past days, I received some request to expand on the properties of this library and how it is shared across different sessions. What better way to share this information than writing this up in a blog?

As always, I’d like to start with a reference to the official documentation. SAS(R) Studio 3.5: User’s Guide describes the Webwork library, along with its differences with respect to the Work library, in the section about the Interactive Mode. The main points are:

  • Webwork is the default output library in interactive mode. If you refer to a table without specifying both the libref and the table name, SAS Studio assumes it is stored in the Webwork library.
  • The Webwork library is shared between interactive mode and non-interactive mode. Any data that you create in the Webwork library in one mode can be accessed in the other mode.
  • The Work library is not shared between interactive mode and non-interactive mode. Each workspace server session has its own separate Work library, and data cannot be shared between them.
  • Any data that you save to the Work library in interactive mode cannot be accessed from the Work library in non-interactive mode. Also, you cannot view data in the Work library from the Libraries section of the navigation pane if the data was created in interactive mode.

In addition to this, we can list some additional considerations:

  • The Webwork library is shared between every workspace server session started when using parallel process flows from the Visual Programming perspective.
  • The Webwork library is not shared between different SAS Studio sessions. When using multiple SAS Studio sessions, each one has a different Webwork, just like traditional SAS Foundation sessions do not share their Work libraries.
  • The Webwork library is cleared at the end of the SAS Studio session and its content is temporary in nature, just like the Work library.

Here are the logs of the same lines of code executed in different SAS Studio sessions to show the actual path, on a Windows machine, of the Work and Webwork directories:

First SAS Studio session, non-interactive mode

sas-studio-webwork-library01

Same session, interactive mode

sas-studio-webwork-library02

Second SAS Studio session, non-interactive mode

sas-studio-webwork-library03

And since a picture is worth a thousand words, the following diagram depicts the relationship between SAS Studio sessions, Work libraries and Webwork libraries.

sas-studio-webwork-library04

Finally, I’d like to remember that, in distributed environments where the workspace server sessions are load balanced across multiple hosts, it is imperative to configure the Webwork library on a shared filesystem, following the instructions explained in the SAS Studio tips for SAS Grid Manager Administrators blog.

tags: SAS Grid Manager, SAS Professional Services, sas studio, SAS Studio Webwork library

SAS Studio Webwork library demystified was published on SAS Users.

十二 122016
 

Being an industry disruptor is a lonely place to be – but if you’re successful, the rewards are well worth the initial risk. Betting big on your new way of doing things takes courage, and is only the first step in a risky process. Your next critical step is to […]

4 Basics disruptive innovators must get right was published on SAS Voices.

十二 102016
 
Koch Snowflake in SAS

I have a fondness for fractals.

In previous articles, I've used SAS to create some of my favorite fractals, including a fractal Christmas tree and the "devil's staircase" (Cantor ) function. Because winter is almost here, I think it is time to construct the Koch snowflake fractal in SAS.

A Koch snowflake is shown to the right. The Koch snowflake has a simple construction. Begin with an equilateral triangle. For every line segment, replace the segment by four shorter segments, each of which is one-third the length of the original segment. The two middle segments are offset by 60 degrees from the direction of the original segment. You then iterate this process, with each step replacing every line segment with four others.

The Koch division process

One step in the construction of a Koch Snowflake

For simplicity, the graph at right shows the first step of the construction on a single line segment that has one endpoint at (0,0) and another endpoint at (1,0). The original horizontal line segment of length one is replaced by four line segments of length 1/3. The first and fourth segments are in the same direction as the original segment (horizontal). The second segment is rotated -60 degrees from the original direction. It, too, is 1/3 the original length. The third segment connects the second and fourth line segments.

The following program defines a function that uses the vector capabilities of SAS/IML software to carry out the fundamental step on a line segment. The function takes two row vectors as arguments, which are the (x,y) coordinates of the endpoints of the line segment. The function returns a 5 x 2 matrix, where the rows contain the endpoints of the four line segments that result from the Koch construction.

Two statements in the function might be unfamiliar. The direct product operator (@) quickly constructs points that are k/3 units along the original line segment for k=0, 1, 2, and 3. The ATAN2 function computes the direction angle for a two-dimensional vector.

proc iml;
/* divide a line segment (2 pts) into 4 segments (5 pts).  
   Create middle point by rotating through -60 degrees */ 
start KochDivide(A, E);                    /* (x,y) coords of endpoints */
   segs = j(5, 2, .);                      /* matrix to hold 4 shorter segs */
   v = (E-A) / 3;                          /* vector 1/3 as long as orig */
   segs[{1 2 4 5}, ] = A + v @ T(0:3);     /* endpoints of new segs */
   /* Now compute middle point. Use ATAN2 to find direction angle. */
   theta = -constant("pi")/3 + atan2(v[2], v[1]); /* change angle by pi/3 */
   w = cos(theta) || sin(theta);           /* vector to middle point */
   segs[3,] = segs[2,] + norm(v)*w;
   return segs;
finish;
 
/* test on line segment from (0,0) to (1,0) */
s = KochDivide({0 0}, {1 0});
title  "Fundamental Step in the Koch Snowflake Construction";
ods graphics / width=600  height=300; 
call series(s[,1], s[,2]) procopt="aspect=0.333";

Construct the Koch snowflake in SAS

Now that we have defined a function that can replace a segment by four shorter segments, let's define a function that applies that function repeatedly to each segment of a polygon. The following function takes two arguments: the polygon and the number of iterations. An N-sided closed polygon is represented as an (N+1) x 2 matrix of vertices.

/* create Koch Snowflake from and equilateral triangle */
start KochPoly(P0, iters=5);    
P = P0;
do j=1 to iters;
   N = nrow(P) - 1;             /* old number of segments */
   newP = j(4*N+1, 2);          /* new number of segments + 1 */
   do i=1 to N;                 /* for each segment... */
      idx = (4*i-3):(4*i+1);                    /* rows for 4 new segments */
      newP[idx, ] = KochDivide(P[i,], P[i+1,]); /* generate new segments */
   end;
   P = newP;                    /* update polygon and iterate */
end;
return P;
finish;

To test the function, define an equilateral triangle. The call to the KochPoly function creates a Koch snowflake from the equilateral triangle.

/* create equilateral triangle as base for snowflake */
pi = constant("pi");
angles = -pi/6 // pi/2 // 7*pi/6;  /* angles for equilateral triangle */
P = cos(angles) || sin(angles);    /* vertices of equilateral triangle */
P = P // P[1,];                    /* append first vertex to close triangle */
K = KochPoly(P);                   /* create the Koch snowflake */

The Koch snowflake that results from this iteration is shown at the top of this article. For completeness, here are the statements that write the snowflake to a SAS data set and graph it by using PROC SGPLOT:

S = j(nrow(K), 3, 1);      /* add ID variable with constant value 1 */
S[ ,1:2] = K;
create Koch from S[colname={X Y ID}];  append from S;  close;
QUIT;
 
ods graphics / width=400px height=400px;
footnote justify=Center "Koch Snowflake";
proc sgplot data=Koch;
   styleattrs wallcolor=CXD6EAF8;               /* light blue */
   polygon x=x y=y ID=ID / outline fill fillattrs=(color=white);
   xaxis display=none;
   yaxis display=none;
run;

Generalizations of the Koch snowflake

The KochPoly function does not check whether the polygon argument is an equilateral triangle. Consequently, use the function to "Koch-ify" any polygon! For example pass in the rectangle with vertices P = {0 0, 2 0, 2 1, 0 1, 0 0}; to create a "Koch box."

Enjoy playing with the program. Let me know if you generate an interesting shape!

tags: Fractal, Just for Fun

The post Create a Koch snowflake with SAS appeared first on The DO Loop.

十二 102016
 

During a recent customer visit, I was asked how to include a calculated variable within SAS Asset Performance Analytics’ (APA) Root Cause Analysis workflow. This is a simple request. Is there a simple approach to do this?

To remind you, in the APA workflow, an ETL Administrator makes a Data Mart available in the solution for the APA users. They can select variables and explore, analyze and create a model based on the columns present in the Data Mart.

But if you want to analyze a calculated column, like a difference between two variables, do you need to change the Data Mart? Yes, if you want the APA Data Selection to include this calculated column. But this takes time, and do you really need to change the Data Mart? No!

A simpler and faster approach to adding a calculated column is modifying the APA Root Cause Analysis workflow. And, this is simple!

SAS Asset Performance Analytics is easily and highly configurable. You can easily customize analytical workflows by modifying its underlying stored process. Let me show you how to customize an existing analysis and add a calculation step to enhance APA’s Root Cause Analysis with a calculated column.

Benefits

The main purpose of this customized analysis is to avoid the SAS Enterprise Guide utilization. The users are rarely SAS experts. Thereby, asking users to switch between the tools depending on the functionalities availability on the APA GUI isn’t recommended. The more you can do within the APA interface via wizard-guided workflows, the easier it will be.

The second benefit is the keeping the Data Mart limited to only crucial variables. Instead of asking an ETL Administrator to add non-validated and/or infrequently used calculated columns to the Data Mart, allow the APA user to test and create meaningful tags to enhance workflows as needed. Once the APA user identifies new and meaningful calculated variables, they can easily be added to the Data Mart and available to APA Explorations and APA Stability Monitoring. Limiting only critical variables within the Data Mart will ensure the data size is optimizing and only adjusted as needed.

Root Cause Analysis Use

The use of this new Root Cause Analysis is very easy, instead of selecting “Root Cause Analysis,” select “Root Cause Analysis Calculated Columns” when required.

sas-asset-performance-analytics-root-cause-analysis01

Figure 1: Stored Process Folder allows us to choose the appropriate Analysis

Clicking on “Ok” triggers the same steps as the original RCA.

sas-asset-performance-analytics-root-cause-analysis02

Figure 2: steps triggered by the Root Cause Analysis

After specifying a data selection, the “Filter Data” step contains six new user prompts:

Figure 3: New “Filter Data” interface

Figure 3: New “Filter Data” interface

These prompts allow the user to define what calculation is needed. Six calculations choices are currently available. These choices can be further customized as needed. Calculations types currently available include:

  1. Difference between two variables if you want to study a gap between two tags
  2. Absolute difference of two variables if large gaps are suspicious regardless of the order
  3. Ratio of two variables if you want to control the proportionality
  4. Multiplication of two variables
  5. Summation of two variables
  6. None

By default, the Calculation option is set to “None,” to perform a classical Root Cause Analysis.

Figure 4: Dropdown list with calculation details

Figure 4: Dropdown list with calculation details

After choosing your variables, you can apply variable coefficients you want to test. By default, APA performs an absolute difference Variable 1 – Variable 2, with the coefficients set to 1. If the variables don’t have the same order of magnitude, you can apply a coefficient to put the variables at the same level. By doing this the newly created calculated variable fluctuates around 0, which is easily interpretable.

In the example below, the goal is to realize a simple absolute difference between the Variable 2 (V.Four_Press_Clap) and the Variable 1 (V.Four_Press_C). By default the newly created column name is DIFF_TAGS. You can modify it to something more descriptive for your purposes. Don’t forget that the new column name must follow standard SAS column naming conventions.

Before processing the step, be careful to check the data is not missing during the data period you’ve selected. If it’s missing during the full time period, the DIFF_TAGS variable will not be created and the output will be the same than if you’d selected “None” for the calculation prompt.

Figure 5: DIFF_TAGS variable creation corresponding to the result of Four_Press_Clap – Four_Press_C

Figure 5: DIFF_TAGS variable creation corresponding to the result of Four_Press_Clap – Four_Press_C

Click “Run and Save Step.” As a result, the newly created calculated column is added to your input RCA data and available in the “Expand and Build Statistics” step. Now you can apply minimum and maximum value bands to the calculated column similar to the original RCA workflow.

Figure 6: minimum and maximum values to consider as atypical

Figure 6: minimum and maximum values to consider as atypical

As a result, the calculated column is used like a standard input tag during the full Root Cause Analysis workflow.

Things to keep in mind

If you have a specific need, SAS Asset Performance Analytics is easily customizable. In France, for example, we propose an unlimited and modular APA interface.

Creating columns within the APA user interface analysis workflow has two main benefits:

  1. The calculation is intuitive for a non SAS user
  2. You don’t need to change the data mart

Only two steps are required to implement Root Cause Analysis including Calculated Columns within APA:

  1. Copy and paste the pam_rca_filterData_NEW.sas file in the SASHome/SASFoundation/ 9.4/pamsrmv/sasstp directory.
  2. Using the SAS Management Console, import the RCA_CC.spk file at the following location: /Products/SAS Asset Performance Analytics/Analytical Workbench in the Folders tab.

If you have any questions about this process, feel free to add a comment below.

tags: Global Technology Practice, SAS Asset Performance Analytics

Enhancing SAS Asset Performance Analytics’ Root Cause Analysis with Calculated Columns was published on SAS Users.

十二 102016
 

Are you ready for the supermoon on December 14? This will actually be the 3rd supermoon in 2016! With these big-looking full moons we're having this year, I got to wondering exactly how big is the moon compared to Earth? This seems like a good question to answer with some […]

The post Is the Moon really 1/4 the size of the Earth? appeared first on SAS Learning Post.

十二 092016
 

It's that festive time of year again, so you may want to build yourself a fire  and grab a cup of hot chocolate as you prepare for a rousing round of holiday/IT joy. Grab your co-workers and gather around the water cooler while singing along to this post and others […]

Over the wires and through the cloud was published on SAS Voices.