11月 092018
 

In parts one and two of this blog posting series, we introduced machine learning models and the complexity that comes along with their extraordinary predictive abilities. Following this, we defined interpretability within machine learning, made the case for why we need it, and where it applies. In part three of [...]

SAS Customer Intelligence 360: A look inside the black box of machine learning [Part 3] was published on Customer Intelligence Blog.

11月 092018
 

With the recent midterm elections here in the US, I frequently saw congressional district maps popping up in the news. And being a GraphGuy, I wanted to fully understand these maps, to see if I might could improve them. If you're interested in congressional district maps, follow along as I [...]

The post Enhancing a congressional district map - one layer at a time appeared first on SAS Learning Post.

11月 072018
 

How different is different? Descriptive analytics is powerful. You get to see your data and seeing is believing as they say. But blindly believing what you see is not always the best strategy. Asking someone else if they see what you see helps but is still subjective. Objectivity starts with [...]

Segment comparisons – Seeing is believing but measuring is knowing was published on Customer Intelligence Blog.

11月 072018
 

Migration, version road maps and configurations were the themes of several questions that came up in a recent webinar about combining SAS Grid Manager and SAS Viya. You’ll see in this blog post that we were ready to get into the nitty-gritty details in our answers below – just as we did in the previous FAQs post. We hope you find them useful in your work using SAS Grid Manager and SAS Viya together.

1. Can we migrate SAS programs that are currently on SAS PC environments into the SAS Grid environment – or do we need to rewrite the programs for SAS Grid Manager?

No, you don’t need to rewrite your SAS programs to run on a SAS Grid environment. Many customers migrate their code from other environments (like PCs or servers) and submit them to SAS Grid Manager from SAS Display Manager, SAS Studio or any other application of their choice.

If you already use SAS Enterprise Guide to run jobs on a remote server, the process may be as simple as changing your server configuration to use a grid-launched workspace server (information that your SAS Administrator would provide) and continuing to work in much the same way as always, requiring no changes to your code.

Depending on other changes that take place at the same time SAS Grid Manager is implemented, there may need to be some small adjustments to your programs.  For example, if your organization consolidates source data onto new storage, you may need to change paths associated with your LIBNAME statements.  These should be housekeeping items rather than significant rewrites of the logic in your SAS code.

If you plan to continue to use the programming environment provided by BASE SAS itself (DMS) and have been using SAS/CONNECT, you will need to add the SIGNON statement to start a session on the SAS Grid Manager

  • ENDRSUBMIT statement to end the block of code to be run on the grid
  • Divide and Conquer – Writing Parallel SAS Code to Speed Up Your SAS Program.

    2. Is there a version of SAS Grid Manager that runs on the SAS Viya architecture?

    The SAS Grid Manager roadmap includes a release of SAS Grid Manager on the SAS Viya Architecture late in 2019.

    3. Will I be able to migrate my SAS Grid Manager configuration and jobs from SAS 9.4 to the SAS Viya-based release of SAS Grid Manager?

     The plan to deliver SAS Grid Manager on the SAS Viya architecture includes automation to migrate jobs, flows, and schedule information from your SAS 9.4 environment to your SAS Viya environment.  It is our goal to make this transition as straightforward and easy as possible – especially where there is feature parity between SAS 9 based and SAS Viya-based solutions.  Since each product delivers solution-specific PROCs and other functionality that can be used within a job executed by SAS Grid Manager, each customer should work with their SAS team to understand which jobs can be migrated and which jobs may need to continue to run against your SAS 9.4 environment.  

    * * *

    These were all great questions that we thought deserved more detail than we could offer in a webinar.  If you have more questions that weren’t covered here, or in our previous post on this topic, just post them in the comments section.  We’ll answer them quickly.  Thanks for your interest!

    3 questions about implementing SAS Grid Manager and SAS Viya was published on SAS Users.

  • 11月 072018
     

    When solving optimization problems, it is harder to specify a constrained optimization than an unconstrained one. A constrained optimization requires that you specify multiple constraints. One little typo or a missing minus sign can result in an infeasible problem or a solution that is unrelated to the true problem. This article shows two ways to check that you've correctly specified the constraints for a two-parameter optimization. The first is a program that translates the linear constraint matrix in SAS/IML software into a system of linear equations in slope-intercept form. The second is a visualization of the constraint region.

    For simplicity, this article only discusses two-dimensional constraint regions. For these regions, the SAS/IML constraint matrix is a k x 4 matrix. We also assume that the constraints are linear inequalities that define a 2-D feasible region.

    The SAS/IML matrix for boundary and linear constraints

    In SAS/IML software you must translate your boundary and linear constraints into a matrix that encodes the constraints. (The OPTMODEL procedure in SAS/OR software uses a more natural syntax to specify constraints.) The first and second rows of the constraint matrix specify the lower and upper bounds, respectively, for the variables. The remaining rows specify the linear constraints. In general, when you have p variables, the first p columns contain a matrix of coefficients; the (p+1)st column encodes whether the constraint is an equality or inequality; the (p+2)th column specifies the right-hand side of the linear constraints. In this article, p = 2.

    The following matrix is similar to the one used in a previous article about how to find an initial guess in a feasible region. The matrix encodes three inequality constraints for a two-parameter optimization. The first two rows of the first column specify bounds for the first variable: 0 ≤ x ≤ 10. The first two rows of the second column specify bounds for the second variable: 0 ≤ y ≤ 8. The third through fifth rows specify linear inequality constraints, as indicated in the comments. The third column encodes the direction of the inequality constraints. A value of -1 means "less than"; a value of +1 means "greater than." You can also use the value 0 to indicate an equality constraint.

    proc iml;
    con = {  0   0   .   .,        /* lower bounds */
            10   8   .   .,        /* upper bounds */
             3  -2  -1  10,        /* 3*x1 + -2*x2 LE 10 */
             5  10  -1  56,        /* 5*x1 + 10*x2 LE 56 */
             4   2   1   7 };      /* 4*x1 +  2*x2 GE  7 */

    Verify the constraint matrix

    A pitfall of encoding the constraints is that you might wonder whether you made a mistake when you typed the constraint matrix. Also, each linear constraint (row) in the matrix represents a linear equation in "standard form" such as 3*x - 2*y ≤ 10, whereas you might be more comfortable visualizing constraints in the equivalent "slope-intercept form" such as "y ≥ (3/2)x - 5.

    It is easy to transform the equations from standard form to slope-intercept form. If c2 > 0, then the standard equation c1*x + c2*y ≥ c0 is transformed to y ≥ (-c1/c2)*x + (c0/c2). If c2 < 0, you need to remember to reverse the sign of the inequality: y ≤ (-c1/c2)*x + (c0/c2). Because the (p+1)th column has values ±1, you can use the SIGN function to perform all transformations without writing a loop. The following function "decodes" the constraint matrix and prints it is "human readable" form:

    start BLCToHuman(con);
       nVar = ncol(con) - 2;
       Range = {"x", "y"} + " in [" + char(con[1,1:nVar]`) + "," + char(con[2,1:nVar]`) + "]";
       cIdx = 3:nrow(con);              /* rows for the constraints */
       c1 = -con[cIdx,1] / con[cIdx,2];
       c0 =  con[cIdx,4] / con[cIdx,2];
       sign = sign(con[cIdx,3] # con[cIdx,2]);
       s = {"<=" "=" ">="};
       idx = sign + 2;                 /* convert from {-1 0 1} to {1 2 3} */
       EQN = "y" + s[idx] + char(c1) + "x + " + char(c0);
       print Range, EQN;
    finish;
     
    run BLCToHuman(con);

    Visualize the feasible region

    In a previous article, I showed a technique for visualizing a feasible region. The technique is crude but effective. You simply evaluate the constraints at each point of a dense, regular, grid. If a point satisfies the constraints, you plot it in one color; otherwise, you plot it in a different color.

    For linear constraints, you can perform this operation by using matrix multiplication. If A is the matrix of linear coefficients and all inequalities are "greater than" constraints, then you simply check that A*X ≥ b, where b is the right-hand side (the (p+2)th column). If any of the inequalities are "less than" constraints, you can multiply that row of the constraint matrix by -1 to convert it into an equivalent "greater than" constraint. This algorithm is implemented in the following SAS/IML function:

    start PlotFeasible(con);
       L = con[1, 1:2];                   /* lower bound constraints */
       U = con[2, 1:2];                   /* upper bound constraints */
       cIdx = 3:nrow(con);                /* rows for linear inequality constraints */
       C = con[cIdx,] # con[cIdx,3];      /* convert all inequalities to GT */ 
       M = C[, 1:2];                      /* coefficients for linear constraints */
       RHS = C[, 4];                      /* right hand side */
       x = expandgrid( do(L[1], U[1], (U[1]-L[1])/50),    /* define (x,y) grid from bounds */
                       do(L[2], U[2], (U[2]-L[2])/50) );
       q = (M*x`>= RHS);                  /* 0/1 indicator matrix */
       Feasible = (q[+,] = ncol(cIdx));   /* are all constraints satisfied? */
       call scatter(x[,1], x[,2],) group=Feasible grid={x y} option="markerattrs=(symbol=SquareFilled)";
    finish;
     
    ods graphics / width=430px height=450px;
    title "Feasible Region";
    title2 "Formed by Bounds and Linear Constraints";
    run PlotFeasible(con);

    The graph shows that the feasible region as a red pentagonal region. The left and lower edges are determined by the bounds on the parameters. There are three diagonal lines that are determined by the linear inequality constraints. By inspection, you can see that the point (2, 2) is in the feasible region. Similarly, the point (8,6) is in the blue region and is not feasible.

    The technique uses the bounds on the parameters (the first two rows) to specify the range of the axes in the plot. If your problem does not have explicit upper/lower bounds on the parameters, you can "invent" bounds such as [0, 100] or [-10, 10] just for plotting the feasible region.

    In summary, this article shows two techniques that can help you verify that a constraint matrix is correctly specified. The first translates that constraint matrix into "human readable" form. The second draws a crude approximation of the feasible region by evaluating the constraints at each location on a dense grid. Both techniques are shown for two-parameter problems. However, with a little effort, you could incorporate the second technique when searching for a good initial guess in higher-dimensional problems.

    The post Visualize the feasible region for a constrained optimization appeared first on The DO Loop.

    11月 062018
     

    The only constant is change – Heraclitus As online user behavior continues to evolve, user expectations are growing as well. More and more, users expect to be known; thus, web personalization and targeted content are becoming mission critical and an expectation, not an exception. Using SAS Customer Intelligence 360, we [...]

    Using SAS at SAS: How content targeting drives better UX was published on Customer Intelligence Blog.

    11月 062018
     

    A few weeks ago I posted a cliffhanger-of-a-blog-post. I left my readers in suspense about which of my physical activities are represented in different sets of accelerometer data that I captured. In the absence of more details from me, the internet fan theories have been going wild. Well, it's time for the big reveal! I've created a SAS Visual Analytics report that shows each of these activity streams with the proper label:

    Accelerometer measurements per activity -- click to enlarge!

    Were your guesses confirmed? Any surprises? Were you more impressed with my safe driving or with my reckless behavior on the trampoline?

    Collecting and preparing accelerometer data

    You might remember that this entire experiment was inspired by a presentation from Analytics Experience 2018. That's when I learned about an insurance company that built a smartphone app to collect data about driving behavior, and that the app relies heavily on accelerometer readings. I didn't have time or expertise to build my own version of such an app, but I found that there are several good free apps that can collect and export this data. I used an app called AccDataRec on my Android phone.

    Each "recording session" generates a TSV file -- a tab-separated file that contains a timestamp and a measurement for each of the accelerometer axes (X, Y, and Z). In my previous post, I shared tips about how to import multiple TSV files in a single step. Here's the final version of the program that I wrote to import these data:

    filename tsvs "./accel/*.tsv";
    libname out "./accel";
     
    data out.accel;
      length 
        casefile $ 100 /* to write to data set */
        counter 8 
        timestamp 8 
        timestamp_sec 8
        x 8 y 8 z 8 
        filename $ 25        
        tsvfile $ 100 /* to hold the value */
      ;
      format timestamp datetime22.3 timestamp_sec datetime20.;
     
      /* store the name of the current infile */
      infile tsvs filename=tsvfile expandtabs;
      casefile=tsvfile;
      input counter timestamp x y z filename;
     
      /* convert epoch time into SAS time */
      timestamp=dhms('01jan1970'd, 0, 0, timestamp / 1000);
     
      /* create a timestamp with the precision of one second */
      timestamp_sec = intnx('second',timestamp,0);
    run;

    Some notes:

    • I converted the timestamp value from the data file (an epoch time value) to a native SAS datetime value by using this trick.
    • Following advice from readers on my last post, I changed the DLM= option to a more simple EXPANDTABS option on the INFILE statement.
    • Some of the SAS time-series analysis doesn't like the more-precise timestamp values with fractions of seconds. I computed a less precise field, rounding down to the second, just in case.
    • For my reports in this post, I really need only 5 fields: counter (the ordinal sequence of measurements), x, y, z, and the filename (mapping to activity).

    The new integrated SAS Viya environment makes it simple to move from one task to another, without needing to understand the SAS product boundaries. I used the Manage Data function (that's SAS Data Management, but does that matter?) to upload the ACCEL data set and make it available for use in my reports. Here's a preview:

    Creating a SAS Visual Analytics report

    With the data now available and loaded into memory, I jumped to the Explore and Visualize Data activity. This is where I can use my data to create a new SAS Visual Analytics report.

    At first, I was tempted to create a Time Series Plot. My data does contain time values, and I want to examine the progression of my measurements over time. However, I found the options of the Time Series Plot to be too constraining for my task, and it turns out that for this task the actual time values really aren't that important. What's important is the sequence of the measurements I've collected, and that's captured as an ordinal in the counter value. So, I selected the Line Plot instead. This allowed for more options in the categorical views -- including a lattice row arrangement that made it easy to see the different activity patterns at a glance. This screen capture shows the Role assignments that I selected for the plot.

    Adding a closer view at each activity

    With the overview Line Plot complete, it's time to add another view that allows us to see just a single activity and provide a close-up view of its pattern. I added a second page to my report and dropped another Line Plot onto the canvas. I assigned "counter" to the category and the x, y, and z values to the Measures. But instead of adding a Lattice Row value, I added a Button Bar to the top of the canvas. My idea is to use the Button Bar -- which is good for navigating among a small number of values -- as a way to trigger a filter for the accelerometer data.

    I assigned "filename" to the Category value in the Button Bar role pane. Then I used the Button Bar options menu (the vertical dots on the right) to add a New filter from selection, selecting "Include only selection".

    With this Button Bar control and its filter in place, I can now switch among the data values for the different activities. Here's my "drive home" data -- it looks sort of exciting, but I can promise you that it was a nice, boring ride home through typical Raleigh traffic.

    Phone mounted in my car for the drive home

    The readings from the "kitchen table" activity surprised me at first. This activity was simply 5 minutes of my phone lying flat on my kitchen table. I expected all readings to hover around zero, but the z axis showed a relatively flat line closer to 10 meters-per-second-per-second. Then I remembered: gravity. This sensor registers Earth's gravity, which we are taught is 9.8 meters-per-second-per-second. The readings from my phone hovered around 9.6 -- maybe my house is in a special low-gravity zone, or the readings are a bit off.

    Phone at rest on my kitchen table

    Finally, let's take a closer look at my trampoline workout. Since I was holding my phone upright, it looks like the x-axis felt the brunt of the acceleration forces. According to these readings, my phone was subjected to a g-force of 7 or 8 times that of Earth's gravity -- but just for a split second. And since my phone was in my hand and my arm was flailing around (I am not a graceful rebounder), my phone was probably experiencing more force than my body was.

    Bounding on the trampoline as high as I can

    Some love for the Windows 10 app

    My favorite method to view SAS Visual Analytics reports is through the SAS Visual Analytics application that's available for Windows 10 and Windows mobile devices. Even on my desktop, where I have a full web browser to help me, I like the look and feel of the specialized Windows 10 app. The report screen captures for this article were rendered in the Windows 10 app. Check out this article for more information about the app. You can try the app for free, even without your own SAS Viya environment. The app is hardwired with a connection to the SAS demo reports at SAS.com.

    See also

    This is the third (and probably final) article in my series about accelerometer data. See these previous posts for more of the fun background information:

    The post Reporting on accelerometer data with SAS Visual Analytics appeared first on The SAS Dummy.