Business Intelligence

2月 082018
 

By default, SAS Visual Analytics 7.4 supports country and state level polygons for regional geomaps. In SAS Visual Analytics 7.4, custom shape files are now supported, as well. This means that if a site has their own custom polygon data that defines custom regions, it’s possible to create a region geomap that displays those regions.

Implementing the process requires completing some preparatory steps, explicitly execution of some SAS code, but the steps are explained in Appendix 2 of the SAS Visual Analytics 7.4: Administration Guide. The SAS program that completes the steps is provided for download at http://support.sas.com/rnd/datavisualization/vageo/va74polygons.sas.

Two examples using the program are provided in Appendix 2 for US counties and German provinces. The instructions in Appendix 2 assume that the custom polygon data is provided in ESRI shape file format, which is likely the most common use-case. The site will need access to a SAS programming environment and SAS/GRAPH software, and whoever completes the process will need access to the SAS Visual Analytics configuration directory and the ability to restart services—so an administrator-type person will be required.

One common request is to provide a regional geomap, where the regions are site-defined groups of states or provinces of a country. In this example problem, the site has sales data for each sales region in the US and would like to display a geo map of the regions.

Custom regional map in SAS Visual AnalyticsFor this type of region/province example, you will likely be able to use one of the maps already provided by SAS in the MAPSGFK library to produce your region boundaries. For more information on the datasets in the MAPSGFK library, see this paper. 

The MAPSGFK.US_STATES dataset contains the data required to overlay all states of the US on a VA region geomap and has these columns:

The highlighted columns, STATECODE, LONG, and LAT will be particularly useful, but first, the sales region (REGION) column and values must be added using simple data step code. The unnecessary FIPS code (STATE) can be dropped in the same DATA step.  Note that the region values are assigned in upper case, as these will later be converted to ID values, which VA expects to be in upper case.

data regions;
   length region $ 12;
   drop state;
   set mapsgfk.us_states;
      if statecode in ('AK','HI','PR') then delete;
      else if statecode in ('WA','MT','OR','ID','WY')
         then region='NORTHWEST';
      else if statecode in ('CA','NV','UT','AZ','CO','NM')
         then region='SOUTHWEST'; 
      else if statecode in ('ND','SD','NE','MN','WI','MI','IA','IL','IN')
         then region='NORTHCENTRAL'; 
      else if statecode in ('KS','OK','TX','MO','AR')
         then region='SOUTHCENTRAL'; 
      else if statecode in ('ME','NH','VT','MA','RI','CT','NY','PA','NJ','OH','DE',
'MD','DC')then region='NORTHEAST';
      else if statecode in ('KY','WV','VA','TN','NC','MS','AL','LA','GA','SC','FL')
         then region='SOUTHEAST';
      run;

The data is then sorted by the REGION values, a requirement of the SAS/GRAPH GREMOVE procedure, which is used to remove the internal state boundary data points, leaving the region boundary points only.

proc sort data=regions;
   by region;
 proc gremove data=regions out=mapscstm.regions1;
    by region;
    id statecode;
    run;

To complete the process, since the LAT and LONG values are already in the form that VA needs (unprojected) and we are using a SAS dataset rather than the ESRI shape file format, we’ll only use a part of the code from the downloadable program mentioned at the beginning of the blog.

First, create a mapscstm directory under /SASHome/SASFoundation/9.4 to store the custom polygon dataset.  Make sure that the library is accessible to the SAS session by including a libname statement in the appserver_autoexec_usermods.sas file, found in config/Lev1/SASApp, and then restarting the Object Spawner.

Example:

libname MAPSCSTM “SASHome/SASFoundation/9.4/mapscstm”;

Tip:  Be sure to back up the original ATTRLOOKUP and CENTLOOKUP datasets before running any additional code, as you will be modifying the originals.

To complete creation of the polygon dataset, you will need to execute only a part of the downloadable program to:
• Make sure that your polygon dataset has all of the columns expected by SAS Visual Analytics.
• Add the region attributes to the ATTRLOOKUP.
• Add the region center point locations to the CENTLOOKUP dataset.

%let REGION_LABEL=USRegions;   /* The label for the custom region */
 %let REGION_PREFIX=R1; /* unique ISO 2-Letter Code  */
 %let REGION_ISO=000; /* unique ISO Code  */
 %let REGION_DATASET=MAPSCSTM.REGIONS1;  /* Polygon data set to be 
              created - be sure to use suffix "1" */

Note that the downloadable program includes additional macro assignments and additional code, but since our data is already in the form of a SAS dataset, rather than ESRI shape file format, we won’t be using all of the code.

The following datastep adds the necessary columns/values to the polygon dataset so that the form of the data is what is expected by VA.  Note that the LAT and LONG columns are already in unprojected form, so we just assign the same values to X and Y.  (VA doesn’t actually use the X,Y columns from the polygon dataset.)

data &REGION_DATASET.;
   set &REGION_DATASET.;
   where density <= 3; 
   id=region;
   idname=region;
   x=long;  
   y=lat;
   ISO = "&REGION_ISO.";
   RESOLUTION = 1;
   LAKE = 0;
   ISOALPHA2 = "&REGION_PREFIX.";
   AdminType = "regions";
   keep ID SEGMENT IDNAME LONG LAT X Y ISO DENSITY RESOLUTION LAKE ISOALPHA2 AdminType;
   run;

Then PROC SQL steps are executed to add rows relative to the custom polygons to the ATTRLOOKUP and CENTLOOKUP datasets:

This step adds the USRegions row to ATTRLOOKUP:

proc sql;
   insert into valib.attrlookup
      values ( 
         "&REGION_LABEL.",         /* IDLABEL=State/Province Label */
         "&REGION_PREFIX.",        /* ID=SAS Map ID Value */
         "&REGION_LABEL.",         /* IDNAME=State/Province Name */
         "",                       /* ID1NAME=Country Name */
         "",                       /* ID2NAME */
         "&REGION_ISO.",           /* ISO=Country ISO Numeric Code */
         "&REGION_LABEL.",         /* ISONAME */
         "&REGION_LABEL.",         /* KEY */
         "",                       /* ID1=Country ISO 2-Letter Code */
         "",                       /* ID2 */
         "",                       /* ID3 */
         "",                       /* ID3NAME */
         0                         /* LEVEL (0=country level, 1=state level) */
         );
quit;

This step adds a row to ATTRLOOKUP for each individual region:

proc sql;
   insert into valib.attrlookup
      select distinct 
         IDNAME,            /* IDLABEL=State/Province Label */
         ID,                /* ID=SAS Map ID Value */
         IDNAME,            /* IDNAME=State/Province Name */
 
         "&REGION_LABEL.",  /* ID1NAME=Country Name */
         "",                /* ID2NAME */
         "&REGION_ISO.",    /* ISO=Country ISO Numeric Code */
         "&REGION_LABEL.",  /* ISONAME */
         trim(IDNAME) || "|&REGION_LABEL.",  /* KEY */
         "&REGION_PREFIX.",   /* ID1=Country ISO 2-Letter Code */
         "",                  /* ID2 */
         "",                  /* ID3 */
         "",                  /* ID3NAME */
         1                    /* LEVEL (1=state level) */
   from &REGION_DATASET.;
quit;

This step calculates and adds the central location point for each of the regions to the CENTLOOKUP dataset.   The site data contains only the 48 contiguous states (no Alaska or Hawaii). If Alaska and Hawaii had been included, a different algorithm would need to be used to calculate the central location.

proc sql;
   /* Add custom region */
   insert into valib.centlookup
      select distinct
         "&REGION_DATASET." as mapname,
         "&REGION_PREFIX." as ID,
         avg(x) as x,
         avg(y) as y
      from &REGION_DATASET.;
 
   /* Add custom provinces */
   insert into valib.centlookup
      select distinct
         "&REGION_DATASET." as mapname,
         ID as ID,
         avg(x) as x,
         avg(y) as y
      from &REGION_DATASET.
         group by id;
quit;

After executing the code above, you will need to restart the Web Application server, so that SAS Visual Analytics has access to the new polygons.

Code is also included in the downloadable program to create a dataset for validating your results. The validate dataset includes a column for the ID and IDNAME of the regions, in addition to two randomly calculated measures.  In our case, we will instead just use our original REGIONSALES dataset containing the regional sales data.

1. Sign into SAS Visual Analytics and create a new exploration with data source REGIONSALES.
2. Create a Geo data item from State: Right-click Regions, select Geography?Subdivision(State, Province) Names. From the Country or Region drop-down list, select the USRegions region label.
3. Create a geo map visualization. Select Regions for the map style, Regions for the Geography role, and salesamt for the Color role.

Your regions should display, similar to this:

You can also include the region data item in a hierarchy with the state data item to produce a drill-down region map:

Or a bubble or coordinate map:

I hope this example has been helpful to users of SAS Visual Analytics 7.4.  In my next blog, you will see that this process is tremendously simplified by new mapping features in SAS Visual Analytics 8.2.

Creating a custom regional map in SAS Visual Analytics 7.4 was published on SAS Users.

12月 222017
 

Another report requirement came my way and I wanted to share how to use our Visual Analytics’ out-of-the-box relative period calculations to solve it.

Essentially, we had a customer who wanted to see a metric for every month, the previous month’s value next to it, and lastly the difference between the two.

Relative Period Report in SAS Visual Analytics

To do this in SAS Visual Analytics, which is available in versions 7.3 and above, use the relative periodic operators. I am going to use the Mega_Corp data which has a date data item called Date by Month using the format: MMMYYYY. SAS Visual Analytics supports relative period calculations for month, quarter and year.
The first two columns, circled in red, are straight from the data. The metric we are interested in for this report is Profit.

Next, we will create the last column, Profit (Difference from Previous Period), which is an aggregated measure that uses the periodic operators.

From the Data pane, select the metric used in the list table, Profit. Then right-click on Profit and navigate the menus: Create / Difference from Previous Period / Using: Date by Month.

A new aggregated measure will be created for you:

If you right-click on the aggregated measure and select Edit Aggregated Measure…, you will see this relative period calculation, where it is taking the current period (notice the 0) minus the value for the previous period (notice the -1).

Okay – that’s it. This out-of-the-box relative period calculation is ready to be added to the list table. Notice the other Period Operators available in the list. These support SAS Visual Analytics’ additional out-of-the-box aggregated measure calculations such as the Difference between Parallel Periods, Year to Date cumulative calculations, etc.

Now we have to create the final column to meet our report requirement: the Previous Period column.

To do this we are going to leverage the out-of-the-box functionality of the relative period calculation. Since this aggregated measure calculates the previous period for the subtraction – let’s use this to our advantage.

Duplicate the out-of-the-box relative period calculation by right-clicking on Profit (Difference from Previous Period) and select Duplicate Data Item.

Then right-click on the new data item, and select Edit Aggregated Measure….

Now delete everything highlighted in yellow below, remember to also delete the minus sign. And give the data item a new name. Click OK. This will create an aggregated measure that will calculate the previous period.

The final result should look like this from either the Visual tab or Text tab:

Now we have all the columns to meet our report requirement:

Now that I’ve piqued your interest, I’m sure you are wondering if you could use this technique to create aggregated data items to represent the Period -1, -2, -3 offset? YES! This is absolutely possible.
Also, I went ahead and plotted the Difference from Previous Period on a line chart. This is an extremely useful visualization to gage if the variance between periods is acceptable. You can easily assign display rules to this visualization to flag any periods that may need further investigation.

Relative Period Report in SAS Visual Analytics was published on SAS Users.

11月 142017
 

SAS Visual Analytics 7.4 has added the support for date parameters. Recall from my first post,  Using parameters in SAS Visual Analytics, a parameter is a variable whose value can be changed at any time by the report viewer and referenced by other report objects. These objects can be a calculated item, aggregated measure, filter, rank or display rule. And remember, every time the parameter is changed, the corresponding objects are updated to reflect that change.

Here is my updated table that lists the supported control objects and parameter types for SAS Visual Analytics 7.4. The type of parameter is required to match the type of data that is assigned to the control.
Notice that SAS Visual Analytics 7.4 has also introduced the support for multiple value selection control objects. I’ll address these in another blog.

Using Date Parameters in your SAS Visual Analytics Reports

Let’s look at an example of a SAS Visual Analytics Report using date parameters. In this fictitious report, we have been given the requirements that the user wants to pick two independent date periods for comparison. This is not the same requirement as filtering the report between a start and end date. This report requirement is such that a report user can pick two independent months in the source data to be able to analyze the change in Expense magnitude for different aggregation levels, such as Region, Product Line and Product.

In this example, we will compare two different Month,Year periods. This could easily be two different Quarter,Year or Week,Year periods; depending on the report requirements, these same steps can be applied.

In this high level breakdown, you can see in red I will create two date parameters from data driven drop-down lists. From these parameter values, I will create two calculated data items, shown in purple, and one aggregated measure that will be used in three different report objects, shown in green.

Here are the steps:

1.     Create the date parameters.

2.     Add the control objects to the report and assign roles.

3.     Create the dependent data items, i.e. the calculated data items and aggregated measure.

4.     Add the remaining report objects to the canvas and assign roles.

Step 1: Create the date parameters

First we will need to create the date parameters that will hold the values made by the report viewers. From the Data Pane, use the drop-down menu and select New Parameter….

Then create your first parameter as shown below. Give it a name.

Next, select minimum and maximum values allowed for this parameter. I used the min and max available in my data source, but you could select a more narrow range if you wanted to restrict the users to only have access to portions of the data, just so long as the values are in your data source since, in this example, we will use the data source to populate the available values in the drop-down list.

Then select a current value, this will serve as the default value that will populate when a user first opens the report.

Finally, select the format in which you want your data item to be formatted. I selected the same format as my underlying data item I will be using to populate the drop-down list.

Notice how your new parameters will now be available from your Data Pane.

Step 2: Add the control objects to the report and assign roles

Next, drag and drop the drop-down list control objects onto the report canvas. In this example, we are not using the Report or Section Prompt areas since I do not want to filter the objects in the report or section automatically. Instead, I am using these prompt values and storing them in a parameter. I will then use those values to create new calculated data items and an aggregated measure.

Once your control objects are in the report canvas, then use the Roles Pane to assign the data items to the roles. As you can see from the screenshot, we are using the Date by Month data item to seed the values of the drop-down list by assigning it to the Category role, this data item is in our data source.

Then we are going to assign our newly created parameters, Period1Parameter and Period2Parameter to the Parameter role. This will allow us to use the value selected in our calculations.

Step 3: Create the dependent data items, i.e. the calculated data items and aggregated measure

Now we are free to use our parameters as we like. In this example, I am prompting the report viewer for two values: Period 1 and Period 2 which are the two periods the user would like compared in this report. So, we will need to create two calculated data times from a single column in our source data. Since we want to display these as columns next to each other in a crosstab object and use them for an aggregated measure, this technique can be used.

Calculated Data Item: Period 1 Expenses

From the Data Pane, use the drop-down menu and select New Calculated Item…. Then use the editor to create this expression: If the Date by Month for this data row equals the parameter value selected for Period 1, then return the Expenses; else return 0.

Calculated Data Item: Period 2 Expenses

Repeat this using the Period2Parameter in the expression.

Aggregated Measure: Period Difference

Next, we want to calculate the difference between the two user selected Period Expenses. To do this, we will need to create an aggregated measure which will evaluate based on the report object’s role assignments. In other words, it will be calculated “on-the-fly” based on the visualization.

Similar to the calculated data items, use the Data Pane and from the drop-down menu select New Aggregated Measure…. Use the editor to create this expression. Notice that we are using our newly created calculated data items we defined using the parameter values. This expression does not use the parameter value directly, but indirectly through the calculated data item.

Step 4: Add the remaining report objects to the canvas and assign roles

No that we have:

  • our Control Objects to capture the user input.,
  • the Parameters to store the values.,
  • and the Calculated Data Items and Aggregated Measure created…

we can add our report objects to the canvas and assign our roles.

You can see I used all three new measures in the crosstab object. I used the aggregated measure in the bar chart and treemap but notice the different aggregation levels. There is even a hierarchy assigned to the treemap category role. This Period Difference aggregated measure calculation is done dynamically and will evaluate for each visualization with its unique role assignments, even while navigating up and down the hierarchy.

Here are some additional screenshots of different period selections.

In this first screenshot you can see the parallel period comparison between December 2010 and 2011.

In these next two screenshots, we are looking at the Thanksgiving Black Friday month of November. We are comparing the two years 2010 and 2011 again. Here we see that the Board Product from the Game Product Line is bright blue indicating an increase in magnitude of Expenses in the most recent period, Nov2011.

By double clicking on Board in the treemap, we are taken to the next level of the hierarchy, Product Description, where we see a the largest magnitude of Expenses is coming from Backgammon and Bob Board Games.

In these final two screenshots we are comparing consecutive periods, November 2011 with December 2011. We can see from the bar chart easily the Region and Product Line where there is the greatest increase in Expenses.

I’ve configured a brush interaction between all three visualizations so that when I select the tallest bar it will highlight the corresponding data values in the crosstab and treemap.

Conclusion

Now you can use date parameters in your Visual Analytics Reports. There are several applications of this feature and this is only one way you can use parameters to drive business intelligence. Using this technique to create columns based on a user selected value is great when you need to compare values when your source data isn’t structured in this manner.

Using Date Parameters in your SAS Visual Analytics Reports was published on SAS Users.

5月 022017
 

For many years the humble spreadsheet has held many different roles and responsibilities supporting finance, marketing, sales -- pretty much every department in your business. There's always someone with a “magic spreadsheet,” but how effective is this culture that always uses the same format to consume data? My view of [...]

The spreadsheet: Friend or foe? was published on SAS Voices by Tim Clark

4月 102017
 

Earth is an explosive world! Data from the Smithsonian Institution's Global Volcanism Program (GVP) documents Earth's volcanoes and their eruptive history over the past 10,000 years. The GVP database includes the names, locations, types, and features of more than 1,500 volcanoes. Let's look closer into volcanic eruptions across the globe [...]

How to design an infographic about volcanic eruptions using SAS Visual Analytics was published on SAS Voices by Falko Schulz

3月 282017
 

How many meteorites have hit the earth in the last 4,000 years? Where have they landed? And which ones were the biggest? Can we show all of this information - and more in an intuitive data visualization? It turns out NASA provides public data about recorded meteorite impacts on earth all the [...]

How to design a meteorite infographic using NASA data and SAS was published on SAS Voices by Falko Schulz

1月 312017
 

You have all seen, or perhaps even created, some really bad graphics: Cluttered, confusing, too small, incomprehensible. Or worse, the author may have committed one of the three unforgivable sins of data visualization by deceptively distorting a map, truncating the axis so as to misrepresent the data, or used double […]

How to make your pie chart worse was published on SAS Voices.

1月 092017
 

I've long been fascinated by both science and the natural world around us, inspired by the amazing Sir David Attenborough with his ever-engaging documentaries and boundless enthusiasm for nature, and also by the late, great Carl Sagan and his ground-breaking documentary series, COSMOS. The relationships between the creatures, plants and […]

Intelligent ecosystems and the intelligence of things was published on SAS Voices.

12月 122016
 

Being an industry disruptor is a lonely place to be – but if you’re successful, the rewards are well worth the initial risk. Betting big on your new way of doing things takes courage, and is only the first step in a risky process. Your next critical step is to […]

4 Basics disruptive innovators must get right was published on SAS Voices.

12月 072016
 

As technology evolves, so do the c-suite roles related to technology. In particular, the roles of Chief Digital Officer and Chief Data Officer – both referred to as CDO – have seen rapid changes. This post will document the changes I've observed in these two roles, and answer questions I've heard as our customers have been navigating the […]

Rise of the CDO reflects the rising role of data was published on SAS Voices.