Oklahoma State University (OSU) has corralled its data faster than a tumbleweed in a whirlwind, and has bold plans to transform its institutional research efforts. I recently met with OSU's Institutional Research and Information Management (IRIM) team, which provides information, research, decision support, and analysis on demand to the OSU [...]
Data and analytics touch our lives every day. Consider: A call from your bank warning of a suspicious transaction. A well-timed discounted offer for something you need. Most people realize that data and analytics are behind these things, but they remain on the periphery of mainstream conversations. We need to [...]
Can you use a data visualization tool to display building maps, floor designs and other Esri data? With the recent addition of custom polygon support in SAS Visual Analytics 8.2, customers wondered if this feature can be utilized to render different types of regional overlays. A common request is to [...]
Building and visualizing custom polygons in SAS Visual Analytics was published on SAS Voices by Falko Schulz
Are you interested in using SAS Visual Analytics 8.2 to visualize a state by regions, but all you have is a county shapefile? As long as you can cross-walk counties to regions, this is easier to do than you might think.
Here are the steps involved:
Obtain a county shapefile and extract all components to a folder. For example, I used the US Counties shapefile found in this SAS Visual Analytics community post.
Note: Shapefile is a geospatial data format developed by ESRI. Shapefiles are comprised of multiple files. When you unzip the shapefile found on the community site, make sure to extract all of its components and not just the .shp. You can get more information about shapefiles from this Wikipedia article: https://en.wikipedia.org/wiki/Shapefile.
Run PROC MAPIMPORT to convert the shapefile into a SAS map dataset.
libname geo 'C:\Geography'; /*location of the extracted shapefile*/ proc mapimport datafile="C:\Geography\UScounties.shp" out=geo.shapefile_counties; run;
Add a Region variable to your SAS map dataset. If all you need is one state, you can subset the map dataset to keep just the state you need. For example, I only needed Texas, so I used the State_FIPS variable to subset the map dataset:
proc sql; create table temp as select *, /*cross-walk counties to regions*/ case when name='Anderson' then '4' when name='Andrews' then '9' when name='Angelina' then '5' when name='Aransas' then '11', <……> when name='Zapata' then '11' when name='Zavala' then '8' end as region from geo.shapefile_counties /*subset to Texas*/ where state_fips='48'; quit;
Use PROC GREMOVE to dissolve the boundaries between counties that belong to the same region. It is important to sort the county dataset by region before you run PROC GREMOVE.
proc sort data=temp; by region; run; proc gremove data=temp out=geo.regions_shapefile nodecycle; by region; id name; /*name is county name*/ run;
To validate that your boundaries resolved correctly, run PROC GMAP to view the regions. If the regions do not look right when you run this step, it may signal an issue with the underlying data. For example, when I ran this with a county shapefile obtained from Census, I found that some of the counties were mislabeled, which of course, caused the regions to not dissolve correctly.
proc gmap map=geo.regions_shapefile data=geo.regions_shapefile all; id region; choro region / nolegend levels=1; run;
Here’s the result I got, which is exactly what I expected:
Add a sequence number variable to the regions dataset. SAS Visual Analytics 8.2 needs it properly define a custom polygon inside a report:
data geo.regions_shapefile; set geo.regions_shapefile; seqno=_n_; run;
Load the new region shapefile in SAS Visual Analytics.
In the dataset with the region variable that you want to visualize, create a new geography variable and define a new custom polygon provider.
Now, you can create a map of your custom regions:
How to create custom regional maps in SAS Visual Analytics 8.2 was published on SAS Users.
In this education analytics series of blog posts, we have been on a journey to learn how education customers are turning their data into insights to be a more data-informed and analytical organizations. In my first five posts in the education analytics blog series, we learned how education customers are using SAS, [...]
It’s true. The ODS destination for PowerPoint now has the STARTPAGE= option, which provides you with greater control and flexibility when creating presentations.
Added to the ODS POWERPOINT statement in SAS® 9.4TS1M4, the STARTPAGE= option enables you to force the creation of a new slide between procedures and between ODS layout containers. Inserting a slide break between layout containers is one of the most impactful ways that you can use this option.
A new layout container does not automatically trigger a new slide within the presentation. A new slide is started when the current slide is full. That is the default, but the new STARTPAGE= option gives you the ability to start a new slide between containers even if the current slide is not full.
Shown below are four procedures placed within three layout containers.
- The first PROC ODSTEXT step is placed in the first layout container. Its purpose is to generate a slide of text, with that text roughly centered in the slide.
- The second PROC ODSTEXT step is in the second container. Its purpose is to provide useful information about the table and graph.
- The PROC TABULATE and SGPLOT steps make up the third container. They are the results of the analysis and, as such, need to be displayed side by side.
Default Behavior of ODS POWERPOINT, Without STARTPAGE=
Let’s look at the default behavior. In this example, the STARTPAGE= option is not used.
ods powerpoint file='example1.pptx' options(backgroundimage="saslogo_pptx.png"); title; ods layout gridded x=10% y=25%; proc odstext; p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue]; p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt]; run; ods layout end; ods layout gridded rows=1 columns=1; ods region; proc odstext; p 'Table Shows Total Runs and Hits for Each League'; p 'The Graph Contains One Bubble for Each Player. The Size of Each Bubble Represents the Magnitude of the RBIs.'; run; ods layout end; ods graphics / width=4.5in height=4in; ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct; ods region; proc tabulate data=sashelp.baseball; class league; var nruns nhits nrbi; tables league='', sum='Totals'*(nruns nhits)*f=comma12.; run; ods region; proc sgplot data=sashelp.baseball; bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3; run; ods layout end; ods powerpoint close;
Here is the resulting slide output:
Those results are not what we hoped they would be. The output from the second PROC ODSTEXT step, which is to provide information about the table and graph, is on the first slide. So is the graph!!! And the graph does not look good because it is the wrong size. The table is by itself on the second slide. This is not the desired output at all.
Here Is Where STARTPAGE= Helps!
In this example, an ODS POWERPOINT statement with the STARTPAGE= option is added. It is placed after the ODS LAYOUT END statement for the first container.
ods powerpoint file='example2.pptx' options(backgroundimage="saslogo_pptx.png"); title; ods layout gridded x=10% y=25%; proc odstext; p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue]; p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt]; run; ods layout end; ods powerpoint startpage=now; /* <---- Triggers a new slide */ ods layout gridded rows=1 columns=1; ods region; proc odstext; p 'Table Shows Total Runs and Hits for Each League'; p 'The Graph Contains One Bubble for Each Player. The Size of Each Bubble Represents the Magnitude of the RBIs.'; run; ods layout end; ods graphics / width=4.5in height=4in; ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct; ods region; proc tabulate data=sashelp.baseball; class league; var nruns nhits nrbi; tables league='', sum='Totals'*(nruns nhits)*f=comma12.; run; ods region; proc sgplot data=sashelp.baseball; bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3; run; ods layout end; ods powerpoint close;
The STARTPAGE= option gave us exactly what we need. The first slide contains just the text (from the first layout container). The second slide contains more text along with the table and graph (from the second and third layout containers).
Use It Wisely!
The most important thing to know about using the STARTPAGE= option with layout containers is that it has to be placed between containers. It cannot be placed within a container. So you cannot put the statement between the ODS LAYOUT GRIDDED and ODS LAYOUT END statements.
For more information about ODS destination for PowerPoint and all of its options, visit this The Dynamic Duo: ODS Layout and the ODS Destination for PowerPoint. Take a peek at it for more examples of using ODS Layout with the ODS destination for PowerPoint.
The ODS Destination for PowerPoint Has a New Option was published on SAS Users.
SAS Visual Analytics 7.4 has added the support for date parameters. Recall from my first post, Using parameters in SAS Visual Analytics, a parameter is a variable whose value can be changed at any time by the report viewer and referenced by other report objects. These objects can be a calculated item, aggregated measure, filter, rank or display rule. And remember, every time the parameter is changed, the corresponding objects are updated to reflect that change.
Here is my updated table that lists the supported control objects and parameter types for SAS Visual Analytics 7.4. The type of parameter is required to match the type of data that is assigned to the control.
Notice that SAS Visual Analytics 7.4 has also introduced the support for multiple value selection control objects. I’ll address these in another blog.
Let’s look at an example of a SAS Visual Analytics Report using date parameters. In this fictitious report, we have been given the requirements that the user wants to pick two independent date periods for comparison. This is not the same requirement as filtering the report between a start and end date. This report requirement is such that a report user can pick two independent months in the source data to be able to analyze the change in Expense magnitude for different aggregation levels, such as Region, Product Line and Product.
In this example, we will compare two different Month,Year periods. This could easily be two different Quarter,Year or Week,Year periods; depending on the report requirements, these same steps can be applied.
In this high level breakdown, you can see in red I will create two date parameters from data driven drop-down lists. From these parameter values, I will create two calculated data items, shown in purple, and one aggregated measure that will be used in three different report objects, shown in green.
Here are the steps:
1. Create the date parameters.
2. Add the control objects to the report and assign roles.
3. Create the dependent data items, i.e. the calculated data items and aggregated measure.
4. Add the remaining report objects to the canvas and assign roles.
Step 1: Create the date parameters
First we will need to create the date parameters that will hold the values made by the report viewers. From the Data Pane, use the drop-down menu and select New Parameter….
Then create your first parameter as shown below. Give it a name.
Next, select minimum and maximum values allowed for this parameter. I used the min and max available in my data source, but you could select a more narrow range if you wanted to restrict the users to only have access to portions of the data, just so long as the values are in your data source since, in this example, we will use the data source to populate the available values in the drop-down list.
Then select a current value, this will serve as the default value that will populate when a user first opens the report.
Finally, select the format in which you want your data item to be formatted. I selected the same format as my underlying data item I will be using to populate the drop-down list.
Notice how your new parameters will now be available from your Data Pane.
Step 2: Add the control objects to the report and assign roles
Next, drag and drop the drop-down list control objects onto the report canvas. In this example, we are not using the Report or Section Prompt areas since I do not want to filter the objects in the report or section automatically. Instead, I am using these prompt values and storing them in a parameter. I will then use those values to create new calculated data items and an aggregated measure.
Once your control objects are in the report canvas, then use the Roles Pane to assign the data items to the roles. As you can see from the screenshot, we are using the Date by Month data item to seed the values of the drop-down list by assigning it to the Category role, this data item is in our data source.
Then we are going to assign our newly created parameters, Period1Parameter and Period2Parameter to the Parameter role. This will allow us to use the value selected in our calculations.
Step 3: Create the dependent data items, i.e. the calculated data items and aggregated measure
Now we are free to use our parameters as we like. In this example, I am prompting the report viewer for two values: Period 1 and Period 2 which are the two periods the user would like compared in this report. So, we will need to create two calculated data times from a single column in our source data. Since we want to display these as columns next to each other in a crosstab object and use them for an aggregated measure, this technique can be used.
Calculated Data Item: Period 1 Expenses
From the Data Pane, use the drop-down menu and select New Calculated Item…. Then use the editor to create this expression: If the Date by Month for this data row equals the parameter value selected for Period 1, then return the Expenses; else return 0.
Calculated Data Item: Period 2 Expenses
Repeat this using the Period2Parameter in the expression.
Aggregated Measure: Period Difference
Next, we want to calculate the difference between the two user selected Period Expenses. To do this, we will need to create an aggregated measure which will evaluate based on the report object’s role assignments. In other words, it will be calculated “on-the-fly” based on the visualization.
Similar to the calculated data items, use the Data Pane and from the drop-down menu select New Aggregated Measure…. Use the editor to create this expression. Notice that we are using our newly created calculated data items we defined using the parameter values. This expression does not use the parameter value directly, but indirectly through the calculated data item.
Step 4: Add the remaining report objects to the canvas and assign roles
No that we have:
- our Control Objects to capture the user input.,
- the Parameters to store the values.,
- and the Calculated Data Items and Aggregated Measure created…
we can add our report objects to the canvas and assign our roles.
You can see I used all three new measures in the crosstab object. I used the aggregated measure in the bar chart and treemap but notice the different aggregation levels. There is even a hierarchy assigned to the treemap category role. This Period Difference aggregated measure calculation is done dynamically and will evaluate for each visualization with its unique role assignments, even while navigating up and down the hierarchy.
Here are some additional screenshots of different period selections.
In this first screenshot you can see the parallel period comparison between December 2010 and 2011.
In these next two screenshots, we are looking at the Thanksgiving Black Friday month of November. We are comparing the two years 2010 and 2011 again. Here we see that the Board Product from the Game Product Line is bright blue indicating an increase in magnitude of Expenses in the most recent period, Nov2011.
By double clicking on Board in the treemap, we are taken to the next level of the hierarchy, Product Description, where we see a the largest magnitude of Expenses is coming from Backgammon and Bob Board Games.
In these final two screenshots we are comparing consecutive periods, November 2011 with December 2011. We can see from the bar chart easily the Region and Product Line where there is the greatest increase in Expenses.
I’ve configured a brush interaction between all three visualizations so that when I select the tallest bar it will highlight the corresponding data values in the crosstab and treemap.
Now you can use date parameters in your Visual Analytics Reports. There are several applications of this feature and this is only one way you can use parameters to drive business intelligence. Using this technique to create columns based on a user selected value is great when you need to compare values when your source data isn’t structured in this manner.
Using Date Parameters in your SAS Visual Analytics Reports was published on SAS Users.
In my first four posts in the education analytics blog series, we learned how education customers are using SAS, the positive impact for their users and institution, and some of their best practices. In talking to customers for this series, one of the many things we've learned is that they [...]
In this education analytics series of blog posts, we have been on a journey to learn how education customers are turning their data into insights to be a more data-informed and analytical organization. So far on our journey, we learned how education customers are using SAS, the positive impact that SAS [...]
For our third installment in this blog post series, let’s continue our journey to learn more from three SAS education customers. Today, we'll hear about the benefits their users and institutions have received by using SAS for data visualization and analytics. In this post, you'll hear from: Linda Sullivan, Assistant Vice President [...]