SAS Visual Analytics

9月 302022
 

Recently I was on an email thread where someone asked how to do a swimmer plot in SAS Visual Analytics. People replied with other ways using SAS code. Though there is not a standard swimmer plot in VA, I thought it might be possible to create one with a custom graph. So I decided to give it a try.

I reviewed some related materials about the swimmer plot and discovered a useful blog post by my colleague Sanjay Matange. His post provides SAS code to generate the data used by the swimmer plot as well, making things simple. My next step was to create a swimmer plot template in SAS Graph Builder and draw the plot in SAS Visual Analytics.

Create swimmer plot template

This will be done in SAS Graph Builder, I will create a custom graph template named ‘Swimmer Plot’.

The composition of the swimmer plot template

To make the Swimmer plot template, I use two schedule charts and four scatter plots, as shown below.

The template is made up with following charts:

a) Schedule Chart 1 will draw the High/Low bar representing the duration of each subject. It also needs to indicate the type of disease stage – Stage 1, 2, 3, 4.

b) Schedule Chart 2 will draw the Start/End line representing the duration of each response of each subject. It also needs to indicate the type of response - Complete or Partial.

c) Scatter Plot 1 will be used for the Start event, and Scatter Plot 2 for the End event.

d) Scatter Plot 3 will be used to indicate the Durable responder, and Scatter Plot 4 will show if the response is a Continued response.

Creating the swimmer plot template

In SAS Graph Builder, drag the above plots one by one to the work area. And then perform the settings as listed for each object in the ‘Option’ pane.

Next, we need to define roles for these plots in the ‘Roles’ pane.

1 - In the ‘Shared Roles’ section, click the toolbox icon next to the role ‘Shared Role 1’. Edit the role to update the Role Name to ‘Item’ and click OK button.

2 - In the ‘Schedule Chart 1’ section:

a) Click the toolbox icon next to the role ‘Schedule Chart 1 Start’. Edit the role to update the Role Name to ‘Low’ and click OK button.

b) Click the toolbox icon next to the role ‘Schedule Chart 1 Finish’. Edit the role to update the Role Name to ‘High’ and click OK button.

c) Add role by clicking the ‘+ Add Role’ link, update the Role Name to ‘Stage’, leave the Role Type as ‘Group’, check the ‘Required’ checkbox, and click OK button.

3 - In the ‘Schedule Chart 2’ section:

a) Click the toolbox icon next to the role ‘Schedule Chart 2 Start’. Edit the role to update the Role Name to ‘Start’ and click OK button.

b) Click the toolbox icon next to the role ‘Schedule Chart 2 Finish’. Edit the role to update the Role Name to ‘Endline’ and click OK button.

c) Add role by clicking the ‘+ Add Role’ link, update the Role Name to ‘Status’, leave the Role Type as ‘Group’, check the ‘Required’ checkbox, and click OK button.

4 - In the ‘Scatter Plot 1’ section, click the toolbox icon next to ‘Scatter Plot 1 X’, and select ‘Create Shared Role with Another Role’ > ‘Start’. Update the Role Name to ‘Start’ and click OK button.

5 - In the ‘Scatter Plot 2’ section, click the toolbox icon next to ‘Scatter Plot 2 X’, and select ‘Edit Role’, update the Role Name to ‘End’, and click OK button.

6 - In the ‘Scatter Plot 3’ section, click the toolbox icon next to ‘Scatter Plot 3 X’, and select ‘Edit Role’, update the Role Name to ‘Durable’, and click OK button.

7 - In the ‘Scatter Plot 4’ section, click the toolbox icon next to ‘Scatter Plot 4 X’, and select ‘Edit Role’, update the Role Name to ‘Continued’, and click OK button.

Now I am done with the creating the template. Save it as ‘Swimmer Plot’ in ‘My Folder’.

Prepare the data for the swimmer plot

I generated the data set from the Swimmer plot codes by Sanjay and updated the missing values in the ‘Stage’ column. This will avoid the missing value shown in VA. I put the generated CSV file here. Next, I need to prepare the data so it can be directly used to draw the swimmer plot in VA.

1 - Change the Classification of the ‘item’ column, from Measure to Category as shown below.

2 - Create a custom sort for the ‘item’ column. RMB the ‘item’ column in the ‘Data’ Pane and select ‘Custom sort…’ from the menu. In the ‘Add Custom Sort’ pop-up page, click the ‘Add all’ icon to have all the items sorted as below.

3 - Create a calculated item named ‘Continued’ as shown below, its expression is IF ( 'highcap'n NotMissing ) RETURN ( 'high'n + 0.2 ) ELSE ..

That’s all for the data preparation.

Create the swimmer plot in VA

We will first import the ‘Swimmer Plot’ template. In SAS Visual Analytics, go to the ‘Object’ pane, and click the toolbox icon. Select the ‘Import custom graph…’ from the pop-up menu and choose the ‘Swimmer Plot’ in the open dialog. Click OK button to import the graph template we just created. Now the ‘Swimmer Plot’ will be listed in the ‘Graph’ section in the ‘Object’ pane.

Next, drag the ‘Swimmer Plot’ object to canvas, and assign the corresponding data columns to the roles, SAS Visual Analytics will render the Swimmer Plot. To show more legends for the marks in the plot, I use an Image object. I put the ‘Swimmer Plot’ and the legend image in a Precision container. Now we see the chart as shown below.

Summary

With SAS Graph builder, we create the swimmer plot template using two schedule charts and four scatter plots. After importing the template in SAS Visual Analytics, we can create the swimmer plot easily by assigning corresponding data roles.

How to draw a swimmer plot in SAS Visual Analytics was published on SAS Users.

8月 112022
 

If you haven't explored data with Automatic Actions enabled yet, then read on to learn about some fast and easy ways to gain insights using One-way filters, Two-way filters or Linked Selection.



Part 1 – Easy exploration for multiple data sources

In this post, we focus on One-way filters.

Here are the benefits and behaviors of using Automatic Actions: One-way filters:

  • Applied at the Page level. This means all objects on the page will be filtered if possible.
  • Same data source objects are filter automatically, different data source objects require column mappings to be defined under the global Map Data.
  • Filters are combined with the AND operator and applied in the order in which they are selected.

We will also cover some Pro Tips for when using Automatic Actions:

  • For Control Objects:
    • If you want to filter multiple data sources, use the columns from the Global Mapped Data definition for the Role assignment.
    • Use the mapped column with the highest cardinality.
  • Use the original column in visualizations and not the Custom Category used for mapping data.
  • Use Object Titles to label the data source.

Part 2 - Dive into multidimensional or hierarchical data easily

In this post, we cover Automatic Actions: Two-way filters.

The term “two-way” sounds like you get only two opportunities to select a value, but this is not the case. The first value selected will automatically filter all of the objects with either the same data source or mapped data sources on the page. Then all subsequent value selections will filter all objects on the page including the first source object with an AND condition.

This type of compound filter is especially beneficial for exploring multidimensional or hierarchical data easily. This is because the objects can display different levels of hierarchy and the filtering will simultaneously be applied and you can continue to dive into the hierarchy or dimension.

Part 3 - Find related data fast

In Part 3 of this series, I cover the Automatic Action: Linked Selection. The Linked Selection Action highlights, or “selects”, the same data simultaneously for all of the objects on the page that support manual linked selections.

The key to using Linked Selection is using related data and data items, but not the exact same Role assignments. Let SAS Visual Analytics do all the work to simultaneously highlight the selected data points. It’s important for report viewers to understand that the selected/highlighted data may represent the entire value or part of the value of the selected row, cell, data point, bubble, tile, etc.

Here are some of the key tips I will cover for Linked Selections:

  • Automatic Actions are applied at the Page level in the report.
  • All objects which support actions will be automatically included in the selected Automatic Action.
  • Map Data for different data sources. Include all of the columns you want to be linked.
  • Create Custom Categories to assist with Mapping Data for different data sources.
  • Hold the Ctrl key to select multiple values.
  • Use color meaningfully. Since objects are not filtered but values are selected/highlighted it is important to coordinate colors for data values with purpose.

Demonstration

Watch this video tutorial as I walk through an example of each Automatic Action type.

View more of my Communities posts.

SAS Visual Analytics: Automatic Actions: a three-part series was published on SAS Users.

6月 152022
 

Have you ever heard the phrase “beggars can’t be choosers”? Basically, it means that if you ask for something, be grateful for what you get, especially if you don’t have the means to acquire it yourself. This phrase can be widely applicable to most areas of our lives, but when it comes to preparing data in SAS Viya, beggars can be choosers.

SAS Viya has many tools for preparing your data: you can write code in SAS Studio, create plans in SAS Data Studio, or prepare data within reports in SAS Visual Analytics. Because each of these tools has its own features and functions, you may find yourself wondering, which tool should I use?

The answer, like the answer to many questions, is it depends. What are you trying to do? What skills do you have? How quickly does the data need to be ready to use? How many people will be using the data and how?

The tool you choose will boil down to two factors: (1) simplicity and (2) control. For example, let’s say you need to create a calculated column quickly. The easiest way to create this column is within SAS Visual Analytics. You can create the new column without having to switch tools. SAS Visual Analytics is great for preparing data with little effort (no code needed) and limited control (limited options available).

On the other hand, let’s say you need to perform a more complex action like transposing or appending data sets, or standardizing values. SAS Visual Analytics, while easy to use, doesn’t have the features available for this level of data preparation. Instead, you can create a plan in SAS Data Studio to prepare your data. SAS Data Studio is great for preparing data with moderate effort (little to no code needed) and moderate control (more options available).

If you want to have full control, you can write code in SAS Studio to prepare your data. SAS Studio is great for preparing your data with more effort (may require some coding) and full control (unlimited options).

So, you can really think about it as a tradeoff between simplicity and control.  To perform more complex data preparation tasks, you need more control.

I like to think about it in the context of drawing. Let’s say I want to draw an elephant. Before I start to draw, I need to think about how much time I have available (simplicity) and the level of detail I want to include in my drawing (control).

If I’m playing Pictionary, a charades-inspired word-guessing game where you draw a picture so your teammate can identify a specific word, I want to draw the simplest picture possible so my teammate can guess the word as quickly as possible. I don’t need to create the best-looking picture in the world; I just need my teammate to guess correctly in the shortest amount of time. SAS Visual Analytics is the Pictionary version of my elephant.

On the other hand, if I want to sketch a cute picture of an elephant for my niece’s birthday card, I may want to spend a bit more time on the drawing and add some additional details like eyes and a mouth. SAS Data Studio is the birthday card sketch version of my elephant.

If I’m interested in creating a real-life approximation of an elephant, however, I may want to spend a lot more time and add a lot more detail like shading. SAS Studio is the real-life drawing version of my elephant.

Note: The SAS Visual Analytics version took me about 5 seconds to draw, the SAS Data Studio version took me about 5 minutes, and the SAS Studio version took me an hour.

Now you’re probably wondering, when do I use each tool?

SAS Visual Analytics

Use SAS Visual Analytics in the following scenarios:

  • Modifying properties: modify properties of data items (names, formats, classifications, aggregations).
  • Filters: create subsets of the data for the report (data source filters) or specific objects (basic filters, advanced filters, post-aggregate filters, common filters). Note: Post-aggregate filters, basic filters, and common filters can only be created in SAS Visual Analytics.
  • Calculated items: create a simple calculation, like number of days to delivery (calculated item) or percent of total (aggregated measure) or custom groupings (custom categories). Note: Due to their dynamic nature, aggregated measures can only be created in SAS Visual Analytics.
  • Hierarchies and geography data. Note: Hierarchies and geography data items can only be created in SAS Visual Analytics.
  • A simple join between two tables based on equality condition (for example, T1.Date = T2.OrderDate).
  • Mapping data sources: select a value in an object that uses Table A and filter the associated values in an object that uses Table B (mapping) and the common column between Table A and Table B is used by both objects. Note: Mapping data sources can only be performed in SAS Visual Analytics.
  • Aggregated data sources: create an aggregated version of the table.
  • Data views for ad-hoc analysis: changes made to data in SAS Visual Analytics are available only for the specific report. To apply data changes to other reports that use the same data source, you can create and share a data view, but you cannot use that prepared data outside of SAS Visual Analytics.

SAS Data Studio

Use SAS Data Studio in the following scenarios:

  • Improve data quality with casing, parsing, extraction, identification analysis, generating matchcodes, standardizing, or matching and clustering.
  • To discover and address unknown problems in your data using the Suggestions.
  • Append tables or transpose.
  • Create a join between two or more tables (maximum is 32) based on equality condition (for example, T1.Date = T2.OrderDate) or when the common column between the data sources is calculated (for example, datepart(T1.Date) = T2.OrderDate).
  • Write custom DATA step or CASL code to perform any required data preparation action.
  • To create tables used for many reports, by many analysts, and in different Viya tools. Note: Jobs can be created from plans and scheduled to run at specific times to ensure the data is up-to-date.

Note: You can also modify the properties of data items, create subsets of the data, and create calculated items in SAS Data Studio.

SAS Studio

Use SAS Studio in the following scenarios:

  • Programming with SAS Data Quality: improve data quality programmatically with casing, parsing, extraction, identification analysis, generating matchcodes, standardizing, or matching and clustering.
  • Create a non-equi join (for example T1.Date > T2.EndDate) between two or more tables or when the common column between the data sources is calculated (for example, datepart(T1.Date) = T2.OrderDate).

Note: You can use the SQL procedure or CASL code to perform the join.

Note: The DATA step can be used to combine tables where the name of the column is the same in each table.

  • Programming with the DATA step: perform complex conditional logic, use arrays, or process data iteratively using the DATA step.
  • Programming with SQL: perform any action that requires the use of SQL code (creating subqueries, using set operators, inserting rows into tables, updating values in a table, using DICTIONARY tables, and more).
  • Using SAS procedures: use any SAS procedure not associated with a transform in SAS Data Studio (for example, FORMAT, REPORT, FREQ, UNIVARIATE, and more).
  • Programming with macros: generate dynamic code based on data or user input using the SAS macro language.
  • Create tables used for many reports, by many analysts, and in different Viya tools.

Note: Jobs can be created from SAS code and scheduled to run at specific times to ensure the data is up-to-date.

Note: You can also modify properties of data items, create subsets of the data, create calculated items, create custom groupings, create aggregated tables, append and transpose data, and write custom DATA step or CASL code in SAS Studio.

Summary

The following table summarizes the data preparation actions you can perform in each tool.

Action SAS Visual Analytics SAS Data Studio SAS Studio
Modify properties of data items (names, formats, classifications, aggregations)
 
Filter the data
Filter specific objects in a report
 
Create simple calculations
Create custom groupings
Create dynamic aggregated measures
Create hierarchies
Create geography data items
 
Map data sets to apply interactions between two objects that use different data sources
Perform simple joins between two tables based on equality condition
Perform joins between more than two tables (up to 32) based on equality condition
Perform joins when the common column needs to be calculated
Perform non-equi joins
 
Create an aggregated table
Append tables
Transpose tables
 
Improve data quality with transforms
Improve data quality programmatically
Determine data issues using Suggestions
 
Write DATA step code
Write CASL code
Write SQL code
Use SAS procedures
Use SAS macro language
 
Perform ad-hoc analysis
Create tables used for many reports
Schedule actions to run on a repeatable basis

 

For more information about how to use SAS Visual Analytics to prepare your data and create advanced reports, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

Beggars can’t be choosers…Unless they’re using SAS Viya was published on SAS Users.

5月 172022
 

Here at SAS, we understand the importance of having access to cutting-edge professional resources. That’s why, for more than 40 years, we’ve provided individuals in programming, data management and analytics fields with low-cost and no-cost materials that promote success in their educational and professional journeys. And today, as the demand for employees with advanced skill sets and global certifications grows, we get it – having the ability to easily access the tools you need to succeed is more important than ever.

We’ve got you covered.

As part of our ongoing commitment to helping individuals enhance their skills, further their careers and increase their chance of success in the field, we’re now offering SAS Certification Practice Exams for free. Yes, free.

Over the years, candidates who have taken advantage of our practice exams have found them to be a valuable, effective tool for gauging their preparedness for SAS Certification Exams. When combined with other SAS training resources – like webinars, content guides, training courses and web tutorials – these free exams greatly increase candidates’ chances of success. Exams are currently available in:

    • Programming
    • Advanced Analytics
    • Data Management
    • Visual Analytics
    • Administration

Representative of the live exams, our online practice exams go through the same rigorous development process and are designed to give candidates an idea of what they should expect in the actual test questions. SAS practice exams also provide the rationale behind correct and incorrect answers, giving participants even more insight and opportunity for exam success.

And the numbers speak for themselves.*

    • Those who passed the practice exam had a 17.5% higher pass rate on the live exam than those who did not take or pass the practice version.
    • Those who took the practice exam – regardless of score – had an 8.15% higher pass rate on the live exam than those who opted not to take it.

Not to mention taking a practice test just might make the difference between passing the SAS Certification Exam on the first try or having to retake it. We’ve found that many who don’t pass the Certification Exam miss the mark by only a few questions, which they could have avoided with a bit more preparation.

Keep in mind, in order to make the best use of this resource, we recommend taking a practice exam as your final method of study in order to test your preparedness before diving into the actual exam. If you are curious about the types of questions typically on the certification exams, we have sample questions available for you to review.

So, what are you waiting for? A free resource to prepare you for a globally-recognized certification and make your resume stand out from the rest? That’s a no-brainer.

For more information about free SAS Certification Practice Exams, visit: https://www.sas.com/en_us/certification/resources/sas-practice-exams.html

*Based on data from 20,000 practice exams taken since 2020.

Access for success: SAS Certification practice exams now offered for free was published on SAS Users.

4月 052022
 

Have you ever heard something referred to as the bee’s knees? Most likely the person uttering that expression meant that it was truly amazing and extraordinary. Maybe you stopped and pondered the origin of the phrase. Well wonder no more! In the 1920s in the United States, people were obsessed with rhyming and anthropomorphism (giving human traits or attributes to animals). This era saw the introduction of multiple animal-centric idioms: the ant’s pants, the cat’s pajamas, and (you guessed it) the bee’s knees. Most likely this idiom refers to the pollen basket on the lower section of a worker bee’s legs (where the tibia and the tarsi connect). When bees are out foraging, they carry all the pollen they find in the pollen baskets on their legs. If you ever see a bee with lots of pollen on her legs, you know she’s been working hard… one might even say she’s an amazing and extraordinary worker!

SAS Visual Analytics has a lot of features that you might call the bee’s knees, but one of the most amazing and extraordinary features is the AggregateTable operator. This operator, introduced in SAS Visual Analytics 8.3, enables you to perform aggregations on data crossings that are independent of (or changed from) the data in your objects. This means you can use this operator to compare aggregations for different groups of data in one single object.

To illustrate the AggregateTable operator in action (and to keep with the theme), let’s consider an example.

I’m a hobby beekeeper in Texas. This means that I own and maintain a few hives in my backyard from which I collect honey and wax to sell to family and friends. I’m interested in learning about honey production in the United States for different years. I’m pretty sure I’m not the biggest honey producer in my state (or even my county), but I want to look at different crossings of total production (by state and year, by county and year, and by state).

The United States Department of Agriculture’s National Agricultural Statistics Service has Census data on honey production (measured in pounds) for all counties in the United States for 2002, 2007, 2012, and 2017.

Type of Aggregation: Add

To start, I would like to calculate the total production for each state (Maximum by State) and display the total production for the state that produced the most honey in that year. For example, in 2002 California produced the most honey of any state (15,585,925 pounds) and in 2017 North Dakota produced the most honey of any state (24,296,437 pounds).

Because the table contains details by county and I’m interested in the total production by state, I will either need to create an aggregated data source that contains details by state, or I will need to use the AggregateTable operator. Since this post is about the wonders of the AggregateTable operator, let’s focus on that.

The AggregateTable operator requires five parameters:

Parameter Description
Aggregation- Aggregated The aggregation applied to the aggregated item when it is used in an object that displays fewer group-by crossings than the table in the expression.
Aggregation- Measure The aggregation applied to the measure in the inner table context.
Type of aggregation The type of aggregation that is performed. Values are Fixed, Add, or Remove.
Categories The list of categories used to alter the data crossing for the aggregation.
Measure The measure that is aggregated. A Table operator can be added as the measure to perform a nested aggregation.

It also has a nested operator, Table, that creates an intermediate table defined by the Type of aggregation, Categories, Measure, and Aggregation- Measure parameters.

For this example, I want to use a Key value object to see the total production values for the state that produced the most honey in each year. The object will contain one group-by crossing (or category): Year. The calculation, however, will need to use two group-by crossings to determine the highest producing state for each year: Year and State. Therefore, the Aggregation-Measure is _Sum_ (to calculate the total production by state), the Type of aggregation is Add (because I want to add a crossing for State to the calculation), Categories is set to State, and Measure is Production (lb).

The intermediate table will contain one row for each state and year and contain total production values.

Then, for each year, I want the highest production value (for 2002, 15,585,925 pounds). Therefore, the Aggregation- Aggregated parameter should be _Max_ to grab the maximum values for each year from the intermediate table.

Then, I can display the Maximum by State with Year in a Key value object.

Note: Beginning in SAS Visual Analytics 2021.1.1 (May 2021), a new role is available for the Key value object, Lattice category. This enables you to display a key value object for each distinct value of a category data item (in this example, Year).

Now that I have a data item that contains the production amount for the highest producing state for each year, I can create some more complex calculations, like the percentage of total production for each year by the state that had the highest production. This will enable me to see if the highest producing state is doing all the heavy lifting or if all the states are producing a near-equal amount.

Type of Aggregation: Remove

The Type of aggregation parameter also enables you to remove categories from the calculation. For this example, suppose I want to compare the production in each county to the production from the highest producing county in that state (Maximum by County). I want to use a list table to compare these values.

The object will contain three group-by crossings: Year, State, and County. The calculation, however, will only use two group-by crossings to determine the highest producing county in each state for each year: Year and State. Therefore, the Aggregation-Measure is _Max_ (to calculate the maximum production in each state), the Type of aggregation is Remove (because I want to remove a crossing for County from the calculation), Categories is set to County, and Measure is Production (lb).

The intermediate table will contain one row for each state and year and contain the production values for the county in that state with the highest production. Notice that for this table, the aggregation for Production was set to Maximum to show the maximum production for each state.

Because the number of groupings in the object (County, Year, and State) is not fewer than the number of groupings in the calculation (Year and State), the Aggregation-Aggregated parameter is not applied and can be set to any value.

Then, I can display Maximum by County in a list table with total production by county to easily compare each county’s production with the production of the county in that state that had the highest production.

Now, I can calculate each county’s difference from the county with the highest production in that state for each year.

Type of Aggregation: Fixed

The Type of aggregation parameter also enables you to set a fixed category for the calculation. For this example, suppose I want to compare the production in each county to the total production across all years by state (Total by State). I want to use a list table to compare these values.

The object will contain three group-by crossings: Year, State, and County. The calculation, however, will only use one group-by crossing to determine the total production by state across all years: State. Therefore, the Aggregation-Measure is _Sum_ (to calculate the total production by state across all years), the Type of aggregation is Fixed (because I want to fix the crossing to State for the calculation), Categories is set to State and Measure is Production (lb).

The intermediate table will contain one row for each state and total production values across all years.

Because the number of groupings in the object (County, Year, and State) is not fewer than the number of groupings in the calculation (State), the Aggregation-Aggregated parameter is not applied and can be set to any value.

Then, I can display Total by State in a list table with total production by county to easily compare each county’s production with total production in the state across all years.

I can even compare total state values for each year with total production in that state across all years.

Then, I can calculate the share of total production produced each year.

For more information about how to create advanced data items and filters for your SAS Visual Analytics reports, including examples that use different types of operators, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

Bees are important contributors in pollinating most of the food that you eat and are in desperate need of your help! There are many ways you can help the honeybee population thrive:

  • Become a beekeeper
  • Plant a garden for all pollinators (including bumblebees, butterflies, bats, and moths)
  • Reduce or eliminate pesticide use
  • Support your local beekeepers by buying local honey
  • Contact a beekeeping group if you see a swarm
  • Volunteer as a citizen data scientist by taking pictures and posting information on the iNaturalist app
  • Create a bee bath for those hot summer days
  • Build homes for native bees

 

The AggregateTable operator… It’s the bee’s knees! was published on SAS Users.

2月 012022
 

When we moved out to the country with our two dogs, our oldest dog Todd suddenly decided he liked to howl…. And he would do so every time we left the house. Maybe it was the country air? Maybe it was a time-lapse gene? Maybe he just wanted to learn something new?

If you’ve been using SAS Visual Analytics for a while, it’s possible you might have never created reports that linked to external URLs. SAS Visual Analytics can do so much on its own, perhaps you never thought about extending its functionality outside the product itself! Well, it’s time to learn a new trick.

To illustrate how this can be done (and to keep with the theme), let’s consider an example.

I’m interested in adding a new member to my family (a dog!), and I know I would like to adopt an animal in need. I’m not sure, however, which breed will suit my lifestyle. I need a dog that’s playful and sweet, but one that also likes to sleep late.

I have a report in SAS Visual Analytics that shows details about animals surrendered at an Austin animal shelter. I’d like to select various characteristics (like animal type, sex, whether the dog is spayed or neutered, and condition) and see what breeds they have available. Then, I’d like to see additional details about each breed at the American Kennel Club website (www.akc.org). On this website, you can find information about various breeds of dogs (and cats!), including average sizes, life expectancy, personality, and many other traits.

I’ll add an interactive link to the report, so when a user selects a specific breed, the page for that breed appears. The interactive link will use parameters to pass a selected value from the report to the web page.

To create interactive links, I like to follow four simple steps:

  1. Research the structure of the URL
  2. Use a hardcoded value to test the link
  3. Parameterize the link
  4. Test the parameterized value

Step 1: Research

Before adding interactive links to a report, you need to understand how the target web page structures the URL. I typically do this by accessing the target web page and searching for a specific subject. For some websites, you might need to view the Developer Guide for the website to fully understand the structure.

Typically, URLs are constructed in one of three ways:

  • Path: In these URLs, the subject is added at the end of the URL. For example, to view a country page on Wikipedia, you use the following URL: https://en.wikipedia.org/wiki/country where country is the full name of the country of interest.
  • Query: In these URLs, the subject is assigned to the value of a URL parameter using a sequence of attribute-value pairs: ?parameter1=value1&parameter2=value2. Multiple parameters can be assigned by separating the attribute-value pairs with an ampersand (&). For example, to search Etsy for a specific type of item, you use the following URL: https://www.etsy.com/search?q=item where item is the specific search string.
  • File: In these URLs, the subject is a part of a file name at the end of the URL. For example, to view a country profile on CIA Factbook, you use the following URL: https://www.cia.gov/library/publications/the-world-factbook/geos/country-code.html where country-code is the 2-letter abbreviation of the country of interest.

To start, I’ll select one of the breeds in the list: Australian Cattle Dog (my current dog’s breed). The American Kennel Club website has a drop-down selector at the top of the page where you can select a breed.

Australian cattle dogs are alert, curious, and pleasant. Tell me about it! He won’t let a leaf fly by outside without raising the alarm.

The URL is constructed with the breed as part of the path: https://www.akc.org/dog-breeds/australian-cattle-dog/. Notice that for breeds with multiple words (like Australian Cattle Dog), the link uses hyphens (-) instead of spaces.

Step 2: Hardcode

Now that you understand the structure of the URL, you can test the link using various hardcoded values. For example, to view details about dachshunds, go to https://www.akc.org/dog-breeds/dachshund/. Dachshunds are friendly, curious, and spunky. They must be! Why else would they be chosen to star in dog races at Oktoberfest celebrations around the world?

Step 3: Parameterize

After you have tested the URL using hardcoded values, you need to replace the hardcoded value with parameters. These are values that will be passed from your report to the external URL to make the links interactive. For the report, I’ll add the link to the word cloud and replace the hardcoded breed with the breed I select in the report.

Because the URL replaces spaces with hyphens, I have created a calculated item in SAS Visual Analytics that has breeds with multiple words separated by slashes instead of spaces, Breed (ForLink).

Because I want to pass Breed (ForLink) from the word cloud to the URL, I need to add it to one of the roles for the word cloud. I don’t want the breeds to appear with hyphens in the word cloud, so I’ll add the calculated item to the Hidden role.

Tip: Data items assigned to the Hidden role are available for color-mapped display rules, external links, and mapping data sources and should only be assigned if it will not increase the number of rows in the query. In this example, the word cloud shows details about breeds. Adding Breed (ForLink) to the Hidden role makes the value available for the external link and does not increase the number of rows in the query.

Then, to add the link:

  1. With the word cloud selected, click Actions in the right pane and expand URL Links.
  2. Click New URL Link.
  3. Specify a descriptive name for the link.
  4. For the URL, enter the URL up to, but not including, the breed (https://www.akc.org/dog-breeds/); this value will be passed from the selected breed in the word cloud.
  5. Next to Parameters, click the Add icon.
  6. For the Source field, select Breed (ForLink) and leave the Target value blank. Adding nothing to the Target field indicates that the value of Breed (ForLink) should be appended to the end of the URL.

When a viewer selects a breed in the word cloud, the breed value will be appended to the end of the URL and details for that breed will be displayed.

Step 4: Test

After the interactive link has been created, you need to ensure that the link works by testing it in the report.

I’ll select both Cat and Dog as the type of animal, Male for sex, Yes for spayed or neutered, and Aged for condition. There are 143 animals in the Austin animal shelter that meet these criteria. I’m thinking a Labrador retriever might be good for my family, so I’ll double-click Labrador Retriever in the word cloud to see the traits and characteristics for that breed. It looks like Labrador retrievers are friendly, active, and outgoing, and they are also highly adaptable (meaning I can train them to sleep late). It sounds like a perfect fit!

For more information about how to add interactive links to your SAS Visual Analytics reports, including examples on creating links with different URL structures, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

You can’t teach an old dog new tricks… or can you? was published on SAS Users.

2月 012022
 

When we moved out to the country with our two dogs, our oldest dog Todd suddenly decided he liked to howl…. And he would do so every time we left the house. Maybe it was the country air? Maybe it was a time-lapse gene? Maybe he just wanted to learn something new?

If you’ve been using SAS Visual Analytics for a while, it’s possible you might have never created reports that linked to external URLs. SAS Visual Analytics can do so much on its own, perhaps you never thought about extending its functionality outside the product itself! Well, it’s time to learn a new trick.

To illustrate how this can be done (and to keep with the theme), let’s consider an example.

I’m interested in adding a new member to my family (a dog!), and I know I would like to adopt an animal in need. I’m not sure, however, which breed will suit my lifestyle. I need a dog that’s playful and sweet, but one that also likes to sleep late.

I have a report in SAS Visual Analytics that shows details about animals surrendered at an Austin animal shelter. I’d like to select various characteristics (like animal type, sex, whether the dog is spayed or neutered, and condition) and see what breeds they have available. Then, I’d like to see additional details about each breed at the American Kennel Club website (www.akc.org). On this website, you can find information about various breeds of dogs (and cats!), including average sizes, life expectancy, personality, and many other traits.

I’ll add an interactive link to the report, so when a user selects a specific breed, the page for that breed appears. The interactive link will use parameters to pass a selected value from the report to the web page.

To create interactive links, I like to follow four simple steps:

  1. Research the structure of the URL
  2. Use a hardcoded value to test the link
  3. Parameterize the link
  4. Test the parameterized value

Step 1: Research

Before adding interactive links to a report, you need to understand how the target web page structures the URL. I typically do this by accessing the target web page and searching for a specific subject. For some websites, you might need to view the Developer Guide for the website to fully understand the structure.

Typically, URLs are constructed in one of three ways:

  • Path: In these URLs, the subject is added at the end of the URL. For example, to view a country page on Wikipedia, you use the following URL: https://en.wikipedia.org/wiki/country where country is the full name of the country of interest.
  • Query: In these URLs, the subject is assigned to the value of a URL parameter using a sequence of attribute-value pairs: ?parameter1=value1&parameter2=value2. Multiple parameters can be assigned by separating the attribute-value pairs with an ampersand (&). For example, to search Etsy for a specific type of item, you use the following URL: https://www.etsy.com/search?q=item where item is the specific search string.
  • File: In these URLs, the subject is a part of a file name at the end of the URL. For example, to view a country profile on CIA Factbook, you use the following URL: https://www.cia.gov/library/publications/the-world-factbook/geos/country-code.html where country-code is the 2-letter abbreviation of the country of interest.

To start, I’ll select one of the breeds in the list: Australian Cattle Dog (my current dog’s breed). The American Kennel Club website has a drop-down selector at the top of the page where you can select a breed.

Australian cattle dogs are alert, curious, and pleasant. Tell me about it! He won’t let a leaf fly by outside without raising the alarm.

The URL is constructed with the breed as part of the path: https://www.akc.org/dog-breeds/australian-cattle-dog/. Notice that for breeds with multiple words (like Australian Cattle Dog), the link uses hyphens (-) instead of spaces.

Step 2: Hardcode

Now that you understand the structure of the URL, you can test the link using various hardcoded values. For example, to view details about dachshunds, go to https://www.akc.org/dog-breeds/dachshund/. Dachshunds are friendly, curious, and spunky. They must be! Why else would they be chosen to star in dog races at Oktoberfest celebrations around the world?

Step 3: Parameterize

After you have tested the URL using hardcoded values, you need to replace the hardcoded value with parameters. These are values that will be passed from your report to the external URL to make the links interactive. For the report, I’ll add the link to the word cloud and replace the hardcoded breed with the breed I select in the report.

Because the URL replaces spaces with hyphens, I have created a calculated item in SAS Visual Analytics that has breeds with multiple words separated by slashes instead of spaces, Breed (ForLink).

Because I want to pass Breed (ForLink) from the word cloud to the URL, I need to add it to one of the roles for the word cloud. I don’t want the breeds to appear with hyphens in the word cloud, so I’ll add the calculated item to the Hidden role.

Tip: Data items assigned to the Hidden role are available for color-mapped display rules, external links, and mapping data sources and should only be assigned if it will not increase the number of rows in the query. In this example, the word cloud shows details about breeds. Adding Breed (ForLink) to the Hidden role makes the value available for the external link and does not increase the number of rows in the query.

Then, to add the link:

  1. With the word cloud selected, click Actions in the right pane and expand URL Links.
  2. Click New URL Link.
  3. Specify a descriptive name for the link.
  4. For the URL, enter the URL up to, but not including, the breed (https://www.akc.org/dog-breeds/); this value will be passed from the selected breed in the word cloud.
  5. Next to Parameters, click the Add icon.
  6. For the Source field, select Breed (ForLink) and leave the Target value blank. Adding nothing to the Target field indicates that the value of Breed (ForLink) should be appended to the end of the URL.

When a viewer selects a breed in the word cloud, the breed value will be appended to the end of the URL and details for that breed will be displayed.

Step 4: Test

After the interactive link has been created, you need to ensure that the link works by testing it in the report.

I’ll select both Cat and Dog as the type of animal, Male for sex, Yes for spayed or neutered, and Aged for condition. There are 143 animals in the Austin animal shelter that meet these criteria. I’m thinking a Labrador retriever might be good for my family, so I’ll double-click Labrador Retriever in the word cloud to see the traits and characteristics for that breed. It looks like Labrador retrievers are friendly, active, and outgoing, and they are also highly adaptable (meaning I can train them to sleep late). It sounds like a perfect fit!

For more information about how to add interactive links to your SAS Visual Analytics reports, including examples on creating links with different URL structures, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

You can’t teach an old dog new tricks… or can you? was published on SAS Users.

11月 102021
 

Time is a free resource to people yet is the most precious one. We all have 24 hours every day in our lives. We do not need to pay for getting these hours, and we do not have ways to pay for getting more than 24 hours a day. Have you ever noticed how you spend your time? Or how other people spend their time?

Certainly, there will be commonalities – for example, all people need time to sleep, to eat and many people need time to work and study. Also, for sure there are differences in how people divide their time for activities in each day. There might be some pattern of time use in different countries and different cultures. I am interested in exploring this, so I found some data from the web to explore.

What is a Time Use Survey?

Over the last 30 years, an increasing number of countries around the world have been conducting large-scale time use surveys. The Time Use Survey is designed to measure the amount of time people spend on various activities in their daily life, across a total duration of 24 hours (or 1,440 minutes). These activities, such as work, relaxing, and exercising, are classified into a set of descriptive categories, and the time on these activities are interviewed from some respondents. Then the data was recorded, calculated and edited.

I got the time use data from OECD (Organization for Economic Co-operation and Development) site, and the time use survey was conducted in more than 30 countries from 2009 to 2016. I also got the American Time Use Survey data for 2020 for my exploration. I am aware that the data quality might not be good enough for serious research, but that’s not a problem for me. I just want to explore it for fun, while practicing SAS Visual Analytics usage.

How do people around the world spend their time?

Download the excel file from OECD site, import it in SAS Visual Analytics. I will explore how people in different countries spend their time, how many minutes do they averagely spend on among the five categories (according to OECD, they put different activities into five categories).

We can easily draw a bar chart in SAS VA like below. Note: the downloaded OECD data has the time use data for American, but I eliminate it from this chart due to its total is 1,469 minutes (more than 24 hours a day). And that leads me to explore the American time use data separately.

See the green bars are the longest one among the five colored bars? They represent Personal Care. It seems people across these countries pay the most time in Personal Care. Unbelievable? Check the activities in the Personal Care category: sleeping, eating, dressing, and others personal care activities. All right, people sleep about 8 hours (480 mins) every day on average, that’s about 30% of a day. It makes sense that the Personal Care category occupies the most time (about 661 mins on average) in our daily life.

Now from another perspective, let’s see the top and bottom countries where people spend time on Personal Care, as well as on paid work/study. From below charts, I guess you won’t be surprised when seeing France sits on the top one country with most Personal Care time, and Japan sits on the top one country with most time on paid work/study, while Italy is the country where people spend least time on paid work/study.

Note in above charts, I use the same scale for the X axis intentionally. This is to make sure people get direct feel on the differences between the two categories, the ‘Paid work or study’ time on the right is less than half of the ‘Personal Care’ time on the left.

Furthermore, we can look at the distribution of these five categories across all these countries. Calculate the percentage for each major category using calculated items in VA and show them in a box plot. We see people on average spend about 46% of their time on Personal Care, about 20% on Leisure, and 19% on Paid work/study. The highest percentage in ‘Personal Care’ is about 52%, more than 12 hours every day. The least percentage is about 42%, that’s about 10 hours every day. Also notice that time on Personal Care, Leisure and Paid work/study are the top 3 categories that takes more than 85% time each day.

How do Americans spend their time?

As I mentioned, the American data from OECD is not ideal for me, so I downloaded the American Time Use Survey (ATUS) data, and using the 2020 data file for further exploration. The ATUS data was organized in different categories using different methodology than OECD data, so I must do data preparation in SAS Studio, and then explore in SAS Visual Analytics.

Prepare the data

The raw 2020 data file has 399 columns, and 8,782 rows. It contains data for the total number of minutes that each respondent spent on the 6-digit activity (per ATUS code rule). The column names with the letter "t" precedes the 6-digit codes, identify activities by their 6-digit codes. The first and second digits of the 6-digit code correspond to some tier1 code; the second and third digits correspond to some tier 2 code, etc. Each row corresponds to a unique respondent.

So my data preparation includes:

  • Classify the 6-digit activities to their corresponding tier1 codes, which comes to about 18 categories.
  • Calculate the means and 99% confidence interval for each of the 18 categories.
  • Transpose the dataset and merge the datasets. If you are interested in how I did this, you can get the code on GitHub..
  • The ATUS data set contains one column on Age, so I can make a custom category of age group in VA and divide the ages to three categories: less than 18, great than 65, and between 18 and 65. This will enable me to compare the ATUS data with the OECD data (whose ages are between 18 and 65).

Aggregate the data

ATUS contains detailed data from thousands of respondents with hundreds of columns. I need to aggregate the data for my exploration. Here are some tips when doing the aggregation for each of these hundreds of columns:

  1. The default aggregation for measure items in VA is Sum. We can easily change the aggregation in the data pane by clicking the ‘Edit properties’ icon and choosing other aggregation (I will use ‘Average’) in the ‘Aggregation’ dropdown list. But when I have hundreds of measure items in the ATUS data set, how can I quickly set the average aggregation for them instead of one-by-one? The tip is clicking on the first measure item, and scroll to the last item, press ‘shift’ when clicking the last item. This will select all these measure items. Right click the mouse, and from the pop-up menu, choose Aggregation > Average. This will set the aggregation to average for all the chosen items.
  2. I need a bunch of calculated items; each comprise lots of measure items. In SAS Visual Analytics, we can manually add each item in visual mode. But it’s too tedious to add so many measure items. The tip here is to write some SAS macro codes to generate the calculation expressions in text for me, then copy/paste the expression in text mode.

Explore the data

According to the ATUS code rule, ATUS uses different categories than OECD categories. To be able to compare the time used in major activity categories, I make the similar major activity categories like those from OECD, based on my personal understanding of the ATUS activities. Then with the bunch of calculated items, I get the time for these major activity categories. Due to methodology difference, be aware that this may lead the results to be partially inaccurate.

Now starts my ATUS exploration. Below charts show how people in American divide their daily time. The dataset has information on gender, so the bottom one shows the average percentage for Male and Female respectively.

When I put the percentage data (calculated for major activities categories) in a box plot, it has lot of outliers for each category. Considering different methodology and personal classification to the major activity categories (here is the OECD code), I see some difference than the OECD box plot. Note that the ranking for top two major activity categories are Personal Care and Leisure, the same pattern as in OECD data.

Identify the outliers

Notice those outliers in above box plot? I’d better explore more. In latest version, SAS Visual Analytics will automatically detect outliers in data items. This ‘Insights’ will list the data items in report objects that might be affected by outliers.

For example, in below screenshot, I made a histogram of the ‘Personal Care %’, which shows its distribution looks like normal. If I click the ‘Insight’ icon at top-right corner, VA will show all the data items that might be affected by outliers. If clicking the icon next to the ‘Personal Care %’ item at the bottom, a message will pop up saying that there are 243 outliers in this data item.

Create a custom graph

I saw lot of outliers in columns of ATUS data when exploring it, so I decide to use the mean value with confidence intervals. I created a custom graph with a scatter plot and a schedule chart. In SAS VA, assign the black dot in the custom graph to show the mean value and make the beginning and ending of each blue bar show the 99% confidence intervals.

Below is the top 10 ATUS activity categories (here are the ATUS tier 1 code categories) American people spend time on. We see the largest average time is the Personal Care, about 586 mins (nearly 10 hours) with 99% confidence intervals ranging from 583.5 to 588.4 mins.

That’s my initial exploration of Time Use Survey data, but much more can be done. For example, because ATUS data is collected on an ongoing, monthly basis, we can perform time-series analysis to identify changes in how people spend their time.

Would you like to give it a try? Visit the SAS Visual Analytics Gallery on the SAS Support Communities to see more ways you can use SAS Visual Analytics to explore data. Then sign up for a two-week free trial of SAS Visual Analytics.

EXPLORE NOW | SAS Visual Analytics Gallery
START FREE TRIAL | SAS Visual Analytics

How do people divide their time among daily activities? was published on SAS Users.

10月 192021
 

The social and economic impact of COVID-19 has dramatically affected supply chains and demand planning across all industries. Then there’s the Amazon effect, which has led to sky-high consumer expectations of the ordering and delivery process. Demand planners for retailers and consumer goods companies have quickly realized they have no [...]

What does it take to become an analytic-driven demand planning organization? was published on SAS Voices by Charlie Chase

8月 102021
 

Underperforming schools can cast a shadow over the community. Not only do students suffer, but the downstream effects of poor performance can be felt across school funding, teacher retention, morale and the local economy. When one of the top 20 K-12 school districts in the US received a “C” ranking [...]

Improving student outcomes though visual data exploration and AI was published on SAS Voices by Jennifer Griess