12月 052018
 

Recently a SAS programmer wanted to obtain a table of counts that was based on a histogram. I showed him how you can use the OUTHIST= option on the HISTOGRAM statement in PROC UNIVARIATE to obtain that information. For example, the following call to PROC UNIVARIATE creates a histogram for the MPG_City variable in the Sashelp.Cars data set. The histogram has 11 bins. The OUTHIST= option writes the counts for each bin to a SAS data set:

proc univariate data=Sashelp.Cars noprint;
   var MPG_City;
   histogram MPG_City / barlabel=count outhist=MidPtOut;
run;
 
proc print data=MidPtOut label;
   label _MIDPT_ = "Midpoint" _COUNT_="Frequency";
   var _MIDPT_ _COUNT_;
run;

Endpoints versus midpoints

As I've previously discussed, PROC UNIVARIATE supports two options for specifying the locations of bins. The MIDPOINTS option specifies that "nice" numbers (for example, multiples of 2, 5, or 10) are used for the midpoints of the bins; the ENDPOINTS option specifies that nice numbers are used for the endpoints of the bins; By default, midpoints are used, as shown in the previous section. The following call to PROC UNIVARIATE uses the ENDPOINTS option and writes the new bin counts to a data set. The histogram is not shown.

proc univariate data=Sashelp.Cars noprint;
   var MPG_City;
   histogram MPG_City / barlabel=count endpoints outhist=EndPtOut;
run;
 
proc print data=EndPtOut;
   label _MINPT_ = "Left Endpoint" _COUNT_="Frequency";
   var _MINPT_ _COUNT_;
run;

Tabulating counts in the SAS/IML language

If you want to "manually" count the number of observations in each bin, you have a few choices. If you already know the bin width and anchor position for the bins, then you can use a DATA step array to accumulate the counts. You can also use PROC FORMAT to define a format to bin the observations and use PROC FREQ to tabulate the counts.

The harder problem is when you do not have a prior set of "nice" values to use as the endpoints of bins. It is usually not satisfactory to use the minimum and maximum data values as endpoints of the binning intervals because that might result in intervals whose endpoints are long decimal values such as [3.4546667 4.0108333].

Fortunately, the SAS/IML language provides the GSCALE subroutine, which computes "nice" values from a vector of data and the number of bins. The GSCALE routine returns a three-element vector. The first element is the minimum value of the leftmost interval, the second element is the maximum value of the rightmost interval, and the third element is the bin width. For example, the following SAS/IML statements compute nice intervals for the data in the MPG_City variable:

proc iml;
use Sashelp.Cars;
   read all var "MPG_City" into X;
close;
 
/* GSCALE subroutine computes "nice" tick values: s[1]<=min(x); s[2]>=max(x) */
call gscale(s, x, 10);  /* ask for about 10 intervals */
print s[rowname={"Start" "Stop" "Increment"}];

The output from the GSCALE subroutine suggests that a good set of intervals to use for binning the data are [10, 15), [15, 20), ..., [55, 60]. These are the same endpoints that are generated by using the ENDPOINTS option in PROC UNIVARIATE. (Actually, the procedure uses half-open intervals for all bins, so it adds the extra interval [60, 65) to the histogram.)

I've previously shown how to use the BIN and TABULATE functions in SAS/IML to count the observations in a set of bins. The following statements use the values from the GSCALE routine to form evenly spaced cutpoints for the binning:

cutPoints = do(s[1], s[2], s[3]);    /* use "nice" cutpoints from GSCALE */
*cutPoints = do(s[1], s[2]+s[3], s[3]);  /* ALTERNATIVE: add additional cutpoint to match UNIVARIATE */
b = bin(x, cutPoints);               /* find bin for each obs */
call tabulate(bins, freq, b);        /* count how many obs in each bin */
binLabels = char(cutPoints[bins]);   /* use left endpoint as labels for bins */
print freq[colname = binLabels label="Count"];

Except for the last interval, the counts are the same as for the ENDPOINTS option in PROC UNIVARIATE. It is a matter of personal preference whether you want to treat the last interval as a closed interval or whether you want all intervals to be half open. If you want to exactly match PROC UNIVARIATE, you can modify the definition of the cutPoints variable, as indicated in the program comments.

Notice that the TABULATE routine only reports the bins that have nonzero counts. If you prefer to obtain counts for ALL bins—even bins with zero counts—you can use the TabulateLevels module, which I described in a previous blog post.

In summary, you can use PROC UNIVARIATE or SAS/IML to create a tabular representation of a histogram. Both procedures provide a way to obtain "nice" values for the bin endpoints. If you already know the endpoints for the bins, you can use other techniques in SAS to produce the table.

The post When is a histogram not a histogram? When it's a table! appeared first on The DO Loop.

12月 042018
 

It might snow this weekend here at the SAS headquarters! This would be the first snow of the season for us, and it got me thinking about snow. Apparently these thoughts have manifested themselves in my computer graphics work ... in the form of a snow animation. Follow along, and [...]

The post Let it snow, let it snow, let it snow! appeared first on SAS Learning Post.

12月 042018
 

In this blog series, we’ve spoken directly to professors to find out why it’s so important to teach analytics, their advice for students, and to learn how they create interest in analytics programs at their universities. For this third and final post, we’ll hear how SAS has played a role [...]

Two professors’ perspectives on SAS and the future of analytics was published on SAS Voices by Georgia Mariani

12月 042018
 

When a Visual Analytics 8.3 report moves on a screen from one page to the next – all by itself, without a human hovering over a keyboard – you're seeing the Report Playback feature of SAS Visual Analytics Viewer 8.3 in action.

Reasons for using visual movement

Playable dashboards are easy to create and use. But let's ponder for a moment: Why would you want to set your report in motion? You might want it to scroll automatically:

  • At a kiosk or booth where folks linger for short periods of time.
  • During a presentation to an audience so you're hands-free. You decide how long each page displays and are free to focus on explaining key facts and figures in the moving report without the distraction of manually flipping through each page. Sort of like your car's cruise control –  you take your foot off the pedal and the vehicle keeps going.

Design considerations for playable dashboards

If the intent is to let the report run on its own in a kiosk or a booth, be mindful that such environments require information to move fast. Those watching the playable dashboard expect to grasp key facts and figures quickly. Time is of the essence.

A short attention span benefits from a report design whereby each report page contains one report object that quickly conveys the essence of the message in a few seconds. If you use a complex report design with multiple report objects and a small font, chances aren't good that your audience will absorb meaning from your report.

Any report object (for example, scatter plot) that requires your user to first look at the legend and then comprehend the data in the graph would be unsuitable for playable dashboards that are set to move at a fast rate, such as three or four seconds per page.

Example of a playable dashboard

I designed a report to illustrate carbon dioxide (CO2) emissions for 20 countries. I added five report objects that are easy to comprehend in about five seconds (a subjective estimate, of course.) I also added a scatter plot and geomap with legends that are challenging to comprehend to illustrate why report objects with legends can be unsuitable for a playable dashboard!

For the scatter plot, the presenter would have to expand the legend tooltip to show the legends for the country data in that report object – not realistic in a fast-paced dashboard. In the geomap, the audience needs to look at the legends at the bottom (icons, colors, etc.) and associate that legend with the display in the graph. That’s a lot of brain activity for five seconds – unrealistic. It makes sense, then, to use report objects here that don’t depend on user comprehension of legends to understand the data.

Let the show begin!

When the scatter plot or geomap is displayed, notice how it’s hard to comprehend such report objects in five seconds. In such a short timeframe, it's impossible to process legends and the data, all at once.

How to Create a Playable Dashboard in the Web-based Viewer

  1. In SAS Visual Analytics Viewer, I opened the report and chose Edit playback from the main menu.
  2. In the Edit Playback dialog, I chose the following options:

a. Transition unit – I can choose to display one page at a time or one object at a time. I chose to display one page at a time.

b. Seconds per unit – I chose to display each page for five seconds.

c. Show canvas only – I chose this option because it hides the report control area, page tabs, and page controls for a nicer look.

d. Show timer – This option would display a countdown for each page or object transition. I did not choose this option.

e. Show navigation controls for the report playback – I chose this option because it displays navigation controls in the bottom right corner of the viewer when I hover over the report with my mouse. Personally, I really like this feature because it gives me the flexibility to intervene and move the report pages forward or backward, pause the playback, or exit the playback.

Finally, I save and exit, and the playable dashboard begins to play on my monitor screen.

SAS® Visual Analytics on SAS® Viya® Try it for free!

How to create a playable dashboard with SAS Visual Analytics was published on SAS Users.

12月 032018
 

Did you ever try to find articles about a topic in a library before computers came along? You might have had to manually look through several hundred bound periodicals, or perhaps you were lucky enough to find the topic in a master index that pointed you directly to the year [...]

The post How to find SAS blog posts about your favorite topics appeared first on SAS Learning Post.