SAS Visual Analytics

9月 082017
 

In SAS Visual Analytics 8.1, report creators have the ability to include drive-distance and drive-time in their geographical maps, but only if their site has an Esri ArcGIS Online account and they have valid credentials for the account.

In the user Settings for SAS Visual Analytics Geographic Mapping 8.1 release, there are three choices for selection of a geographic map provider.  The map provider creates the background map for geo maps and for network diagrams that display a map.

The map provider options are:

  • OpenStreet Map service, hosted at SAS.
  • Esri ArGIS Online Services, which only requirFinale acceptance of the terms and conditions.
  • Esri premium services, which requires a credential validation.

If Esri premium services is selected, there is an additional prompt for valid credentials, and you must still accept the Esri ArcGIS Online Services terms in order to select the premium services checkbox.

It’s also worth pointing out here, that even if you have Esri premium credentials, in order for these credentials to be validated in SAS Visual Analytics, you must also be a member of the ESRI Users custom group.  Users can be added to this group in SAS Environment Manager, as shown below.

Note that without the Esri ‘premium’ service and validated credentials, when you right-click and Create geographic selection in your report map, you are only able to select the Distance selection, which displays the radial distance for the selection point.

With premium services in effect, you can also select drive-time or drive-distance.  An example of a drive-time selection is shown here.  Drive-time creates an irregular selection based on the distance that can be driven in the specified amount of time.

A drive-distance example is shown below.  Drive-distance creates an irregular selection based on the driving distance using roads.

When selecting drive-time or drive distance, you can also add breaks to show, as in the example below, the 5-mile distance, the 10-mile distance, and the 15-mile distance on the maps.

It’s also worth pointing out, that if a viewer of the report has not had Esri premium credentials validated, the viewer will be unable to view the drive-distances and drive-time features.  The settings for users of the report viewer are also stored in the Report Viewer Geographic Mapping user settings.

If a user is adding a connection to the server in SAS Mobile BI 8.15 and their account is a member of the Esri Users group, they will be prompted for their Esri premium credentials when adding the server connection:

I hope you’ve found this post helpful.

How do I access the Premium Esri Map Service for my SAS Visual Analytics reports? was published on SAS Users.

8月 312017
 

In my first post of this blog series, we learned how three education customers are using SAS. Today, we'll hear about the positive impact that SAS and analytics are providing for users and the education institutions. In this post, you'll hear from: Linda Sullivan, Assistant Vice President for Institutional Knowledge Management, [...]

Education analytics: The impact of using SAS and analytics was published on SAS Voices by Georgia Mariani

8月 172017
 

In this blog post I am going to cover the example of importing data into SAS Viya using Cloud Analytic Services (CAS) actions via REST API. For example, you may want to import data into a CASLib via REST API.  This means you can perform an import of data outside of the SAS Self-Service Import user interface environment using REST API.  Once this data is loaded into CAS it is available for use in applications such as SAS Visual Analytics and SAS Visual Data Builder.

Introduction

To import data into SAS Viya via REST API, you need to make a series of REST API calls:

1.     Start CAS Session
2.     Load Data into a CASLib
3.     End CAS Session

I will walk through these various REST API calls in the sections below using the REST API testing application HTTPRequestor, which is a free add-on to the Mozilla Firefox browser.

Before I perform any of my REST API calls, I need to Base-64 encode my credentials. The input for encoding the credentials is: I used the site https://www.base64encode.org/ to encode my credentials.  Note: You can use other methods (e.g., Python) to encode your credentials. Use the preferred method by your organization to ensure you are meeting their security protocols.

Below is the header Authorization information I will be sending with each of my requests.

Authorization Header

1.     Start CAS Session

First, I need to start a CAS Session. Below is an example request for starting a CAS Session:

POST https://<YourCASServer:Port>/cas/sessions

Authorization: Basic <Base-64EncodedCredentials>
 Content-Type: application/json

{}

This request returns the CASSessionUUID needed in the next step.

I construct my request in HTTPRequestor as follows and submit the request:

Start CAS Session Request/Response

Here is a screenshot of the raw transaction information.

Start CAS Session Raw Transaction

I need to copy the CAS Session UUID information that was returned for use in the subsequent REST API calls since their CAS Actions must be performed within a CAS Session.

2.     Load Data into a CASLib

Now that I have started my CAS session and have its UUID, I can load the table to CAS. Below is an example request for the table.loadTable CAS Action:

POST 
https://<YourCASServer:Port>/cas/sessions/<CASSessionUUID>/actions/table.load
Table

Authorization: Basic <Base-64EncodedCredentials>
 Content-Type: application/json

{"casLib":"<InputCASLib>","importOptions":{"fileType":"<FileType>"},"path":"<InputFilePathAndName>",
 "casout":{"caslib":"<OutputCASLib>","name":"<OutputTableName>","promote":true}}

 

This request returns a log message: “NOTE: Cloud Analytic Services made the file <InputFilePathAndName> available as table <OutputTableName> in caslib <OutputCASLib>.”

For my example, I will load the SAS data set BASEBALL located in the helpdata CASLib to the Public CASLib and call the CAS Table SAS_BASEBALL.  I am copying the data to the Public CASLib to make it more readily available to all CAS users. Let’s first confirm that the SAS_BASEBALL table does not currently exist in the Public CASLib.

Public CASLib Before LoadTable CAS Action Called

I construct my request in HTTPRequestor as follows and submit the request:

Load Table Request/Response

Here is a screenshot of the raw transaction information.

Load Table Raw Transaction

Next, I will confirm that the SAS_BASEBALL data set is now loaded in the Public CASLib.

Public CASLib After LoadTable CAS Action Called

The SAS_BASEBALL data set is now available for use in applications such as SAS Visual Analytics and SAS Visual Data Builder.

3.     End CAS Session

Finally, I need to terminate my CAS Session. Below is an example request for the session.endSession CAS Action:

POST https://&lt;YourCASServer:Port&gt;/cas/sessions/&lt;CASSessionUUID&gt;/actions/session.endSession

Authorization: Basic &lt;Base-64EncodedCredentials&gt;
 Content-Type: application/json

{}

 

This request returns a status of 0 indicating there was no error and the CASSessionUUID specified in the request has ended.

I construct my request in HTTPRequestor as follows and submit the request:

End CAS Session Request/Response

Here is a screenshot of the raw transaction information.

End CAS Session Raw Transaction

Conclusion

These calls can be strung together so you could schedule their execution. For more information on SAS Viya and REST APIs, refer to the following documentation the SAS Cloud Analytics REST API documentation.

Load Data into SAS Viya via REST API was published on SAS Users.

8月 152017
 

CAS data modelingThe CAS physical data model, i.e.what features CAS offers for data storage, and how to use them to maximize performance in CAS (and consequently SAS Visual Analytics 8.1 too).

So, specifically let’s answer the question:

What CAS physical table storage features can we use to get better performance in CAS and SAS Visual Analytics/CAS?

CAS Physical Table Storage Features

The following data storage features affect how CAS tables are physically structured:

  • Compression
  • Partitioning
  • Sorting
  • Repeated Tables
  • Extended Data Types (Varchar)
  • User Defined Formats

Compression — the Storage Option that Degrades Performance

data public.MegaCorp (compress=yes);
   set baselib.MegaCorp;
run;

Partitioning and Sorting

Partitioning is a powerful tool for improving Bar Charts, Decision Tree, Linear Regression) provide grouping as well as classification functionality.

When performing analyses/processing, CAS first groups the data into the required BY-groups. Pre-partitioning on commonly-used BY-groups means CAS can skip this step, vastly improving performance.

Within partitions, tables can be sorted by non-partition-key variables. Pre-sorting by natural ordering variables (e.g. time) allows CAS to skip the ordering step in many cases just like partitioning allows CAS to skip the grouping step.

For a full use-case, consider a line graph that groups sales by region and plots by date. This graph object would benefit greatly from a CAS table that is pre-partitioned by region and pre-sorted by date.

Join Optimization

Partitioning can also support join operations since both the CAS FedSQL Merge Join algorithm utilize BY-GROUP operations to support their processing.

Pre-partitioning tables in anticipation of joins will greatly improve join performance. A good use case is partitioning both a large transaction table and an equally large reference table (e.g. an enormous Customer table) by the common field, customerID. When a DATA Step MERGE or a FedSQL join is performed between the two tables on that field, the join/merge will take advantage of partitioning for the BY-GROUP operation resulting in something similar to a partition-wise join.

Like Compression, partitioning and sorting can be implemented via CAS actions as well as data set options. Using the data set options is demonstrated below:

data mycas.bigOrderTable (partition=(region division) orderby=(year quarter month));
   set CASorBase.bigOrderTable;
run;

Repeated Tables

By default, in distributed CAS Server deployments, CAS divides incoming tables into blocks and distributes those blocks among its DUPLICATE data set option or the Repeated Tables have two main use-cases in CAS:

1.     Join Optimization
2.     Small Table Operation Optimization

Join Optimization

For join operations, the default data distribution scheme can result in significant network traffic as matching records from the two tables travel between worker nodes to meet. If one of the two tables was created with the DUPLICATE/REPEAT option, then every possible record from that table is available on every node to the other table. There is no need for any network traffic.

Small Table Operation Optimization

For small tables, even single table operations can perform better with repeated instead of divided distribution. LASR actually implemented the “High Volume Access to Smaller Tables” feature for the same reason. When a table is repeated, CAS runs any required operation on a single worker node against the full copy of the table that resides there, instead of distributing the work.

As stated, repeated tables can be implemented with the DUPLICATE data set option, it can also be implemented with the REPEAT option on the PROC CASUTIL LOAD statement. The CASUTIL method is shown below:

proc casutil ;
   load data=sashelp.prdsale outcaslib=”caspath”
           casout=”prdsale” replace REPEAT ;
quit ;

Extended Data Types (VARCHAR)

With Viya 3.2 comes SAS’ first widespread implementation of variable length character fields. While Base SAS offers variable length character fields through compression, Viya 3.2 is the first major SAS release to include a save storage space, it also improves performance by reducing the size of the record being processed. CAS, like any other processing engine, will process narrower records more quickly than wide records.

User Defined Formats

User defined formats (UDFs) exist in CAS in much the same way they do in Base SAS. Their primary function, of course, is to provide display formatting for raw data values. Think about a format for direction. The raw data might be: “E”, “W”, “N”, “S” while the corresponding format values might be “East”, “West”, “North”, “South.”

So how might user defined formats improve performance in CAS? The same way they do in Base SAS, and the same way that VARCHAR does, by reducing the size of the record that CAS has to process. Imagine replacing multiple 200 byte description fields with 1 byte codes. If you had 10 such fields, the record length would decrease 1990 bytes ((10 X 200) – 10). This is an extreme example but it illustrates the point: User defined formats can reduce the amount of data that CAS has to process and, consequently, will lead to performance gains.

CAS data modeling for performance was published on SAS Users.

7月 262017
 

In a previous blog, I describe how there are a few new features related to report and page prompts in SAS Visual Analytics 8.1; namely the ability to configure cascading prompts in VA 8.1: Cascading Prompts as Report and Page Prompts.

In this blog, I will cover how to configure prompts, either report, page, or report canvas prompts, that use different data sources.

Different Data Sources with overlapping data values

First, you must have two different data sources added to your Visual Analytics report. These data sources must have values that overlap that you wish to prompt on. All of the values do not need to map, but they must have some values in common if you wish to use a shared prompt.

In this example, we will prompt for Product Line. Let’s examine the column values:

I’ve color coded the values that I would like to map together. I see that the only values that match “out-of-the-box” is Game.

One work around to get all of the values to match will be to create a Custom Category and use that column for the mapping.

In a “real world” scenario, this may not be ideal. The cardinality of the two columns may be so large that you may have to go back to either the source data or ETL job to produce better matching values.

However, if you are using date columns as the mapping columns things are considerably easier as year, month, and quarter are standard values that match without extra steps.

Here is my new Custom Category that I will use for my mapping:

Here are my mappings now. I will be using Product Line (New) for the Insight Toys data source moving forward.

Add prompts

There are two different locations where you can add prompts, i.e. Control Objects, which means there are two different ways to configure prompts with different data sources:

1.     Report and Page Prompts

2.     Report Canvas Prompts

Report and Page Prompt configuration for different data sources

For this first example, I will configure a Button Bar object placed in the Page Prompt area to filter two different data sources. For the Button Bar’s Category Role, I will use the data source with the largest available selection, in this case, the Product Line (New) from Insight Toys.

Now let’s configure this button bar to filter both data sources. You must activate the button bar by clicking on it, then right-mouse click and select Edit data source mappings

Then you simply have to pick your source table’s column to map to your target table’s column.

That’s it. The mapping is complete. Here is what the report would look like with different selections made for the button bar. Notice, that since I used the Insight Toys data source for the Role assignment, and it has more values than available in the Mega Corp data. If a selection is made where nothing matches in Mega Corp, as in the Gift example, then the Mega Corp bar chart is blank.

Report Canvas Prompt configuration for different data sources

In this second example, I am going to use a List Control object within the report canvas to filter two different data sources. Again, I will use the Insight Toys’ Product Line (New) column as the List Role Category assignment since it has the most values.

Now to configure the list to filter both bar charts. Click on the list control object to activate the window. Then select the Actions pane, and use the Add button to select Add filter.

Then select both bar charts as the target of the filter Action.
Next, select the Map data option.

Select the source data’s column to map to the target data’s column. Use the + to add additional column mapping criteria.

Here is how the report would look with a few of the values selected from the list table. You can see how both Mega Corp and Insight Toys display overlapping values for Product Line but for any unique Product Lines, such as Gift, its values are only displayed on the Insight Toys bar chart.

Now you know how to configure your control objects for multiple data sources. This works no matter how many data sources you add to your report, simply use the Map data option and select the mappings between the source data and target data.

As I mentioned earlier, a frequently used application of mapping prompts for multiple data sources is for date columns. Here is a screenshot of one example using year and month. I also styled the button bar’s selected background and text color to coordinate with the graphs.

 

SAS Visual Analytics 8.1: Configuring prompts with different source data was published on SAS Users.

7月 102017
 

In SAS Viya 3.2, SAS Visual Data Builder provides a mechanism for performing simple, self-service data preparation tasks for SAS Visual Analytics or other applications. SAS Visual Data Builder is NOT an Extract, Transform and Load (ETL) or data quality tool. You may still need one of those tools to perform more complex data preparation.

SAS Visual Data Builder can perform the following tasks:

  • View table and column profiles – provides information on number rows and columns on the table, as well as standard and advanced metrics for the columns.
  • Perform data transformations – includes items such as joining tables, transposing columns, creating calculated columns, filtering data and splitting columns.
  • Create plans – a plan is a collection of data transformations (actions) performed on one or more tables.  Plans can be saved and executed again.

SAS Visual Data Builder

To access SAS Visual Data Builder from SAS Home, select ≡ > SAS Visual Data Builder from the menu.
Note: The user must belong to the pre-defined custom user group Data Builders to have permission to access the application.

For SAS Visual Data Builder, the user can select their preferred default start screen in their application Settings.

The options are:

  • Show welcome dialog.
  • Start with data.
  • Start with new plan.
  • Choose existing plan.

With the SAS Viya 3.2 release, SAS Visual Data Builder is now a separate application from Visual Analytics (VA). There is not a one-to-one mapping of the feature set in SAS 9.4: VA 7.3 Data Preparation to SAS Viya 3.2: SAS Visual Data Builder.

For more information on SAS Visual Data Builder refer to the SAS Viya 3.2: Visual Data Builder was published on SAS Users.

7月 072017
 

For colleges and universities, awarding financial aid today requires sophisticated analysis. When higher education leaders ask, “How can we use financial aid to help meet our institutional goals?” they need to consider many scenarios to balance strategic enrollment goals, student need, and institutional finances in order to optimize yield and [...]

Meet student enrollment goals by optimizing your financial aid strategy was published on SAS Voices by Georgia Mariani

7月 012017
 

As a technical consultant for SAS, I have the privilege of meeting with SAS customers, learning more about how they use our software, and then helping them solve their problems. Recently, a client of mine was having trouble finding a way to implement the use of multiple application servers in SAS Visual Analytics 7.3 based on what business unit a user is a part of. Guessing that some of our SAS Visual Analytics Administrators may have the same question, I thought I’d share how we solved his problem.

The outline below is the order of operations of how a server context is selected. The example provided in the screenshots is how you would control access so that user requests will only be sent through the appropriate business unit server.

Preliminary Requirements

  • The Server is registered with the job execution service.
  • The Server is visible to the requesting user.

Assuming that all servers are registered with the job execution service the following steps are how an application server would be selected:

Step 1: The server associated with the target LASR library will be used. If the server is not visible to the user, proceed to step 2.

Figure 1: In this example the server context associated with the target library is SASApp which the user is denied access to in the server context permissions

Step 2: The suite-level default server defined in the SAS Visual Analytics configuration properties at “va.defaultWorkspaceServer” will be used. If the server is not visible to the user, proceed to step 3.

Figure 2: In this example the configuration property is also set to SASApp which the user is denied access to in the server context permissions

Step 3: Lastly, use any server that is registered with the job execution service and visible to the requesting user.

Figure 3: In this example the BU server context would end up being chosen because it is the only context that the user has permission to access

Some additional Info to Keep in Mind

  • The preferences in the administrator and data builder tabs allow the forced used of a specific server by opting out of automatic selection. These preferences can be separately set and one does not affect the other.
  • In SAS Visual Analytics Explorer, the pooled workspace server is used to populate the available data to import field, and the workspace server is used to perform the actual importing.

Figure 4: The pooled workspace server is denied access so the import data area will not be populated

I hope you found this blog helpful. Please feel free to leave a question or comment below.

Using multiple server contexts in SAS Visual Analytics was published on SAS Users.

6月 292017
 

In my last blog, we examined the data pane in SAS Visual Analytics 8.1. That blog discussed how to have the data pane display the data items of your active data source, and how to perform tasks such as viewing measure details, changing data item properties, and creating geographic data items, hierarchies, and custom categories.  In this blog, we’ll look at creating new calculated data items and calculated aggregations.

If you recall, l you display the Data pane in the Visual Analytics interface by clicking the Data icon on the left menu.

A calculated data item is a new data item created from existing data by using an expression.

  • Calculations are performed on un-aggregated data—the expression is evaluated on each row before aggregations are calculated.
  • Calculated data items can accept parameters.
  • A hierarchy can contain calculated category data items.
  • Calculated data items can be changed to geography data items and used in geo maps.

You can create a derived calculation from a category or measure data item by right-clicking on the data item and selecting Create calculation from data item.

For a category data item, you can create a distinct count, count, or number missing. Creating a derived calculation from a category data item:

For a measure data item, you can create a percent of total, or a periodic calculation based on one of your date data items. Creating a derived calculation from a measure data item:

Notice that in both cases, the new data item is an aggregation, so the new item will appear under the Aggregated Measure category in the data pane.

Note:  In order to use the periodic calculation types, your selected data item must include the year.

You can also edit these new data items by right-clicking on the data item and selecting Edit. Editing a derived calculation:

There is now a single interface for creating calculated data items of type Numeric, Character, Date or Datetime or Aggregated measures.

  • This interface provides both Visual mode and Text mode for viewing and editing the expression.
  • You can drag and drop data items or parameters and operators onto the expression in either mode.
  • In text mode, you can also type in your expression.

Creating a calculated data item or aggregated measure:

Specifying the calculation result type and format:

Some notes for using operators in calculations and aggregations:

  • Operators are provided for both calculations and aggregations.
  • You can expand and collapse each category of operators.
  • If you add an Aggregated operator to an expression, the result type will be changed to Aggregated Measure.
  • You cannot have nested aggregations in an expression.You also have access to periodic operators and simple and advanced aggregated operators for calculation aggregations.

In the same interface, you have access to simple and advanced numeric operators, simple and advanced text operators, along with boolean, date and time, and comparison operators for your calculations.

You also have access to periodic operators and simple and advanced aggregated operators for calculation aggregations.

The most important point to remember in using this interface is to think ahead as to whether you are creating a calculation (operating on each row) or an aggregation (operating across rows) and specify the data type and format before you begin to drag and drop data items and operators.  The default data type is Numeric, but if you add an aggregation operator, the type will automatically switch to Aggregated Measure.

Remember that you also create calculated items of character, date, and datetime data types–and you can choose from a list of date and datetime formats for those data types.

The SAS Visual Analytics 8.1 Data Pane: Creating Calculations and Aggregations was published on SAS Users.

6月 212017
 

Today in higher education, savvy users expect to have the information they need to make data-informed decisions at their fingertips. As such, leaders in institutional research (IR) are under pressure to provide these users with accurate data, reports and analyses. IR has been tasked with transforming data and reports in [...]

6 examples of data management, reporting and analytics in higher education was published on SAS Voices by Georgia Mariani