Tech

6月 302022
 

Welcome to the seventh installment in my series Getting Started with Python Integration to SAS Viya. In previous posts, I discussed how to connect to the CAS serverhow to execute CAS actions, and how to work with the results. Now it's time to learn how to filter CAS tables.

Load and explore data

First, I'll load the cars.csv file into memory from the sassfotware GitHub page. I'll start by using my CAS connection object conn, followed by the upload_file method from the SWAT package. The upload_file method loads a client-side file into memory on the CAS server and returns a reference to the CAS table. The data in this example is small, for training purposes. Processing data in the CAS server is typically reserved for larger data.

tbl = conn.upload_file('https://raw.githubusercontent.com/sassoftware/sas-viya-programming/master/data/cars.csv',
      casout={'name':'cars', 'caslib':'casuser'})

The response from the command is minimal, but there's much going on. Specifically:

  • Cloud Analytic Services renders the uploaded file available as table CARS in caslib CASUSER(Peter).
  • The table CARS is created in caslib CASUSER(Peter) from binary data uploaded to Cloud Analytic Services.

Next, I'll view the type and value of the tbl variable to confirm it's a CASTable object that references the CARS table in the CAS server.

display(type(tbl), tbl)
# and the command results:
swat.cas.table.CASTable
CASTable('CARS', caslib='CASUSER(Peter)')

The results show tbl is a CASTable object and references the CARS table in the CAS server. Now, when I execute methods on the tbl object, the processing occurs in CAS.

Further, let's preview the CAS table using the SWAT package head method on the tbl object.

tbl.head()

The CAS server returns 5 rows from the CAS table to the Python client as a DataFrame as expected.

Finally, I'll execute the SWAT shape attribute to view the number of rows and columns in the CAS table.

tbl.shape
# and the results
(428, 15)

Notice, the CARS CAS table contains 428 rows and 15 columns.

Filter a CAS Table

Now that we have created the table in the caslib and are familiar with the data, let's learn how to filter CAS tables!

Python Technique

First, I'll start with the traditional Pandas filtering technique. Since the SWAT package tries to blend the world of Pandas and SAS, this is a good starting point. This will filter the CAS table for rows where the Make of the car is Acura or Toyota,

makeFilter = (tbl['Make']=='Acura') | (tbl['Make']=='Toyota')
tbl[makeFilter].shape
# and the results
(35, 15)

The results show that 35 rows in the CARS CAS table have the Make Toyota or Acura.

Query Method

You can also filter a CAS table using the query method in the SWAT package. Wait a minute, does this look familiar? I mentioned SWAT blends the world of Pandas as SAS. I'll again filter for a Make of Toyota or Acura .

tbl.query("Make = 'Acura' or Make = 'Toyota'").shape
# and the results
(35, 15)

Notice, the results are the same.

isin Method

All right. Sorry, but again I'll show you another familiar method from the SWAT package. Since I am filtering for specific categories, I can also use the SWAT isin method, which works similarly to the Pandas isin method!

tbl[tbl.Make.isin(['Acura', 'Toyota'])].shape
# and the results
(35, 15)

Notice, again the results are the same.

CAS Table WHERE Parameter

Lastly, I'll show you a specific CAS technique, which is very efficient if you need to filter data for multiple methods or actions. To begin, let's display the type and value of the tbl object again.

display(type(tbl), tbl)
# and the results
swat.cas.table.CASTable
CASTable('CARS', caslib='CASUSER(Peter)')

Notice the tbl variable is a CASTable object, which is a reference to a CAS table in the CAS server, not an actual table. In this example, it references the CARS table in the Casuser caslib.

The CASTable object enables you to add a variety of parameters to the object to filter, create columns, group and more. In this example, I'll add the where parameter to the CASTable object to filter for car makes of Toyota and Acura. Then I'll display the value of the CASTable object again.

tbl.where = 'Make = "Acura" or Make = "Toyota"'
display(tbl)

CASTable('CARS', caslib='CASUSER(Peter)', where='Make = "Acura" or Make = "Toyota"')

Notice the CASTable still references the CARS table in the Casuser caslib, but the where parameter has been added to the object. The where parameter is applied when the object is used with an action or method.

Let's check it out. Here I'll specify the tbl variable (the CASTable object) then the shape attribute.

tbl.shape
# and the results
(35, 15)

Notice the results again show 35 cars are either Toyota or Acura.

Now, what is nice about using this method is if I want to continue to use this filter with other methods or actions, all I need to do is use the tbl object. Perhaps you want the value counts of each Make? The SWAT package has the value_counts method available.

tbl.Make.value_counts()
# and the results
Toyota    28
Acura      7

The response displays the counts only for rows where Make is Acura or Toyota. This occurs because the where parameter is applied to the CAS table when the value_counts method is executed.

Delete the WHERE parameter

Lastly, if you want to delete the where parameter from the CASTable object, use the del_params method.

tbl.del_params('where')
display(tbl)
# and the results
CASTable('CARS', caslib='CASUSER(Peter)')

Once the parameter is deleted you can revert back to analyzing the entire table. Here I'll use the shape attribute again to view how many rows are in the original CAS table.

tbl.shape
# and the results
(428, 15)

Summary

SAS Viya offers various filtering options for CAS tables. The SWAT package blends the world of Pandas and SAS, making available many familiar Pandas techniques. You also have specific CAS techniques handy when working with the CASTable object directly. Choosing the correct method may depend on your greater code structure, but you now have the right tools to make that decision.

Additional and related resources

Getting Started with Python Integration to SAS® Viya® - Part 7 - Filtering CAS Tables was published on SAS Users.

6月 162022
 

MS Excel logo No matter which powerful analytical tools data professionals use for their data processing, MS Excel remains the output of choice for many users and whole industries.

In banking and finance, for example, I have seen many SAS users create quite sophisticated data queries and data analysis projects in SAS Enterprise Guide. Yet, at the end, when the final datasets have been produced and validated, comes the manual part when users export those tables into Excel, and then combine and rearrange them by copying-pasting into a desired workbook for storing and distributing.

However, this heavily manual process can be not just fully automated, but also enhanced compared with the point-and-click “export to Excel” and “copy-paste” interactive process. Here is how.

Creating a single simple Excel sheet

Suppose, we want to convert SASHELP.CLASS data table to Excel. Here is a bare-bone solution using SAS Output Delivery System:

ods excel file='C:\Projects\SAS_to_Excel\Single_sheet.xlsx';
 
proc print data=SASHELP.CLASS noobs;
run;
 
ods excel close;

This code is pretty much self-explanatory. It will produce Single_sheet.xlsx Excel workbook file in the folder C:\Projects\SAS_to_Excel. When opened in Excel, it will look as follows:

If you browse through the ODS EXCEL documentation you will find a variety of options that allow you to customize Excel output. Let’s get a little fancy by utilizing some of them.

Creating several customized sheets in Excel workbook

The following code example creates two sheets (tabs) in a single workbook. In addition, it demonstrates some other features to enhance data visualization.

/* -------------------------------------------- */
/* Two sheets workbook with enhanced appearance */
/* -------------------------------------------- */
 
/* Formats for background & foreground coloring */
proc format;
   value hbg 
      50 <- 60   = #66FF99
      60 <- 70   = #FFFF99
      70 <- high = #FF6666
      ;
   value hfg
      low -< 50, 70 <- high = white;
run;
 
/* Define custom font style for ODS TEXT */
proc template;
   define style styles.MyStyle;
   parent=styles.htmlblue;
   style usertext from usertext /
      foreground  = #FF33CC
      font_weight = bold
      ;
   end;
run;
 
/* ODS Excel output file destination */
ods excel file = 'C:\Projects\SAS_to_Excel\Two_sheets_fancy.xlsx';
 
   /* Excel options for 1st sheet (tab) */
   ods excel options
      ( sheet_name      = 'SASHELP.CLASS'
        frozen_headers  = 'on'
        embedded_titles = 'on' )
      style = styles.MyStyle;
 
   title justify=left color='#4D7EBF' 'This is TITLE for SASHELP.CLASS';
 
   ods text='This is TEXT for SASHELP.CLASS';
 
   proc print data=SASHELP.CLASS noobs;
      var NAME;
      var SEX AGE / style = {just=C};
      var HEIGHT  / style = {background=hbg. foreground=hfg.};
      var WEIGHT;
   run;
 
   /* Excel options for 2nd sheet (tab) */
   ods excel options
      ( sheet_name      = 'SASHELP.CARS'
        frozen_headers  = 'on'
        embedded_titles = 'on' );
 
   title 'This is TITLE for SASHELP.CARS';
 
   proc print data=SASHELP.CARS noobs;
   run;
 
ods excel close;

Here are the code highlights:

  • PROC FORMAT creates two formats, HBG and HFG, which are used in the first PROC PRINT to illustrate cell text and background coloring.

    NOTE: In SAS, colors are specified as hexadecimal RGB values (cxRRGGBB). However, I found (although it seems undocumented) that in PROC FORMAT and PROC TEMPLATE these colors can be written as quoted (double or single) or unquoted values, as well as prefixed with either ‘#’ or ‘cx’ (for hexadecimal). For example, the following are all valid values: #FFAADD, '#FFAADD', cxFFAADD, 'cxFFAADD'.
  • PROC TEMPLATE defines usertext custom font style for ODS TEXT, which used in ODS EXCEL along with and as alternative to the TITLE statement.
  • ODS EXCEL FILE=file-specification statement opens the EXCEL destination and specifies the output file for ODS EXCEL. The output file-specification can be either a physical file name (must be in quotes) or a fileref (without quotes) assigned with FILENAME statement. It can point to a location on the machine where SAS is run (SAS server), or a network drive accessible from the SAS server.

    This statement follows by two separate ODS EXCEL OPTIONS statements – one per corresponding sheet (tab).
  • ODS EXCEL OPTIONS statement specifies destination-specific suboptions with space-delimited name='value' pairs. In particular, we use the following options:

    • SHEET_NAME= specifies the name for the next worksheet (worksheet names can be up to 28 characters long).
    • FROZEN_HEADERS='ON' specifies that headers are not scrolled when table is vertically scrolled (default is OFF). It is very convenient feature that keeps the title(s) and column names in view while user scrolls through the table rows.
    • EMBEDDED_TITLES='ON' specifies whether titles should appear in the worksheet (default is OFF).

    There is a variety of other useful ODS EXCEL options allowing further customization of your Excel workbook appearance and functionality.

  • TITLE statement is highly customizable with Output Delivery System as shown in this custom TITLE with ODS example.
  • ODS TEXT= statement inserts text into your ODS output. Unlike TITLE for which ODS EXCEL merges several cells, ODS TEXT places its text in a single cell (see screenshot below). The UserText style element that we modified using PROC TEMPLATE controls the font style, font color, and other attributes of the text that the ODS TEXT= statement produces.
  • In PROC PRINT, we use multiple VAR statements to select variables, determine their order and apply styles (text and background colors) to the printed values. There are much more ODS styles with PROC PRINT available for further customizations.
  • The next section of the code ( /* Excel options for 2nd sheet (tab) */ ) creates the second sheet in the same Excel workbook. Similarly, you can create as many sheets/tabs as you wish.
  • The last statement ods excel close; closes the ODS Excel destination so nothing more is written to the output file.

The following are the two screenshots illustrating the two sheets in the produced Excel workbook:

SAS creates Excel workbook with several sheets/tabs

Questions

Do you find this post useful? Do you have questions, comments, suggestions, other tips or tricks about creating MS Excel workbooks in SAS? Please share with us below.

Additional Resources

TUNE IN NOW | LEARN HOW TO READ AND WRITE EXCEL FILES WITH SAS

Automating Excel workbooks creation using SAS was published on SAS Users.

6月 152022
 

Have you ever heard the phrase “beggars can’t be choosers”? Basically, it means that if you ask for something, be grateful for what you get, especially if you don’t have the means to acquire it yourself. This phrase can be widely applicable to most areas of our lives, but when it comes to preparing data in SAS Viya, beggars can be choosers.

SAS Viya has many tools for preparing your data: you can write code in SAS Studio, create plans in SAS Data Studio, or prepare data within reports in SAS Visual Analytics. Because each of these tools has its own features and functions, you may find yourself wondering, which tool should I use?

The answer, like the answer to many questions, is it depends. What are you trying to do? What skills do you have? How quickly does the data need to be ready to use? How many people will be using the data and how?

The tool you choose will boil down to two factors: (1) simplicity and (2) control. For example, let’s say you need to create a calculated column quickly. The easiest way to create this column is within SAS Visual Analytics. You can create the new column without having to switch tools. SAS Visual Analytics is great for preparing data with little effort (no code needed) and limited control (limited options available).

On the other hand, let’s say you need to perform a more complex action like transposing or appending data sets, or standardizing values. SAS Visual Analytics, while easy to use, doesn’t have the features available for this level of data preparation. Instead, you can create a plan in SAS Data Studio to prepare your data. SAS Data Studio is great for preparing data with moderate effort (little to no code needed) and moderate control (more options available).

If you want to have full control, you can write code in SAS Studio to prepare your data. SAS Studio is great for preparing your data with more effort (may require some coding) and full control (unlimited options).

So, you can really think about it as a tradeoff between simplicity and control.  To perform more complex data preparation tasks, you need more control.

I like to think about it in the context of drawing. Let’s say I want to draw an elephant. Before I start to draw, I need to think about how much time I have available (simplicity) and the level of detail I want to include in my drawing (control).

If I’m playing Pictionary, a charades-inspired word-guessing game where you draw a picture so your teammate can identify a specific word, I want to draw the simplest picture possible so my teammate can guess the word as quickly as possible. I don’t need to create the best-looking picture in the world; I just need my teammate to guess correctly in the shortest amount of time. SAS Visual Analytics is the Pictionary version of my elephant.

On the other hand, if I want to sketch a cute picture of an elephant for my niece’s birthday card, I may want to spend a bit more time on the drawing and add some additional details like eyes and a mouth. SAS Data Studio is the birthday card sketch version of my elephant.

If I’m interested in creating a real-life approximation of an elephant, however, I may want to spend a lot more time and add a lot more detail like shading. SAS Studio is the real-life drawing version of my elephant.

Note: The SAS Visual Analytics version took me about 5 seconds to draw, the SAS Data Studio version took me about 5 minutes, and the SAS Studio version took me an hour.

Now you’re probably wondering, when do I use each tool?

SAS Visual Analytics

Use SAS Visual Analytics in the following scenarios:

  • Modifying properties: modify properties of data items (names, formats, classifications, aggregations).
  • Filters: create subsets of the data for the report (data source filters) or specific objects (basic filters, advanced filters, post-aggregate filters, common filters). Note: Post-aggregate filters, basic filters, and common filters can only be created in SAS Visual Analytics.
  • Calculated items: create a simple calculation, like number of days to delivery (calculated item) or percent of total (aggregated measure) or custom groupings (custom categories). Note: Due to their dynamic nature, aggregated measures can only be created in SAS Visual Analytics.
  • Hierarchies and geography data. Note: Hierarchies and geography data items can only be created in SAS Visual Analytics.
  • A simple join between two tables based on equality condition (for example, T1.Date = T2.OrderDate).
  • Mapping data sources: select a value in an object that uses Table A and filter the associated values in an object that uses Table B (mapping) and the common column between Table A and Table B is used by both objects. Note: Mapping data sources can only be performed in SAS Visual Analytics.
  • Aggregated data sources: create an aggregated version of the table.
  • Data views for ad-hoc analysis: changes made to data in SAS Visual Analytics are available only for the specific report. To apply data changes to other reports that use the same data source, you can create and share a data view, but you cannot use that prepared data outside of SAS Visual Analytics.

SAS Data Studio

Use SAS Data Studio in the following scenarios:

  • Improve data quality with casing, parsing, extraction, identification analysis, generating matchcodes, standardizing, or matching and clustering.
  • To discover and address unknown problems in your data using the Suggestions.
  • Append tables or transpose.
  • Create a join between two or more tables (maximum is 32) based on equality condition (for example, T1.Date = T2.OrderDate) or when the common column between the data sources is calculated (for example, datepart(T1.Date) = T2.OrderDate).
  • Write custom DATA step or CASL code to perform any required data preparation action.
  • To create tables used for many reports, by many analysts, and in different Viya tools. Note: Jobs can be created from plans and scheduled to run at specific times to ensure the data is up-to-date.

Note: You can also modify the properties of data items, create subsets of the data, and create calculated items in SAS Data Studio.

SAS Studio

Use SAS Studio in the following scenarios:

  • Programming with SAS Data Quality: improve data quality programmatically with casing, parsing, extraction, identification analysis, generating matchcodes, standardizing, or matching and clustering.
  • Create a non-equi join (for example T1.Date > T2.EndDate) between two or more tables or when the common column between the data sources is calculated (for example, datepart(T1.Date) = T2.OrderDate).

Note: You can use the SQL procedure or CASL code to perform the join.

Note: The DATA step can be used to combine tables where the name of the column is the same in each table.

  • Programming with the DATA step: perform complex conditional logic, use arrays, or process data iteratively using the DATA step.
  • Programming with SQL: perform any action that requires the use of SQL code (creating subqueries, using set operators, inserting rows into tables, updating values in a table, using DICTIONARY tables, and more).
  • Using SAS procedures: use any SAS procedure not associated with a transform in SAS Data Studio (for example, FORMAT, REPORT, FREQ, UNIVARIATE, and more).
  • Programming with macros: generate dynamic code based on data or user input using the SAS macro language.
  • Create tables used for many reports, by many analysts, and in different Viya tools.

Note: Jobs can be created from SAS code and scheduled to run at specific times to ensure the data is up-to-date.

Note: You can also modify properties of data items, create subsets of the data, create calculated items, create custom groupings, create aggregated tables, append and transpose data, and write custom DATA step or CASL code in SAS Studio.

Summary

The following table summarizes the data preparation actions you can perform in each tool.

Action SAS Visual Analytics SAS Data Studio SAS Studio
Modify properties of data items (names, formats, classifications, aggregations)
 
Filter the data
Filter specific objects in a report
 
Create simple calculations
Create custom groupings
Create dynamic aggregated measures
Create hierarchies
Create geography data items
 
Map data sets to apply interactions between two objects that use different data sources
Perform simple joins between two tables based on equality condition
Perform joins between more than two tables (up to 32) based on equality condition
Perform joins when the common column needs to be calculated
Perform non-equi joins
 
Create an aggregated table
Append tables
Transpose tables
 
Improve data quality with transforms
Improve data quality programmatically
Determine data issues using Suggestions
 
Write DATA step code
Write CASL code
Write SQL code
Use SAS procedures
Use SAS macro language
 
Perform ad-hoc analysis
Create tables used for many reports
Schedule actions to run on a repeatable basis

 

For more information about how to use SAS Visual Analytics to prepare your data and create advanced reports, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

Beggars can’t be choosers…Unless they’re using SAS Viya was published on SAS Users.

6月 102022
 

There are several alternatives to writing to a Microsoft Word file when you use the SAS® Output Delivery System (ODS). The RTF, TAGSETS.RTF and TAGSETS.RTF_SAMPLE destinations create an RTF formatted file. The WORD destination, which is still considered preproduction, creates a DOCX formatted file. The destinations provide many of the same functions, although some features are unique to each one. The destination that you choose might vary depending on your desired final output.

Both the Tagsets and the Word destination enable you to specify suboptions using the OPTIONS option. These suboptions are described in the documentation. For the Tagsets destinations, you can also see a list of the available suboptions and their values in the log by using the DOC='HELP' suboption as follows:

Ods tagsets.rtf file='test.rtf'
 options(doc='help');

What is common across destinations

There are some often-used features that are common across all ODS destinations that write to Microsoft Word. All ODS statements support the following options:

  • SASDATE – This option uses the date and time that the SAS session started if the DATE system option is turned on. Otherwise, the date and time listed are when the file is created. This is the default setting for the Tagsets destinations.
  • COLUMNS= – This option allows for multi-column output.
  • None of the destinations support the Report Writing Interface or the ODS LAYOUT ABSOLUTE option.
  • All destinations can create a table of contents, although the options for this task vary. To show the Table of Contents once the file is created, right-click in the page and select Update Field:
    • Ods rtf file='file.rtf' contents toc_data;
    • Ods tagsets.rtf file='file.rtf' options(contents='yes' toc_data='on');
    • Ods word file='file.docx' options(contents='yes' toc_data='on);
    • The TOC_LEVEL='n' suboption in the Tagsets and Word destinations controls the level of expansion of the table of contents.

Unique features, strengths and limitations for each destination

ODS RTF

The RTF destination was introduced in SAS® 8.1. The style template inherits from Styles.Printer. By default, title and footnote text is placed in the header and footer sections of the document. The BODYTITLE option in the ODS statement places the title and footnote text in the body of the document. The KEEPN option controls where tables are split on the page and the NOTRKEEP option controls whether table rows can be split by a page break.

A significant limitation of the RTF destination is that paging can be difficult because there is no vertical measurement. SAS allows Word to determine paging. In addition, the RTF destination does not support the UNIFORM option.

Here is an example:

ods _all_ close;
ods rtf file='paging_rtf.rtf' ;
 
proc print data=sashelp.cars(obs=15) noobs;
title 'Shows paging difficulties where columns wrap on the same page';
run;
 
ods _all_ close;

Shows paging difficulties where columns wrap on the same page

ODS Tagsets.RTF

The Tagsets.RTF destination was introduced in SAS® 9.2, partly to provide more control over paging. The style template inherits from Styles.Printer. Bodytitle is the default behavior, which means that the title and footnote text is placed in the body of the document. This destination supports the UNIFORM option. Another useful option is TABLEROWS=n, which specifies the number of rows in each table before ODS inserts a page break. Suboptions that are often used include the following:

  • CONTINUE_TAG= controls whether the “(Continued)” text is displayed at the end of a page when a table crosses a page boundary.
  • VSPACE= controls vertical space in the document.
  • WATERMARK= adds a diagonal text string as an argument.

Here is an example:

ods tagsets.rtf file='watermark_vspace_tagsets_rtf.rtf'
 options(doc='help' watermark="Draft" vspace='off') ;
 
proc print data=sashelp.class noobs;
title 'Shows watermark and no space between title and table';
run;
 
ods _all_ close;

Shows watermark and no space between title and table

ODS Tagsets.RTF_Sample

The Tagsets.RTF_Sample destination was also introduced in SAS 9.2. The style template inherits from Styles.Printer. The main distinction between this tagset and Tagsets.RTF is that this tagset places the title and footnote text in the header and footer of the Word document. This behavior makes it similar to the RTF destination without the BODYTITLE option. It includes the same suboptions listed in the ODS Tagsets.RTF section.

Here is an example:

ods tagsets.rtf_sample file='title_header_tagsets_rtf_sample.rtf'
 options(doc='help');
 
proc print data=sashelp.class noobs;
title 'Shows title in Header section of document';
run;
 
ods _all_ close;

Shows title in Header section of document

ODS WORD

The ODS WORD destination is still considered preproduction in the current release of SAS. It produces Microsoft Word output that is compatible with Microsoft Office 2010 and later. The style template inherits from Styles.Word and uses the SCHEME statement in the TEMPLATE procedure to define its style.

Some strengths of the ODS WORD destination are that it creates a native DOCX file and the file size can be smaller than that of a file created by the RTF destinations. The following file created with ODS WORD is 12 KB. The same file that is created with ODS RTF is 22 KB. There are several suboptions that help control where text splits on a page, including KEEP_LINES and KEEP_NEXT. The ODS WORD destination also supports accessibility and SVG output.

The ODS WORD destination currently does not support watermarks.

Here is an example:

ods word file='word_output.docx' sasdate;
 
proc print data=sashelp.class noobs;
title 'ODS WORD output';
run;
 
ods _all_ close;
ods listing;

ODS WORD output

Conclusion

Multiple ODS destinations can create a Microsoft Word file. Any method that you choose can create presentation-quality output for Microsoft Word. Thank you for reading and please check out the documentation links below.

Learn more

A comparison of the ODS destinations for writing to Microsoft Word was published on SAS Users.

6月 082022
 

When generating animation files, the number of frames shown per second is called the frame rate (FPS - Frames per second). Generally, the higher the frame rate, the smoother the animation. Human eyes can process 10 to 12 frames per second and perceive them individually, while higher frame rates are perceived as motion due to "persistence of vision." Modern movies are usually shot and presented at 24 FPS, and there are currently five commonly used frame rates in the film and television industry: 24 FPS, 25 FPS, 30 FPS, 50 FPS and 60 FPS for HDTV.

Animations can present richer time-varying analysis results than static images. It can consider and compare analysis results from different angles to present more details. Animations generated in SAS can be in GIF or SVG format. Various SAS/Graph PROC and SG PROC steps support generating animated GIF files. This short post discusses some timing control issues that authors have found while generating animations for rigorous physics simulations with SAS. Preserving time accuracy is important in physics simulation and some time-sensitive animation generation.

GIF format review

The Graphics Interchange Format (GIF) is a bitmap image format with animation features created by CompuServer in 1987, and GIF89a is version 89a (July 1989) of the format. Its advantages include smaller file sizes and wider support for web browsers and platforms, but the smallest unit of frame duration in the GIF89a file format is hundredths of a second (0.01 seconds) instead of milliseconds (0.001 seconds). Many GIF tools allow specifying the frame duration in milliseconds, even though they will be rounded to hundredths of seconds anyways. They unintendedly mask the limitations of this file format. On the other hand, some GIF-playing software will automatically modify a time interval that is too small. For example, Microsoft's IE rounds up time intervals under 50ms to 100ms. This creates certain challenges for producing time-accurate animations.

Research in 2014[1] indicated that GIF89a does not produce substantial delays when an animation interval is larger than 100ms, regardless of the web browser. Below this threshold, the resulting performance falls considerably. IE9 reinterprets any delay below 100ms as 100ms, so a 50ms interval shows a 50ms delay, while a 16.67ms interval shows around 85ms delay. Firefox 10 and Chrome 17 can execute a GIF89a animation correctly (except for an 85ms delay on 16.67ms intervals), and their mean number of missed frames is smaller and stabilizes over time. The refresh interval is 16.67 milliseconds for a screen with a refresh rate of 60Hz.

Due to the need to control the size of the generated file, 8 FPS (125ms) is enough for most GIF files. When there are many dynamics at play, we can use 15 FPS (66.67ms). The file size for 24 FPS (41.67ms) GIFs without compression will be larger than a standard MP4 video file, so then it would be better to produce the animation as an MP4 instead of a GIF format with a higher FPS.

What is the proper FPS?

When generating animations in SAS, you can specify the duration of a frame through the SAS system option ANIMDURATION. It can accept up to 10 characters in length, that is, there can be 8 significant digits after the decimal point. This commonly leads to programmers incorrectly thinking that the duration of the animation can be specified in milliseconds. The actual test shows that in SAS, the duration of a frame specified via ANIMDURATION option with value 1/FPS, and the actual duration of the animation will be quite different. That is, if the programmer wants to generate relatively time-accurate animation directly in the simulation, they should pay attention when selecting a specific FPS to generate the animation and use the following relationship to specify the target DURATION value:

/*Calculate the accurate ANIMDURATION from/to FPS in SAS*/
 
proc fcmp inlib=work.proto outlib=work.fcmp.animation;
  function fps_animduration(fps);
    dur=1/fps;
    return( floor( dur * 100)/100 ); *Keep the first 2 digits after decimal point instead of round(dur, 0.01);
  endsub;
 
  function animduration_fps(dur);
    return( floor(<strong>1</strong>/dur) ); *Cut the integer part;
  endsub;
run;
 
option cmplib=(work.fcmp);

Accordingly, the following conversion can be used in SAS Macro:

%let DUR=%sysevalf( %sysfunc(floor((1/&FPS)*100 ))/100 );

In a time-accurate physics simulation, the duration or period is the absolute time, so the number of frames that need to be generated is the duration divided by the frame rate. Since the minimum unit of GIF per frame duration mentioned above is one-hundredth of a second, the actual generated frame rate is larger than expected in a generated GIF file, which means that the actual number of frames generated is less than expected for a specified simulation duration. Experiments show that when the FPS is less than or equal to 100, if no error in the number of generated frames is allowed, there are only 19 frame rates that can be used: 1-10, 11, 12, 14, 16, 20, 25, 33, 50 and 100 FPS. If 5% of the generated frames or time error is allowed, you can have an additional 8 frame rates to choose from: 24, 32, 48, 49, 96, 97, 98 and 99 FPS. Based on the background knowledge mentioned earlier, combined with actual needs, GIF file size limitations and playback smoothness requirements, we’d better generally choose a frame rate between 8 and 25 to produce time-accurate animations. 8, 10, 16, 20 and 25 FPS are all good choices. 24 FPS can be chosen if the error rate of frames generated is allowed to be <=5%. The following table lists the error between the actual number of frames generated and the expected number of frames when the duration is 1s under different FPS settings:

Figure1: Frame lost rate (<=5%) and FPS in GIF generation

The relationship between the actual loss rate of generated frames and specified FPS in the figure below reveals which frame rate we should specify when generating time-accurate animations:

Figure2: Frame lost rate and FPS specified in GIF generation

The figure below is a single-cycle animation generated by solving the figure-8 solution of the three-body problem in SAS. The physical time of a single cycle is about 6.3259 seconds, and the actual animation time is also 6.32 seconds with a time scale factor of 1. The duration is 0.04s per frame, for a total of 158 frames. In fact, the problem discussed in this article was discovered when trying to solve the three-body problem with SAS. Of course, in addition to the above frame rate selection, introducing compression techniques to SAS GIF generating is recommended to reduce the file size. Several experiments have shown that third-party tools such as GIF Optimizer can be used to reduce a GIF file's size by 90% without a significant reduction in graphics quality.

Figure3: An animation with galaxy background for the figure-8 solution of three body problem in SAS

Conclusion

When generating GIF animations in SAS that need to be time-accurate such as physics simulation, pay attention to the selection of the ANIMDURATION corresponding to the appropriate frame rate. When programming a GIF animation to display the analysis results, do not choose a random frame rate, or else there will be a large loss of generated frames or a time error. 8, 10, 16, 20 and 25 FPS can be chosen when there is no generated frame loss allowed and 24 FPS can be selected when a generated frame loss rate below 5% is allowed. The limitations discussed here are inherent to the GIF file specification; not due to SAS or third-party tools.

Learn more

Considerations for generating time-accurate animation with SAS was published on SAS Users.

6月 062022
 

This article was co-authored by Deva Kumar and Ali Dixon.

The launch of SAS Analytics Pro Advanced Programming offers key statistical capabilities in a docker container. The product bundles selected executables from SAS Viya to create the container, which eases or streamlines the setup required for fixes and updates to the software.

Current and new users can use the platform for data management, advanced analytics, reporting and batch processing. It includes the products in SAS Analytics Pro, Base SAS, SAS Studio 5.2, SAS/STAT, SAS/GRAPH. Integration with Python is available and includes Proc Python and access to saspy from python clients (ex: JupyterLab). Analytics Pro Advanced Programming offers the addition of SAS/ETS, SAS/IML, SAS/OR and SAS/QC.

For the remainder of this post, a reference to “Analytics Pro” includes the features of both Analytics Pro and Analytics Pro Advanced Programming on Viya.

Easy as 1-2-3 OR Simple is Good

Analytics Pro is delivered as a Docker image. This makes it easy to deploy, use and maintain. The diagram below shows that it takes just 3 steps to begin programming with SAS on your workstation.

Getting Started With APro

For a current Analytics Pro 9.x user, Analytics Pro is the next generation with all the goodness of Analytics Pro 9.x with many improvements from SAS Viya.

What is the same?

Users can run all of their existing programs with Analytics Pro. A few things like having a filename to a C: drive must change, but not the logic of the program. Users can run interactively using SAS Studio or run batch jobs.

Users can continue to program using the SAS Language consisting of DataStep, Procedures and Macros.

What is different?

Analytics Pro is “carved” out of SAS Viya so that it is familiar to current SAS 9.x users. In keeping with this philosophy, Analytics Pro does not include many of the advanced capabilities of SAS Viya.

  • The Cloud Analytic Server (CAS) is not included. However, Analytics Pro programs can connect to CAS.
  • To keep the size small, Analytics Pro does not include the latest version of SAS Studio or any of the other great visual products on SAS Viya.
  • Analytics Pro includes the database engines that are on SAS Viya. There are no additional charges for these.
  • Analytics Pro includes Proc Python from SAS Viya allowing users to include Python code as part of the SAS program.
  • Python programmers can use SASPy to access Analytics Pro from their Python code.
  • Updates are shipped monthly giving users the latest fixes and updates. This will include any new database drivers that are added to SAS Viya and other SAS Viya features like Proc Python.

What if users want to run it on the cloud?

One of the advantages of Analytics Pro is it will fit into the user’s cloud strategy. The container can be in docker on virtual machines – Linux or Windows. See this article for an example. The container can also be deployed on Kubernetes clusters. Users can use the platform that suits their needs.

View a demo of these new features and learn more on the SAS Viya Release Highlights Show.



SAS Analytics Pro Advanced Programming provides a powerful package to explore and solve problems for those seeking new statistical capabilities.

Learn more

Check out these Analytics Pro articles or take a deep dive into this additional Analytics Pro documentation.

Introducing Analytics Pro Advanced Programming: The power of SAS on your desktop was published on SAS Users.

5月 312022
 

If you use formatted variables in a CLASS statement in procedures such as MEANS or UNIVARIATE, SAS will use the formatted values of those variables and not the internal values. For example, consider you have a data set (Health) with variables Subj, Age, Weight, and Height. You want to see the mean Weight and Height for three age groups: 0 to 40, 41-70, and greater than 70. A program to create the Health data set and compute the means is shown below.

Data Health;
   input Subj $ Age Weight Height;
datalines;
001 23 120 72
002 35 200 80
003 77 98 63
004 51 230 75
005 15 98 54
006 44 160 68
007 83 111 65
008 10 . 45
009 60 200 .
;
proc format;
   value AgeGrp 0-40 = '0 to 40'
                41-70 = '41 to 70'
                71-high = '&gt;70';
run;
 
proc means data=Health n nmiss mean;
   class Age;
   format Age AgeGrp.;
   var Weight Height;
run;

Here is the output:

Notice that there was no need to run another DATA step and create a variable representing age ranges. Not only is this an easier way to get these results—it is also much more efficient.

The other day, I received an email from a person and he posed the following question. How can I see the mean of the variable Sales (in the SASHELP data set Shoes) for three ranges of Sales: 0 to 100,000; 100,000 to 200,000; 200,000 and above?

My first thought was to create a variable (called Range) and then use that variable in a CLASS statement with PROC MEANS. However, after some thought, I wondered if I could use the Sales variable in the CLASS statement (and supply a format) and use the same variable in the VAR statement. As is often the case, the best way to answer a SAS programming question is to write a short program and see what happens. Here is what I suggested:

proc format;
  value SalesRange 0 - &lt;100000 = 'Lower'
              100000 - &lt;200000 = 'Middle'
              200000 - high = 'Upper';
run;
 
proc means data=sashelp.Shoes n nmiss mean std;
   class Sales;  
   format Sales SalesRange.;
   var Sales;
run;

Here is the output:

As you can see from the output, it worked just fine.

Before we leave this topic, let's use the first example, referencing the Health data set, to demonstrate a Multi-Label Format. Suppose you want to see means of Weight and Height for two different ranges of Age. One range is the original 0–40, 41–70, and 70+; the other is 0–50 and 51+. Here is a way to accomplish this using a single format:

proc format;
   value AgeGrp (multilabel) 0-40 = '0 to 40'
                             41-70 = '41 to 70'
                             71-high = '&gt;70'
 
                             0-50 = '0 to 50'
                             51-high = '&gt;50';
run;
 
proc means data=Health n nmiss mean;
   class Age / MLF;
   format Age AgeGrp.;
   var Weight Height;
run;

You need to add the option "multilabel" following the format name on the VALUE statement.  Without this option, you will get an error message if you have overlapping ranges. Next, use the option MLF (multi-label format) on the CLASS statement. Here is the output:

You now have the mean Weight and Height for two separate ranges of Age.

Consider using a formatted variables in CLASS statements for those procedures that allow this statement. It will save you time and reduce CPU time as well.

If you liked this blog, then you might like my latest book: Cody’s Collection of SAS Blogs. It gathers all of my best tips and tricks from my blogs in one place for easy reference.

Using formatted CLASS variables was published on SAS Users.

5月 202022
 

In the previous article, we demonstrated how to run CMS-HCC Risk Adjustment Models using SAS Analytics Pro on Viya. Recall that we used the SAS Studio interface to access the demographic and diagnoses input data, set up the model score code and score the input data.

In this third article, we will introduce an alternative approach that surfaces the CMS-HCC Risk Adjustment Model execution through SASPy integration to a Flask application. We will demonstrate how this integration allows a user to score an individual patient/member on-demand, using inputs to an interactive web form to execute the model score code, surfacing the resulting score to the user.

You may be thinking, "Why would I want to do this? Isn’t it sufficient to run the CMS-HCC Risk Adjustment Models manually as necessary?" Of course, there is nothing wrong with running the models via SAS Studio as needed and relying on a person to do so.

However, there are scenarios where an automated approach, less dependent on the active participation of a person and integrated with a third-party application is preferred. For example, you may need your CMS-HCC Risk Adjustment Model to run on-demand based on a 3rd party application that was developed by your clients or you may need to run the models as part of a process that is detached from your SAS deployment.

In these cases, you can take advantage of SAS, SASPy integration and cloud infrastructure to achieve the level of automation and integration required. Rather than the MacBook deployment leveraged in the prior article, we will use a deployment of SAS Analytics Pro Viya on a Linux server instance in Azure.

The following lists the step-by-step process to run the CMS-HCC Risk Adjustment Model through SASPy API integration to a Flask application.

  1. The Flask application receives the input form data from the web browser as JSON
  2. Python uses SASPy to connect to SAS Analytics Pro Viya via SSH
  3. SASPy parses the JSON input into SAS tables
  4. SASPy submits SAS code with the SAS tables as input to SAS Analytics Pro on Viya
  5. SAS Analytics Pro Viya runs the code and generates an output SAS table in the output folder
  6. SASPy accesses the output SAS table and converts it to a Pandas data frame
  7. Flask exposes the data frame as JSON back to the web browser

To achieve this integrated process, we are going to walk through all the necessary deployment and configuration steps in a video. To follow along, the following items are required:

  • Azure subscription
  • Azure Linux instance deployed
  • License for SAS Analytics Pro Viya – save it under the sasinside folder
  • Knowledge of Linux to execute scripts
  • Familiarity with Docker containers
  • Clone the provided GitHub repository

Watch the video below to see a detailed demonstration of the deployment and configuration steps.

The video follows the whole process with the high-level steps outlined here.

  1. Deploy an Azure instance with Ubuntu
  2. Install Docker on this instance
  3. Clone the GitHub repository
  4. Install SAS Analytics Pro on Viya container
  5. Configure password-less SSH
  6. Build the Flask application container
  7. Run the Flask application container
  8. Test password-less SSH from the Flask application container to the SAS Analytics Pro on Viya container
  9. Configure ports on the Azure instance to confirm it is accessible from my machine
  10. Run the application on the browser to score a patient/member with the CMS-HCC model

This is the overall architecture of the application found on this GitHub repository.

If you followed along with the deployment and configuration, you may have additional questions. If you are wondering about the purpose of the sascfg_personal.py file, it tells SASPy how to connect via SSH to the SAS Analytics Pro on Viya container. You have to specify the host, localhost and luser options to make sure SASPy can successfully connect.

If you run the SAS Analytics Pro on Viya container first, Docker will probably assign 172.17.0.2 as the IP for that container and 172.17.0.3 for the Flask container. If you suspect the IP is different, inspect the container to make sure the IP is correct.

 
SAS_config_names   = ['ssh']
ssh                = {'saspath'  : '/opt/sas/viya/home/SASFoundation/sas',
                      'ssh'      : '/usr/bin/ssh',
                      'host'     : '172.17.0.2', # sas apro container host
                      'luser'    : 'sasdemo',
                      'localhost': '172.17.0.3', # local Python container ip address
                      'options'  : ["-fullstimer"]
                     }

For more information, check out the official documentation page.

Thanks to everyone that contributed to this blog series and to those that followed it from the beginning. I hope this was helpful and that you learned something along the way.

*If you are interested in a license for SAS Analytics Pro on Viya, contact your SAS representative or get in contact with SAS through the chatbot on sas.com.

Running CMS Risk Adjustment Models via API with SAS Analytics Pro (Viya) on Azure was published on SAS Users.

5月 172022
 

Here at SAS, we understand the importance of having access to cutting-edge professional resources. That’s why, for more than 40 years, we’ve provided individuals in programming, data management and analytics fields with low-cost and no-cost materials that promote success in their educational and professional journeys. And today, as the demand for employees with advanced skill sets and global certifications grows, we get it – having the ability to easily access the tools you need to succeed is more important than ever.

We’ve got you covered.

As part of our ongoing commitment to helping individuals enhance their skills, further their careers and increase their chance of success in the field, we’re now offering SAS Certification Practice Exams for free. Yes, free.

Over the years, candidates who have taken advantage of our practice exams have found them to be a valuable, effective tool for gauging their preparedness for SAS Certification Exams. When combined with other SAS training resources – like webinars, content guides, training courses and web tutorials – these free exams greatly increase candidates’ chances of success. Exams are currently available in:

    • Programming
    • Advanced Analytics
    • Data Management
    • Visual Analytics
    • Administration

Representative of the live exams, our online practice exams go through the same rigorous development process and are designed to give candidates an idea of what they should expect in the actual test questions. SAS practice exams also provide the rationale behind correct and incorrect answers, giving participants even more insight and opportunity for exam success.

And the numbers speak for themselves.*

    • Those who passed the practice exam had a 17.5% higher pass rate on the live exam than those who did not take or pass the practice version.
    • Those who took the practice exam – regardless of score – had an 8.15% higher pass rate on the live exam than those who opted not to take it.

Not to mention taking a practice test just might make the difference between passing the SAS Certification Exam on the first try or having to retake it. We’ve found that many who don’t pass the Certification Exam miss the mark by only a few questions, which they could have avoided with a bit more preparation.

Keep in mind, in order to make the best use of this resource, we recommend taking a practice exam as your final method of study in order to test your preparedness before diving into the actual exam. If you are curious about the types of questions typically on the certification exams, we have sample questions available for you to review.

So, what are you waiting for? A free resource to prepare you for a globally-recognized certification and make your resume stand out from the rest? That’s a no-brainer.

For more information about free SAS Certification Practice Exams, visit: https://www.sas.com/en_us/certification/resources/sas-practice-exams.html

*Based on data from 20,000 practice exams taken since 2020.

Access for success: SAS Certification practice exams now offered for free was published on SAS Users.

5月 162022
 

Just getting started with this series? Make sure to explore the earlier posts Part 1, Part 2 and Part 3.

Up until now, you have seen how ModelOps can solve your biggest machine learning challenges and that SAS and Microsoft, together, can help you deploy, govern and monitor your models anywhere in the Microsoft Cloud. But what does that look like in practice and what value does that provide you?

In this blog, we will share how to register SAS and open-source models from SAS Model Manager to Azure Machine Learning and then call that model directly from Power BI.

SAS Model Manager is a centralized, searchable repository for all types of models and analytical assets. This solution gives you complete visibility into your analytical processes, ensuring traceability and governance. SAS Viya on Azure allows you to efficiently move your analytical models from the innovation lab into production. With SAS and Microsoft, you can now easily and seamlessly deploy models in Azure Machine Learning (AML).

Step 1: SAS Model Manager governs, deploys, and monitors all types of SAS and open-source models (i.e., Python, R). Deploy your model registered in SAS Model Manager to Azure Machine Learning.

Once a model is deployed into Azure Machine Learning it can be leveraged just like any other model in Azure Machine Learning while maintaining the governance and control that Viya offers.

Step 2: Refresh Azure Machine Learning to find your model with associated artifacts and variable metrics that can be used to publish the model in an executable container runtime supporting SAS and open-source models.

After you publish a model to an Azure Machine Learning publishing destination you can deploy the published model to AKS. This enables you to make the published model available so that you score it using custom code or another application outside the SAS Viya environment. It also enables you to deploy the published model in a production environment.

Step 3: Go back to SAS Model Manager to deploy the model to AKS (Azure Kubernetes Services) which is the Azure Machine Learning compute target. This steps ensures that the model you have published is ready to be deployed.

Step 4: Score your model in Azure Machine Learning.

Step 5: Score your model from Power BI.

When a new model is developed and ready for production, it is easy to register that model in Azure Machine Learning and replace the existing model that is in production. This allows business intelligence users to quickly gain access to the most accurate insights. By leveraging the strengths of the ModelOps lifecycle that SAS has developed the deployment mechanism and distribution of that model using the Microsoft Cloud, models can be pushed out into production in an automated and repeatable way. The combination of SAS Viya and Azure Machine Learning accelerates the time to value for customers who are looking to incorporate DevOps principles into their model development and deployment.

To learn more about ModelOps and our partnership with Microsoft, see our whitepaper: ModelOps with SAS Viya on Azure.

ModelOps in practice: How to call SAS models from Microsoft Power BI, part 4 was published on SAS Users.