books

9月 132021
 

The Day of the Programmer is not enough time to celebrate our favorite code-creators. That’s why at SAS, we celebrate an entire week with SAS Programmer Week! If you want to extend the fun and learning of SAS Programmer Week year-round, SAS Press is here to support you with books for programmers at every level.

2021 has been a big year for learning, so we wanted to share the six most popular books for programmers this year. There are some old favorites on this list as well as some brand-new books on a variety of topics. Check out the list below, and see what your fellow programmers are reading this year!

  1. Little SAS Book: A Primer, Sixth Edition

This book is at the top of almost every list of recommended books for anyone who wants to learn SAS. And for good reason! It breaks down the basics of SAS into easy-to-understand chunks with tons of practice questions. If you are new to SAS or are interested in getting your basic certification, this is the book for you.

  1. Learning SAS by Example: A Programmer’s Guide, Second Edition

Whether you are learning SAS for the first time or just need a quick refresher on a single topic, this book is well-organized so that you can read start to finish or skip to your topic of interest. Filled with real-world examples, this is a book that should be on every SAS programmer’s bookshelf!

  1. Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS

If you work with big data, then you probably work with a lot of text. The third book on our list is for anyone who handles unstructured data. This book focuses on practical solutions to real-life problems. You’ll learn how to collect, cleanse, organize, categorize, explore, analyze, and interpret your data.

  1. End-to-End Data Science with SAS: A Hands-On Programming Guide

This book offers a step-by-step explanation of how to create machine learning models for any industry. If you want to learn how to think like a data scientist, wrangle messy code, choose a model, and evaluate models in SAS, then this book has the information that you need to be a successful data scientist.

  1. Cody's Data Cleaning Techniques Using SAS, Third Edition

Every programmer knows that garbage in = garbage out. Take out the trash with this indispensable guide to cleaning your data. You’ll learn how to find and correct errors and develop techniques for correcting data errors.

  1. SAS Graphics for Clinical Trials by Example

If you are a programmer who works in the health care and life sciences industry and want to create visually appealing graphs using SAS, then this book is designed specifically for you. You’ll learn how to create a wide range of graphs using Graph Template Language (GTL) and statistical graphics procedures to solve even the most challenging clinical graph problems.

An honorable mention also goes to the SAS Certification Guides. They are a great way to study for the certification exams for the SAS Certified Specialist: Base Programming and SAS Certified Professional: Advanced Programming credentials.

We have many books available to support you as you develop your programming skills – and some of them are free! Browse all our available titles today.

Top Books for SAS Programmers was published on SAS Users.

8月 032021
 

Thousands of SAS users are migrating from SAS University Edition to SAS OnDemand for Academics (ODA). I thought I would share some of my thoughts, having just finished two books using ODA (Getting Started with SAS Programming: Using SAS Studio in the Cloud and A Gentle Introduction to Statistics Using SAS Studio in the Cloud). Users of SAS University Edition had to download software to create a virtual computer on their local computer, then download the SAS software, and finally, set up a way to read and write files from their “real” computer to their “virtual computer.” This caused many people massive headaches (including me). The great news about SAS OnDemand for Academics is that you don’t have to download anything! You access SAS on a cloud platform. Also, reading data from your real computer is quite simple.

ODA was developed so that people can use it to learn how to program and run tasks using SAS. It is not to be used for commercial purposes. There is also a 5-gigabyte limit for data files, but that is certainly not a problem for learning how to use SAS.

Registering for ODA

To gain access to ODA, you need to register with SAS. Part of the registration process is to create a SAS profile. If you already have a SAS profile, skip that portion of the instructions.

To start, point your browser to:

https://welcome.oda.sas.com

You will have to wait for an email from SAS with an approval, but once you have access, it is a delight to use. Let me show you step-by-step how easy it is to upload an Excel workbook from your local computer to SAS Studio using ODA.

This first figure shows a worksheet (Drug_Study.xlsx) with information on four individuals.

Figure 1: Excel Spreadsheet

All the variable names should be self-explanatory except for SBP (systolic blood pressure) and DBP (diastolic blood pressure). It is a good idea to choose variable names that are compatible with SAS V7. The next step is to select a folder in the Server Files and Folder tab in SAS Studio and upload the file. This tab is shown in Figure 2 (with the UPLOAD icon circled).

Figure 2: Upload the File to a Folder in SAS Studio

Once you click the UPLOAD icon, you are given the option to select a file.

Figure 3: Select Your File

Select your file, click UPLOAD, and the file on your local hard drive (Drug_Study.xlsx) will appear in the SAS Studio folder (Figure 4). (Note: you can upload multiple files at one time if you choose.)

Figure 4: The File Drug_Study.xlsx in Now in the SAS Studio Folder

You can now use the IMPORT utility (under Tasks and Utilities) to convert this Excel file to a SAS data set. In this example, the default name of the WORK file (IMPORT) was changed to Drug_Study. A list of variables is shown in Figure 5.

Figure 5: List of Variable Names in Data Set Drug_Study

You can now write programs or use SAS Studio tasks to analyze this data set.

To learn more about how to use SAS Studio as part of SAS OnDemand for Academics, write SAS programs, or use SAS Studio tasks, please take a look at my new book, Getting Started with SAS Programing: Using SAS Studio in the Cloud.

I welcome comments and questions on this blog.

Migrating from SAS University Edition to SAS OnDemand for Academics was published on SAS Users.

7月 272021
 

In the past, the COMPRESS function was useful. Since SAS version 9, it has become a blockbuster, and you might not have noticed. The major change was the addition of a new optional parameter called MODIFIERS.

The traditional use of the COMPRESS function was to remove blanks or a list of selected characters from a character string. The addition of a MODIFIER argument does two things. First, you can specify classes of characters to remove, such as all letters, all punctuation marks, or all digits. That is extremely useful, but the addition of the 'k' modifier is why I used the term blockbuster in my description. The 'k' modifier flips the function from one that removes characters from a string to one that keeps a list of characters and removes everything else. Let me show you some examples.

This first example stems from a real problem I encountered while trying to read values that contained units. My data looked something like this:

ID     Weight 
001    100lbs.
002     59Kgs.
003    210LBS
004    83kg

My goal was to create a variable called Wt that represented the person's weight in pounds as a numeric value.

First, let’s look at the code. Then, I’ll give an explanation.

data Convert;
   length ID $3 Weight $8;
   input ID Weight;
 
   Wt = input(compress(Weight,,'kd'),8.);
   /* The COMPRESS function uses two modifiers, 'k' and 'd'.  This means
      keep the digits, remove anything else.  The INPUT function does the
      character-to-numeric conversion.
   */
 
   If findc(Weight,'k','i') then Wt = Wt * 2.2;
 
   /* the FINDC function is looking for an upper or lowercase 'k' in the
      original character string.  If found, it converts the value in
      kilograms to pounds (note: 1 kg = 2.2 pounds).
   */
 
datalines;
001    100lbs.
002     59Kgs.
003    210LBS
004    83kg
;
title "Listing of Data Set Convert";
footnote "This program was run using SAS OnDemand for Academics";
proc print data=Convert noobs;
run;

The program reads the value of Weight as a character string. The COMPRESS function uses 'k' and 'd' as modifiers. Notice the two commas in the list of arguments. A single comma would interpret 'kd' as the second argument (the list of characters to remove). Including two commas notifies the function that 'kd' is the third argument (modifiers). You can list these modifiers in any order, but I like to use 'kd', and I think of it as "keep the digits." What remains is the string of digits. The INPUT function does the character-to-numeric conversion.

Your next step is to figure out if the original value of Weight contained an upper or lowercase 'k'. The FINDC function can take three arguments: the first is the string that you are examining, the second is a list of characters that you are searching for, and the third argument is the 'i' modifier that says, "ignore case" (very useful).

If the original character string (Weight) contains an uppercase or lowercase 'k', you convert from kilograms to pounds.

Here is the output:

There is one more useful application of the COMPRESS function that I want to discuss. Occasionally, you might have a text file in ASCII or EBCDIC that contains non-printing characters (usually placed there in error). Suppose you want just the digits, decimal points (periods), blanks, and commas. You need to read the original value as a text string. Let's call the original string Contains_Junk. All you need to convert these values is one line of code like this:

Valid = compress(Contains_Junk,'.,','kdas');

In this example, you are using all three arguments of the COMPRESS function. As in pre-9 versions of SAS, the second argument is a list of characters that you want to remove. However, because the third argument (modifiers) contains a 'k', the second argument is a list of characters that you want to keep. In addition to periods and commas, you use modifiers to include all digits, uppercase and lowercase letters (the 'a' modifier - 'a' for alpha), and space characters (these include spaces, tabs, and a few others such as carriage returns and linefeeds). If you did not want to include tabs and other "white space" characters, you could rewrite this line as:

Valid = compress(Contains_Junk,'., ','kd');

Here you are including a blank in the second argument and omitting the 's' in the modifier list.

You can read more about the COMPRESS function in any of the following books, available from SAS Press as an e-book or from Amazon in print form:

Or my latest programming book:

 

Questions and/or comments are welcome.

The Amazing COMPRESS Function was published on SAS Users.

7月 212021
 

In my new book, I explain how segmentation and clustering can be accomplished in three ways: coding in SAS, point-and-click in SAS Visual Statistics, and point-and-click in SAS Visual Data Mining and Machine Learning using SAS Model Studio. These three analytical tools allow you to do many diverse types of segmentation, and one of the most common methods is clustering. Clustering is still among the top 10 machine learning methods used based on several surveys across the globe.

One of the best methods for learning about your customers, patrons, clients, or patients (or simply observations in almost any data set) is to perform clustering to find clusters that have similar within-cluster characteristics and each cluster has differing combinations of attributes. You can use this method to aid in understanding your customers or profile various data sets. This can be done in an environment where SAS and open-source software work in a unified platform seamlessly. (While open source is not discussed in my book, stay tuned for future blog posts where I will discuss more fun and exciting things that should be of interest to you for clustering and segmentation.)

Let’s look at an example of clustering. The importance of looking at one’s data quickly and easily is a real benefit when using SAS Visual Statistics.

Initial data exploration and preparation

To demonstrate the simplicity of clustering in SAS Visual Statistics, the data set CUSTOMERS is used here and also throughout the book. I have loaded the CUSTOMERS data set into memory, and it is now listed in the active tab. I can easily explore and visualize this data by right-mouse-clicking and selecting Actions and then Explore and Visualize. This will take you to the SAS Visual Analytics page.

I have added four new compute items by taking the natural logarithm of four attributes and will use these newly transformed attributes in a clustering.

Performing simple clustering

Clustering in SAS Visual Statistics can be found by selecting the Objects icon on the left and scrolling down to see the SAS Visual Statistics menus as seen below. Dragging the Cluster icon onto the Report template area will allow you to use that statistic object and visualize the clusters.

Once the Cluster object is on the template, adding data items to the Data Roles is simple by checking the four computed data items.

Click the OK icon, and immediately the four data items that are being clustered will look like the report below where five clusters were found using the four data items.

There are 105,456 total observations in the data set, however, only 89,998 were used for the analysis. Some observations were not used due to the natural logarithm not being able to be computed. To see how to handle that situation easily, please pick up a copy of Segmentation Analytics with SAS Viya. Let me know if you have any questions or comments.

 

 

Clustering made simple was published on SAS Users.

3月 022021
 

The more I use SAS Studio in the cloud via SAS OnDemand for Academics, the more I like it. To demonstrate how useful the Files tab is, I'm going to show you what happens when you drag a text file, a SAS data set, and a SAS program into the Editor window.

I previously created a folder called MyBookFiles and uploaded several files from my local computer to that folder.  You can see a partial list of files in the figure below.

Notice that there are text files, SAS data sets, SAS programs, and some Excel workbooks. Look what happens when I drag a text file (Blank_Delimiter.txt) into the Editor window.

No need to open Notepad to view this file—SAS Studio displays it for you. What about a SAS data set? As an example, I dragged a SAS data set called blood_pressure into the Editor.

You see a list of variables and some of the observations in this data set.  There are vertical and horizontal scroll bars (not shown in the figure) to see more rows or columns. If you want to see a listing of the entire data set or the first 'n' observations, you can run the List Data task, located under the Tasks and Utilities tab.

For the last example, I dragged a SAS program into the editor. It appears exactly the same as if I opened it in my stand-alone version of SAS.

At this point, you can run the program or continue to write more SAS code. By the way, the tilde (~) used In the INFILE statement is a shortcut for your home directory. Follow it with the folder name and the file name.

You can read more about SAS Studio in the cloud in my latest book, Getting Started with SAS Programming: Using SAS Studio in the Cloud.

Viewing files, programs, and data sets in SAS Studio was published on SAS Users.

12月 142020
 

Do you need to see how long patients have been treated for? Would you like to know if a patient’s dose has changed, or if the patient experienced any dose interruptions? If so, you can use a Napoleon plot, also known as a swimmer plot, in conjunction with your exposure data set to find your answers. We demonstrate how to find the answer in our recent book SAS® Graphics for Clinical Trials by Example.

You may be wondering what a Napoleon plot is? Have you ever heard of the map of Napoleon’s Russian campaign? It was a map that displayed six types of data, such as troop movement, temperature, latitude, and longitude on one graph (Wikipedia). In the clinical setting, we try to mimic this approach by displaying several different types of safety data on one graph: hence, the name “Napoleon plot.” The plot is also known as a swimmer plot because each patient has a row in which their data is displayed, which looks like swimming lanes.

Code

Now that you know what a Napoleon plot is, how do you produce it? In essence, you are merely writing GTL code to produce the graph you need. In order to generate a Napoleon plot, some key GTL statements that are used are DISCRETEATTRMAP, HIGHLOWPLOT, SCATTERPLOT and DISCRETELEGEND. Other plot statements are used, but the statements that were just mentioned are typically used for all Napoleon plot. In our recent book, one of the chapters carefully walks you through each step to show you how to produce the Napoleon plot. Program 1, below, gives a small teaser of some of the code used to produce the Napoleon Plot.

Program 1: Code for Napoleon Plot That Highlights Dose Interruptions

	   discreteattrmap name = "Dose_Group";
            value "54" / fillattrs = (color = orange) 
                         lineattrs = (color = orange pattern = solid);     
            value "81" / fillattrs = (color = red) 
                         lineattrs = (color = red pattern = solid);
         enddiscreteattrmap;
 
         discreteattrvar attrvar = id_dose_group var = exdose attrmap = "Dose_Group";
 
         legenditem type = marker name = "54_marker" /
            markerattrs = (symbol = squarefilled color = orange)
            label = "Xan 54mg";
 
         < Other legenditem statements >
 
 
	     layout overlay / yaxisopts = (type = discrete 
                                         display = (line label)     
                                         label = "Patient")
 
	        highlowplot y = number 
                          high = eval(aendy/30.4375) 
                          low = eval(astdy/30.4375) / 
                 group = id_dose_group                       
                 type = bar 
                 lineattrs = graphoutlines 
                 barwidth = 0.2;
		 scatterplot y = number x = eval((max_aendy + 10)/30.4375) /      
                 markerattrs = (symbol = completed size = 12px);               
		 discretelegend "54_marker" "81_marker" "completed_marker" /  
                 type = marker  
                 autoalign = (bottomright) across = 1                          
                 location = inside title = "Dose";
         endlayout;

Output

Without further ado, Output 1 shows you an example of a Napoleon plot. You can see that there are many patients, and so the patient labels have been suppressed. You also see that the patient who has been on the study the longest has a dose delay indicated by the white space between the red and orange bars. While this example illustrates a simple Napoleon plot with only two types, dose exposure and treatment, the book has more complex examples of swimmer plots.

Output 1: Napoleon Plot that Highlights Dose Interruptions

Napoleon plot with orange and red bars showing dose exposure and treatment

How to create a Napoleon plot with Graph Template Language (GTL) was published on SAS Users.

9月 222020
 

Everyone knows that SAS has been helping programmers and coders build complex machine learning models and solve complex business problems for many years, but did you know that you can also now build machines learning models without a single line of code using SAS Viya?

SAS has been helping programmers and coders build complex machine learning models and solve complex business problems over many years.

Building on the vision and commitment to democratize analytics, SAS Viya offers multiple ways to support non-programmers and empowers people with no programming skills to get up and running quickly and build machine learning models. I touched on some of the ways this can be done via SAS Visual Analytics in my previous post on analytics for everyone with SAS Viya. In addition, SAS Viya also supports more advanced pipeline-based visual modeling via SAS Visual Data Mining and Machine Learning. The combination of these different tools within SAS Viya supporting a low-code/no-code approach to modeling makes SAS Viya an incredibly flexible and powerful analytics platform that can help drive analytics usage and adoption throughout an organization.

As analytics and machine learning become more pervasive, an analytics platform that supports a low-code/no-code approach can get more people involved, drive ongoing innovations, and ultimately accelerate digital transformation throughout an organization.

Speed

I have met my fair share of coding ninjas who blew me away with their ability to build models using keyboards with lightning speed. But when it comes to being able to quickly get an idea into a model and generate all the assessment statistics and charts, there is nothing quite like a visual approach to building machine learning models.

In SAS Viya, you can build a decision tree model literally just by dragging and dropping the relevant variables onto the canvas as shown in the animated screen flow below.

Building a machine learning model via drag and drop

In this case, we were able to quickly build a decision tree model that predicts child mortality rates around the world. Not only do we get the decision tree in all its graphics glory (on the left-hand side of the image), we also get the overall model fit measure (Average Standard Error in this case), a variable importance chart, as well as a lift chart all without having to enter a single line of code in under 5 seconds!

You also get a bunch of detailed statistical outputs, including a detailed node statistics table without having to do anything extra. This is useful for when you need to review the distribution and characteristics of specific nodes when using the decision tree.

Detailed node statistics table

 

What’s more, you can leverage the same drag-and-drop paradigm to quickly tune the model. In our case, you can do simple modifications like adding a new variable by simply dragging a new data item onto the canvas or more complex techniques like manually splitting or pruning a node just by clicking and selecting a node on the canvas. The whole model and visualization refreshes instantly as you make changes, and you get instant feedback on the outputs of your tuning actions, which can help drive rapid iteration and idea testing.

Governance and collaboration

A graphical and components-based approach to modeling also has the added benefits of providing a stronger level of governance and fostering collaboration. Building machine learning model is often a team sport, and the ability to share and reuse models easily can dramatically reduce the cost and effort involved in building and maintaining models.

SAS Visual Data Mining and Machine Learning enables users to build complex, enterprise-grade pipeline models that support sophisticated variable selection, feature engineering techniques, as well as model comparison processes all within a single, easy-to-understand, pipeline-based design framework.

Pipeline modeling using SAS VDMML

The graphical, pipeline-based modeling framework within SAS Visual Data Mining and Machine Learning leverages common components, supports self-documentation, and allows users to leverage a template-based approach to building and sharing machine learning models quickly.

More importantly, as a new user or team member who needs to review, tune or reuse someone else’s model, it is much easier and quicker to understand the design and intent of the various components of a pipeline model and make the needed changes.

It is much easier and quicker to understand the design and intent of the various components of a pipeline model.

Communication and storytelling

Finally, and perhaps most importantly, a graphical, low-code/no-code approach to building machine learning models makes it much easier to communicate both the intent and potential impact of the model. Figures and numbers represent facts, but narratives and stories convey emotion and build connections. The visual modeling approaches supported by SAS Viya enable you to tell compelling stories, share powerful ideas, and inspire valuable actions.

SAS Viya enables you to make changes and apply filters on the fly within its various visual modeling environments. With the model training process and model outputs all represented visually, it makes it extremely easy to discuss business scenarios, test hypotheses, and test modeling strategies and approaches, even with people without a deep machine learning background.

There is no question that a programmatic approach to building machine learning models offers the ultimate power and flexibility and enables data scientist to build the most complex and advanced machine learning models. But when it comes to speed, governance, and communications, a graphical, low-code/no-code approach to building machine learning definitely has a lot to offer.

To learn more about a low-code/no-code approach to building machine learning models using SAS Viya, check out my book Smart Data Discovery Using SAS® Viya®.

The value of a low-code/no-code approach to building machine learning models was published on SAS Users.

9月 222020
 

Everyone knows that SAS has been helping programmers and coders build complex machine learning models and solve complex business problems for many years, but did you know that you can also now build machines learning models without a single line of code using SAS Viya?

SAS has been helping programmers and coders build complex machine learning models and solve complex business problems over many years.

Building on the vision and commitment to democratize analytics, SAS Viya offers multiple ways to support non-programmers and empowers people with no programming skills to get up and running quickly and build machine learning models. I touched on some of the ways this can be done via SAS Visual Analytics in my previous post on analytics for everyone with SAS Viya. In addition, SAS Viya also supports more advanced pipeline-based visual modeling via SAS Visual Data Mining and Machine Learning. The combination of these different tools within SAS Viya supporting a low-code/no-code approach to modeling makes SAS Viya an incredibly flexible and powerful analytics platform that can help drive analytics usage and adoption throughout an organization.

As analytics and machine learning become more pervasive, an analytics platform that supports a low-code/no-code approach can get more people involved, drive ongoing innovations, and ultimately accelerate digital transformation throughout an organization.

Speed

I have met my fair share of coding ninjas who blew me away with their ability to build models using keyboards with lightning speed. But when it comes to being able to quickly get an idea into a model and generate all the assessment statistics and charts, there is nothing quite like a visual approach to building machine learning models.

In SAS Viya, you can build a decision tree model literally just by dragging and dropping the relevant variables onto the canvas as shown in the animated screen flow below.

Building a machine learning model via drag and drop

In this case, we were able to quickly build a decision tree model that predicts child mortality rates around the world. Not only do we get the decision tree in all its graphics glory (on the left-hand side of the image), we also get the overall model fit measure (Average Standard Error in this case), a variable importance chart, as well as a lift chart all without having to enter a single line of code in under 5 seconds!

You also get a bunch of detailed statistical outputs, including a detailed node statistics table without having to do anything extra. This is useful for when you need to review the distribution and characteristics of specific nodes when using the decision tree.

Detailed node statistics table

 

What’s more, you can leverage the same drag-and-drop paradigm to quickly tune the model. In our case, you can do simple modifications like adding a new variable by simply dragging a new data item onto the canvas or more complex techniques like manually splitting or pruning a node just by clicking and selecting a node on the canvas. The whole model and visualization refreshes instantly as you make changes, and you get instant feedback on the outputs of your tuning actions, which can help drive rapid iteration and idea testing.

Governance and collaboration

A graphical and components-based approach to modeling also has the added benefits of providing a stronger level of governance and fostering collaboration. Building machine learning model is often a team sport, and the ability to share and reuse models easily can dramatically reduce the cost and effort involved in building and maintaining models.

SAS Visual Data Mining and Machine Learning enables users to build complex, enterprise-grade pipeline models that support sophisticated variable selection, feature engineering techniques, as well as model comparison processes all within a single, easy-to-understand, pipeline-based design framework.

Pipeline modeling using SAS VDMML

The graphical, pipeline-based modeling framework within SAS Visual Data Mining and Machine Learning leverages common components, supports self-documentation, and allows users to leverage a template-based approach to building and sharing machine learning models quickly.

More importantly, as a new user or team member who needs to review, tune or reuse someone else’s model, it is much easier and quicker to understand the design and intent of the various components of a pipeline model and make the needed changes.

It is much easier and quicker to understand the design and intent of the various components of a pipeline model.

Communication and storytelling

Finally, and perhaps most importantly, a graphical, low-code/no-code approach to building machine learning models makes it much easier to communicate both the intent and potential impact of the model. Figures and numbers represent facts, but narratives and stories convey emotion and build connections. The visual modeling approaches supported by SAS Viya enable you to tell compelling stories, share powerful ideas, and inspire valuable actions.

SAS Viya enables you to make changes and apply filters on the fly within its various visual modeling environments. With the model training process and model outputs all represented visually, it makes it extremely easy to discuss business scenarios, test hypotheses, and test modeling strategies and approaches, even with people without a deep machine learning background.

There is no question that a programmatic approach to building machine learning models offers the ultimate power and flexibility and enables data scientist to build the most complex and advanced machine learning models. But when it comes to speed, governance, and communications, a graphical, low-code/no-code approach to building machine learning definitely has a lot to offer.

To learn more about a low-code/no-code approach to building machine learning models using SAS Viya, check out my book Smart Data Discovery Using SAS® Viya®.

The value of a low-code/no-code approach to building machine learning models was published on SAS Users.

8月 252020
 

Analytics is playing an increasingly strategic role in the ongoing digital transformation of organizations today. However, to succeed and scale your digital transformation efforts, it is critical to enable analytics skills at all tiers of your organization. In a recent blog post covering 4 principles of analytics you cannot ignore, SAS COO Oliver Schabenberger articulated the importance of democratizing analytics. By scaling your analytics efforts beyond traditional data science teams and involving more people with strong business domain knowledge, you can gain more valuable insights and make more significant impacts.

SAS Viya was built from the ground up to fulfill this vision of democratizing analytics. At SAS, we believe analytics should be accessible to everyone. While SAS Viya offers tremendous support and will continue to be the tool of choice for many advanced users and programmers, it is also highly accessible for business analysts and insights team who prefer a more visual approach to analytics and insights discovery.

Self-service data management

First of all, SAS Viya makes it easy for anyone to ingest and prepare data without a single line of code. The integrated data preparation components within SAS Viya support ad-hoc, agile-oriented data management tasks where you can profile, cleanse, and join data easily and rapidly.

Automatically Generated Data Profiling Report

You can execute complex joins, create custom columns, and cleanse your data via a completely drag-and-drop interface. The automation built into SAS Viya eases the often tedious task of data profiling and data cleansing via automated data type identification and transform suggestions. In an area that can be both complex and intimidating, SAS Viya makes data management tasks easy and approachable, helping you to analyze more data and uncover more insights.

Data Join Using a Visual Interface

A visual approach supporting low-code and no-code programming

Speaking of no-code, SAS Viya’s visual approach and support extend deep into data exploration and advanced modeling. Not only can you quickly build charts such as histograms and box plots using a drag and drop interface, but you can also build complex machine learning models using algorithms such as decision trees and logistic regression on the same visual canvas.

Building a Decision Tree Model Using SAS Viya

By putting the appropriate guard rails and providing relevant and context-rich help for the user, SAS Viya empowers users to undertake data analysis using other advanced analytics techniques such as forecasting and correlation analysis. These techniques empower users to ask more complex questions and can potentially help uncover more actionable and valuable insights.

Correlation Analysis Using the Correlation Matrix within SAS Viya

Augmented analytics

Augmented analytics is an emerging area of analytics that leverages machine learning to streamline and automate the process of doing analytics and building machine learning models. SAS Viya leverages augmented analytics throughout the platform to automate various tasks. My favorite use of augmented analytics in SAS Viya, though, is the hyperparameters autotuning feature.

In machine learning, hyperparameters are parameters that you need to set before the learning processing can begin. They are only used during the training process and contribute significantly to the model training process. It can often be challenging to set the optimal hyperparameter settings, especially if you are not an experienced modeler. This is where SAS Viya can help by making building machine learning models easier for everyone one hyperparameter at a time.

Here is an example of using the SAS Viya autotuning feature to improve my decision tree model. Using the autotuning window, all I needed to do was tell SAS Viya how long I want the autotuning process to run for. It will then work its magic and determine the best hyperparameters to use, which, in this case, include the Maximum tree level and the number of Predictor bins. In most cases, you get a better model after coming back from getting a glass of water!

Hyperparameters Autotuning in SAS Viya

Under the hood, SAS Viya uses complex optimization techniques to try to find the best hyperparameter combinations to use all without you having to understand how it manages this impressive feat. I should add that hyperparameters autotuning is supported with many other algorithms in SAS Viya, and you have even more autotuning options when using it via the programmatic interface!

By leveraging a visually oriented framework and augmented analytics capabilities, SAS Viya is making analytics easier and machine learning models more accessible for everyone within an organization. For more on how SAS Viya enables everyone to ask more complex questions and uncover more valuable insights, check out my book Smart Data Discovery Using SAS® Viya®.

Analytics for everyone with SAS Viya was published on SAS Users.

8月 102020
 

The most fundamental concept that students learning introductory SAS programming must master is how SAS handles data. This might seem like an obvious statement, but it is often overlooked by students in their rush to produce code that works. I often tell my class to step back for a moment and "try to think like SAS" before they even touch the keyboard. There are many key topics that students must understand in order to be successful SAS programmers. How does SAS compile and execute a program? What is the built-in loop that SAS uses to process data observation by observation? What are the coding differences when working with numeric and character data? How does SAS handle missing observations?

One concept that is a common source of confusion for students is how to tell SAS to treat rows versus columns. An example that we use in class is how to write a program to calculate a basic descriptive statistic, such as the mean. The approach that we discuss is to identify our goal, rows or columns, and then decide what SAS programming statements are appropriate by thinking like SAS. First, we decide if we want to calculate the mean of an observation (a row) or the mean of a variable (a column). We also pause to consider other issues such as the type of variable, in this case numeric, and how SAS evaluates missing data. Once these concepts are understood we can proceed with an appropriate method: using DATA step programming, a procedure such as MEANS, TABULATE, REPORT or SQL, and so on. For more detailed information about this example there is an excellent user group paper on this topic called "Many Means to a Mean" written by Shannon Pileggi for the Western Users of SAS Software conference in 2017. In addition, The Little SAS® Book and its companion book, Exercises and Projects for the Little SAS® Book, Sixth Edition address these types of topics in easy-to-understand examples followed up with thought-provoking exercises.

Here is an example of the type of question that our book of exercises and projects uses to address this type of concept.

Short answer question

  1. Is there a difference between calculating the mean of three variables X1, X2, and X3 using the three methods as shown in the following examples of code? Explain your answer.
    Avg1 = MEAN(X1,X2,X3);
    Avg2 = (X1 + X2 + X3) / 3;
    PROC MEANS; VAR X1 X2 X3; RUN;

Solution

In the book, we provide solutions for odd-numbered multiple choice and short answer questions, and hints for the programming exercises. Here is the solution for this question:

  1. The variable Avg1 that uses the MEAN function returns the mean of nonmissing arguments and will provide a mean value of X1, X2, and X3 for each observation (row) in the data set. The variable Avg2 that uses an arithmetic equation will also calculate the mean for each observation (row), but will return a missing value if any of the variables for that observation have a missing value. Using PROC MEANS will calculate the mean of nonmissing data for each variable (column) X1, X2, and X3 vertically.

For more information about The Little SAS Book and its companion book of exercises and projects, check out these blogs:

Learning to think like SAS was published on SAS Users.