1月 132020
 

Are you ready to get a jump start on the new year? If you’ve been wanting to brush up your SAS skills or learn something new, there’s no time like a new decade to start! SAS Press is releasing several new books in the upcoming months to help you stay on top of the latest trends and updates. Whether you are a beginner who is just starting to learn SAS or a seasoned professional, we have plenty of content to keep you at the top of your game.

Here is a sneak peek at what’s coming next from SAS Press.
 
 

For students and beginners

For beginners, we have Exercises and Projects for The Little SAS® Book: A Primer, Sixth Edition, the best-selling workbook companion to The Little SAS Book by Rebecca Ottesen, Lora Delwiche, and Susan Slaughter. Exercises and Projects for The Little SAS® Book, Sixth Edition will be updated to match the updates to the new The Little SAS® Book: A Primer, Sixth Edition. This hands-on workbook is designed to hone your SAS skills whether you are a student or a professional.

 

 

For data explorers of all levels

This free e-book explores the features of SAS® Visual Data Mining and Machine Learning, powered by SAS® Viya®. Users of all skill levels can visually explore data on their own while drawing on powerful in-memory technologies for faster analytic computations and discoveries. You can manually program with custom code or use the features in SAS® Studio, Model Studio, and SAS® Visual Analytics to automate your data manipulation and modeling. These programs offer a flexible, easy-to-use, self-service environment that can scale on an enterprise-wide level. This book introduces some of the many features of SAS Visual Data Mining and Machine Learning including: programming in the Python interface; new, advanced data mining and machine learning procedures; pipeline building in Model Studio, and model building and comparison in SAS® Visual Analytics

 

 

For health care data analytics professionals

If you work with real world health care data, you know that it is common and growing in use from sources like observational studies, pragmatic trials, patient registries, and databases. Real World Health Care Data Analysis: Causal Methods and Implementation in SAS® by Doug Faries et al. brings together best practices for causal-based comparative effectiveness analyses based on real world data in a single location. Example SAS code is provided to make the analyses relatively easy and efficient. The book also presents several emerging topics of interest, including algorithms for personalized medicine, methods that address the complexities of time varying confounding, extensions of propensity scoring to comparisons between more than two interventions, sensitivity analyses for unmeasured confounding, and implementation of model averaging.

 

For those at the cutting edge

Are you ready to take your understanding of IoT to the next level? Intelligence at the Edge: Using SAS® with the Internet of Things edited by Michael Harvey begins with a brief description of the Internet of Things, how it has evolved over time, and the importance of SAS’s role in the IoT space. The book will continue with a collection of chapters showcasing SAS’s expertise in IoT analytics. Topics include Using SAS Event Stream Processing to process real world events, connectivity, using the ESP Geofence window, applying analytics to streaming data, using SAS Event Stream Processing in a typical IoT reference architecture, the role of SAS Event Stream Manager in managing ESP deployments in an IoT ecosystem, how to use deep learning with Your IoT Digital, accounting for data quality variability in streaming GPS data for location-based analytics, and more!

 

 

 

Keep an eye out for these titles releasing in the next two months! We hope this list will help in your search for a SAS book that will get you to the next step in updating your SAS skills. To learn more about SAS Press, check out our up-and-coming titles, and to receive exclusive discounts make sure to subscribe to our newsletter.

Foresight is 2020! New books to take your skills to the next level was published on SAS Users.

12月 122019
 

Parts 1 and 2 of this blog post discussed exploring and preparing your data using SASPy. To recap, Part 1 discussed how to explore data using the SASPy interface with Python. Part 2 continued with an explanation of how to prepare your data to use it with a machine-learning model. This final installment continues the discussion about preparation by explaining techniques for normalizing your numerical data and one-hot encoding categorical variables.

Normalizing your data

Some of the numerical features in the examples from parts 1 and 2 have different ranges. The variable Age spans from 0-100 and Hours_per_week from 0-80. These ranges affect the calculation of each feature when you apply them to a supervised learner. To ensure the equal treatement of each feature, you need to scale the numerical features.

The following example uses the SAS STDIZE procedure to scale the numerical features. PROC STDIZE is not a standard procedure available in the SASPy library.  However, the good news is you can add any SAS procedure to SASPy! This feature enables Python users to become a part of the SAS community by contributing to the SASPy library and giving them the chance to use the vast number of powerful SAS procedures. To add PROC STDIZE to SASPy, see the instructions in the blog post Adding SAS procedures to the SASPy interface to Python.

After you add the STDIZE to SASPy, run the following code to scale the numerical features. The resulting values will be between 0 and 1.

# Creating a SASPy stat objcect
stat = sas.sasstat()
# Use the stdize function that was added to SASPy to scale our features
stat_result = stat.stdize(data=cen_data_logTransform,
                         procopts = 'method=range out=Sasuser.afternorm',
			 var = 'age education_num capital_gain capital_loss hours_per_week')

To use the STDIZE procedure in SASPy we need to specify the method and the output data set in the statement options. For this we use the "procopts" option and we specify range as our method and our "out" option to a new SAS data set, afternorm.

After running the STDIZE procedure we assign the new data set into a SAS data object.

norm_data = sas.sasdata('afternorm', libref='SASuser')

Now let's verify if we were successful in transforming our numerical features

norm_data.head(obs=5)


 

 

 

 

 
The output looks great! You have normalized the numerical features. So, it's time to tackle the last data-preparation step.

One-Hot Encoding

Now that you have adjusted the numerical features, what do you do with the categorical features? The categories Relationship, Race, Sex, and so on are in string format. Statistical models cannot interpret these values, so you need to transform the values from strings into a numerical representation. The one-hot encoding process provides the transformation you need for your data.

To use one-hot encoding, use the LOGISTIC procedure from the SASPy Stat class. SASPy natively includes the LOGISTIC procedure, so you can go straight to coding. To generate the syntax for the code below, I followed the instructions from Usage Note 23217: Saving the coded design matrix of a model to a data set.

stat_proc_log = stat.logistic(data=norm_data, procopts='outdesign=SASuser.test1 outdesignonly',
            cls = "workclass education_level marital_status occupation relationship race sex native_country / param=glm",
	    model = "age = workclass education_level marital_status occupation relationship race sex native_country / noint")

To view the results from this code, create a SAS data object from the newly created data set, as shown in this example:

one_hot_data = sas.sasdata('test1', libref='SASuser')
display(one_hot_data.head(obs=5))

The output:

 

 

 

 

Our data was successfully one-hot encoded! For future reference, due to SAS’ analytical power, this step is not required. When including a categorical feature in a class statement the procedure automatically generates a design matrix with the one-hot encoded feature. For more information, I recommend reading this post about different ways to create a design matrix in SAS.

Finally

You made it to the end of the journey! I hope everyone who reads these blogs can see the value that SASPy brings to the machine-learning community. Give SASPy  a try, and you'll see the power it can bring to your SAS solutions.

Stay curious, keep learning, and (most of all) continue innovating.

Machine Learning with SASPy: Exploring and Preparing your data - Part 3 was published on SAS Users.

10月 302019
 

I suffer from arthritis. You can tell just by watching me walk: Depending on the day, I have a slight limp, which varies in severity based on a number of factors such as the time of day and recent physical activity. Years of treatment for my condition have shown me [...]

I applied AI to my arthritis assessment. Here’s what happened. was published on SAS Voices by Mark Wolff

9月 282019
 

This article continues a series that began with Machine learning with SASPy: Exploring and preparing your data (part 1). Part 1 showed you how to explore data using SASPy with Python. Here, in part 2, you will learn how to begin to prepare your data to use it within a machine-learning model.

Review part 1 if needed and ensure you still have the ADULT data set ready to use. (The data set is available from the UCI Machine Learning Repository.) If not, take some time to download and explore the data again, as described in part 1.

Preparing your data

Preparing data is a necessary step to perform before applying the data toward a model. There are string values, skewed data, and missing data points to consider. In the data set, be sure to clear missing values, so you can jump into other methods.

For this exercise, you will explore how to transform skewed features using SASPy and Pandas.

First, you must separate the income data from the data set, because the income feature will later become your target variable to model.

Drop the income data and turn the pandas data frame back into a SAS data object, with the following code:

Now, let's take a second look at the numerical features. You will use SASPy to create a histogram of all numerical features. Typically, the Matplotlib library is used, but SASPy provides great opportunities to visualize the data.

The following graphs represent the expected output.

Taking a look at the numerical features, two values stick out. CAPITAL_GAIN and CAPITAL_LOSS are highly skewed. Highly skewed features can affect your model, as most models try to maintain a normally distributed curve. To fix this, you will apply a logarithmic transformation using pandas and then visualize the change using SASPy.

Transforming skewed features

First, you need to change the SAS data object back into a pandas data frame and assign the skewed features to a list variable:

Then, use pandas to apply the logarithmic transformation and convert the pandas data frame back into a SAS data object:

Display transformed data

Now, you are ready to visualize these changes using SASPy. In the previous section, you used histograms to display the data. To display this transformation, you will use the SASPy SASUTIL class. Specifically, you will use a procedure typically used in SAS, the UNIVARIATE procedure.

To use the SASUTIL class with SASPy, you first need to create a Python object that uses the SASUTIL class:

 

Now, use the univariate function from SASPy:

 

Using the UNIVARIATE procedure, you can set axis limits to the output histograms so that you can see the data in a clearer format. After running the selected code, you can use the dir() function to verify successful submission:

 

Here is the output:

 

 

 

The function calculates various descriptive statistics and plots. However, for this example, the focus is on the histogram.

 

Here are the results:

Wrapping up

You have now transformed the skewed data. Pandas applied the logarithmic transformation and SASPy displayed the histograms.

Up next

In the next and final article of this series, you will continue preparing your data by normalizing numerical features and one-hot encoding categorical features.

Machine learning with SASPy: Exploring and preparing your data (part 2) was published on SAS Users.

9月 102019
 

SASPy is a powerful Python library that interfaces with SAS and can help with your machine-learning solutions. SASPy was created for Python programmers to leverage the power of SAS within their Python scripts. If you are not familiar with SASPy, see the following resources:

This blog post shows you how powerful SASPy can be. SASPy helps you with providing visuals and descriptive statistics quickly and accurately. To demonstrate this capability, let’s explore and prepare your data using SASPy.

Prerequisites

To get started, here is what you need:

  • The Census Income data set from the University of California Irvine’s Machine Learning Repository
    • Download the adult.data data set from the data folder.
    • Remove the missing values prior to exploring and preparing.
  • SAS®9.4 or SAS® Viya® 3.1 or any later variations of these
  • Jupyter Notebook
  • SASPy (To install SASPy, refer to the installation and configuration documentation.)

After verifying you have completed the above requirements, you can start your Jupyter Notebook and begin coding using SASPy.

Let's start by importing libraries we will use in this example

  1. Import the libraries:
  2. Start your SAS session. Use the command below to establish a connection.

A "SAS Connection established" message returns once connected. This example uses a local connection to SAS. However, you can use an STDIO connection or an IOM connection to SAS if you prefer. For more information, see SAS Configuration.

  1. Read in your data set. You have two options: You can either read in the data set using pandas and then read the data into a SAS data object or you can read it directly into a SAS data object. This example shows reading the data directly into a SAS data object.

To access existing data in a SAS session, use the SAS data object. A SAS data object can be used to do the following:

  • Create various graphs such as histograms, scatter plots, heatmaps, and so on.
  • Display descriptive statistics.
  • Transfer data in between a pandas data frame and a SAS data object.

The SAS data object is versatile. To view all of its capabilities, refer to the SAS Data Object documentation.

  1. Verify whether you successfully read in your data set:

Similar to pandas, SASPy has a head function to display data points. The only difference is when you are specifying how many data points you would like to see. You need to include “obs=n” if you are using a SAS data object.

Exploring your Data

SASPy provides many options to explore your data. This example uses a combination of SASPy functions and pandas to explore the data.

  1. Determine the number of records in your data:
  2. Determine how many individuals earn more or less than $50,000. For this step, this example uses pandas to demonstrate how you can switch between using SASPy and pandas seamlessly.
    1. Change your SAS data object into a pandas data frame:
    2. Use the value_counts function to determine how many individuals earn more or less than $50,000:
    3. View the percent of individuals whose income is greater than $50,000:                                               
    4. Display all your values to gain an understanding of your data:

As you can see from the output above, there are 30,162 records. About 7,508 individuals earn more than $50,000, and about 22,654 individuals make up to $50,000. From all the data, you can see about 25% percent of individuals earn more than $50,000.

  1. It is also important to look at your numerical features. Use SASPy to get a quick description of your data:

As you can see above, the table lists calculated values for the mean, median, and other valuable statistical values.

Exploring your data is just the first step in generating your machine-learning solutions. This blog post described how to generate basic statistical values and display output using SASPy, pandas, and Python. Part 2 and 3 of this blog post cover how to prepare your data using SASPy and to then apply it to a machine learning model.

For more information about the data set, see the UC Irvine Machine Learning Repository.

Machine learning with SASPy: Exploring and preparing your data (part 1) was published on SAS Users.

9月 102019
 

SASPy is a powerful Python library that interfaces with SAS and can help with your machine-learning solutions. SASPy was created for Python programmers to leverage the power of SAS within their Python scripts. If you are not familiar with SASPy, see the following resources:

This blog post shows you how powerful SASPy can be. SASPy helps you with providing visuals and descriptive statistics quickly and accurately. To demonstrate this capability, let’s explore and prepare your data using SASPy.

Prerequisites

To get started, here is what you need:

  • The Census Income data set from the University of California Irvine’s Machine Learning Repository
    • Download the adult.data data set from the data folder.
    • Remove the missing values prior to exploring and preparing.
  • SAS®9.4 or SAS® Viya® 3.1 or any later variations of these
  • Jupyter Notebook
  • SASPy (To install SASPy, refer to the installation and configuration documentation.)

After verifying you have completed the above requirements, you can start your Jupyter Notebook and begin coding using SASPy.

Let's start by importing libraries we will use in this example

  1. Import the libraries:
  2. Start your SAS session. Use the command below to establish a connection.

A "SAS Connection established" message returns once connected. This example uses a local connection to SAS. However, you can use an STDIO connection or an IOM connection to SAS if you prefer. For more information, see SAS Configuration.

  1. Read in your data set. You have two options: You can either read in the data set using pandas and then read the data into a SAS data object or you can read it directly into a SAS data object. This example shows reading the data directly into a SAS data object.

To access existing data in a SAS session, use the SAS data object. A SAS data object can be used to do the following:

  • Create various graphs such as histograms, scatter plots, heatmaps, and so on.
  • Display descriptive statistics.
  • Transfer data in between a pandas data frame and a SAS data object.

The SAS data object is versatile. To view all of its capabilities, refer to the SAS Data Object documentation.

  1. Verify whether you successfully read in your data set:

Similar to pandas, SASPy has a head function to display data points. The only difference is when you are specifying how many data points you would like to see. You need to include “obs=n” if you are using a SAS data object.

Exploring your Data

SASPy provides many options to explore your data. This example uses a combination of SASPy functions and pandas to explore the data.

  1. Determine the number of records in your data:
  2. Determine how many individuals earn more or less than $50,000. For this step, this example uses pandas to demonstrate how you can switch between using SASPy and pandas seamlessly.
    1. Change your SAS data object into a pandas data frame:
    2. Use the value_counts function to determine how many individuals earn more or less than $50,000:
    3. View the percent of individuals whose income is greater than $50,000:                                               
    4. Display all your values to gain an understanding of your data:

As you can see from the output above, there are 30,162 records. About 7,508 individuals earn more than $50,000, and about 22,654 individuals make up to $50,000. From all the data, you can see about 25% percent of individuals earn more than $50,000.

  1. It is also important to look at your numerical features. Use SASPy to get a quick description of your data:

As you can see above, the table lists calculated values for the mean, median, and other valuable statistical values.

Exploring your data is just the first step in generating your machine-learning solutions. This blog post described how to generate basic statistical values and display output using SASPy, pandas, and Python. Part 2 and 3 of this blog post cover how to prepare your data using SASPy and to then apply it to a machine learning model.

For more information about the data set, see the UC Irvine Machine Learning Repository.

Machine learning with SASPy: Exploring and preparing your data (part 1) was published on SAS Users.

9月 092019
 

Editor's Note: This article was translated and edited by SAS USA and was originally written by Makoto Unemi. The original text is here.

SAS previously provided SAS Scripting Wrapper for Analytics Transfer (SWAT), a package for using SAS Viya functions from various general-purpose programming languages ​​such as Python.

In addition to SWAT, SAS launched Deep Learning Python (DLPy), a higher-level API package for Python, making it possible to use SAS Viya functions more efficiently from Python. In this article I outline more about what DLPy is and how it's implementation.

About DLPy

DLPy is a high-level package for the Python API created for deep learning and image action set after Viya3.3. DLPy provides an API similar to Keras to improve the efficiency of deep learning and image processing coding. With just a little rewriting of the existing Keras code, it is possible to execute the processing on SAS Viya.

For example, below is an example of a Convolutional Neural Network (CNN) layer definition; you can see that it is very similar to Keras.

The layers supported by DLPy are: InputLayer, Conv2d, Pooling, Dense, Recurrent, BN, Res, Proj, and OutputLayer. The following is an example of learning.

DLPy functions

Introducing DLPy's functions (partial excerpts), taking as an example the learning of multiple dolphins and giraffe images using CNN and applying test images to the model.

Implementation of major deep learning networks

DLPy offers the following pre-built deep learning models: VGG11/13/16/19, ResNet34/50/101/152, wide_resnet, and dense_net.

The following models also offer pre-trained weights using ImageNet data (these weights can be used for unique tasks by transfer learning): VGG16, VGG19, ResNet50, ResNet101, and ResNet152. The following is an example of transferring ResNet50 pre-trained weights.

CNN judgment basis information

Using the heat_map_analysis() method, you can output a colorful heat map and check where you focused on the image.

In addition, the get_feature_maps() method is used to get the feature map of each layer of CNN, and feature_maps.display() method is used to specify and display the obtained feature map layer and check can also do.

The following is the output result of layer 1 feature map.

The following is the output result of layer 18 feature map.

Deep learning & image processing related task support function

resize() method: Resize image data

as_patches() method: Image data expansion (generates a patch from the original image)

two_way_split() method: Data split (learning, testing)

plot_network() method: draws the structure of the defined deep learning layer (network) as a graphical diagram

plot_training_history() method: Iterative learning history display

predict() method: Display prediction (scoring) results

plot_predict_res() method: Display classification results

And of course, you can use DLPy to get data from a SAS Viya in-memory session, pass it to your local client, and convert it to common data formats like numpy arrays and Pandas DataFrames. The converted data can be smoothly supplied to models of other open source packages such as scikit-learn.

Regarding image classification using DLPy, videos are also available in the Deep Learning with Python (DLPy) Demo Series section of the DLPy product page.

SAS Viya: Package for Python API for deep learning and image processing: DLPy was published on SAS Users.

9月 032019
 

The startup ecosystem is dynamic and the flow of venture capital into tech is at an all-time high. Billions of dollars are invested in tech startups every year. Many tech startups market themselves as ‘powered by AI’ and pitch investors with buzzword laden phrases such as, ‘we leverage state of [...]

7 ways SAS empowers startups with artificial intelligence and machine learning was published on SAS Voices by Avinash Sooriyarachchi

9月 032019
 

In part one of this blog series, we introduced hybrid marketing as a method that combines both direct and digital marketing capabilities while absorbing insights from machine learning. In part two, we will share perspectives on: How SAS Customer Intelligence 360 completes analytic's last mile. How campaign management processes can easily [...]

SAS Customer Intelligence 360: Hybrid marketing and analytic's last mile [Part 2] was published on Customer Intelligence Blog.

8月 262019
 

The marketing industry has never had greater access to data than it does today. However, data alone does not drive your marketing organization. Decisions do. And with all the recent hype regarding the potential of AI, a successful cross-channel campaign is propelled by a personalized, data-driven approach injected with machine [...]

SAS Customer Intelligence 360: Hybrid marketing and analytic's last mile [Part 1] was published on Customer Intelligence Blog.