artificial intelligence

1月 162020

Using Customer Lifetime Value in your business decision making is often important and crucial for success. Businesses that are customer-centric often spend thousands of dollars acquiring new customers, “on-boarding” new customers, and retaining those customers. If your business margins are thin, then it can often be months or quarters before you start to turn a profit on a particular customer. Additionally, some business models will segment the worth of their customers into categories that will often give different levels of service to the more “higher worth” customers. The metric most often used for that is called Customer Lifetime Value (CLV). CLV is simply a balance sheet look at the total cost spent versus the total revenue earned over a customer’s projected tenure or “life.”

In this blog, we will focus on how a business analyst can build a functional analytical dashboard for a fictional company that is seeing its revenue, margins, and a customer’s lifetime value decrease and what steps they can take to correct that.

We will cover 3 main areas of interest:

  1. First, screenshots of SAS Visual Analytic reports, using Customer Lifetime Value and how you can replicate them.
  2. Next, we will look at the modeling that we did in the report, with explanations on how we got used the results in subsequent modeling.
  3. Lastly, we talk about one example of how we scored and deployed the model, and how you can do the same.

Throughout this blog, I will also highlight areas where SAS augments our software with artificial intelligence to improve your experience.

1. State of the company

First, we will look at the state of the company using the dashboard and take note of any problems.

Our dashboard shows the revenue of our company over the last two years as well as a forecast for the next 6 months. We see that revenue has been on the decline in recent years and churns have been erratically climbing higher.

Our total annual revenue was 112M last year with just over 5,000 customers churning.

So far this year, our revenue is tracking low and sits at only 88M, but the bad news is that we have already tripled last year's churn total.

If these trends continue, we stand to lose a third of our revenue!

2. The problems

Now, let’s investigate as to where the problems are and what can be done about them.

If we look at our current metrics, we can see some interesting points worth investigating.

The butterfly chart on the right shows movement between customer loyalty tiers within each region of the country with the number of upgrades (on the right) and downgrades (on the left).

The vector plots show us information over multiple dimensions. These show us the difference between two values and the direction it is heading. For example, on the left, we see that Revenue is pointed downward while churns (x axis) are increasing.

The vector plot on the right shows us the change in margin from year to year as well as the customer lifetime value.

What’s interesting here is that there are two arrows that are pointing up, indicating a rise in customer lifetime value. Indeed, if we were to click on the map, we would see that these two regions are the same two that have a net increase in Loyalty Tier.

This leads me to believe that a customer’s tier is predictive of margin. Let’s investigate it further.

3. Automated Analysis

We will use the Automated Analysis feature within Visual Analytics to quickly give us the drivers of CLV.

This screenshot shows an analysis that SAS Visual Analytics(VA) performed for me automatically. I simply told VA which variable I was interested in analyzing and within a matter of seconds, it ran a series of decision trees to produce this summary. This is an example of how SAS is incorporating AI into our software to improve your experience.

Here we can see that loyalty tier is indeed the most important factor in determining projected annual margin (or CLV).

4. Influential driver

Once identified, the important driver will be explored across other dimensions to assess how influential this driver might be.

A cursory exploration of Loyalty Tier indicates that yes, loyalty tier, particularly Tier 5, has a major influence on revenue, order count, repeat orders, and margin.

5. CLV comparison models

We will create two competing models for CLV and compare them.

Here on our modeling page are two models that I’ve created to predict CLV. The first one is a Linear Regression and the second is a Gradient Boosting model. I've used Model Comparison to tell me that the Linear Regression model delivers a more accurate prediction and so I use the output of that model as input into a recommendation engine.

6. Recommendation engine

Based on our model learnings and the output of the model, we are going to build a recommendation engine to help us with determine what to do with each customer.

Represented here, I built a recommendation engine model using the Factorization Machine algorithm.

Once we implement our model, customers are categorized more appropriately and we can see that it has had an impact on revenue and the number of accounts is back on the rise!


Even though Customer Lifetime Value has been around for years, it is still a valuable metric to utilize in modeling and recommendation engines as we have seen. We used it our automated analysis, discovered that it had an impact on revenue, we modeled future values of CLV and then incorporated those results into a recommendation engine that recommended new loyalty tiers for our customers. As a result, we saw positive changes in overall company revenue and churn.

To learn more, please check out these resources:

How to utilize Customer Lifetime Value with SAS Visual Analytics was published on SAS Users.

10月 302019

I suffer from arthritis. You can tell just by watching me walk: Depending on the day, I have a slight limp, which varies in severity based on a number of factors such as the time of day and recent physical activity. Years of treatment for my condition have shown me [...]

I applied AI to my arthritis assessment. Here’s what happened. was published on SAS Voices by Mark Wolff

9月 092019

Editor's Note: This article was translated and edited by SAS USA and was originally written by Makoto Unemi. The original text is here.

SAS previously provided SAS Scripting Wrapper for Analytics Transfer (SWAT), a package for using SAS Viya functions from various general-purpose programming languages ​​such as Python.

In addition to SWAT, SAS launched Deep Learning Python (DLPy), a higher-level API package for Python, making it possible to use SAS Viya functions more efficiently from Python. In this article I outline more about what DLPy is and how it's implementation.

About DLPy

DLPy is a high-level package for the Python API created for deep learning and image action set after Viya3.3. DLPy provides an API similar to Keras to improve the efficiency of deep learning and image processing coding. With just a little rewriting of the existing Keras code, it is possible to execute the processing on SAS Viya.

For example, below is an example of a Convolutional Neural Network (CNN) layer definition; you can see that it is very similar to Keras.

The layers supported by DLPy are: InputLayer, Conv2d, Pooling, Dense, Recurrent, BN, Res, Proj, and OutputLayer. The following is an example of learning.

DLPy functions

Introducing DLPy's functions (partial excerpts), taking as an example the learning of multiple dolphins and giraffe images using CNN and applying test images to the model.

Implementation of major deep learning networks

DLPy offers the following pre-built deep learning models: VGG11/13/16/19, ResNet34/50/101/152, wide_resnet, and dense_net.

The following models also offer pre-trained weights using ImageNet data (these weights can be used for unique tasks by transfer learning): VGG16, VGG19, ResNet50, ResNet101, and ResNet152. The following is an example of transferring ResNet50 pre-trained weights.

CNN judgment basis information

Using the heat_map_analysis() method, you can output a colorful heat map and check where you focused on the image.

In addition, the get_feature_maps() method is used to get the feature map of each layer of CNN, and feature_maps.display() method is used to specify and display the obtained feature map layer and check can also do.

The following is the output result of layer 1 feature map.

The following is the output result of layer 18 feature map.

Deep learning & image processing related task support function

resize() method: Resize image data

as_patches() method: Image data expansion (generates a patch from the original image)

two_way_split() method: Data split (learning, testing)

plot_network() method: draws the structure of the defined deep learning layer (network) as a graphical diagram

plot_training_history() method: Iterative learning history display

predict() method: Display prediction (scoring) results

plot_predict_res() method: Display classification results

And of course, you can use DLPy to get data from a SAS Viya in-memory session, pass it to your local client, and convert it to common data formats like numpy arrays and Pandas DataFrames. The converted data can be smoothly supplied to models of other open source packages such as scikit-learn.

Regarding image classification using DLPy, videos are also available in the Deep Learning with Python (DLPy) Demo Series section of the DLPy product page.

SAS Viya: Package for Python API for deep learning and image processing: DLPy was published on SAS Users.

9月 032019

The startup ecosystem is dynamic and the flow of venture capital into tech is at an all-time high. Billions of dollars are invested in tech startups every year. Many tech startups market themselves as ‘powered by AI’ and pitch investors with buzzword laden phrases such as, ‘we leverage state of [...]

7 ways SAS empowers startups with artificial intelligence and machine learning was published on SAS Voices by Avinash Sooriyarachchi

4月 302019

Artificial intelligence is the attention-grabbing, overhyped, shiny object that every organization is searching to make use of. Yes, it is overhyped, but it’s also very real and very powerful. “We do not want to add to the hype. We do not want to add to the confusion. We want to [...]

Bringing AI, ML and analytics to life - SAS Global Forum Tech Connection was published on SAS Voices by Shannon Heath

4月 122019

At the risk of oversimplifying, I think of artificial intelligence as what becomes possible after you’ve fully embraced analytics and you’re starting to get bolder about how to use it. Your models are getting better, your predictions are more accurate, your results are stronger and over all, confidence grows in [...]

6 thinks you didn't know about AI was published on SAS Voices by John Balla

4月 122019

At the risk of oversimplifying, I think of artificial intelligence as what becomes possible after you’ve fully embraced analytics and you’re starting to get bolder about how to use it. Your models are getting better, your predictions are more accurate, your results are stronger and over all, confidence grows in [...]

6 thinks you didn't know about AI was published on SAS Voices by John Balla

4月 062019

Recently, the North Carolina Human Trafficking Commission hosted a regional symposium to help strengthen North Carolina’s multidisciplinary response to human trafficking. One of the speakers shared an anecdote from a busy young woman with kids. She had returned home from work and was preparing for dinner; her young son wanted [...]

Countering human trafficking using text analytics and AI was published on SAS Voices by Tom Sabo

4月 062019

Recently, the North Carolina Human Trafficking Commission hosted a regional symposium to help strengthen North Carolina’s multidisciplinary response to human trafficking. One of the speakers shared an anecdote from a busy young woman with kids. She had returned home from work and was preparing for dinner; her young son wanted [...]

Countering human trafficking using text analytics and AI was published on SAS Voices by Tom Sabo

4月 032019

Structuring a highly unstructured data source

Human language is astoundingly complex and diverse. We express ourselves in infinite ways. It can be very difficult to model and extract meaning from both written and spoken language. Usually the most meaningful analysis uses a number of techniques.

While supervised and unsupervised learning, and specifically deep learning, are widely used for modeling human language, there’s also a need for syntactic and semantic understanding and domain expertise. Natural Language Processing (NLP) is important because it can help to resolve ambiguity and add useful numeric structure to the data for many downstream applications, such as speech recognition or text analytics. Machine learning runs outputs from NLP through data mining and machine learning algorithms to automatically extract key features and relational concepts. Human input from linguistic rules adds to the process, enabling contextual comprehension.

Text analytics provides structure to unstructured data so it can be easily analyzed. In this blog, I would like to focus on two widely used text analytics techniques: information extraction and entity resolution.

Information Extraction

Information Extraction (IE) automatically extracts structured information from an unstructured or semi-structured text data type -- for example, a text file, to create new structured text data. IE works at the sub-document level, in contrast with techniques such as categorization, that work at the document or record level. Therefore, the results of IE can further feed into other analyses, like predictive modeling or topic identification, as features for those processes. IE can also be used to create a new database of information. One example is the recording of key information about terrorist attacks from a group of news articles on terrorism. Any given IE task has a defined template, which is a (or a set of) case frame(s) to hold the information contained in a single document. For the terrorism example, a template would have slots corresponding to the perpetrator, victim, and weapon of the terroristic act, and the date on which the event happened. An IE system for this problem is required to “understand” an attack article only enough to find data corresponding to the slots in this template. Such a database can then be used and analyzed through queries and reports about the data.

In their new book, SAS® Text Analytics for Business Applications: Concept Rules for Information Extraction Models, authors Teresa Jade, Biljana Belamaric Wilsey, and Michael Wallis, give some great examples of uses of IE:

"One good use case for IE is for creating a faceted search system. Faceted search allows users to narrow down search results by classifying results by using multiple dimensions, called facets, simultaneously. For example, faceted search may be used when analysts try to determine why and where immigrants may perish. The analysts might want to correlate geographical information with information that describes the causes of the deaths in order to determine what actions to take."

Another good example of using IE in predictive models is analysts at a bank who want to determine why customers close their accounts. They have an active churn model that works fairly well at identifying potential churn, but less well at determining what causes the churn. An IE model could be built to identify different bank policies and offerings, and then track mentions of each during any customer interaction. If a particular policy could be linked to certain churn behavior, then the policy could be modified to reduce the number of lost customers.

Reporting information found as a result of IE can provide deeper insight into trends and uncover details that were buried in the unstructured data. An example of this is an analysis of call center notes at an appliance manufacturing company. The results of IE show a pattern of customer-initiated calls about repairs and breakdowns of a type of refrigerator, and the results highlight particular problems with the doors. This information shows up as a pattern of increasing calls. Because the content of the calls is being analyzed, the company can return to its design team, which can find and remedy the root problem.

Entity Resolution and regular expressions

Entity Resolution is the technique of recognizing when two observations relate to the same entity (thing, person, company), despite having been described differently. And conversely, recognizing when two observations do not relate to the same entity, despite having been described similarly. For example, you are listed in one data base as S Roberts, Sian Roberts, S.Roberts. All refer to the same person but would be treated as different people in an analysis unless they are resolved (combined to one person).

Entity resolution can be performed as part of a data pre-processing step or as text analysis. Basically one helps resolve multiple entries (cleans the data) and the other resolves reference to a single entity to extract meaning, for example, pronoun resolution - when “it” refers to a particular company mentioned earlier in the text. Here is another example:

Assume each numbered item is a separate observation in the input data set:
1. SAS Institute is a great company. Our company has a recreation center and health care center for employees.
2. Our company has won many awards.
3. SAS Institute was founded in 1976.

The scoring output matches are below; note that the document ID associated with each match aligns with the number before the input document where the match was found.

Unstructured data clean-up

In the following section we focus on the pre-processing clean-up of the data. Unstructured data is the most voluminous form of data in the world, and analysts rarely receive it in perfect condition for processing. In other words, textual data needs to be cleaned, transformed, and enhanced before value can be derived from it.

A regular expression is a pattern that the regular expression engine attempts to match in input. In SAS programming, regular expressions are seen as strings of letters and special characters that are recognized by certain built-in SAS functions for the purpose of searching and matching. Combined with other built-in SAS functions and procedures, such as entity resolution, you can realize tremendous capabilities. Matthew Windham, author of Unstructured Data Analysis: Entity Resolution and Regular Expressions in SAS®, gives some great examples of how you might use these techniques to clean your text data in his book. Here we share one of them:

"As you are probably familiar with, data is rarely provided to analysts in a form that is immediately useful. It is frequently necessary to clean, transform, and enhance source data before it can be used—especially textual data."

Extract, Transform, and Load (ETL) ETL is a general set of processes for extracting data from its source, modifying it to fit your end needs, and loading it into a target location that enables you to best use it (e.g., database, data store, data warehouse). We’re going to begin with a fairly basic example to get us started. Suppose we already have a SAS data set of customer addresses that contains some data quality issues. The method of recording the data is unknown to us, but visual inspection has revealed numerous occurrences of duplicative records. In this example, it is clearly the same individual with slightly different representations of the address and encoding for gender. But how do we fix such problems automatically for all of the records?

First Name Last Name DOB Gender Street City State Zip Robert Smith 2/5/1967 M 123 Fourth Street Fairfax, VA 22030 Robert Smith 2/5/1967 Male 123 Fourth St. Fairfax va 22030

Using regular expressions, we can algorithmically standardize abbreviations, remove punctuation, and do much more to ensure that each record is directly comparable. In this case, regular expressions enable us to perform more effective record keeping, which ultimately impacts downstream analysis and reporting. We can easily leverage regular expressions to ensure that each record adheres to institutional standards. We can make each occurrence of Gender either “M/F” or “Male/Female,” make every instance of the Street variable use “Street” or “St.” in the address line, make each City variable include or exclude the comma, and abbreviate State as either all caps or all lowercase. This example is quite simple, but it reveals the power of applying some basic data standardization techniques to data sets. By enforcing these standards across the entire data set, we are then able to properly identify duplicative references within the data set. In addition to making our analysis and reporting less error-prone, we can reduce data storage space and duplicative business activities associated with each record (for example, fewer customer catalogs will be mailed out, thus saving money).

Your unstructured text data is growing daily, and data without analytics is opportunity yet to be realized. Discover the value in your data with text analytics capabilities from SAS. The SAS Platform fosters collaboration by providing a toolbox where best practice pipelines and methods can be shared. SAS also seamlessly integrates with existing systems and open source technology.

Further Resources:
Natural Language Processing: What it is and why it matters

White paper: Text Analytics for Executives: What Can Text Analytics Do for Your Organization?

SAS® Text Analytics for Business Applications: Concept Rules for Information Extraction Models, by Teresa Jade, Biljana Belamaric Wilsey, and Michael Wallis

Unstructured Data Analysis: Entity Resolution and Regular Expressions in SAS®, by Matthew Windham

Text analytics explained was published on SAS Users.