In my new book, I explain how segmentation and clustering can be accomplished in three ways: coding in SAS, point-and-click in SAS Visual Statistics, and point-and-click in SAS Visual Data Mining and Machine Learning using SAS Model Studio. These three analytical tools allow you to do many diverse types of segmentation, and one of the most common methods is clustering. Clustering is still among the top 10 machine learning methods used based on several surveys across the globe.
One of the best methods for learning about your customers, patrons, clients, or patients (or simply observations in almost any data set) is to perform clustering to find clusters that have similar within-cluster characteristics and each cluster has differing combinations of attributes. You can use this method to aid in understanding your customers or profile various data sets. This can be done in an environment where SAS and open-source software work in a unified platform seamlessly. (While open source is not discussed in my book, stay tuned for future blog posts where I will discuss more fun and exciting things that should be of interest to you for clustering and segmentation.)
Let’s look at an example of clustering. The importance of looking at one’s data quickly and easily is a real benefit when using SAS Visual Statistics.
Initial data exploration and preparation
To demonstrate the simplicity of clustering in SAS Visual Statistics, the data set CUSTOMERS is used here and also throughout the book. I have loaded the CUSTOMERS data set into memory, and it is now listed in the active tab. I can easily explore and visualize this data by right-mouse-clicking and selecting Actions and then Explore and Visualize. This will take you to the SAS Visual Analytics page.
I have added four new compute items by taking the natural logarithm of four attributes and will use these newly transformed attributes in a clustering.
Performing simple clustering
Clustering in SAS Visual Statistics can be found by selecting the Objects icon on the left and scrolling down to see the SAS Visual Statistics menus as seen below. Dragging the Cluster icon onto the Report template area will allow you to use that statistic object and visualize the clusters.
Once the Cluster object is on the template, adding data items to the Data Roles is simple by checking the four computed data items.
Click the OK icon, and immediately the four data items that are being clustered will look like the report below where five clusters were found using the four data items.
There are 105,456 total observations in the data set, however, only 89,998 were used for the analysis. Some observations were not used due to the natural logarithm not being able to be computed. To see how to handle that situation easily, please pick up a copy of Segmentation Analytics with SAS Viya. Let me know if you have any questions or comments.
Analyzing ticket sales and customer data for large sports and entertainment events is a complex endeavor. But SAS Visual Analytics makes it easy, with location analytics, customer segmentation, predictive artificial intelligence (AI) capabilities – and more. This blog post covers a brief overview of these features by using a fictitious event company [...]
From time to time, the addition of new features requires a review of how capabilities are organized and presented in JMP. Are they located where it makes the most sense and where users would expect to find them? For example, in JMP 12 there was enough new material combined with […]
In a previous post, I wrote how pedigree might be used to help predict outcomes of horse races. In particular, I discussed a metric called the Dosage Index (DI), which appeared to be a leading indicator of success (at least historically). In this post, I want to introduce the Center […]
Marketers have used segmentation as a technique to target customers for communications, products, and services since the introduction of customer relationship management (i.e., CRM) and database marketing. Within the context of segmentation, there are a variety of applications, ranging from consumer demographics, geography, behavior, psychographics, events and cultural backgrounds. Over time, segmentation has proven its value, and brands continue to use this strategy across every stage of the customer journey:
Let's provide a proper definition for this marketing technique. As my SAS peer and friend Randy Collica stated in his influential book on this subject:
"Segmentation is in essence the process by which items or subjects are categorized or classified into groups that share similar characteristics. These techniques can be beneficial in classifying customer groups. Typical marketing activities seek to improve their relationships with prospective and current customers. The better you know about your customer's needs, desires, and their purchasing behaviors, the better you can construct marketing programs designed to fit their needs, desires, and behaviors."
"In an era of big data, hyperconnected digital customers and hyper-personalization, segmentation is the cornerstone of customer insight and understanding across the modern digital business. The question is: Is your segmentation approach antiquated or advanced?"
This provides a nice transition to review the types of segmentation methods I observe with clients. It ultimately boils down to two categories:
Business rules for segmentation (i.e., non-quantitative)
Analytical segmentation (i.e., quantitative)
Let's dive deeper into each of these...
Business Rules For Segmentation
This technique centers on a qualitative, or non-quantitative, approach leveraging various customer attributes conceptualized through conversations with business stakeholders and customer focus groups to gather pointed data. This information represents consumer experiential behavior, and analysts will assign subjective segments for targeted campaign treatments. Although directionally useful, in this day and age of data-driven marketing, it is my opinion that this approach will have suboptimal results.
Within this category, there are two approaches marketing analysts can select from:
Supervised (i.e., classification)
Unsupervised (i.e., clustering)
Supervised segmentation is typically referred to as a family of pattern analysis approaches. Supporters of this method stress that the actionable deliverable from the analysis classifies homogeneous segments that can be profiled, and informs targeting strategies across the customer lifecycle. The use of the term supervised refers to specific data mining (or data science) techniques, such as decision trees, random forests, gradient boosting or neural networks. One key difference in supervised approaches is that the analysis requires a dependent (or target) variable, whereas no dependent variable is designated in unsupervised models. The dependent variable is usually a 1-0 (or yes/no) flag-type variable that matches the objective of the segmentation. Examples of this include:
Product purchase to identify segments with higher probabilities to convert on what you offer.
Upsell/cross-sell to identify segments who are likely to deepen their relationship with your brand.
Retention to identify segments most likely to unsubscribe, attrite, or defect.
Click behavior to identify segments of anonymous web traffic likely to click on your served display media.
After applying these techniques, analysts can deliver a visual representation of the segments to help explain the results to nontechnical stakeholders. Here is a video demonstration example of SAS Visual Analyticswithin the context of supervised segmentation being applied to a brand's digital traffic through the use of analytical decision trees:
Critics of this approach argue that the resulting model is actually a predictive model rather than a segmentation model because of the probability prediction output. The distinction lies in the use of the model. Segmentation is classifying customer bases into distinct groups based on multidimensional data and is used to suggest an actionable roadmap to design relevant marketing, product and customer service strategies to drive desired business outcomes. As long as we stay focused on this premise, there is nothing to debate.
On the other hand, unsupervised approaches, such as clustering, association/apriori, principal components or factor analysis point to a subset of multivariate segmentation techniques that group consumers based on similar characteristics. The goal is to explore the data to find intrinsic structures. K-means cluster analysis is the most popular technique I view with clients for interdependent segmentation, in which all applicable data attributes are simultaneously considered, and there is no splitting of dependent (or target) and independent (or predictor) variables. Here is a video demonstration example of SAS Visual Statistics for unsupervised segmentation being applied to a brand's digital traffic (including inferred attributes sourced from a digital data management platform) through the use of K-means clustering:
Keep in mind that unsupervised applications are not provided training examples (i.e., there isn't a 1-0 or yes/no flag type variable to bias the formation of the segments). Subsequently, it is fair to make the interpretation that the results of a K-means clustering analysis is more data driven, hence more natural and better suited to the underlying structure of the data. This advantage is also its major drawback: it can be difficult to judge the quality of clustering results in a conclusive way without running live campaigns.
Naturally, the question is which technique is better to use in practice – supervised or unsupervised approaches for segmentation? In my opinion, the answer is both (assuming you have access to data that can be used as the dependent or target variable). When you think about it, I can use an unsupervised technique to find natural segments in my marketable universe, and then use a supervised technique (or more than one via champion-challenger applications) to build unique models on how to treat each cluster segment based on goals defined by internal business stakeholders.
Now, let me pose a question I have been receiving more frequently from clients over the past couple of years.
"Our desired segmentation strategies are outpacing our ability to build supporting analytic models. How can we overcome this?"
Does this sound familiar? For many of my clients, this is a painful reality limiting their potential. That's why I'm personally excited about new SAS technology to address this challenge. SAS Factory Miner allows marketers to dream bigger when it comes to analytical segmentation. It fosters an interactive, approachable environment to support working relationships between strategic visionaries and analysts/data scientists. The benefit for the marketer campaign manager is the ability to expand your segmentation strategies from 5 or 10 segments to 100's or 1000's, while remaining actionable within the demands of today's modern marketing ecosystem. The advantage for the supporting analyst team is the ability to be more efficient, and exploit modern analytical methods and processing power, without the need for incremental resources.
Here is a video demonstration example ofSAS Factory Miner for supersizing your data-driven segmentation capabilities:
I'll end this posting by revisiting a question we shared in the beginning:
Is your segmentation approach antiquated or advanced?
Dream bigger my friends. The possibilities are inspiring!
If you enjoyed this article, be sure to check out my other work here. Lastly, if you would like to connect on social media, link with me on Twitter or LinkedIn.
Yes, it’s a holiday week, which means Thanksgiving-related posts and people telling you what they’re thankful for. You know it, you love it. So here’s my shot: I’m thankful for new editions. That’s right—second editions, third editions—if it’s new and updated, I’m all for it. Go with me on this…. [...]
Rounding off our reports on major new developments in SAS 9.3, today we'll talk about proc mcmc and the random statement.
Stand-alone packages for fitting very general Bayesian models using Markov chain Monte Carlo (MCMC) methods have been available for quite some time now. The best known of these are BUGS and its derivatives WinBUGS (last updated in 2007) and OpenBUGS . There are also some packages available that call these tools from R.
Today we'll consider a relatively simple model: Clustered Poisson data where cluster means are a constant plus a cluster-specific exponentially-distributed random effect. To be clear: y_ij ~ Poisson(mu_i) log(mu_i) = B_0 + r_i r_i ~ Exponential(lambda) Of course in Bayesian thinking all effects are random-- here we use the term in the sense of cluster-specific effects.
SAS Several SAS procedures have a bayes statement that allow some specific models to be fit. For example, in Section 6.6 and example 8.17, we show Bayesian Poisson and logistic regression, respectively, using proc genmod. But our example today is a little unusual, and we could not find a canned procedure for it. For these more general problems, SAS has proc mcmc, which in SAS 9.3 allows random effects to be easily modeled.
We begin by generating the data, and fitting the naive (unclustered) model. We set B_0 = 1 and lambda = 0.4. There are 200 clusters of 10 observations each, which we might imagine represent 10 students from each of 200 classrooms.
data test2; truebeta0 = 1; randscale = .4; call streaminit(1944); do i = 1 to 200; randint = rand("EXPONENTIAL") * randscale; do ni = 1 to 10; mu = exp(truebeta0 + randint); y = rand("POISSON", mu); output; end; end; run;
proc genmod data = test2; model y = / dist=poisson; run;
Standard Wald 95% Parameter Estimate Error Confidence Limits
Intercept 1.4983 0.0106 1.4776 1.5190
Note the inelegant SAS syntax for fitting an intercept-only model. The result is pretty awful-- 50% bias with respect to the global mean. Perhaps we'll do better by acknowledging the clustering. We might try that with normally distributed random effects in proc glimmix.
proc glimmix data = test2 method=laplace; class i; model y = / dist = poisson solution; random int / subject = i type = un; run;
Cov Standard Parm Subject Estimate Error UN(1,1) i 0.1682 0.01841
Standard Effect Estimate Error t Value Pr > |t| Intercept 1.3805 0.03124 44.20 <.0001
No joy-- still a 40% bias in the estimated mean. And the variance of the random effects is biased by more than 50%! Let's try fitting the model that generated the data.
random rint ~ gamma(shape=1, scale=gscale) subject = i initial=0.0001; mu = exp(fixedint + rint); model y ~ poisson(mu); run;
The key points of the proc mcmc statement are nmc, the total number of Monte Carlo iterations to perform, and thin, which includes only every nth sample for inference. The prior and model statements are fairly obvious; we note that in more complex models, parameters that are listed within a single prior statement are sampled as a block. We're placing priors on the fixed (shared) intercept and the scale of the exponential. The mu line is actually just a programming statement-- it uses the same syntax as data step programming. The newly available statement is random. The syntax here is similar to those for the other priors, with the addition of the subject option, which generates a unique parameter for each level of the subject variable. The random effects themselves can be used in later statements, as shown, to enter into data distributions. A final note here is that the exponential distribution isn't explicitly available, but since the gamma distribution with shape fixed at 1 defines the exponential, this is not a problem. Here are the key results.
Standard Parameter N Mean Deviation fixedint 1000 1.0346 0.0244 gscale 1000 0.3541 0.0314
The 95% HPD regions include the true values of the parameters and the posterior means are much less biased than in the model assuming normal random effects.
As usual, MCMC models should be evaluated carefully for convergence and coverage. In this example, I have some concerns (see default diagnostic figure above) and if it were real data I would want to do more.
R The CRAN task view on Bayesian Inference includes a summary of tools for general and model-specific MCMC tools. However, there is nothing like proc mcmc in terms of being a general and easy to use tool that is native to R. The nearest options are to use R front ends to WinBUGS/OpenBUGS (R2WinBUGS) or JAGS (rjags). (A brief worked example of using rjags was posted last year by John Myles White.) Alternatively, with some math and a little sweat, the mcmc package would also work. We'll explore an approach through one or more of these packages in a later entry, and would welcome a collaboration from anyone who would like to take that on.