clustering

4月 272018
 

Analyzing ticket sales and customer data for large sports and entertainment events is a complex endeavor. But SAS Visual Analytics makes it easy, with location analytics, customer segmentation, predictive artificial intelligence (AI) capabilities – and more. This blog post covers a brief overview of these features by using a fictitious event company [...]

Analyze ticket sales using location analytics and customer segmentation in SAS Visual Analytics was published on SAS Voices by Falko Schulz

9月 092016
 

From time to time, the addition of new features requires a review of how capabilities are organized and presented in JMP. Are they located where it makes the most sense and where users would expect to find them? For example, in JMP 12 there was enough new material combined with […]

The post JMP 13 Preview: Improvements to the Analyze menu for a better user experience appeared first on JMP Blog.

6月 082016
 

In a previous post, I wrote how pedigree might be used to help predict outcomes of horse races. In particular, I discussed a metric called the Dosage Index (DI), which appeared to be a leading indicator of success (at least historically). In this post, I want to introduce the Center […]

The post What does a winning thoroughbred horse look like? appeared first on JMP Blog.

1月 192016
 

Marketers have used segmentation as a technique to target customers for communications, products, and services since the introduction of  customer relationship management (i.e., CRM) and database marketing. Within the context of segmentation, there are a variety of applications, ranging from consumer demographics, geography, behavior, psychographics, events and cultural backgrounds. Over time, segmentation has proven its value, and brands continue to use this strategy across every stage of the customer journey:

  • Acquisition
  • Upsell/cross-sell
  • Retention
  • Winback

Let's provide a proper definition for this marketing technique. As my SAS peer and friend Randy Collica stated in his influential book on this subject:

"Segmentation is in essence the process by which items or subjects are categorized or classified into groups that share similar characteristics. These techniques can be beneficial in classifying customer groups. Typical marketing activities seek to improve their relationships with prospective and current customers. The better you know about your customer's needs, desires, and their purchasing behaviors, the better you can construct marketing programs designed to fit their needs, desires, and behaviors."

Moving beyond the academic interpretation, in today's integrated marketing ecosystem, SAS Global Customer Intelligence director Wilson Raj provides a modern viewpoint:

"In an era of big data, hyperconnected digital customers and hyper-personalization, segmentation is the cornerstone of customer insight and understanding across the modern digital business. The question is: Is your segmentation approach antiquated or advanced?"

This provides a nice transition to review the types of segmentation methods I observe with clients. It ultimately boils down to two categories:

  1. Business rules for segmentation (i.e., non-quantitative)
  2. Analytical segmentation (i.e., quantitative)

Let's dive deeper into each of these...

Business Rules For Segmentation

This technique centers on a qualitative, or non-quantitative, approach leveraging various customer attributes conceptualized through conversations with business stakeholders and customer focus groups to gather pointed data. This information represents consumer experiential behavior, and analysts will assign subjective segments for targeted campaign treatments. Although directionally useful, in this day and age of data-driven marketing, it is my opinion that this approach will have suboptimal results.

Analytical Segmentation

Within this category, there are two approaches marketing analysts can select from:

  1. Supervised (i.e., classification)
  2. Unsupervised (i.e., clustering)

Supervised segmentation is typically referred to as a family of pattern analysis approaches. Supporters of this method stress that the actionable deliverable from the analysis classifies homogeneous segments that can be profiled, and informs targeting strategies across the customer lifecycle. The use of the term supervised refers to specific data mining (or data science) techniques, such as decision trees, random forests, gradient boosting or neural networks.  One key difference in supervised approaches is that the analysis requires a dependent (or target) variable, whereas no dependent variable is designated in unsupervised models. The dependent variable is usually a 1-0 (or yes/no) flag-type variable that matches the objective of the segmentation. Examples of this include:

  • Product purchase to identify segments with higher probabilities to convert on what you offer.
  • Upsell/cross-sell to identify segments who are likely to deepen their relationship with your brand.
  • Retention to identify segments most likely to unsubscribe, attrite, or defect.
  • Click behavior to identify segments of anonymous web traffic likely to click on your served display media.

After applying these techniques, analysts can deliver a visual representation of the segments to help explain the results to nontechnical stakeholders. Here is a video demonstration example of SAS Visual Analytics within the context of supervised segmentation being applied to a brand's digital traffic through the use of analytical decision trees:

 

Critics of this approach argue that the resulting model is actually a predictive model rather than a segmentation model because of the probability prediction output. The distinction lies in the use of the model. Segmentation is classifying customer bases into distinct groups based on multidimensional data and is used to suggest an actionable roadmap to design relevant marketing, product and customer service strategies to drive desired business outcomes.  As long as we stay focused on this premise, there is nothing to debate.

On the other hand, unsupervised approaches, such as clustering, association/apriori, principal components or factor analysis point to a subset of multivariate segmentation techniques that group consumers based on similar characteristics. The goal is to explore the data to find intrinsic structures. K-means cluster analysis is the most popular technique I view with clients for interdependent segmentation, in which all applicable data attributes are simultaneously considered, and there is no splitting of dependent (or target) and independent (or predictor) variables. Here is a video demonstration example of SAS Visual Statistics for unsupervised segmentation being applied to a brand's digital traffic (including inferred attributes sourced from a digital data management platform) through the use of K-means clustering:

 

Keep in mind that unsupervised applications are not provided training examples (i.e., there isn't a 1-0 or yes/no flag type variable to bias the formation of the segments). Subsequently, it is fair to make the interpretation that the results of a K-means clustering analysis is more data driven, hence more natural and better suited to the underlying structure of the data. This advantage is also its major drawback: it can be difficult to judge the quality of clustering results in a conclusive way without running live campaigns.

Naturally, the question is which technique is better to use in practice – supervised or unsupervised approaches for segmentation? In my opinion, the answer is both (assuming you have access to data that can be used as the dependent or target variable). When you think about it, I can use an unsupervised technique to find natural segments in my marketable universe, and then use a supervised technique (or more than one via champion-challenger applications) to build unique models on how to treat each cluster segment based on goals defined by internal business stakeholders.

Now, let me pose a question I have been receiving more frequently from clients over the past couple of years.

"Our desired segmentation strategies are outpacing our ability to build supporting analytic models. How can we overcome this?"

Does this sound familiar? For many of my clients, this is a painful reality limiting their potential. That's why I'm personally excited about new SAS technology to address this challenge. SAS Factory Miner allows marketers to dream bigger when it comes to analytical segmentation. It fosters an interactive, approachable environment to support working relationships between strategic visionaries and analysts/data scientists. The benefit for the marketer campaign manager is the ability to expand your segmentation strategies from 5 or 10 segments to 100's or 1000's, while remaining actionable within the demands of today's modern marketing ecosystem. The advantage for the supporting analyst team is the ability to be more efficient, and exploit modern analytical methods and processing power, without the need for incremental resources.

Here is a video demonstration example of SAS Factory Miner for supersizing your data-driven segmentation capabilities:

 

I'll end this posting by revisiting a question we shared in the beginning:

Is your segmentation approach antiquated or advanced?

Dream bigger my friends. The possibilities are inspiring!

If you enjoyed this article, be sure to check out my other work here. Lastly, if you would like to connect on social media, link with me on Twitter or LinkedIn.

 

tags: Clustering, CRM, Data Driven Marketing, Data Mining, data science, Decision Trees, marketing analytics, personalization, segmentation

Analytical segmentation for data-driven marketing was published on Customer Analytics.

11月 222011
 
Yes, it’s a holiday week, which means Thanksgiving-related posts and people telling you what they’re thankful for. You know it, you love it. So here’s my shot: I’m thankful for new editions. That’s right—second editions, third editions—if it’s new and updated, I’m all for it. Go with me on this…. [...]
10月 042011
 



Rounding off our reports on major new developments in SAS 9.3, today we'll talk about proc mcmc and the random statement.

Stand-alone packages for fitting very general Bayesian models using Markov chain Monte Carlo (MCMC) methods have been available for quite some time now. The best known of these are BUGS and its derivatives WinBUGS (last updated in 2007) and OpenBUGS . There are also some packages available that call these tools from R.

Today we'll consider a relatively simple model: Clustered Poisson data where cluster means are a constant plus a cluster-specific exponentially-distributed random effect. To be clear:
y_ij ~ Poisson(mu_i)
log(mu_i) = B_0 + r_i
r_i ~ Exponential(lambda)
Of course in Bayesian thinking all effects are random-- here we use the term in the sense of cluster-specific effects.

SAS
Several SAS procedures have a bayes statement that allow some specific models to be fit. For example, in Section 6.6 and example 8.17, we show Bayesian Poisson and logistic regression, respectively, using proc genmod. But our example today is a little unusual, and we could not find a canned procedure for it. For these more general problems, SAS has proc mcmc, which in SAS 9.3 allows random effects to be easily modeled.

We begin by generating the data, and fitting the naive (unclustered) model. We set B_0 = 1 and lambda = 0.4. There are 200 clusters of 10 observations each, which we might imagine represent 10 students from each of 200 classrooms.

data test2;
truebeta0 = 1;
randscale = .4;
call streaminit(1944);
do i = 1 to 200;
randint = rand("EXPONENTIAL") * randscale;
do ni = 1 to 10;
mu = exp(truebeta0 + randint);
y = rand("POISSON", mu);
output;
end;
end;
run;

proc genmod data = test2;
model y = / dist=poisson;
run;

Standard Wald 95%
Parameter Estimate Error Confidence Limits

Intercept 1.4983 0.0106 1.4776 1.5190

Note the inelegant SAS syntax for fitting an intercept-only model. The result is pretty awful-- 50% bias with respect to the global mean. Perhaps we'll do better by acknowledging the clustering. We might try that with normally distributed random effects in proc glimmix.

proc glimmix data = test2 method=laplace;
class i;
model y = / dist = poisson solution;
random int / subject = i type = un;
run;

Cov Standard
Parm Subject Estimate Error
UN(1,1) i 0.1682 0.01841

Standard
Effect Estimate Error t Value Pr > |t|
Intercept 1.3805 0.03124 44.20 <.0001

No joy-- still a 40% bias in the estimated mean. And the variance of the random effects is biased by more than 50%! Let's try fitting the model that generated the data.

proc mcmc data=test2 nmc=10000 thin=10 seed=2011;
parms fixedint 1 gscale 0.4;

prior fixedint ~ normal(0, var = 10000);
prior gscale ~ igamma(.01 , scale = .01 ) ;

random rint ~ gamma(shape=1, scale=gscale) subject = i initial=0.0001;
mu = exp(fixedint + rint);
model y ~ poisson(mu);
run;

The key points of the proc mcmc statement are nmc, the total number of Monte Carlo iterations to perform, and thin, which includes only every nth sample for inference. The prior and model statements are fairly obvious; we note that in more complex models, parameters that are listed within a single prior statement are sampled as a block. We're placing priors on the fixed (shared) intercept and the scale of the exponential. The mu line is actually just a programming statement-- it uses the same syntax as data step programming.
The newly available statement is random. The syntax here is similar to those for the other priors, with the addition of the subject option, which generates a unique parameter for each level of the subject variable. The random effects themselves can be used in later statements, as shown, to enter into data distributions. A final note here is that the exponential distribution isn't explicitly available, but since the gamma distribution with shape fixed at 1 defines the exponential, this is not a problem. Here are the key results.

Posterior Summaries

Standard
Parameter N Mean Deviation
fixedint 1000 1.0346 0.0244
gscale 1000 0.3541 0.0314

Posterior Intervals

Parameter Alpha HPD Interval
fixedint 0.050 0.9834 1.0791
gscale 0.050 0.2937 0.4163

The 95% HPD regions include the true values of the parameters and the posterior means are much less biased than in the model assuming normal random effects.

As usual, MCMC models should be evaluated carefully for convergence and coverage. In this example, I have some concerns (see default diagnostic figure above) and if it were real data I would want to do more.

R
The CRAN task view on Bayesian Inference includes a summary of tools for general and model-specific MCMC tools. However, there is nothing like proc mcmc in terms of being a general and easy to use tool that is native to R. The nearest options are to use R front ends to WinBUGS/OpenBUGS (R2WinBUGS) or JAGS (rjags). (A brief worked example of using rjags was posted last year by John Myles White.) Alternatively, with some math and a little sweat, the mcmc package would also work. We'll explore an approach through one or more of these packages in a later entry, and would welcome a collaboration from anyone who would like to take that on.