data science

9月 262017
 

In Part 1 and Part 2 of this blog posting series, we discussed: Our current viewpoints on marketing attribution and conversion journey analysis in 2017. The selection criteria of the best measurement approach. Introduced our vision on handling marketing attribution and conversion journey analysis. We would like to conclude this [...]

Algorithmic marketing attribution and conversion journey analysis [Part 3] was published on Customer Intelligence Blog.

9月 212017
 
A previous entry (http://sas-and-r.blogspot.com/2017/07/options-for-teaching-r-to-beginners.htmldescribes an approach to teaching graphics in R that also “get[s] students doing powerful things quickly”, as David Robinson suggested

In this guest blog entry, Randall Pruim offers an alternative way based on a different formula interface. Here's Randall: 

For a number of years I and several of my colleagues have been teaching R to beginners using an approach that includes a combination of
  • the lattice package for graphics,
  • several functions from the stats package for modeling (e.g., lm(), t.test()), and
  • the mosaic package for numerical summaries and for smoothing over edge cases and inconsistencies in the other two components.
Important in this approach is the syntactic similarity that the following “formula template” brings to all of these operations.  

    goal ( y ~ x , data = mydata, ... )


Many data analysis operations can be executed by filling in four pieces of information (goal, y, x, and mydata) with the appropriate information for the desired task. This allows students to become fluent quickly with a powerful, coherent toolkit for data analysis.

Trouble in paradise
As the earlier post noted, the use of lattice has some drawbacks. While basic graphs like histograms, boxplots, scatterplots, and quantile-quantile plots are simple to make with lattice, it is challenging to combine these simple plots into more complex plots or to plot data from multiple data sources. Splitting data into subgroups and either overlaying with multiple colors or separating into sub-plots (facets) is easy, but the labeling of such plots is not as convenient (and takes more space) than the equivalent plots made with ggplot2. And in our experience, students generally find the look of ggplot2 graphics more appealing.
On the other hand, introducing ggplot2 into a first course is challenging. The syntax tends to be more verbose, so it takes up more of the limited space on projected images and course handouts. More importantly, the syntax is entirely unrelated to the syntax used for other aspects of the course. For those adopting a “Less Volume, More Creativity” approach, ggplot2 is tough to justify.
ggformula: The third-and-a half way
Danny Kaplan and I recently introduced ggformula, an R package that provides a formula interface to ggplot2 graphics. Our hope is that this provides the best aspects of lattice (the formula interface and lighter syntax) and ggplot2 (modularity, layering, and better visual aesthetics).
For simple plots, the only thing that changes is the name of the plotting function. Each of these functions begins with gf. Here are two examples, either of which could replace the side-by-side boxplots made with lattice in the previous post.
We can even overlay these two types of plots to see how they compare. To do so, we simply place what I call the "then" operator (%>%, also commonly called a pipe) between the two layers and adjust the transparency so we can see both where they overlap.

Comparing groups
Groups can be compared either by overlaying multiple groups distinguishable by some attribute (e.g., color)
or by creating multiple plots arranged in a grid rather than overlaying subgroups in the same space. The ggformula package provides two ways to create these facets. The first uses | very much like lattice does. Notice that the gf_lm() layer inherits information from the the gf_points() layer in these plots, saving some typing when the information is the same in multiple layers.


The second way adds facets with gf_facet_wrap() or gf_facet_grid() and can be more convenient for complex plots or when customization of facets is desired.
Fitting into the tidyverse work flow
ggformala also fits into a tidyverse-style workflow (arguably better than ggplot2 itself does). Data can be piped into the initial call to a ggformula function and there is no need to switch between %>% and + when moving from data transformations to plot operations.
Summary
The “Less Volume, More Creativity” approach is based on a common formula template that has served well for several years, but the arrival of ggformula strengthens this approach by bringing a richer graphical system into reach for beginners without introducing new syntactical structures. The full range of ggplot2 features and customizations remains available, and the  ggformula  package vignettes and tutorials describe these in more detail.
-- Randall Pruim
9月 192017
 

In Part 1 of this blog posting series, we discussed our current viewpoints on marketing attribution and conversion journey analysis in 2017. We concluded on a cliffhanger, and would like to return to our question of which attribution measurement method should we ultimately focus on. As with all difficult questions [...]

Algorithmic marketing attribution and conversion journey analysis [Part 2] was published on Customer Intelligence Blog.

9月 142017
 

Editor's note: This blog post was authored by Malcolm Lightbody (SAS Customer Intelligence Product Management) and Suneel Grover (SAS Principal Solutions Architect).

Everyone has a marketing attribution problem, and all attribution measurement methods are wrong. We hear that all the time. Like many urban myths, it is founded in truth. Most organizations believe they can do better on attribution. They all understand that there are gaps, for example, missing touchpoint data, multiple identities across devices, arbitrary decisions on weightings for rules, and uncertainty about what actions arise from the results.

Broadly speaking, the holy grail of media measurement is to analyze the impact and business value of all company-generated marketing interactions across the complex customer journey. In this post, our goal is to take a transparent approach in discussing how SAS is building data-driven marketing technology to help customers progress beyond typical attribution methods to make the business case for customer journey optimization.

Being SAS, we advocate an analytic approach to addressing the operational and process-related obstacles that we commonly hear from customers. We want to treat them as two sides of the same coin. The output of attribution analytics informs marketers about what touch points and sequence of activities drive conversions. This leads marketers to make strategic decisions about future investment levels, as well as more tactical decisions about what activities to run. In an ideal world, the results of subsequent actions are fed back into the attribution model to increase not only its explanatory power, but also its predictive abilities, as shown below:

The diagram above shows the main parts of an attribution project. The actual analysis is just part of the process, with upstream and downstream dependencies. But this doesn’t always happen as it should. Consider a standard attribution report. Let us for the moment ignore what technique was used to generate the result and place ourselves in the shoes of the marketer trying to figure out what to do next.

In the graph above, we see the results of an attribution analysis based on a variety of measurement methods. Before answering the question of which method should we focus on, let's do a quick review of rules-based and algorithmic measurement techniques.

Last-touch and first-touch attribution

This type of attribution allocates 100 percent of the credit to either the last or first touch of the customer journey. This approach has genuine weaknesses, and ignores all other interactions with your brand across a multi-touch journey.

Linear attribution


Linear attribution arbitrarily allocates an equal credit weight to every interaction along the customer journey. Although slightly better than the last- and first-touch approaches, linear attribution will undercredit and overcredit specific interactions.

Time-decay and position-based attribution

Time-decay attribution arbitrarily biases the channel weighting based on the recency of the channel touches across the customer journey. If you support the concept of recency within RFM analysis, there is some merit to approach. Position-based attribution places more weight on the first and last touches, while providing less value to the interactions in between.

Algorithmic attribution

In contrast, algorithmic attribution (sometimes referred to as custom models) assigns data-driven conversion credit across all touch points preceding the conversion, and uses math typically associated with predictive analytics or machine learning  to identify where credit is due. It analyzes both converting and non-converting consumer paths across all channels. Most importantly, it uses data to uncover the correlations and success factors within marketing efforts. Here is a video summarizing a customer case study example to help demystify what we mean.

Why doesn’t everyone use algorithmic attribution?

Although many marketers recognize the value and importance of algorithmic attribution, adopting it hasn’t been easy. There are several reasons:

  • Much-needed modernization. The volume of data that you can collect is massive and may overwhelm outdated data management and analytical platforms. Especially when you’ll need to integrate multiple data sources. Organizations have a decision to make regarding modernization.
  • Scarcity of expertise. Some believe the talent required to unlock the marketing value in data is scarce. However, there are more than 150 universities offering business analytic and data science programs. Talent is flooding into industry. The synergy between analysts and strategically minded marketers is the key to unlock this door.
  • Effective use of data. Organizations are rethinking how they collect, analyze and act on important data sources. Are you using all your crucial marketing data? How do you merge website and mobile app visitor data with email and display campaign data? If you accomplish all of this, how do you take prescriptive action between data, analytics and your media delivery end points?
  • Getting business buy-in. Algorithmic attribution is often perceived as a black box, which vested interest groups can use as a reason to maintain the status quo.

Returning to our question of which method should we ultimately focus on, the answer is it depends. An attribution report on its own cannot decide this. And it doesn’t even matter if the attribution report is generated using the most sophisticated algorithmic techniques. There are four things that the report won't tell you:

  1. The elasticities of a single touch point.
  2. The interdependencies between different touch points.
  3. Cause and effect and timing dependencies.
  4. Differences between different groups of customers.

In Part 2 of this blog posting series, we will dive into specific detail within these areas, as well as introduce our vision within SAS Customer Intelligence 360 on handling algorithmic marketing attribution and conversion journey analysis.

Algorithmic marketing attribution and conversion journey analysis [Part 1] was published on Customer Intelligence Blog.

8月 112017
 

How can you tell if your marketing is working? How can you determine the cost and return of your campaigns? How can you decide what to do next? An effective way to answer these questions is to monitor a set of key performance indicators, or KPIs.

KPIs are the basic statistics that give you a clear idea of how your website (or app) is performing. KPIs vary by predetermined business objectives, and measure progress towards those specific objectives. In the famous words of Avinash Kaushik, KPIs should be:

  • Uncomplex.
  • Relevant.
  • Timely.
  • Instantly useful.

An example that fits this description, with applicability to profit, nonprofit, and e-commerce business models, would be the almighty conversion rate.  In digital analytics this metric is interpreted as the proportion of visitors to a website or app who take action to go beyond a casual content view or site visit, as a result of subtle or direct requests from marketers, advertisers, and content creators.

{\mathrm {Conversion\ rate}}={\frac {{\mathrm {Number\ of\ Goal\ Achievements}}}{{\mathrm {Visitors}}}}

Although successful conversions can be defined differently based on your use case, it is easy to see why this KPI is uncomplex, relevant, timely, and useful. We can even splinter this metric into two types:

Macro conversion – Someone completes an action that is important to your business (like making you some money).

Micro conversion – An indicator that a visitor is moving towards a macro conversion (like progressing through a multi-step sales funnel to eventually make you some money)

Regardless of the conversion type, I have always found that reporting on this KPI is a popular request for analysts from middle management and executives. However, it isn't difficult to anticipate what is coming next from the most important person in your business world:

"How can we improve our conversion rate going forward?"

You can report, slice, dice, and segment away in your web analytics platform, but needles in haystacks are not easily discovered unless we adapt. I know change can be difficult, but allow me to make the case for machine learning and hyperparameters within the discipline of digital analytics. A trendy subject for some, a scary subject for others, but my intent is to lend a practitioner's viewpoint. Analytical decision trees are an excellent way to begin because of their frequent usage within marketing applications, primarily due to their approachability, and ease of interpretation.

Whether your use case is for supervised segmentation, or propensity scoring, this form of predictive analytics can be labeled as machine learning due to algorithm's approach to analyzing data. Have you ever researched how trees actually learn before arriving to a final result? It's beautiful math. However, it doesn't end there. We are living in a moment where more sophisticated machine learning algorithms have emerged that can comparatively increase predictive accuracy, precision, and most importantly – marketing-centric KPIs, while being just as easy to construct.

Using the same data inputs across different analysis types like Forests, Gradient Boosting, and Neural Networks, analysts can compare model fit statistics to determine which approach will have the most meaningful impact on your organization's objectives. Terms like cumulative lift or misclassification may not mean much to you, but they are the keys to selecting the math that best answers how conversion rate can be improved by transparently disclosing accurate views of variable importance.

So is that it? I can just drag and drop my way through the world of visual analytics to optimize against KPIs. Well, there is a tradeoff to discuss here. For some organizations, simply using a machine learning algorithm enabled by an easy-to-use software interface will help improve conversion rate tactics on a mobile app screen experience as compared to not using an analytic method. But an algorithm cannot be expected to perform well as a one size fits all approach for every type of business problem. It is a reasonable question to ask oneself if opportunity is being left on the table to motivate analysts to refine the math to the use case. Learning to improve how an algorithm arrives at a final result should not be scary because it can get a little technical. It's actually quite the opposite, and I love learning how machine learning can be elegant. This is why I want to talk about hyperparameters!

Anyone who has ever built a predictive model understands the iterative nature of adjusting various property settings of an algorithm in an effort to optimize the analysis results. As we endlessly try to improve the predictive accuracy, the process becomes painfully repetitive and manual. Due to the typical length of time an analyst can spend on this task alone - from hours, days, or longer - the approach defies our ability as humans to practically arrive at an optimized final solution. Sometimes referred to as auto tuning, hyperparameters address this issue by exploring different combinations of algorithm options, training a model for each option in an effort to find the best model. Imagine running 1000s of iterations of a website conversion propensity model across different property threshold ranges in a single execution. As a result, these models can improve significantly across important fit statistics that relate directly to your KPIs.

At the end of running an analysis with hyperparameters, the best recipe will be identified. Just like any other modeling project, the ability to action off of the insight is no different, from traditional model score code to next-best-action recommendations infused into your mobile app's personalization technology. That's genuinely exciting, courtesy of recent innovations in distributed analytical engines with feature-rich building blocks for machine-learning activities.

If the subject of hyperparameters is new to you, I encourage you to watch this short video.

This will be one of the main themes of my presentations at Analytics Experience 2017 in Washington DC. Using digital data collected by SAS Customer Intelligence 360 and analyzing it with SAS Visual Data Mining & Machine Learning on VIYA, I want to share the excitement I am feeling about digital intelligence and predictive personalization. I hope you'll consider joining the SAS family for an awesome agenda between September 18th-20th in our nation's capital.

Hyperparameters, digital analytics, and key performance indicators was published on Customer Intelligence Blog.

1月 282017
 

Digital intelligence is a trending term in the space of digital marketing analytics that needs to be demystified. Let's begin by defining what a digital marketing analytics platform is:

Digital marketing analytics platforms are technology applications used by customer intelligence ninjas to understand and improve consumer experiences. Prospecting, acquiring, and holding on to digital-savvy customers depends on understanding their multidevice behavior, and derived insight fuels marketing optimization strategies. These platforms come in different flavors, from stand-alone niche offerings, to comprehensive end-to-end vehicles performing functions from data collection through analysis and visualization.

However, not every platform is built equally from an analytical perspective. According to Brian Hopkins, a Forrester analyst, firms that excel at using data and analytics to optimize their digital businesses will together generate $1.2 trillion per annum in revenue by 2020. And digital intelligence — the practice of continuously optimizing customer experiences with online and offline data, advanced analytics and prescriptive insights — supports every insights-driven business. Digital intelligence is the antidote to the weaknesses of analytically immature platforms, leaving the world of siloed reporting behind and maturing towards actionable, predictive marketing. Here are a couple of items to consider:

  • Today's device-crazed consumers flirt with brands across a variety of interactions during a customer life cycle. However, most organizations seem to focus on website activity in one bucket, mobile in another, and social in . . . you see where I'm going. Strategic plans often fall short in applying digital intelligence across all channels — including offline interactions like customer support or product development.
  • Powerful digital intelligence uses timely delivery of prescriptive insights to positively influence customer experiences. This requires integration of data, analytics and the systems that interact with the consumer. Yet many teams manually apply analytics and deliver analysis via endless reports and dashboards that look retroactively at past behavior — begging business leaders to question the true value and potential impact of digital analysis.

As consumer behavioral needs and preferences shifts over time, the proportion of digital to non-digital interactions is growing. With the recent release of Customer Intelligence 360, SAS has carefully considered feedback from our customers (and industry analysts) to create technology that supports a modern digital intelligence strategy in guiding an organization to:

  • Enrich your first-party customer data with user level data from web and mobile channels. It's time to graduate from aggregating data for reporting purposes to the collection and retention of granular, customer-level data. It is individual-level data that drives advanced segmentation and continuous optimization of customer interactions through personalization, targeting and recommendations.
  • Keep up with customers through machine learning, data science and advanced analytics. The increasing pace of digital customer interactions requires analytical maturity to optimize marketing and experiences. By enriching first-party customer data with infusions of web and mobile behavior, and more importantly, in the analysis-ready format for sophisticated analytics, 360 Discover invites analysts to use their favorite analytic tool and tear down the limitations of traditional web analytics.
  • Automate targeting, channel orchestration and personalization. Brands struggle with too few resources to support the manual design and data-driven design of customer experiences. Connecting first-party data that encompasses both offline and online attributes with actionable propensity scores and algorithmically-defined segments through digital channel interactions is the agenda. If that sounds mythical, check out a video example of how SAS brings this to life.

The question now is - are you ready? Learn more here of why we are so excited about enabling digital intelligence for our customers, and how this benefits testing, targeting, and optimization of customer experiences.

 

tags: Customer Engagement, customer intelligence, Customer Intelligence 360, customer journey, data science, Digital Intelligence, machine learning, marketing analytics, personalization, predictive analytics, Predictive Personalization, Prescriptive Analytics

Digital intelligence for optimizing customer engagement was published on Customer Intelligence.

12月 062016
 

As data-driven marketers, you are now challenged by senior leaders to have a laser focus on the customer journey and optimize the path of consumer interactions with your brand. Within that journey there are three trends (or challenges) to focus on:

  • Deeply understanding your target audience to anticipate their needs and desires.
  • Meeting customers’ expectations (although aiming higher can help differentiate your brand from the pack).
  • Addressing their pain points to increase your brand's relevance.

customer journey

No matter who you chat with, or what marketing conference you recently attended, it's safe to say that the intersection of digital marketing, analytics, optimization and personalization is a popular subject of conversation. Let's review the popular buzzwords at the moment:

  • Predictive personalization
  • Data science
  • Machine learning
  • Self-learning algorithms
  • Segment of one
  • Contextual awareness
  • Real time
  • Automation
  • Artificial intelligence

It's quite possible you have encountered these words at such a high frequency, you could make a drinking game out of it.drinking-game

There’s a lot of confusion created by these terms and what they mean. For instance, there is hubbub around so-called ‘easy button’ solutions that marketing cloud companies are selling for customer analytics and data-drive personalization. In reaction to this, I set off on a personal quest to research questions like:

  1. Does every technology perform analytics and personalization equally?
    • What are the benefits and drawbacks to analytic automation?
    • What are the downstream impacts to the predictive recommendations marketers depend on for personalized interactions across channels?
    • Should I be comfortable trusting a black-box algorithm and how it impacts the facilitated experiences my brand delivers to customers and prospects?
  2. Do you need a data scientist to be successful in modern marketing?
    • Is high quality analytic talent extremely difficult to find?
    • How valid is the complaint of a data science talent shortage?
    • How do I balance the needs of my marketing organization with recent analytic technology trends?

Have I captivated your interest? If yes, check out this on-demand webcast.

It's time to dive in deep and unleash on these questions. During the video, I share the results of my investigation into these questions, and reactive viewpoints. In addition, you will be introduced to new SAS Customer Intelligence 360 technology addressing these challenges. I believe in a future where approachable technology and analytically-curious people come together to deliver intelligent customer interactions. Analytically curious people can be data scientists, citizen data scientists, statisticians, marketing analysts, digital marketers, creative super forces and more. Building teams of these individuals armed with modern customer analytics software tools will help you differentiate and compete in today's marketing ecosystem.

marketing ecosystem

 

tags: artificial intelligence, Context-aware, customer intelligence, customer journey, Data Driven Marketing, data science, digital marketing, Digital Personalization, machine learning, marketing analytics, Predictive Personalization, Real time Automation, segment of one, Self-learning algorithms

Customer analytics: Think outside the black box was published on Customer Intelligence.

12月 062016
 

As data-driven marketers, you are now challenged by senior leaders to have a laser focus on the customer journey and optimize the path of consumer interactions with your brand. Within that journey there are three trends (or challenges) to focus on:

  • Deeply understanding your target audience to anticipate their needs and desires.
  • Meeting customers’ expectations (although aiming higher can help differentiate your brand from the pack).
  • Addressing their pain points to increase your brand's relevance.

customer journey

No matter who you chat with, or what marketing conference you recently attended, it's safe to say that the intersection of digital marketing, analytics, optimization and personalization is a popular subject of conversation. Let's review the popular buzzwords at the moment:

  • Predictive personalization
  • Data science
  • Machine learning
  • Self-learning algorithms
  • Segment of one
  • Contextual awareness
  • Real time
  • Automation
  • Artificial intelligence

It's quite possible you have encountered these words at such a high frequency, you could make a drinking game out of it.drinking-game

There’s a lot of confusion created by these terms and what they mean. For instance, there is hubbub around so-called ‘easy button’ solutions that marketing cloud companies are selling for customer analytics and data-drive personalization. In reaction to this, I set off on a personal quest to research questions like:

  1. Does every technology perform analytics and personalization equally?
    • What are the benefits and drawbacks to analytic automation?
    • What are the downstream impacts to the predictive recommendations marketers depend on for personalized interactions across channels?
    • Should I be comfortable trusting a black-box algorithm and how it impacts the facilitated experiences my brand delivers to customers and prospects?
  2. Do you need a data scientist to be successful in modern marketing?
    • Is high quality analytic talent extremely difficult to find?
    • How valid is the complaint of a data science talent shortage?
    • How do I balance the needs of my marketing organization with recent analytic technology trends?

Have I captivated your interest? If yes, check out this on-demand webcast.

It's time to dive in deep and unleash on these questions. During the video, I share the results of my investigation into these questions, and reactive viewpoints. In addition, you will be introduced to new SAS Customer Intelligence 360 technology addressing these challenges. I believe in a future where approachable technology and analytically-curious people come together to deliver intelligent customer interactions. Analytically curious people can be data scientists, citizen data scientists, statisticians, marketing analysts, digital marketers, creative super forces and more. Building teams of these individuals armed with modern customer analytics software tools will help you differentiate and compete in today's marketing ecosystem.

marketing ecosystem

 

tags: artificial intelligence, Context-aware, customer intelligence, customer journey, Data Driven Marketing, data science, digital marketing, Digital Personalization, machine learning, marketing analytics, Predictive Personalization, Real time Automation, segment of one, Self-learning algorithms

Customer analytics: Think outside the black box was published on Customer Intelligence.

11月 282016
 

One aspect of high-quality information is consistency. We often think about consistency in terms of consistent values. A large portion of the effort expended on “data quality dimensions” essentially focuses on data value consistency. For example, when we describe accuracy, what we often mean is consistency with a defined source […]

The post Harmonizing semantics for consistency in interpreting analytical results appeared first on The Data Roundtable.

9月 212016
 

We have just launched a free eBook containing a carefully selected collection of chapters from SAS Press books introducing the field of data science. Data science may be a difficult term to define, but data scientists are definitely in great demand! Wayne Thompson, Senior Product Manager at SAS, defines data […]

The post Free eBook: Discovering data science with SAS appeared first on SAS Learning Post.