data scientist

1月 162019

If you've ever wanted to apply modern machine learning techniques for text analysis, but didn't have enough labeled training data, you're not alone. This is a common scenario in domains that use specialized terminology, or for use cases where customized entities of interest won't be well detected by standard, off-the-shelf entity models.

For example, manufacturers often analyze engineer, technician, or consumer comments to identify the name of specific components which have failed, along with the associated cause of failure or symptoms exhibited. These specialized terms and contextual phrases are highly unlikely to be tagged in a useful way by a pre-trained, all-purpose entity model. The same is true for any types of texts which contain diverse mentions of chemical compounds, medical conditions, regulatory statutes, lab results, suspicious groups, legal jargon…the list goes on.

For many real-world applications, users find themselves at an impasse, it being incredibly impractical for experts to manually label hundreds of thousands of documents. This post will discuss an analytical approach for Named Entity Recognition (NER) which uses rules-based text models to efficiently generate large amounts of training data suitable for supervised learning methods.

Putting NER to work

In this example, we used documents produced by the United States Department of State (DOS) on the subject of assessing and preventing human trafficking. Each year, the DOS releases publicly-facing Trafficking in Persons (TIP) reports for more than 200 countries, each containing a wealth of information expressed through freeform text. The simple question we pursued for this project was: who are the vulnerable groups most likely to be victimized by trafficking?

Sample answers include "Argentine women and girls," "Ghanaian children," "Dominican citizens," "Afghan and Pakistani men," "Chinese migrant workers," and so forth. Although these entities follow a predictable pattern (nationality + group), note that the context must also be that of a victimized population. For example, “French citizens” in a sentence such as "French citizens are working to combat the threats of human trafficking" are not a valid match to our "Targeted Groups" entity.

For more contextually-complex entities, or fluid entities such as People or Organizations where every possible instance is unknown, the value that machine learning provides is that the algorithm can learn the pattern of a valid match without the programmer having to anticipate and explicitly state every possible variation. In short, we expect the machine to increase our recall, while maintaining a reasonable level of precision.

For this case study, here is the method we used:

1. Using SAS Visual Text Analytics, create a rules-based, contextual extraction model on a sample of data to detect and extract the "Targeted Groups" custom entity. Next, apply this rules-based model to a much larger number of observations, which will form our training corpus for a machine learning algorithm. In this case, we used Conditional Random Fields (CRF), a sequence modeling algorithm also included with SAS Visual Text Analytics.
2. Re-format the training data to reflect the json input structure needed for CRF, where each token in the sentence is assigned a corresponding target label and part of speech.
3. Train the CRF model to detect our custom entity and predict the correct boundaries for each match.
4. Manually annotate a set of documents to use as a holdout sample for validation purposes. For each document, our manual label captures the matched text of the Targeted Groups entity as well as the start and end offsets where that string occurs within the larger body of text.
5. Score the validation “gold” dataset, assess recall and precision metrics, and inspect differences between the results of the linguistic vs machine learning model.

Let's explore each of these steps in more detail.

1. Create a rules-based, contextual extraction model

In SAS Visual Text Analytics, we created a simple model consisting of a few intermediate, "helper" concepts and the main Targeted Groups concept, which combines these entities to generate our final output.

The Nationalities List and Affected Parties concepts are simple CLASSIFIER lists of nationalities and vulnerable groups that are known a priori. The Targeted Group is a predicate rule which only returns a match if the aforementioned two entities are found in that order, separated by no more than 7 tokens, AND if there is not a verb intervening between the two entities (the verb "trafficking" being the only exception). This verb exclusion clause was added to the rule to prevent false matches such as "Turkish Cypriots lacked shelters for victims" and "Bahraini government officials stated that they encouraged victims to participate in the investigation and prosecution of traffickers." We then applied this linguistic model to all the TIP reports leading up to 2017, which would form the basis for our CRF training data.

Nationalities List Helper Concept:

Affected Parties Helper Concept:

Verb Exclusions Helper Concept:

Targeted Group Concept (Final Fact Rule):

2. Re-format the training data

The SAS Visual Text Analytics score code produces a transactional-style output for predicate rules, where each fact argument and the full match are captured in a separate row. Note that a single document may have more than one match, which are then listed according to _result_id_.

Using code, we joined these results back to the original table and the underlying parsing tables to transform the native output you see above to this, the json format required to train a CRF model:

Notice how every single token in each sentence is broken out separately and has both a corresponding label and part of speech. For all the tokens which are not part of our Targeted Groups entity of interest, the label is simple "O", for "Other". But, for matches such as "Afghan women and girls," the first token in the match has a label of "B-vic" for "Beginning of the Victim entity" and subsequent tokens in that match are labeled "I-vic" for "Inside the Victim entity."

Note that part of speech tags are not required for CRF, but we have found that including them as an input improves the accuracy of this model type. These three fields are all we will use to train our CRF model.

3. Train the CRF model

Because the Conditional Random Fields algorithm predicts a label for every single token, it is often used for base-level Natural Language Processing tasks such as Part of Speech detection. However, we already have part of speech tags, so the task we are giving it in this case is signal detection. Most of the words are "Other," meaning not of interest, and therefore noise. Can the CRF model detect our Targeted Groups entity and assign the correct boundaries for the match using the B-vic and I-vic labels?
After loading the training data to CAS using SAS Studio, we applied the crfTrain action set as follows:

After it runs successfully, we have a number of underlying tables which will be used in the scoring step.

4. Manually annotate a set of documents

For ease of annotation and interpretability, we tokenized the saved the original data by sentence. Using a purpose-built web application which enables a user to highlight entities and save the relevant text string and its offsets to a file, we then hand-scored approximately 2,200 sentences from 2017 TIP documents. Remember, these documents have not yet been "seen" by either the linguistic model or the CRF model. This hand-scored data will serve as our validation dataset.

5. Score the validation “gold” dataset by both models and assess results

Finally, we scored the validation set in SAS Studio with the CRF model, so we could compare human versus machine outcomes.

In a perfect world, we would hope that all the matches found by humans are also found by the model and moreover, the model detected even more valid matches than the humans. For example, perhaps we did not include "Rohingyan" or "Tajik" (versus Tajikistani) as nationalities in our CLASSIFIER list in our rules-based model, but the machine learning model detected victims from these groups them as a valid pattern nonetheless. This would be a big success, and one of the compelling reasons to use machine learning for NER use cases.

In a future blog, I'll detail the results of the outcomes, including modeling considerations such as:
  o The format of the CRF training template
  o The relative impact of including inputs such as part of speech tags
  o Precision and recall metrics
  o Performance and train times by volumes of training documents

Machine markup provides scale and agility

In summary, although human experts might produce the highest-quality annotations for NER, machine markup can be produced much more cheaply and efficiently -- and even more importantly, scale to far greater data volumes in a fraction of the time. Generating a rules-based model to generate large amounts of "good enough" labeled data is an excellent way to take advantage of these economies of scale, reduce the cost-barrier to exploring new use cases, and improve your ability to quickly adapt to evolving business objectives.

Reduce the cost-barrier of generating labeled text data for machine learning algorithms was published on SAS Users.

1月 102019

Everyone’s excited about artificial intelligence. But most people, in most jobs, struggle to see the how AI can be used in the day-to-day work they do. This post, and others to come, are all about practical AI. We’ll dial the coolness factor down a notch, but we explore some real gains to be made with AI technology in solving business problems in different industries.

This post demonstrates a practical use of AI in banking. We’ll use machine learning, specifically neural networks, to enable on-demand portfolio valuation, stress testing, and risk metrics.


I spend a lot of time talking with bankers about AI. It’s fun, but the conversation inevitably turns to concerns around leveraging AI models, which can have some transparency issues, in a highly-regulated and highly-scrutinized industry. It’s a valid concern. However, there are a lot of ways the technology can be used to help banks –even in regulated areas like risk –without disrupting production models and processes.

Banks often need to compute the value of their portfolios. This could be a trading portfolio or a loan portfolio. They compute the value of the portfolio based on the current market conditions, but also under stressed conditions or under a range of simulated market conditions. These valuations give an indication of the portfolio’s risk and can inform investment decisions. Bankers need to do these valuations quickly on-demand or in real-time so that they have this information at the time they need to make decisions.

However, this isn’t always a fast process. Banks have a lot of instruments (trades, loans) in their portfolios and the functions used to revalue the instruments under the various market conditions can be complex. To address this, many banks will approximate the true value with a simpler function that runs very quickly. This is often done with first- or second-order Taylor series approximation (also called quadratic approximation or delta-gamma approximation) or via interpolation in a matrix of pre-computed values. Approximation is a great idea, but first- and second-order approximations can be terrible substitutes of the true function, especially in stress conditions. Interpolation can suffer the same draw-back in stress.

An American put option is shown for simplicity. The put option value is non-linear with respect to the underlying asset price. Traditional approximation methods, including this common second-order approximation, can fail to fit well, particularly when we stress asset prices.

Improving approximation with machine learning

Machine learning is technology commonly used in AI. Machine learning is what enables computers to find relationships and patterns among data. Technically, traditional first- order and second-order approximation is a form of classical machine learning, such as linear regression. But in this post we’ll leverage more modern machine learning, like neural networks, to get a better fit with ease.

Neural networks can fit functions with remarkable accuracy. You can read about the universal approximation theorem for more about this. We won’t get into why this is true or how neural networks work, but the motivation for this exercise is to use this extra good-fitting neural network to improve our approximation.

Each instrument type in the portfolio will get its own neural network. For example, in a trading portfolio, our American options will have their own network and interest rate swaps, their own network.

The fitted neural networks have a small computational footprint so they’ll run very quickly, much faster than computing the true value of the instruments. Also, we should see accuracy comparable to having run the actual valuation methods.

The data, and lots of it

Neural networks require a lot of data to train the models well. The good thing is we have a lot of data in this case, and we can generate any data we need. We’ll train the network with values of the instruments for many different combinations of the market factors. For example, if we just look at the American put option, we’ll need values of that put option for various levels of moneyness, volatility, interest rate, and time to maturity.

Most banks already have their own pricing libraries to generate this data and they may already have much of it generated from risk simulations. If you don’t have a pricing library, you may work through this example using the Quantlib open source pricing library. That’s what I’ve done here.

Now, start small so you don’t waste time generating tons of data up front. Use relatively sparse data points on each of the market factors but be sure to cover the full range of values so that the model holds up under stress testing. If the model was only trained with interest rates of 3 -5 percent, it’s not going to do well if you stress interest rates to 10 percent. Value the instruments under each combination of values.

Here is my input table for an American put option. It’s about 800k rows. I’ve normalized my strike price, so I can use the same model on options of varying strike prices. I’ve added moneyness in addition to underlying.

This is the input table to the model. It contains the true option prices as well as the pricing inputs. I used around 800K observations to get coverage across a wide range of values for the various pricing inputs. I did this so that my model will hold up well to stress testing.

The model

I use SAS Visual Data Mining and Machine Learning to fit the neural network to my pricing data. I can use either the visual interface or a programmatic interface. I’ll use SAS Studio and its programmatic interface to fit the model. The pre-defined neural network task in SAS Studio is a great place to start.

Before running the model, I do standardize my inputs further. Neural networks do best if you’ve adjusted the inputs to a similar range. I enable hyper-parameter auto-tuning so that SAS will select the best model parameters for me. I ask SAS to output the SAS code to run the fitted model so that I can later test and use the model.

The SAS Studio Neural Network task provides a wizard to specify the data and model hyper parameters. The task wizard generates the SAS code on the right. I’ve allowed auto-tuning so that SAS will find the best model configuration for me.

I train the model. It only takes a few seconds. I try the model on some new test data and it looks really good. The picture below compares the neural network approximation with the true value.

The neural network (the solid red line) fits very well to the actual option prices (solid blue line). This holds up even when asset prices are far from their base values. The base value for the underlying asset price is 1.

If your model’s done well at this point, then you can stop. If it’s not doing well, you may need to try a deeper model, or different model, or add more data. SAS offers model interpretability tools like partial dependency to help you gauge how the model fits for different variables.

Deploying the model

If you like the way this model is approximating your trade or other financial instrument values, you can deploy the model so that it can be used to run on-demand stress tests or to speed up intra-day risk estimations. There are many ways to do this in SAS. The neural network can be published to run in SAS, in data-base, in Hadoop, or in-stream with a single click. I can also access my model via REST API, which gives me lots of deployment options. What I’ll do, though, is use these models in SAS High-Performance Risk (HPRisk) so that I can leverage the risk environment for stress testing and simulation and use its nice GUI.

HPRisk lets you specify any function, or method, to value an instrument. Given the mapping of the functions to the instruments, it coordinates a massively parallel run of the portfolio valuation for stress testing or simulation.

Remember the SAS file we generated when we trained the neural network. I can throw that code into HPRisk’s method and now HPRisk will run the neural network I just trained.

I can specify a scenario through the HPRisk UI and instantly get the results of my approximation.


I introduced this as a practical example of AI, specifically machine learning in banking, so let’s make sure we keep it practical, by considering the following:
    • Only approximate instruments that need it. For example, if it's a European option, don’t approximate. The function to calculate its true price, the Black-Scholes equation, already runs really fast. The whole point is that you’re trying to speed up the estimation.
    • Keep in mind that this is still an approximation, so only use this when you’re willing to accept some inaccuracy.
    • In practice, you could be training hundreds of networks depending on the types of instruments you have. You’ll want to optimize the training time of the networks by training multiple networks at once. You can do this with SAS.
    • The good news is that if you train the networks on a wide range of data, you probably won’t have to retrain often. They should be pretty resilient. This is a nice perk of the neural networks over the second-order approximation whereby parameters need to be recomputed often.
    • I’ve chosen neural networks for this example but be open to other algorithms. Note that different instruments may benefit from different algorithms. Gradient boosting and others may offer simpler, more intuitive models, that get similar accuracy.

When it comes to AI in business, you’re most likely to succeed when you have a well-defined problem, like our stress testing that takes too long or isn’t accurate. You also need good data to work with. This example had both, which made it a good candidate for to demonstrate practical AI.

More resources

Interested in other machine learning algorithms or AI technologies in general? Here are a few resources to keep learning.

Article: A guide to machine learning algorithms and their applications
Blog post: Which machine learning algorithm should I use?
Video: Supervised vs. Unsupervised Learning
Article: Five AI technologies that you need to know

Practical AI in banking was published on SAS Users.

7月 252018

I recently joined SAS in a brand new role: I'm a Developer Advocate.  My job is to help SAS customers who want to access the power of SAS from within other applications, or who might want to build their own applications that leverage SAS analytics.  For my first contribution, I decided to write an article about a quick task that would interest developers and that isn't already heavily documented. So was born this novice's experience in using R (and RStudio) with SAS Viya. This writing will chronicle my journey from the planning stages, all the way to running commands from RStudio on the data stored in SAS Viya. This is just the beginning; we will discuss at the end where I should go next.

Why use SAS Viya with R?

From the start, I asked myself, "What's the use case here? Why would anyone want to do this?" After a bit of research discussion with my SAS colleagues, the answer became clear.  R is a popular programming language used by data scientists, developers, and analysts – even within organizations that also use SAS.  However, R has some well-known limitations when working with big data, and our SAS customers are often challenged to combine the work of a diverse set of tools into a well-governed analytics lifecycle. Combining the developers' familiarity of R programming with the power and flexibility of SAS Viya for data storage, analytical processing, and governance, this seemed like a perfect exercise.  For this purpose of this scenario, think of SAS Viya as the platform and the Cloud Analytics Server (CAS) is where all the data is stored and processed.

How I got started with SAS Viya

I did not want to start with the task of deploying my own SAS Viya environment. This is a non-trivial activity, and not something an analyst would tackle, so the major pre-req here is you'll need access to an existing SAS Viya setup.  Fortunately for me, here at SAS we have preconfigured SAS Viya environments available on a private cloud that we can use for demos and testing.  So, SAS Viya is my server-side environment. Beyond that, a client is all I needed. I used a generic Windows machine and got busy loading some software.

What documentation did I use/follow?

I started with the official SAS documentation: SAS Scripting Wrapper for Analytics Transfer (SWAT) for R.

The Process

The first two things I installed were R and RStudio, which I found at these locations:

The installs were uneventful, so I'll won't list all those steps here. Next, I installed a couple of pre-req R packages and attempted to install the SAS Scripting Wrapper for Analytics Transfer (SWAT) package for R. Think of SWAT as what allows R and SAS to work together. In an R command line, I entered the following commands:

> install.packages('httr')
> install.packages('jsonlite')
> install.packages('> 
  linux64.tar.gz', repos=NULL, type='file')

When attempting the last command, I hit an error:

ERROR: dependency 'dplyr' is not available for package 'swat'
* removing 'C:/Program Files/R/R-3.5.1/library/swat'
Warning message:
In install.packages("",  :
installation of package 'C:/Users/sas/AppData/Local/Temp/2/RtmpEXUAuC/downloaded_packages/R-swat-1.2.1-linux64.tar.gz'
  had non-zero exit status

The install failed. Based on the error message, it turns out I had forgotten to install another R package:

> install.packages("dplyr")

(This dependency is documented in the R SWAT documentation, but I missed it. Since this could happen to anyone – right? – I decided to come clean here. Perhaps you'll learn from my misstep.)

After installing the dplyr package in the R session, I reran the swat install and was happy to hit a return code of zero. Success!

For the brevity of this post, I decided to not configure an authentication file and will be required to pass user credentials when making connections. I will configure authinfo in a follow-up post.

Testing my RStudio->SAS Viya connection

From RStudio, I ran the following command to connect to the CAS server:

> library(swat)
> conn <- CAS("", 8777, protocol='http', user='user', password='password')

Now that I succeeded in connecting my R client to the CAS server, I was ready to load data and start making API calls.

How did I decide on a use case?

I'm in the process of moving houses, so I decided to find a data set on property values in the area to do some basic analysis, to see if I was getting a good deal. I did a quick google search and downloaded a .csv from a local government site. At this point, I was all set up, connected, and had data. All I needed now was to run some CAS Actions from RStudio.

CAS actions are commands that you submit through RStudio to tell the CAS server to 'do' something. One or more objects are returned to the client -- for example, a collection of data frames. CAS actions are organized into action sets and are invoked via APIs. You can find

> citydata <-, "C:\\Users\\sas\\Downloads\\property.csv", sep=';')
NOTE: Cloud Analytic Services made the uploaded file available as table PROPERTY in caslib CASUSER(user).

What analysis did I perform?

I purposefully kept my analysis brief, as I just wanted to make sure that I could connect, run a few commands, and get results back.

My RStudio session, including all of the things I tried

Here is a brief series of CAS action commands that I ran from RStudio:

Get the mean value of a variable:

> cas.mean(citydata$TotalSaleValue)
          Column     Mean
1 TotalSaleValue 343806.5

Get the standard deviation of a variable:

          Column      Std
1 TotalSaleValue 185992.9

Get boxplot data for a variable:

> cas.percentile.boxPlot(citydata$TotalSaleValue)
          Column     Q1     Q2     Q3     Mean WhiskerLo WhiskerHi Min     Max      Std    N
1 TotalSaleValue 239000 320000 418000 343806.5         0    685000   0 2318000 185992.9 5301

Get boxplot data for another variable:

> cas.percentile.boxPlot(citydata$TotalBldgSqFt)
         Column   Q1   Q2   Q3     Mean WhiskerLo WhiskerHi Min   Max      Std    N
1 TotalBldgSqFt 2522 2922 3492 3131.446      1072      4943 572 13801 1032.024 5301

Did I succeed?

I think so. Let's say the house I want is 3,000 square feet and costs $258,000. As you can see in the box plot data, I'm getting a good deal. The house size is in the second quartile, while the house cost falls in the first quartile. Yes, this is not the most in depth statistical analysis, but I'll get more into that in a future article.

What's next?

This activity has really sparked my interest to learn more and I will continue to expand my analysis, attempt more complex statistical procedures and create graphs. A follow up blog is already in the works. If this article has piqued your interest in the subject, I'd like to ask you: What would you like to see next? Please comment and I will turn my focus to those topics for a future post.

Using RStudio with SAS Viya was published on SAS Users.

4月 052017

Emma Warrillow, President of Data Insight Group, Inc., believes analysts add business value when they ask questions of the business, the data and the approach. “Don’t be an order taker,” she said.

Emma Warrillow at SAS Global Forum.

Warrillow held to her promise that attendees wouldn’t see a stitch of SAS programming code in her session Monday, April 3, at SAS Global Forum.

Not that she doesn’t believe programming skills and SAS Certifications aren’t important. She does.

Why you need communication skills

But Warrillow believes that as technology takes on more of the heavy lifting from the analysis side, communication skills, interpretation skills and storytelling skills are quickly becoming the data analyst’s magic wand.

Warrillow likened it to the centuries-old question: If a tree falls in a forest, and no one is around to hear it, did it make a sound? “If you have a great analysis, but no one gets it or takes action, was it really a great analysis?” she asked.

If you have a great analysis, but no one gets it or takes action, was it really a great analysis?
Click To Tweet

To create real business value and be the unicorn – that rare breed of marketing technologist who understands both marketing and marketing technology – analysts have to understand the business and its goals and operations.

She offered several actionable tips to help make the transition, including:

1. Never just send the spreadsheet.

Or the PowerPoint or the email. “The recipient might ignore it, get frustrated or, worse yet, misinterpret it,” she said. “Instead, communicate what you’ve seen in the analysis.”

2. Be a POET.

Warrillow is a huge fan of the work of Laura Warren of who recommends an acronym approach to data-based storytelling and making sure every presentation offers:

  • Purpose: The purpose of this chart is to …
  • Observation: To illustrate that …
  • Explanation: What this means to us is …
  • Take-away or Transition: As a next step, we recommend …

3. Brand your work.

“Many of us suffer from a lack of profile in our organizations,” she said. “Take a lesson from public relations and brand yourselves. Just make sure you’re a brand people can trust. Have checks and balances in place to make sure your data is accurate.”

4. Don’t be an order taker.

Be consultative and remember that you are the expert when it comes to knowing how to structure the campaign modeling. It can be tough in some organizations, Warrillow admitted, but asking some questions and offering suggestions can be a great way to begin.

5. Tell the truth.

“Storytelling can be associated with big, tall tales,” she said. “You have to have stories that are compelling but also have truth and resonance.” One of her best resources is The Four Truths of the Storyteller” by Peter Gruber, which first appeared in Harvard Business Review December 2007.

6. Go higher.

Knowledge and comprehension are important, “but we need to start moving further up the chain,” Warrillow said. She used Bloom’s Taxonomy to describe the importance of making data move at the speed of business – getting people to take action by moving into application, analysis, synthesis and evaluation phases.

7. Prepare for the future.

“Don’t become the person who says, ‘I’m this kind of analyst,’” she said. “We need to explore new environments, prepare ourselves with great skills. In the short term, we’re going to need more programming skills. Over time, however, we’re going to need interpretation, communication and storytelling skills.” She encouraged attendees to answer the SAS Global Forum challenge of becoming a #LifeLearner.

For more from Warrillow, read the post, Making data personal: big data made small.

7 tips for becoming a data science unicorn was published on SAS Users.

8月 012016

Would Taylor Swift date her suitors or not? Guess what? Data scientists may know the answer. But this time it was pupils who found the answer. Pupils? Yes, data science is for everyone, kids included. During Tech Week, a UK-wide event in July promoted by the Tech Partnership, organisations were […]

The post The Maths in the Dates – and Who Taylor Swift Will Date! appeared first on Generation SAS.

7月 202016

Gareth Hampson, a data scientist who graduated with an MSc in databases and web-based systems from Salford University, recently won a SAS prize for his excellent project using SAS® Enterprise Miner™. He has also been profoundly deaf since the age of 4 due to meningitis. We spoke to Gareth to find out more about […]

The post Disabilities don't mean you can't be an excellent data scientist appeared first on Generation SAS.

5月 112016

As the demand for analytical skills continues to grow and the data scientist has been catalogued as the sexiest job of the 21st century, more and more students are showing interest in the analytics and big data world. We asked one of our graduates to share her experiences working as […]

The post How one data scientist turns ideas into reality appeared first on Generation SAS.

4月 202016

Just last weekend, I was considering buying a new camera lens. I already had a few brands in mind, so I looked online at their websites to learn more about their product information. I was able to conduct a comparison on different brands and lenses to narrow down to a specific 50mm lens provided by a major brand. I added the lens to my cart online, but wanted to get a closer look of it, so I chatted online with a representative to see if there were any lenses available at stores near me. This digital channel was my first point of interaction with the brand, but what impact did that have on my buying experience? Would responsive design come479424735 into play? Would the brand proactively contact me about similar products? Or would they simply react to inquiries that I had as a consumer? But today’s consumers expect immediate, individualized messages – would this brand deliver?

The fact of the matter is that a lot of brands don’t have the capabilities to modify messages, offers and interactions across channels, devices and points in time so that they are more relevant to the end consumer.

 Enter SAS

SAS Customer Intelligence 360, launching this month to the marketplace, offers an all-encompassing view of customers no matter how they choose to engage with you across digital properties.

A complete customer view

SAS Customer Intelligence 360 can give you detailed insights from digital channels customers interact with to create the most effective and relevant actions. The solution rapidly transforms digital data into a complete 360-degree view of the customer, meeting each customer’s needs at the right time, place and in proper context. Multiple decision-making methods, such as predictive models and multivariate tests, help ensure that customers gets the most relevant and personalized offers.

Data integration

Data is also easy to integrate with many offline customer channels though SAS Customer Intelligence 360 and its customer decision hub. Customer interactions are based on previous engagements on all other platforms. The data hub is able to convert all of this into customer-focused actions. With this data integration, the Customer_decision_hubbrand is able to gather my interactions and information from all available sources; not just the website, but the call center, mobile apps, social media and point of sale.

Offline customer data can be appended to digital data to further augment the view of me as a customer. These data sources, typically demographic or transactional in nature, gives marketers valuable insight into a customer’s true needs in order to create more relevant offers, better targeted activities and more efficient use of marketing resources. This capability allows the brand to see me more than just page clicks. They’ll see me as a father with young children, interested in photography and seeking to buy a 50mm lens to capture fleeting family moments.

Insights into future actions

You don’t need to be a data scientist to harness the power of predictive marketing; SAS Customer Intelligence 360 includes guided analytics to provide marketers a forward-looking view of customer journeys. This enables them to better understand business drivers and incorporate them into segmentation, optimization and other analytic techniques. Marketers can better forecast how customers will perform in the future. The solution acts as the data scientist – enabling marketers to become more efficient and effective in the analytical techniques they embed into marketing initiatives.

Web data collection

Each web page is embedded with a single line of HTML that automatically collects page information without expensive tagging. With this feature, the webpage configuration might change simultaneously with what I click on, the order and timing of my clicks, each keystroke, etc. Dynamic data collection offers me more relevant content as I navigate through the brand’s site. Any customer activities are recorded privately and securely over time so that once a customer is identified, the information is connected automatically.

Simply put, SAS Customer Intelligence 360 offers marketers the confidence to manage their digital customer journeys in a more personalized and profitable way. Marketers gain a complete view of their customers and transform this data using analytical insight into customer-centric knowledge and future actions. With this solution, brands can interact with customers on a personalized level and customers will be more satisfied with their entire relationship with a brand, not just a single transaction. Customer loyalty goes up and attrition goes down.

And as for me, I got the lens I was looking for, and was satisfied with the customer experience. Of course I have ideas on how to improve it on behalf of this brand, and SAS Customer Intelligence 360 fits into that picture.

tags: customer decision hub, customer journey, data hub, data scientist, Digital Intelligence, Predictive Marketing, Predictive Personalization, SAS Customer Intelligence 360

SAS Customer Intelligence 360: Digital discovery and engagement brought into focus was published on Customer Intelligence.