–In-Memory Analytics

11月 052015

Like retailers, Communications Service Providers (CSPs) interact with customers in multiple ways during the buy, use and share journey, including interactions at stores, on websites, Facebook pages, call centers, and other channels. However, unlike retailers, CSPs are having a harder time providing a seamless experience across their many channels. For example, have […]

What communications companies need to play in the digital arena was published on SAS Voices.

4月 032014

Demand for analytics is at an all-time high. Monster.com has rated SAS as the number one skill to have to increase your salary and Harvard Business Review continues to highlight why the data scientist is the sexiest job of the 21st century.  It is clear that if you want to be sexy and rich we are in the right profession! Jokes aside I have spent the past five weeks travelling around Australia, Singapore and New Zealand discussing the need to modernise analytical platforms to help meet the sharp increase in demand for analytics to support better business and social outcomes.

While there are many aspects to modernisation, the most prolific discussion during the roadshow was around Hadoop. About 20% of the 150 plus companies were already up and running with their Hadoop play pen. Questions had moved beyond “What is Hadoop?” to “How do I leverage Hadoop as part of my analytical process?”. Within the region we have live customers using Hadoop in various ways:

  • Exploring new text based data sets like customer surveys and feedback.
  • Replicating core transaction system data to perform adhoc queries faster. Removing the need to grab extra data not currently supported in the EDW.
  • The establishment of an analytical sandpit to explore relationships that can have an impact on marketing, risk, fraud and operations by looking at new data sets and combining them with traditional data sets.

The key challenge discussed was unanimous. While Hadoop provided a low cost way to store and retrieve data, it was still a cost without an obvious business outcome. Customers were looking at how to plug Hadoop into their existing analytical processes, and quickly discovering that Hadoop comes with a complex zoo of capabilities and consequentially, skills gaps.

The SAS /Hadoop Ecosystem

The SAS /Hadoop Ecosystem

Be assured that this was and is a top priority in our research and development labs. In response to our customers' concerns, our focus has been to reduce the skills needed to integrate Hadoop into the decision-making value chain. SAS offers a set of technologies that enable users to bring the full power of business analytics functionality to Hadoop. Users can prepare and explore data, develop analytical models with the full depth and breadth of techniques, as well as execute the analytical model in Hadoop. It can be best explained using the four key areas of the data‐to‐decision lifecycle process:

  • Managing data – there are a couple of gaps to address in this area. Firstly, if you need to connect to Hadoop, read and write file data or execute a map reduce job; using Base SAS you can use the FILENAME statement to read and write file data to and from Hadoop. This can be done from your existing SAS environment. Using PROC HADOOP, users can submit HDFS commands and Pig Scripts, as well as upload and execute a map reduce tasks.
    SAS 9.4 is able to use Hadoop to store SAS data through the SAS Scalable Performance Data (SPD) Engine within Base SAS. With SAS/ACCESS to Hadoop, you can connect, read and write data to and from Hadoop as if it were any other source that SAS can connect to. From any SAS client, a connection to Hadoop can be made and users can analyse data with their favourite SAS Procedures and Data Step. SAS/ACCESS to Hadoop supports explicit Hive QL calls. This means that rather than extracting the data into SAS for processing SAS translates these procedures into the appropriate Hive‐QL which resolves the results on Hadoop and only returns the results back to SAS. SAS/ACCESS to Hadoop allows the SAS user to leverage Hadoop just like they do with an RDBMS today.
  • Exploring and visualising insight - With SAS Visual Analytics, users can quickly and easily explore and visualise large amounts of data stored in the Hadoop distributed file system based on SAS LASR Analytics server.  This is an extremely scalable, in‐memory processing engine that is optimised for interactive and iterative analytics. This engine addresses the gaps in MapReduce based analysis, by persisting data in‐memory and taking full advantage of computing resources. Multiple users can interact with data in real‐time because there is no re‐lifting data into memory for each analysis or request, there is no serial sequence of jobs, and computational resources available can be fully exploited.
  • Building modelsSAS High Performance Analytics (HPA) products (Statistics, Data Mining, Text Mining, Econometrics, Forecasting and Optimisation) provide a highly scalable in‐memory infrastructure that supports Hadoop. Enabling you to apply domain‐specific analytics to large data on Hadoop, it effectively eliminates the data movement between the SAS server and Hadoop. SAS provides a set of procedures that enable users to manipulate, transform, explore, model and score data all within Hadoop. In addition, SAS In‐Memory Statistics for Hadoop is an interactive programing environment for data preparation, exploration, modelling and deployment in Hadoop with an extremely fast, multi‐user environment leveraging SAS Enterprise Guide to connect and interact with LASR or take advantage of SAS’ new modern web‐editor, SAS Studio.
  • Deploying and executing models - conventional model scoring requires the transfer of data from one system to SAS where it is scored and then written back. In Hadoop the movement of data from the cluster to SAS can be prohibitively expensive. Instead, you want to keep data in place and integrate SAS Scoring processes on Hadoop. The SAS Scoring Accelerator for Hadoop enables analytic models created with Enterprise Miner or with core SAS/STAT procedures to be processed in Hadoop via MapReduce. This requires no data movement and is performed on the cluster in parallel, just like SAS does with other in‐database accelerators.

To be ahead of competitors we need to act now to leverage the power of Hadoop. SAS has embraced Hadoop and provided a flexible architecture to support deployment with other data warehouse technologies.  SAS now enables you to analyse large, diverse and complex data sets in Hadoop within a single environment – instead of using a mix of languages and products from different vendors.

 Click here to find out how SAS can help you innovate with Hadoop.

tags: big data, data visualisation, data warehouse, Hadoop, Harvard business review, HDFS, Hive, In-Memory, in-memory analytics, MapReduce, modernization, Pig, visual analytics
2月 152014
I was recently part of team discussing enterprise architecture with a chief IT architect, and we were explaining how SAS can integrate into their existing infrastructure, add business value on top it and even fit into their future planned infrastructure.  This conversation was one of the reasons I blogged about [...]
10月 242013
SAS has had a good week.  No. 1 in Analytics The company remains No. 1 in advanced analytics -- per IDC and Forrester, not just according to marketers like me. And we remain committed to innovation. Our data visualization offering, SAS Visual Analytics, is now used by more than 500 [...]
11月 272012
With more than 40 insurers using SAS® Risk Management for Insurance, Solvency II requirements are a key factor. Many insurers embrace SAS to pair better risk management for compliance regulation with more efficient business operations and greater investment value.
4月 172012

How big does big data need to be before it is valuable?

High Performance Analytics levels the Big Data playing field

High Performance Analytics levels the Big Data playing field

The value in big data is within reach of everyone.  It could mean wanting to mine a couple of extra fields about the customer or wanting to improve the customer profile using unstructured data about customer interactions using Hadoop. Most articles and hype about big data surround the three Vs;  Velocity, Variety and Volume.  However, we should never lose sight that big data is relative to your business plan. The real conversation to be had is about the value in being nimble. 

The 4th (V)alue: The intersection of big data and high performance analytics
High performance analytics is the next generation of analytical focus as we work our way through the era of big data looking for optimal ways to gain insight in shorter reporting windows. It is all about getting to the relevant data quicker and delivering that information in real time. High performance analytics is equipping David-sized organisations with the tools to level the playing field.  Examples include:

  • How a bank determines credit risk assessment in seconds instead of hours.
  • Where a government agency improves social welfare by analysing unstructured citizen interaction data.
  • An insurance company that uses census data to improve marketing response rates.
  • How an online business analyses social data to understand sentiment, and behavioral data to improve campaign targeting.

Regional healthcare provider and an insurers point of view

I recently listened to an Australian healthcare customer discuss their version of big data and high performance analytics. It went like this.

Through some acquisitions we have increased our data base size by approximately 15 percent.  This has resulted in our marketing teams being frustrated with longer than usual time-to-market for gaining customer intelligence and executing campaigns.  Further compounding the issue is the competitive pressure coming from recent changes in government legislation, which is driving customers to shop around.  This increase in competition means marketing needs to be more nimble.  Meaning more campaigns to fewer people with more relevance.

Another example is a local Insurance customer I met with to discuss their version of big data, high performance analytics and real-time analytics.

We have issued our sales force with iPads.  The challenge we face is, how do we deliver intelligence to our sales representatives in a manner where we know it is relevant, timely and contextual?  We know they are meeting with prospects and customers but how do we analyse customer data, analytical data, transactional data and interaction data to provide a Next Best Offer in seconds?

If we understand how to beat Goliath, do we know what to beat him with?  A high performance approach leads us to think about the problem differently and look for a solution that optimises the analytical jobs and the way they were architecturally executed.  I expect there is a target value proposition heading my way, now.

Under the hood: high performance analytics is not that scary

We often think of new technology as being like a Ferrari, always thinking it is out of reach or too complex for the average David.  The reality is high performance analytics provides various approaches that span the spectrum of your maturity and size, from:

  • Moving existing analytical models into operational processes for real-time decisions.
  • Optmising analytical jobs to leverage your existing in-database power.
  • Using in-memory analytics to take advantage of cheaper hardware.
  • Building an enterprise analytical platform to drive down TCO while always prioritising business value using a grid based approach.
  • Visually exploring big data using high-performance, interactive, in-memory capabilities to understand all your data, discover new patterns and publish reports to the web and mobile devices.

The democratisation of analytics, especially high performance analytics has allowed every company whether Goliath or David-sized to benefit from big data.  Over the next few weeks we will be discussing the impact of the intersection with big data and high performance analytics.  In particular providing examples relevant to the world we live in left of the date line. Join the discussion to find out what the innovators are doing and lessons we can learn locally.  You can see some more examples here.

What is your big data opportunity? Tell us in the comments below.

tags: big data, customer intelligence, Hadoop, healthcare, high performance analytics, HPA, in-memory analytics