SAS Event Stream Processing

7月 262019
 

In a previous post, Zero to SAS in 60 Seconds- SAS Machine Learning on SAS Analytics Cloud, I documented my experience with a SAS free trial on the SAS Analytics Cloud. Well, the engineers at SAS have been busy and created another free trial. The new trial covers SAS Event Stream Processing (ESP).

This time last year (when just starting at SAS), I only knew ESP as extrasensory perception. I'm more enlightened now. Working through this exercise introduced me to how event stream processing is a powerful and effective tool for analyzing data using machine learning and streaming analytics to uncover insights for real-time decision making. In a nutshell, you create a model, stream your data, process the results, and make timely decisions based on the results.

The trial uses SAS ESPPy, allowing you to embed an ESP project inside a Python pipeline. To see ESPPy in action take a look at this video. To learn more about ESP and IoT see this article on the SAS Communities Library. In this article I chronicle my journey through the trial while introducing key concepts and operations of ESP.

Register and get started

The process to register and initial login are identical to the machine learning article. You must have a SAS Profile to participate in the trial. The only difference is you need to follow this link to sign up for the ESP trial. Please refer to the machine learning article for detailed steps of signing up and logging in.

The use case

SAS Solar Farm in Cary

The SAS Solar Farm sits on almost 12 acres of SAS Headquarters property. There are 10,276 solar panels producing more than 3.6 million kilowatt hours annually. That’s enough power for more than 325 average sized U.S. homes.

As part of the environment management, it is important to continuously monitor the operation of the solar panels to optimize configuration parameters, detect potential equipment failure, and accurately forecast the amount of energy generated. Factors considered include panel angles, time of day, seasons, and weather patterns as the energy generated depends directly of the amount of sun available to the panels.

The ESP project in this demo is pre-loaded in the trial and is run through a Jupyter notebook. The project shows the monitoring of energy (kWh) and power (kW) generated during a specific time interval eliminating localized outlier effects and triggering alerts when there is a pre-defined difference in the energy generated between subsequent time intervals.

Solar Farm Data represented as digital art

Take two minutes and watch this video on how SAS uses SAS software to create a work of art with solar farm data.

Disclaimer: no sheep were harmed during data collection or writing of this article.

Navigating the trial

Once logged into the trial, you see the Applications screen.

ESP trial Applications screen

The Data and Team options in the left pane behave exactly as those in the machine learning trial. These sections allow you to access data and manage a multi-user system. Select the SAS Event Stream Processing icon to start a JupyterLab session.

JupyterLab home screen

I will not go into the details of JupyterLab here. The left pane contains menus, file management, and other options. The pane on the right displays three options:

Python 3 Notebook - a blank Jupyter notebook - documents that combine live, runnable code with narrative text (Markdown), equations (LaTeX), images, interactive visualizations and other rich output
Python 3 Console - a blank Python console - code consoles enable you to run code interactively in a kernel
Text File - basic text editor - enables you to edit text files in JupyterLab

For this article we're going to follow along and interact with the pre-loaded demo Solar Farm ESP project. To locate the Jupyter notebook double click the demo directory from the left pane.

Select the demo directory from the left pane

Next select Event_Stream_Processing. Before proceeding with the demo, I'd highly suggest opening the README.ipynb file.

Contents of the README notebook

Here you will find overview and environment organization information for the trial. The trial uses SAS ESPPy for designing, testing, and deploying projects on ESP Servers.

Step through the demo

Before starting the trial, I needed a little background on event stream processing. I located the SAS ESP product documentation. I recommend referring to it for details on the ESP model, objects, and workflow.

To access the demo, double click the demo directory from the left pane. The trial comes with five pre-loaded demos. Feel free to try any/all of them. Double click on ESP Basic Project - Solar Farm.ipynb to display the Solar Farm notebook. The notebook walks you through the ESP model creation and execution. To run a command place the cursor in a command cell and select the 'Run' button (triangle-shaped button at the top of the notebook). If no response returns when running the cell block, assume the commands ran successfully.

Below is a brief description of the steps in the project:

  1. Create the project and query used - this creates dedicated space and objects where the ESP process takes place
  2. Create input and aggregate windows - this action extracts desired data and creates data subsets from the stream
  3. Add a join window - this brings together lag and current values into the project
  4. Add a compute window - this calculates the difference between the previous and current event
  5. Add a filter window - this action filters occurrences outside a threshold value; this creates an alert for potential mechanical issues
  6. Define workflow connections - this defines the workflow between the various windows in the project
  7. Save the project - this generates an XML file for the project
  8. Load the project to the ESP Server - this loads the project and produces a graphical representation of the workflow

    Solar Farm project workflow

  9. Start streaming data - in this example, rather than streaming data in real time, the stream derives from the solar farm table data
  10. View solar farm data - this creates a graphical representation of streaming data

    Solar Farm graph for kW and kWh

While not included in the demo, the streaming data would pass through the filter and if a threshold breach occurs, an alert is created. Considering the graph above, alerts could very well have occurred just before 1:15 pm (IntkW drops from 185 to 150) and just before 2:30 pm (IntkW drops from 125 to 35).

Your turn

Now that you have a taste of ESP, feel free to step through the rest of the demos. You may also load your own data and create your own ESP models. Feel free to share your experience and what you create by leaving a comment.

SAS Event Stream Processing on SAS Analytics Cloud - my journey was published on SAS Users.

12月 152017
 

What is blockchain and how can you analyze data in a blockchain? This article will discuss various forms of blockchain analytics from a tactical or heuristic perspective. I’ll explain how SAS technologies can provide advanced analytics for operational, value/asset and regulatory viewpoints in the diverse world of open source blockchain [...]

A practical approach to blockchain analytics was published on SAS Voices by Sam Penfield

4月 042017
 

Two minutes in, I knew the 2017 SAS Global Forum Technology Connection would be anything but typical or average. Maybe that’s because SAS’ Chief Technology Officer Oliver Schabenberger was running the show, and nothing he does is ever typical or average. His first surprise of the morning was his entrance. [...]

Impressive technology, surprising connections was published on SAS Voices by Marcie Montague

10月 202016
 

SAS Quality Knowledge Base locales in a SAS event stream processing compute windowIn a previous blog post, I demonstrated combining the power of SAS Event Stream Processing (ESP) and the SAS Quality Knowledge Base (QKB), a key component of our SAS Data Quality offerings. In this post, I will expand on the topic and show how you can work with data from multiple QKB locales in your event stream.

To illustrate how to do this I will review an example where I have event stream data that contains North American postal codes.  I need to standardize the values appropriately depending on where they are from – United States, Canada, or Mexico – using the Postal Code Standardization definition from the appropriate QKB locale.  Note: This example assumes that the QKB for Contact Information has been installed and the license file that the DFESP_QKB_LIC environment variable points to contains a valid license for these locales.

In an ESP Compute window, I first need to initialize the call to the BlueFusion Expression Engine Language function and load the three QKB locales needed – ENUSA (English – United States), ENCAN (English – Canada), and ESMEX (Spanish – Mexico).

sas-quality-knowledge-base-locales-in-a-sas-event-stream-processing-compute-window01

Next, I need to call the appropriate Postal Code QKB Standardization definition based on the country the data is coming from.  However, to do this, I first need to standardize the Country information in my streaming data; therefore, I call the Country (ISO 3-character) Standardization definition.

sas-quality-knowledge-base-locales-in-a-sas-event-stream-processing-compute-window02

After that is done, I do a series of if/else statements to standardize the Postal Codes using the appropriate QKB locale definition based on the Country_Standardized value computed above.  The resulting standardized Postal Code value is returned in the output field named PostalCode_STND.

sas-quality-knowledge-base-locales-in-a-sas-event-stream-processing-compute-window03

I can review the output of the Compute window by testing the ESP Studio project and subscribing to the Compute window.

sas-quality-knowledge-base-locales-in-a-sas-event-stream-processing-compute-window04

Here is the XML code for the SAS ESP project reviewed in this blog:

sas-quality-knowledge-base-locales-in-a-sas-event-stream-processing-compute-window05

Now that the Postal Code values for the various locales have been standardized for the event stream, I can add analyses to my ESP Studio project based on those standardized values.

For more information, please refer to the product documentation:

Learn more about a sustainable approach to data quality.

tags: data management, SAS Data Quality, SAS Event Stream Processing, SAS Professional Services

Using multiple SAS Quality Knowledge Base locales in a SAS Event Stream Processing compute window was published on SAS Users.

10月 052016
 

streamviewerSAS Event Stream Processing (ESP) cannot only process structured streaming events (a collection of fields) in real time, but has also very advanced features regarding the collection and the analysis of unstructured events. Twitter is one of the most well-known social network application and probably the first that comes to mind when thinking about streaming data source. On the other hand, SAS has powerful solutions to analyze unstructured data with SAS Text Analytics. This post is about merging 2 needs: collecting unstructured data coming from Twitter and doing some text analytics processing on tweets (contextual extraction, content categorization and sentiment analysis).

Before moving forward, SAS ESP is based on a publish and subscribe model. Events are injected into an ESP model using an “adapter” or a “connector.” or using Python and the publisher API Target applications consume enriched events output by ESP using the same technology, “adapters” and “connectors.” SAS ESP provides lots of them, in order to integrate with static and dynamic applications.

Then, an ESP model flow is composed of “windows” which are basically the type of transformation we want to perform on streaming events. It can be basic data management (join, compute, filter, aggregate, etc.) as well as advanced processing (data quality, pattern detection, streaming analytics, etc.).

SAS ESP Twitter Adapters background

SAS ESP 4.2 provides two adapters to connect to Twitter as a data source and to publish events from Twitter (one event per tweet) to a running ESP model. There are no equivalent connectors for Twitter.

Both two adapters are publisher only and include:

  • Twitter Publisher Adapter
  • Twitter Gnip Publisher Adapter

The second one is more advanced, using a different API (GNIP, bought by Twitter) and providing additional capabilities (access to history of tweets) and performance. The adapter builds event blocks from a Twitter Gnip firehose stream and publishes them to a source window. Access to this Twitter stream is restricted to Twitter-approved parties. Access requires a signed agreement.

In this article, we will focus on the first adapter. It consumes Twitter streams and injects event blocks into source windows of an ESP engine. This adapter has free capabilities. The default access level of a Twitter account allows us to use the following methods:

  • Sample: Starts listening on random sample of all public statuses.
  • Filter: Starts consuming public statuses that match one or more filter predicates.

SAS ESP Text Analytics background

SAS ESP 4.1/4.2 provides three window types (event transformation nodes) to perform Text Analytics in real time on incoming events.

The term text analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation.

text-analytics-on-twitter

Here are the SAS ESP Text Analytics features:

  • Text Category” window:
    • Content categorization or document classification into topics
    • Automatically identify or extract content that matches predefined criteria to more easily search by, report on, and model/segment by important themes
    • Relies on “.mco” binary files coming from SAS Contextual Analysis solution
  • Text Context” window:
    • Contextual extraction of named entities (people, titles, locations, dates, companies, etc.) or facts of interest
    • Relies on “.li” binary files coming from SAS Contextual Analysis solution
  • Text Sentiment” window:
    • Sentiment analysis of text coming from documents, social networks, emails, etc.
    • Classify documents and specific attributes/features as having positive, negative, or neutral/mixed tone
    • Relies on “.sam” binary files coming from SAS Sentiment Analysis solution

Binary files (“.mco”, “.li”, “.sam”) cannot be reverse engineered. The original projects in their corresponding solutions (SAS Contextual Analysis or SAS Sentiment Analysis) should be used to perform modifications on those binaries.

The ESP project

The following ESP project is aimed to:

  • Wait for events coming from Twitter in the source Twitter window (this is a source window, the only entry point for streaming events)
  • Perform basic event processing and counting
  • Perform text analytics on tweets (in the input streaming, the tweet text is injected as a single field)

text-analytics-on-twitter02

Let’s have a look at potential text analytics results.

Here is a sample of the Twitter stream that SAS ESP is able to catch (the tweet text is collected in a field called tw_Text):

text-analytics-on-twitter03

The “Text Category” window, with an associated “.mco” file, is able to classify tweets into topics/categories with a related score:

text-analytics-on-twitter04

The “Text Context” window, with an associated “.li” file, is able to extract terms and their corresponding entity (person, location, currency, etc.) from a tweet:

text-analytics-on-twitter05

The “Text Sentiment” window, with an associated “.sam” file, is able to determine a sentiment with a probability from a tweet:

text-analytics-on-twitter06

Run the Twitter adapter

In order to inject events into a running ESP model, the Twitter adapter should be started and is going to publish live tweets into the sourceTwitter window of our model.

text-analytics-on-twitter07

Here we search for tweets containing “iphone”, but you can change to any keyword you want to track (assuming people are tweeting on that keyword…).

There are many additional options: -f allows to follow specific user ids, -p allows to specify locations of interest, etc.

Consume enriched events with SAS ESP Streamviewer

SAS ESP provides a way to render events in real-time graphically. Here is an example of how to consume real-time events in a powerful dashboard.

streamviewer

Conclusion

With SAS ESP, you can bring the power of SAS Analytics into the real-time world. Performing Text Analytics (content categorization, sentiment analysis, reputation management, etc.) on the fly on text coming from tweets, documents, emails, etc. and triggering consequently some relevant actions have never been so simple and so fast.

tags: SAS Event Stream Processing, SAS Professional Services, SAS Text Analytics, twitter

How to perform real time Text Analytics on Twitter streaming data in SAS ESP was published on SAS Users.

9月 162016
 

The SAS Quality Knowledge Base (QKB) is a collection of files which store data and logic that define data cleansing operations such as parsing, standardization, and generating match codes to facilitate fuzzy matching. Various SAS software products reference the QKB when performing data quality operations on your data. One of these products is SAS Event Stream Processing (ESP). SAS ESP enables programmers to build applications that quickly process and analyze streaming events. In this blog, we will look at combining the power of these two products – SAS ESP and the SAS QKB.

SAS Event Stream Processing (ESP) Studio can call definitions from the SAS Quality Knowledge Base (QKB) in its Compute window. The Compute window enables the transformation of input events into output events through computed manipulations of the input event stream fields. One of the computed manipulations that can be used is calling the QKB definitions by using the BlueFusion Expression Engine Language function.

Before QKB definitions can be used in ESP projects the QKB must be installed on the SAS ESP server. Also, two environment variables must be set: DFESP_QKB and DFESP_QKB_LIC. The environment variable DFESP_QKB should be set to the path where the QKB data was installed. The environment variable DFESP_QKB_LIC should be set to the path and filename that contains the license(s) for the QKB locale(s).

In this post, I will explore the example of calling the State/Province (Abbreviation) Standardization QKB definition from the English – United States locale in the ESP Compute window. The Source window is reading in events that contain US State data that may or may not be standardized in the 2-character US State abbreviation.

SASEventStreamProcessing

As part of the event stream analysis I want to perform, I need the US_State values to be in a standard format. To do this I will utilize the State/Province (Abbreviation) Standardization QKB definition from the English – United States locale.

First, I need to initialize the call to the BlueFusion Expression Engine Language function and load the ENUSA (English – United States) locale. Note: The license file that the DFESP_QKB_LIC environment variable points to must contain a license for this locale.

SASEventStreamProcessing02

Next, I need to call the QKB definition and return its result. In this case, I am calling the BlueFusion standardize function. This function expects the following inputs: Definition Name, Input Field to Standardize, Output Field for Standardized Value. In this case the Definition Name is State/Province (Abbreviation), the Input Field to Standardize is US_State, and the Output Field for the Standardized Value is result. Note: The field result was declared in the Initialize expression pictured above. This result value is returned in the output field named US_State_STND.

SASEventStreamProcessing03

I can review the output of the Compute window by testing the ESP Studio project and subscribing to the Compute window.

SASEventStreamProcessing04

Here is the XML code for the SAS ESP project reviewed in this blog:

SASEventStreamProcessing05

Now that the US_State values have been standardized for the event stream, I can add analyses to my ESP Studio project based on those standardized values.

For more information, please refer to the product documentation:
SAS Event Stream Processing
DataFlux Expression Engine Language
SAS Quality Knowledge Base

tags: DataFlux Data Management Studio, SAS Event Stream Processing, SAS Professional Services

Using SAS Quality Knowledge Base Definitions in a SAS Event Stream Processing Compute Window was published on SAS Users.

9月 292015
 
SAS has updated its data management suite of software to help data scientists and IT personnel easily identify and use the right data at the right time.
9月 292015
 
SAS has updated its data management suite of software to help data scientists and IT personnel easily identify and use the right data at the right time.
4月 302015
 

SAS Event Stream Processing that is! The latest release of SAS Event Stream Processing will launch May 12, and numerous customers around the globe are already using it. So what’s the big deal?

Why event streams are important to business

SAS Event Stream Processing allows organizations to react to events virtually instantaneously. Consider the following scenarios. Imagine if:

  • An online retailer creates custom offers as customers click around the web
  • An oil company automatically vents a pipeline to a reservoir when sensors detect an increase in pressure
  • A financial regulator reverses predatory trading immediately after it occurred

These are all extremely high value scenarios, and the reason is the rapid reaction time. If the retailer markets to the customer after he or she has already ended their session, it’s less effective. If the oil pipeline breaks, it is a disaster. If the predatory trading isn’t reversed for months, markets and peoples’ lives are negatively affected.

The point is that the value of information decreases dramatically over time. While it’s often helpful to analyze historical data, the faster you can use that analysis to react to actual events, the more valuable it can be to the bottom line. As the graphic below shows, the quicker you take action following a business event, the more that action will be worth in money earned or saved.

ESP_XML_capture.png

Developing event stream models

The recommended path for executing SAS Event Stream Processing models is through the XML Factory Server. To support faster model development, SAS recently added a visual development environment called the SAS Event Stream Processing Studio. Using this graphical interface, model designers can drag and drop windows onto the workspace area to create the appropriate data-centric flow and apply processing rules to address any kind of business need. Simple and intuitive.

SAS Event Stream Processing Studio showing code and associated process flow.

Once the project is designed it can be tested directly within SAS Event Stream Processing Studio, and if everything is fine, the XML model that’s automatically generated from the interface and published to the appropriate server for execution.

SAS Event Stream Studio supports faster analysis and detection of events with:

  • an intuitive environment for developing and testing projects (aka models)
  • a palette of windows and connectors that can be used to design even the most complex event streaming models
  • a definition and testing environment that reduces the need for programming in XML or C++
  • ability to easily instantiate visually-defined models to the XML factory server, connecting to live data streams for model validation
  • full visibility into the automatically generated XML code, which can be further customized with edits and additions

For more information

tags: event stream processing, SAS Event Stream Processing, SAS Professional Services, SAS Programmers

The post You react so quickly! Do you have ESP? appeared first on SAS Users.