Unless you live under a rock, you've probably seen news reports that Russian trolls have been posting on social media to allegedly conduct what they called "information warfare against the United States, with the stated goal of spreading distrust towards the candidates and the political system in general." NBC recently [...]
Since Trump because the US president, many people have noticed that he posts a lot of tweets. While some people choose to analyze and critique the content of those tweets, I was more curious about something a little less controversial - the timing and quantity. Follow along as I dig into [...]
@philsimon says that old stalwarts sometimes just don't cut it.
The post Everything is not a hammer: How new tools can make sense out of streaming data appeared first on The Data Roundtable.
SAS Event Stream Processing (ESP) cannot only process structured streaming events (a collection of fields) in real time, but has also very advanced features regarding the collection and the analysis of unstructured events. Twitter is one of the most well-known social network application and probably the first that comes to mind when thinking about streaming data source. On the other hand, SAS has powerful solutions to analyze unstructured data with SAS Text Analytics. This post is about merging 2 needs: collecting unstructured data coming from Twitter and doing some text analytics processing on tweets (contextual extraction, content categorization and sentiment analysis).
Before moving forward, SAS ESP is based on a publish and subscribe model. Events are injected into an ESP model using an “adapter” or a “connector.” or using Python and the publisher API Target applications consume enriched events output by ESP using the same technology, “adapters” and “connectors.” SAS ESP provides lots of them, in order to integrate with static and dynamic applications.
Then, an ESP model flow is composed of “windows” which are basically the type of transformation we want to perform on streaming events. It can be basic data management (join, compute, filter, aggregate, etc.) as well as advanced processing (data quality, pattern detection, streaming analytics, etc.).
SAS ESP Twitter Adapters background
SAS ESP 4.2 provides two adapters to connect to Twitter as a data source and to publish events from Twitter (one event per tweet) to a running ESP model. There are no equivalent connectors for Twitter.
Both two adapters are publisher only and include:
- Twitter Publisher Adapter
- Twitter Gnip Publisher Adapter
The second one is more advanced, using a different API (GNIP, bought by Twitter) and providing additional capabilities (access to history of tweets) and performance. The adapter builds event blocks from a Twitter Gnip firehose stream and publishes them to a source window. Access to this Twitter stream is restricted to Twitter-approved parties. Access requires a signed agreement.
In this article, we will focus on the first adapter. It consumes Twitter streams and injects event blocks into source windows of an ESP engine. This adapter has free capabilities. The default access level of a Twitter account allows us to use the following methods:
- Sample: Starts listening on random sample of all public statuses.
- Filter: Starts consuming public statuses that match one or more filter predicates.
SAS ESP Text Analytics background
SAS ESP 4.1/4.2 provides three window types (event transformation nodes) to perform Text Analytics in real time on incoming events.
The term text analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation.
Here are the SAS ESP Text Analytics features:
- Text Category” window:
- Content categorization or document classification into topics
- Automatically identify or extract content that matches predefined criteria to more easily search by, report on, and model/segment by important themes
- Relies on “.mco” binary files coming from SAS Contextual Analysis solution
- “Text Context” window:
- Contextual extraction of named entities (people, titles, locations, dates, companies, etc.) or facts of interest
- Relies on “.li” binary files coming from SAS Contextual Analysis solution
- “Text Sentiment” window:
- Sentiment analysis of text coming from documents, social networks, emails, etc.
- Classify documents and specific attributes/features as having positive, negative, or neutral/mixed tone
- Relies on “.sam” binary files coming from SAS Sentiment Analysis solution
Binary files (“.mco”, “.li”, “.sam”) cannot be reverse engineered. The original projects in their corresponding solutions (SAS Contextual Analysis or SAS Sentiment Analysis) should be used to perform modifications on those binaries.
The ESP project
The following ESP project is aimed to:
- Wait for events coming from Twitter in the source Twitter window (this is a source window, the only entry point for streaming events)
- Perform basic event processing and counting
- Perform text analytics on tweets (in the input streaming, the tweet text is injected as a single field)
Let’s have a look at potential text analytics results.
Here is a sample of the Twitter stream that SAS ESP is able to catch (the tweet text is collected in a field called tw_Text):
The “Text Category” window, with an associated “.mco” file, is able to classify tweets into topics/categories with a related score:
The “Text Context” window, with an associated “.li” file, is able to extract terms and their corresponding entity (person, location, currency, etc.) from a tweet:
The “Text Sentiment” window, with an associated “.sam” file, is able to determine a sentiment with a probability from a tweet:
Run the Twitter adapter
In order to inject events into a running ESP model, the Twitter adapter should be started and is going to publish live tweets into the sourceTwitter window of our model.
Here we search for tweets containing “iphone”, but you can change to any keyword you want to track (assuming people are tweeting on that keyword…).
There are many additional options: -f allows to follow specific user ids, -p allows to specify locations of interest, etc.
Consume enriched events with SAS ESP Streamviewer
SAS ESP provides a way to render events in real-time graphically. Here is an example of how to consume real-time events in a powerful dashboard.
With SAS ESP, you can bring the power of SAS Analytics into the real-time world. Performing Text Analytics (content categorization, sentiment analysis, reputation management, etc.) on the fly on text coming from tweets, documents, emails, etc. and triggering consequently some relevant actions have never been so simple and so fast.
The way your graph looks can make all the difference ... two people can graph the exact same data in essentially the same way, but one of the two graphs can be perceived as much better than the other. Hopefully reading my blogs will help you create the better graph! […]
Disclaimer: before you get overly excited, PROC EXPAT is not really an actual SAS procedure. Sadly, it will not transfer or translate your code based on location. But it does represent SAS’ expansion of the Customer Contact Center, and that’s good news for our users. Here’s the story behind my made-up proc.
“Buon giorno!” “Guten Tag!” “Bonjour!” Excitement is in the air, the team buzzes. I’m not at an international airport, I’m at the new SAS office in Dublin, Ireland. I’d been given a one-month assignment to help expand operations, providing training in the Customer Contact Center across channels to deliver exceptional customer support and create an enhanced customer experience around the globe. It was such a rewarding experience!
SAS is a global company with customers in 148 countries, at more than 80,000 sites. The EXPAT Procedure is what I’ve coined my month-long adventure in Dublin, training and supporting our newly expanded Customer Contact Center team. So, what does this mean for you? It means additional customer care and expanded hours for all your inquiries and requests. Win!
Bringing expanded customer service to Europe, Middle East and Africa
The expansion was announced last fall, when SAS revealed plans to open a new Inside Sales and Customer Contact Center in Dublin—an investment of around €40 million with a projected 150 new jobs to be created—to provide support across Europe, Middle East and Africa (EMEA).
The new office models the US Customer Contact Center (and this is where I come in), providing support for customers in their channel of choice—be it social media, Live Chat, phone, email and/or web inquiries. We field general questions about SAS software, training, certifications or resources, as well as specific issues, like errors in your SAS log. The Customer Contact Center is here to assist, and now our customers in EMEA can benefit from the added support as well.
And we’re not just answering inquiries, we’re listening to our customers. We’re always looking at ways to make things easier to navigate, simpler to find, and faster to share. And we love customer feedback, whether direct or indirect, to enhance your experience with SAS.
The new team in Dublin is comprised of multi-lingual individuals with loads of experience in the tech industry. They have begun covering the United Kingdom, Ireland and Italy and it’s been amazing working with such a knowledgeable, patient and fun team with a great sense of humor. I think you’ll like them, too.
While I’ve been assisting with training the team on everything SAS, I’ve gotten a little training myself, working in a new office in a different country, surrounded by colleagues from more than 15 countries across the pond. A reminder of the wide reach of SAS, impact of Big Data analytics, and importance of our worldwide SAS users.
It’s an exciting time for the Customer Contact Center, SAS and our customers. If you’re located in EMEA, don’t hesitate to reach out to us!
PROC EXPAT – Expanding SAS’ global customer service was published on SAS Users.
.@philsimon on what's next for MDM applications.
The post The next wave of MDM: Integrating structured and unstructured data appeared first on The Data Roundtable.