artificial intelligence

10月 302019
 

I suffer from arthritis. You can tell just by watching me walk: Depending on the day, I have a slight limp, which varies in severity based on a number of factors such as the time of day and recent physical activity. Years of treatment for my condition have shown me [...]

I applied AI to my arthritis assessment. Here’s what happened. was published on SAS Voices by Mark Wolff

9月 092019
 

Editor's Note: This article was translated and edited by SAS USA and was originally written by Makoto Unemi. The original text is here.

SAS previously provided SAS Scripting Wrapper for Analytics Transfer (SWAT), a package for using SAS Viya functions from various general-purpose programming languages ​​such as Python.

In addition to SWAT, SAS launched Deep Learning Python (DLPy), a higher-level API package for Python, making it possible to use SAS Viya functions more efficiently from Python. In this article I outline more about what DLPy is and how it's implementation.

About DLPy

DLPy is a high-level package for the Python API created for deep learning and image action set after Viya3.3. DLPy provides an API similar to Keras to improve the efficiency of deep learning and image processing coding. With just a little rewriting of the existing Keras code, it is possible to execute the processing on SAS Viya.

For example, below is an example of a Convolutional Neural Network (CNN) layer definition; you can see that it is very similar to Keras.

The layers supported by DLPy are: InputLayer, Conv2d, Pooling, Dense, Recurrent, BN, Res, Proj, and OutputLayer. The following is an example of learning.

DLPy functions

Introducing DLPy's functions (partial excerpts), taking as an example the learning of multiple dolphins and giraffe images using CNN and applying test images to the model.

Implementation of major deep learning networks

DLPy offers the following pre-built deep learning models: VGG11/13/16/19, ResNet34/50/101/152, wide_resnet, and dense_net.

The following models also offer pre-trained weights using ImageNet data (these weights can be used for unique tasks by transfer learning): VGG16, VGG19, ResNet50, ResNet101, and ResNet152. The following is an example of transferring ResNet50 pre-trained weights.

CNN judgment basis information

Using the heat_map_analysis() method, you can output a colorful heat map and check where you focused on the image.

In addition, the get_feature_maps() method is used to get the feature map of each layer of CNN, and feature_maps.display() method is used to specify and display the obtained feature map layer and check can also do.

The following is the output result of layer 1 feature map.

The following is the output result of layer 18 feature map.

Deep learning & image processing related task support function

resize() method: Resize image data

as_patches() method: Image data expansion (generates a patch from the original image)

two_way_split() method: Data split (learning, testing)

plot_network() method: draws the structure of the defined deep learning layer (network) as a graphical diagram

plot_training_history() method: Iterative learning history display

predict() method: Display prediction (scoring) results

plot_predict_res() method: Display classification results

And of course, you can use DLPy to get data from a SAS Viya in-memory session, pass it to your local client, and convert it to common data formats like numpy arrays and Pandas DataFrames. The converted data can be smoothly supplied to models of other open source packages such as scikit-learn.

Regarding image classification using DLPy, videos are also available in the Deep Learning with Python (DLPy) Demo Series section of the DLPy product page.

SAS Viya: Package for Python API for deep learning and image processing: DLPy was published on SAS Users.

9月 032019
 

The startup ecosystem is dynamic and the flow of venture capital into tech is at an all-time high. Billions of dollars are invested in tech startups every year. Many tech startups market themselves as ‘powered by AI’ and pitch investors with buzzword laden phrases such as, ‘we leverage state of [...]

7 ways SAS empowers startups with artificial intelligence and machine learning was published on SAS Voices by Avinash Sooriyarachchi

4月 302019
 

Artificial intelligence is the attention-grabbing, overhyped, shiny object that every organization is searching to make use of. Yes, it is overhyped, but it’s also very real and very powerful. “We do not want to add to the hype. We do not want to add to the confusion. We want to [...]

Bringing AI, ML and analytics to life - SAS Global Forum Tech Connection was published on SAS Voices by Shannon Heath

4月 122019
 

At the risk of oversimplifying, I think of artificial intelligence as what becomes possible after you’ve fully embraced analytics and you’re starting to get bolder about how to use it. Your models are getting better, your predictions are more accurate, your results are stronger and over all, confidence grows in [...]

6 thinks you didn't know about AI was published on SAS Voices by John Balla

4月 122019
 

At the risk of oversimplifying, I think of artificial intelligence as what becomes possible after you’ve fully embraced analytics and you’re starting to get bolder about how to use it. Your models are getting better, your predictions are more accurate, your results are stronger and over all, confidence grows in [...]

6 thinks you didn't know about AI was published on SAS Voices by John Balla

4月 062019
 

Recently, the North Carolina Human Trafficking Commission hosted a regional symposium to help strengthen North Carolina’s multidisciplinary response to human trafficking. One of the speakers shared an anecdote from a busy young woman with kids. She had returned home from work and was preparing for dinner; her young son wanted [...]

Countering human trafficking using text analytics and AI was published on SAS Voices by Tom Sabo

4月 062019
 

Recently, the North Carolina Human Trafficking Commission hosted a regional symposium to help strengthen North Carolina’s multidisciplinary response to human trafficking. One of the speakers shared an anecdote from a busy young woman with kids. She had returned home from work and was preparing for dinner; her young son wanted [...]

Countering human trafficking using text analytics and AI was published on SAS Voices by Tom Sabo

4月 032019
 

Structuring a highly unstructured data source

Human language is astoundingly complex and diverse. We express ourselves in infinite ways. It can be very difficult to model and extract meaning from both written and spoken language. Usually the most meaningful analysis uses a number of techniques.

While supervised and unsupervised learning, and specifically deep learning, are widely used for modeling human language, there’s also a need for syntactic and semantic understanding and domain expertise. Natural Language Processing (NLP) is important because it can help to resolve ambiguity and add useful numeric structure to the data for many downstream applications, such as speech recognition or text analytics. Machine learning runs outputs from NLP through data mining and machine learning algorithms to automatically extract key features and relational concepts. Human input from linguistic rules adds to the process, enabling contextual comprehension.

Text analytics provides structure to unstructured data so it can be easily analyzed. In this blog, I would like to focus on two widely used text analytics techniques: information extraction and entity resolution.

Information Extraction

Information Extraction (IE) automatically extracts structured information from an unstructured or semi-structured text data type -- for example, a text file, to create new structured text data. IE works at the sub-document level, in contrast with techniques such as categorization, that work at the document or record level. Therefore, the results of IE can further feed into other analyses, like predictive modeling or topic identification, as features for those processes. IE can also be used to create a new database of information. One example is the recording of key information about terrorist attacks from a group of news articles on terrorism. Any given IE task has a defined template, which is a (or a set of) case frame(s) to hold the information contained in a single document. For the terrorism example, a template would have slots corresponding to the perpetrator, victim, and weapon of the terroristic act, and the date on which the event happened. An IE system for this problem is required to “understand” an attack article only enough to find data corresponding to the slots in this template. Such a database can then be used and analyzed through queries and reports about the data.

In their new book, SAS® Text Analytics for Business Applications: Concept Rules for Information Extraction Models, authors Teresa Jade, Biljana Belamaric Wilsey, and Michael Wallis, give some great examples of uses of IE:

"One good use case for IE is for creating a faceted search system. Faceted search allows users to narrow down search results by classifying results by using multiple dimensions, called facets, simultaneously. For example, faceted search may be used when analysts try to determine why and where immigrants may perish. The analysts might want to correlate geographical information with information that describes the causes of the deaths in order to determine what actions to take."

Another good example of using IE in predictive models is analysts at a bank who want to determine why customers close their accounts. They have an active churn model that works fairly well at identifying potential churn, but less well at determining what causes the churn. An IE model could be built to identify different bank policies and offerings, and then track mentions of each during any customer interaction. If a particular policy could be linked to certain churn behavior, then the policy could be modified to reduce the number of lost customers.

Reporting information found as a result of IE can provide deeper insight into trends and uncover details that were buried in the unstructured data. An example of this is an analysis of call center notes at an appliance manufacturing company. The results of IE show a pattern of customer-initiated calls about repairs and breakdowns of a type of refrigerator, and the results highlight particular problems with the doors. This information shows up as a pattern of increasing calls. Because the content of the calls is being analyzed, the company can return to its design team, which can find and remedy the root problem.

Entity Resolution and regular expressions

Entity Resolution is the technique of recognizing when two observations relate to the same entity (thing, person, company), despite having been described differently. And conversely, recognizing when two observations do not relate to the same entity, despite having been described similarly. For example, you are listed in one data base as S Roberts, Sian Roberts, S.Roberts. All refer to the same person but would be treated as different people in an analysis unless they are resolved (combined to one person).

Entity resolution can be performed as part of a data pre-processing step or as text analysis. Basically one helps resolve multiple entries (cleans the data) and the other resolves reference to a single entity to extract meaning, for example, pronoun resolution - when “it” refers to a particular company mentioned earlier in the text. Here is another example:

Assume each numbered item is a separate observation in the input data set:
1. SAS Institute is a great company. Our company has a recreation center and health care center for employees.
2. Our company has won many awards.
3. SAS Institute was founded in 1976.

The scoring output matches are below; note that the document ID associated with each match aligns with the number before the input document where the match was found.

Unstructured data clean-up

In the following section we focus on the pre-processing clean-up of the data. Unstructured data is the most voluminous form of data in the world, and analysts rarely receive it in perfect condition for processing. In other words, textual data needs to be cleaned, transformed, and enhanced before value can be derived from it.

A regular expression is a pattern that the regular expression engine attempts to match in input. In SAS programming, regular expressions are seen as strings of letters and special characters that are recognized by certain built-in SAS functions for the purpose of searching and matching. Combined with other built-in SAS functions and procedures, such as entity resolution, you can realize tremendous capabilities. Matthew Windham, author of Unstructured Data Analysis: Entity Resolution and Regular Expressions in SAS®, gives some great examples of how you might use these techniques to clean your text data in his book. Here we share one of them:

"As you are probably familiar with, data is rarely provided to analysts in a form that is immediately useful. It is frequently necessary to clean, transform, and enhance source data before it can be used—especially textual data."

Extract, Transform, and Load (ETL) ETL is a general set of processes for extracting data from its source, modifying it to fit your end needs, and loading it into a target location that enables you to best use it (e.g., database, data store, data warehouse). We’re going to begin with a fairly basic example to get us started. Suppose we already have a SAS data set of customer addresses that contains some data quality issues. The method of recording the data is unknown to us, but visual inspection has revealed numerous occurrences of duplicative records. In this example, it is clearly the same individual with slightly different representations of the address and encoding for gender. But how do we fix such problems automatically for all of the records?

First Name Last Name DOB Gender Street City State Zip Robert Smith 2/5/1967 M 123 Fourth Street Fairfax, VA 22030 Robert Smith 2/5/1967 Male 123 Fourth St. Fairfax va 22030

Using regular expressions, we can algorithmically standardize abbreviations, remove punctuation, and do much more to ensure that each record is directly comparable. In this case, regular expressions enable us to perform more effective record keeping, which ultimately impacts downstream analysis and reporting. We can easily leverage regular expressions to ensure that each record adheres to institutional standards. We can make each occurrence of Gender either “M/F” or “Male/Female,” make every instance of the Street variable use “Street” or “St.” in the address line, make each City variable include or exclude the comma, and abbreviate State as either all caps or all lowercase. This example is quite simple, but it reveals the power of applying some basic data standardization techniques to data sets. By enforcing these standards across the entire data set, we are then able to properly identify duplicative references within the data set. In addition to making our analysis and reporting less error-prone, we can reduce data storage space and duplicative business activities associated with each record (for example, fewer customer catalogs will be mailed out, thus saving money).

Your unstructured text data is growing daily, and data without analytics is opportunity yet to be realized. Discover the value in your data with text analytics capabilities from SAS. The SAS Platform fosters collaboration by providing a toolbox where best practice pipelines and methods can be shared. SAS also seamlessly integrates with existing systems and open source technology.

Further Resources:
Natural Language Processing: What it is and why it matters

White paper: Text Analytics for Executives: What Can Text Analytics Do for Your Organization?

SAS® Text Analytics for Business Applications: Concept Rules for Information Extraction Models, by Teresa Jade, Biljana Belamaric Wilsey, and Michael Wallis

Unstructured Data Analysis: Entity Resolution and Regular Expressions in SAS®, by Matthew Windham

Text analytics explained was published on SAS Users.

4月 022019
 

There's been a lot of hype regarding using machine learning (ML) for demand forecasting, and rightfully so, given the advancements in data collection, storage, and processing along with improvements in technology. There's no reason why machine learning can't be utilized as another forecasting method among the collection of forecasting methods [...]

Is machine learning practical for statistical forecasting? was published on SAS Voices by Charlie Chase