1月 062018

Deep learning is not synonymous with artificial intelligence (AI) or even machine learning. Artificial Intelligence is a broad field which aims to "automate cognitive processes." Machine learning is a subfield of AI that aims to automatically develop programs (called models) purely from exposure to training data.

Deep Learning and AI

Deep learning is one of many branches of machine learning, where the models are long chains of geometric functions, applied one after the other to form stacks of layers. It is one among many approaches to machine learning but not on equal footing with the others.

What makes deep learning exceptional

Why is deep learning unequaled among machine learning techniques? Well, deep learning has achieved tremendous success in a wide range of tasks that have historically been extremely difficult for computers, especially in the areas of machine perception. This includes extracting useful information from images, videos, sound, and others.

Given sufficient training data (in particular, training data appropriately labelled by humans), it’s possible to extract from perceptual data almost anything that a human could extract. Large corporations and businesses are deriving value from deep learning by enabling human-level speech recognition, smart assistants, human-level image classification, vastly improved machine translation, and more. Google Now, Amazon Alexa, ad targeting used by Google, Baidu and Bing are all powered by deep learning. Think of superhuman Go playing and near-human-level autonomous driving.

In the summer of 2016, an experimental short movie, Sunspring, was directed using a script written by a long short-term memory (LSTM) algorithm a type of deep learning algorithm.

How to build deep learning models

Given all this success recorded using deep learning, it's important to stress that building deep learning models is more of an art than science. To build a deep learning or any machine learning model for that matter one need to consider the following steps:

  • Define the problem: What data does the organisation have? What are we trying to predict? Do we need to collect more data? How can we manually label the data? Make sure to work with domain expert because you can’t interpret what you don’t know!
  • What metrics can we use to reliably measure the success of our goals.
  • Prepare validation process that will be used to evaluate the model.
  • Data exploration and pre-processing: This is where most time will be spent such as normalization, manipulation, joining of multiple data sources and so on.
  • Develop an initial model that does better than a baseline model. This gives some indication of whether machine learning is ideal for the problem.
  • Refine model architecture by tuning hyperparameters and adding regularization. Make changes based on validation data.
  • Avoid overfitting.
  • Once happy with the model, deploy it into production environment. This may be difficult to achieve for many organisations giving that a deep learning score code is large. This is where SAS can help. SAS has developed a scoring mechanism called "astore" which allows deep learning method to be pushed into production with just a click.

Is the deep learning hype justified?

We're still in the middle of deep learning revolution trying to understand the limitations of this algorithm. Due to its unprecedented successes, there has been a lot of hype in the field of deep learning and AI. It’s important for managers, professionals, researchers and industrial decision makers to be able to distill this hype from reality created by the media.

Despite the progress on machine perception, we are still far from human level AI. Our models can only perform local generalization, adapting to new situations that must be similar to past data, whereas human cognition is capable of extreme generalization, quickly adapting to radically novel situations and planning for long-term future situations. To make this concrete, imagine you’ve developed a deep network controlling a human body, and you wanted it to learn to safely navigate a city without getting hit by cars, the net would have to die many thousands of times in various situations until it could infer that cars are dangerous, and develop appropriate avoidance behaviors. Dropped into a new city, the net would have to relearn most of what it knows. On the other hand, humans are able to learn safe behaviors without having to die even once—again, thanks to our power of abstract modeling of hypothetical situations.

Lastly, remember deep learning is a long chain of geometrical functions. To learn its parameters via gradient descent one key technical requirements is that it must be differentiable and continuous which is a significant constraint.

Looking beyond the AI and deep learning hype was published on SAS Users.

11月 212017

I recently spent two days with an innovative communications customer explaining exactly what SAS analytics can do to help them take their advertising platform to a whole new level. Media meets data resulting in addressable advertising. SAS would essentially be the brain behind all their advertising decisions, helping them ingest [...]

Analytics = brilliance was published on SAS Voices by Suzanne Clayton

9月 072017
In many introductory to image recognition tasks, the famous MNIST data set is typically used. However, there are some issues with this data:

1. It is too easy. For example, a simple MLP model can achieve 99% accuracy, and a 2-layer CNN can achieve 99% accuracy.

2. It is over used. Literally every machine learning introductory article or image recognition task will use this data set as benchmark. But because it is so easy to get nearly perfect classification result, its usefulness is discounted and is not really useful for modern machine learning/AI tasks.

Therefore, there appears Fashion-MNIST dataset. This dataset is developed as a direct replacement for MNIST data in the sense that:

1. It is the same size and style: 28x28 grayscale image
2. Each image is associated with 1 out of 10 classes, which are:
       9:Ankle boot
3. 60000 training sample and 10000 testing sample Here is a snapshot of some samples:
Since its appearance, there have been multiple submissions to benchmark this data, and some of them are able to achieve 95%+ accuracy, most noticeably Residual network or separable CNN.
I am also trying to benchmark against this data, using keras. keras is a high level framework for building deep learning models, with selection of TensorFlow, Theano and CNTK for backend. It is easy to install and use. For my application, I used CNTK backend. You can refer to this article on its installation.

Here, I will benchmark two models. One is a MLP with layer structure of 256-512-100-10, and the other one is a VGG-like CNN. Code is available at my github:

The first model achieved accuracy of [0.89, 0.90] on testing data after 100 epochs, while the latter achieved accuracy of >0.94 on testing data after 45 epochs. First, read in the Fashion-MNIST data:

import numpy as np
import io, gzip, requests
train_image_url = ""
train_label_url = ""
test_image_url = ""
test_label_url = ""

def readRemoteGZipFile(url, isLabel=True):
response=requests.get(url, stream=True)
gzip_content = response.content
fObj = io.BytesIO(gzip_content)
content = gzip.GzipFile(fileobj=fObj).read()
if isLabel:
result = np.frombuffer(content, dtype=np.uint8, offset=offset)

train_labels = readRemoteGZipFile(train_label_url, isLabel=True)
train_images_raw = readRemoteGZipFile(train_image_url, isLabel=False)

test_labels = readRemoteGZipFile(test_label_url, isLabel=True)
test_images_raw = readRemoteGZipFile(test_image_url, isLabel=False)

train_images = train_images_raw.reshape(len(train_labels), 784)
test_images = test_images_raw.reshape(len(test_labels), 784)
Let's first visual it using tSNE. tSNE is said to be the most effective dimension reduction tool.This plot function is borrowed from sklearn example.

from sklearn import manifold
from time import time
import matplotlib.pyplot as plt
from matplotlib import offsetbox
plt.rcParams['figure.figsize']=(20, 10)
# Scale and visualize the embedding vectors
def plot_embedding(X, Image, Y, title=None):
x_min, x_max = np.min(X, 0), np.max(X, 0)
X = (X - x_min) / (x_max - x_min)

ax = plt.subplot(111)
for i in range(X.shape[0]):
plt.text(X[i, 0], X[i, 1], str(Y[i]),[i] / 10.),
fontdict={'weight': 'bold', 'size': 9})

if hasattr(offsetbox, 'AnnotationBbox'):
# only print thumbnails with matplotlib > 1.0
shown_images = np.array([[1., 1.]]) # just something big
for i in range(X.shape[0]):
dist = np.sum((X[i] - shown_images) ** 2, 1)
if np.min(dist) < 4e-3:
# don't show points that are too close
shown_images = np.r_[shown_images, [X[i]]]
imagebox = offsetbox.AnnotationBbox(
plt.xticks([]), plt.yticks([])
if title is not None:

tSNE is very computationally expensive, so for impatient people like me, I used 1000 samples for a quick run. If your PC is fast enough and have time, you can run tSNE against the full dataset.

samples=np.random.choice(range(len(Y_train)), size=sampleSize)
tsne = manifold.TSNE(n_components=2, init='pca', random_state=0)
t0 = time()
sample_images = train_images[samples]
sample_targets = train_labels[samples]
X_tsne = tsne.fit_transform(sample_images)
t1 = time()
plot_embedding(X_tsne, sample_images.reshape(sample_targets.shape[0], 28, 28), sample_targets,
"t-SNE embedding of the digits (time %.2fs)" %
(t1 - t0))
We see that several features, including mass size, split on bottom and semetricity, etc, separate the categories. Deep learning excels here because you don't have to manually engineering the features but let the algorithm extracts those.

In order to build your own networks, we first import some libraries

from keras.models import Sequential
from keras.layers.convolutional import Conv2D, MaxPooling2D, AveragePooling2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers import Activation
We also do standard data preprocessing:

X_train = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
X_test = test_images.reshape(test_images.shape[0], 28, 28, 1).astype('float32')

X_train /= 255
X_test /= 255

X_train -= 0.5
X_test -= 0.5

X_train *= 2.
X_test *= 2.

Y_train = train_labels
Y_test = test_labels
Y_train2 = keras.utils.to_categorical(Y_train).astype('float32')
Y_test2 = keras.utils.to_categorical(Y_test).astype('float32')
Here is the simple MLP implemented in keras:

mlp = Sequential()
mlp.add(Dense(256, input_shape=(784,)))
mlp.add(Dense(10, activation='softmax'))
mlp.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

This model achieved almost 90% accuracy on test dataset at about 100 epochs. Now, let's build a VGG-like CNN model. We use an architecture that is similar to VGG but still very different. Because the figure data is small, if we use original VGG architecture, it is very likely to overfit and won't perform very well in testing data which is observed in publically submitted benchmarks listed above. To build such a model in keras is very natural and easy:

num_classes = len(set(Y_train))
model3.add(Conv2D(filters=32, kernel_size=(3, 3), padding="same",
input_shape=X_train.shape[1:], activation='relu'))
model3.add(Conv2D(filters=64, kernel_size=(3, 3), padding="same", activation='relu'))
model3.add(MaxPooling2D(pool_size=(2, 2)))
model3.add(Conv2D(filters=128, kernel_size=(3, 3), padding="same", activation='relu'))
model3.add(Conv2D(filters=256, kernel_size=(3, 3), padding="valid", activation='relu'))
model3.add(MaxPooling2D(pool_size=(3, 3)))
model3.add(Dense(num_classes, activation='softmax'))
model3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
This model has 1.5million parameters. We can call 'fit' method to train the model:, Y_train2, validation_data = (X_test, Y_test2), epochs=50, verbose=1, batch_size=500)
After 40 epochs, this model archieves accuracy of 0.94 on testing data.Obviously, there is also overfitting problem for this model. We will address this issue later.

 Posted by at 8:48 上午
4月 032017

At Opening Session, SAS CEO Jim Goodnight and Alexa have a chat using the Amazon Echo and SAS Visual Analytics.

Unable to attend SAS Global Forum 2017 happening now in Orlando? We’ve got you covered! You can view live stream video from the conference, and check back here for important news from the conference, starting with the highlights from last night’s Opening Session.

While the location and record attendance made for a full house this year, CEO Jim Goodnight explained that there couldn’t be a more perfect setting to celebrate innovation than the world of Walt Disney. “Walt was a master innovator, combining art and science to create an entirely new way to make intelligent connections,” said Goodnight. “SAS is busy making another kind of intelligent connection – the kind made possible by data and analytics.”

It’s SAS’ mission to bring analytics everywhere and to make it ambient. That was exactly the motivation that drove SAS nearly four years ago when embarking on a massive undertaking known as SAS® Viya™. But SAS Viya – announced last year in Las Vegas – is more than just a fast, powerful, modernized analytics platform. Goodnight said it’s really the perfect marriage of science and art.

“Consider what would be possible if analytics could be brought into every moment and every place that data exists,” said Goodnight. “The opportunities are enormous, and like Walt Disney, it’s kind of fun to do the impossible.”

Driving an analytics economy

Executive Vice President and Chief Marketing Officer Randy Guard took the stage to update attendees on new releases available on SAS Viya and why SAS is so excited about it. And he explained the reason for SAS Viya comes from the changes being driven in the analytics marketplace. It’s what Guard referred to as an analytics economy – where the maturity of algorithms and techniques progress rapidly. “This is a place where disruption is normal, a place where you want to be the disruptor; you want to be the innovator,” said Guard. That’s exactly what you can achieve with SAS Viya.

As if SAS Viya didn’t leave enough of an impression, Guard took it one step further by inviting Goodnight back on stage to give users a preview into the newest innovation SAS has been cooking up. Using the Amazon Echo Dot – better known as Alexa – Goodnight put cognitive computing into action as he called up annual sales, forecasts and customer satisfaction reports in SAS® Visual Analytics.

Though still in its infant stages of development, the demo was just another reminder that when it comes to analytics, SAS never stops thinking of the next great thing.

AI: The illusion of intelligence

On his Segway, Executive Vice President and Chief Technology Officer Oliver Schabenberger talks AI at the SAS Global Forum Opening Session.

With his Segway Mini, Executive Vice President and Chief Technology Officer Oliver Schabenberger rolled on stage, fully trusting that his “smart legs” wouldn’t drive him off and into the audience. “I’ve accepted that algorithms and software have intelligence; I’ve accepted that they make decisions for us, but we still have choices,” said Schabenberger.

Diving into artificial intelligence, he explained that today’s algorithms operate with super-human abilities – they are reliable, repeatable and work around the clock without fatigue – yet they don’t behave like humans. And while the “AI” label is becoming trendy, true systems deserving of the AI title have two distinct things in common: they belong to the class of weak AI systems and they tend to be based on deep learning.

So, why are those distinctions important? Schabenberger explained that a weak AI system is trained to do one task only – the system driving an autonomous vehicle cannot operate the lighting in your home.

“SAS is very much engaged in weak AI, building cognitive systems into our software,” he said. “We are embedding learning and gamification into solutions and you can apply deep learning to text, images and time series.” Those cognitive systems are built into SAS Viya. And while they are powerful and great when they work, Schabenberger begged the question of whether or not they are truly intelligent.

Think about it. True intelligence requires some form of creativity, innovation and independent problem solving. The reality is, that today’s algorithms and software, no matter how smart, are being used as decision support systems to augment our own capabilities and make us better.

But it’s uncomfortable to think about fully trusting technology to make decisions on our behalf. “We make decisions based on reason, we use gut feeling and make split-second judgment calls based on incomplete information,” said Schabenberger. “How well do we expect machines to perform [in our place]when we let them loose and how quickly do we expect them to learn on the job?”

It’s those kinds of questions that prove that all we can handle today is the illusion of intelligence. “We want to get tricked by the machine in a clever way,” said Schabenberger. “The rest is just hype.”

Creating tomorrow‘s analytics leaders

With a room full of analytics leaders, Vice President of Sales Emily Baranello asked attendees to consider where the future leaders of analytics will come from. If you ask SAS, talent will be pulled from universities globally that have partnered with SAS to create 200 types of programs that teach today’s students how to work in SAS software. The commitment level to train up future leaders is evident and can be seen in SAS certifications, joint certificate programs and SAS’ track toward nearly 1 million downloads of SAS® Analytics U.

“SAS talent is continuing to building in the marketplace,” said Baranello. “Our goal is to bring analytics everywhere and we will continue to partner with universities to ready those students to be your successful employees.”

Using data for good

More than just analytics and technology, SAS’ brand is a representation of people who make the world a better place. Knowing that, SAS announced the development of GatherIQ – a customized crowdsourcing app that will begin with two International Organization for Migration (IMO) projects. One project will specifically focus on global migration, using data to keep migrants safe as they search for a better life. With GatherIQ, changing the world might be as easy as opening an app.

There's much more to come, so stay tuned to SAS blogs this week for the latest updates from SAS Global Forum!

SAS Viya, AI star at SAS Global Forum Opening Session was published on SAS Users.