10月 282022
 

Payment Fraud continues to be a challenge for banks. With the increasing number of digital payment types and the ever-growing volumes of Real-Time Payments, real-time fraud detection and prevention are vital.

At the same time, customers are demanding a frictionless customer experience, so fraud detection methods need to be sophisticated enough to maintain the required balance.

While rules-based detection can be a good start, it can be difficult to keep up with fraud shifts and maintain efficacy. Rules tend to be reactively built based on individual fraud cases rather than considering full customer data and behavior.

Machine learning combined with behavior profiling alongside rules are best deployed together as part of a layered fraud prevention approach. Machine learning models use advanced methodologies and statistical techniques to identify risky payments. They are highly predictive not only in identifying fraud but also in being able to identify those genuine customer transactions to ensure that the high False positive ratios necessary in today’s environment are achievable.

Supervised vs Unsupervised learning

Supervised learning is when the model is trained on labeled data. In the case of a fraud model, this means having accurately tagged fraudulent transactions within the training data set. The model learns from this data to help predict outcomes in the future. Supervised methods include methodologies like linear regression, logistic regression, decision trees and random forest.

In unsupervised learning, the model is not trained on labeled data and instead works to gain its own insights from the data. Unsupervised learning uses clustering and association techniques.

Supervised models generally have better performance and are more predictable than unsupervised ones. However, supervised Machine Learning techniques require an upfront investment to ensure that the data is tagged correctly to achieve the best results. This may not be possible for organizations whose fraud reporting is not well-controlled or in the case of RTP they have few historical fraud events. In these cases, semi-supervised models may be ideal where supervised learning methods are first used to derive a preliminary set of features or models followed by an application of unsupervised methods to refine the results.

SAS models

SAS provides industry leading predictive models using a range of machine learning. These are generally built with the bank’s own data to give them the optimal fraud detection, unique to their customers and unique to the fraud patterns which they have.

To support behavior profiling, each SAS model is built including SAS’s patented Signature technology, which is a method of storing an entity’s historical transactional information and allows the model to determine someone’s typical habits. SAS models are able to support multi-entity Signatures; these might include customer, account, beneficiary or device, for instance. When scoring a transaction, the model accounts for not only the current transaction, but also the historical behavioral activity of all relevant entities captured by the Signatures. It is widely recognized that historical behavior is predictive of fraud. Typically, there are regular patterns of usage or spending; therefore, deviations from these established patterns may indicate suspicious activity and fraud.

SAS uses the information in these Signatures to derive hundreds of statistical variables that target unusual activity and collectively serve as inputs for the fraud detection model.

Examples of behavioral variables that can be derived from Signatures include, but are not limited to, the following: Average number of transactions in a given period, Typical spending amounts, Typical bill pay usage, Geographical variance and Spend velocity.

Geographic location related variables are very useful information for fraud detection, however customer locations on file aren’t always reliable. Sometimes that information is unavailable, so trying to use this fixed home location to calculate distance from home may not be appropriate. So, SAS also developed a proprietary dynamic home which allows the model to infer the current location of the customer. It uses the locations of clustered transactions to help reduce the false positives where the customer's home location is not actually representative of their true home or the area in which they usually transact.

In addition to behavioral variables, the fraud model relies on risk variables that capture relative fraud risk of various aspects of transactions based on historical data. These risk variables are complementary to behavioral characteristics described above and play a particularly important role in scoring customers and users without regular patterns of spending activity.

SAS begins with thousands of possible model features derived from various input fields. During the modelling process, the number of variables is reduced through a combination of techniques including applying transformations, Kolmogorov–Smirnov tests, correlation with target, Linear / logistic regression, linear interdependency and missing value imputation.

For each of the steps above, SAS follows a threshold-based approach. For example, SAS picks a threshold for the correlation between target and variable. All variables that fall below this threshold are eliminated from modelling. The exact value of the threshold is based on the SAS modelling teams' previous experience, so that the number of candidate variables after filtering remains sufficient to proceed with the subsequent steps.

Finally, SAS also performs several checks to filter out unstable variables.

This reduction of variables invariably results in the selection of the most significant variables that contribute the most to model performance. Among the final set of variables, different variables are important as different scenarios are encountered.

SAS models can also benefit from behavior segmentation. An example of behavior segmentation is to separate business and consumer customers early in the modelling process and essentially build separate models with the variables that are most relevant to each segment.

Different modelling techniques such as gradient boosting and neural networks are considered to determine the best technique. In fact, a combination of techniques are often used to achieve the best performance through an ensemble model.

Model Processing graphic

SAS models output a score from 0 to 999 to indicate the likelihood of the transaction being fraudulent. SAS has developed proprietary, reason code-generation technology which allows the model output to also include a list of risk reason codes. These reason codes are designed to give end users insights of the relevant risks associated with the model output score. Unlike traditional techniques that use individual variables as reasons, the SAS methodology first groups variables that are correlated and have a similar concept into different risk factors. Each of these risk factors are represented by a reason code. The model produces three such risk reason codes that indicate the top three highest priority reasons for why a transaction is likely to be in a fraudulent state (if it is indeed at risk at the time of the transaction).

Case Study 1:

Client A required a model to score real-time payments transactions.

The modelling period was based on 18 months of historical data which included non-monetary transactions such as logins and detail changes, payments and deposit transactions and fraud data. This date range ensures coverage of seasonality and major recurring socio-economic events such as tax payments. As fraud reporting is usually delayed, additional months of fraud data were also included to ensure that all the fraud that occurred during the 18-month period was included.

Various data issues were identified by SAS, the modelling impact was determined and the issues were rectified in the data at source where possible or documented for consideration in modelling.

The main categories of fraud seen in the data were Account Takeover cases where the customer’s credentials were stolen, cases where the customer was involved, cases where the customer was implicit in giving the fraudster their money or credentials such as with an investment scam, romance scam, phone phishing, remote access fraud or an email hack.

The Account Takeover cases made up the majority of the fraud transactions, but the customer involved case contributed to most of the fraud money lost.

Signature entities at the Online user ID and customer level were used to build up a behavioral history across the various channels and transaction types to identify unusual behavior. Variables were developed which looked at the login events and financial transaction patterns. Additionally, SAS’s proprietary dynamic home idea was used to infer the typical locale of the customer to calculate a more accurate distance from home variable.

SAS tried various modelling methods and found that in this case, the gradient boosting method proved to be the most effective at detecting the highly complex and nonlinear patterns in this fraud detection problem.

The model evaluation indicated around a 17% rise in transaction detection rate and 11% rise in Value Detection rate over the existing payments model.

Case Study 2:

Client B required a model to score payment transactions made from their mobile, personal and business internet banking channels.

The modelling period was based on 17 months of historical data which include payments and fraud data. As with Client A, the data range ensured seasonality was considered and that the fraud during the period was captured.

A customer account Signature and a separate beneficiary account Signature were developed in this case, allowing the model to consider the holistic picture of the sender account and the receiving account independently. Variables were then able to be developed based around the patterns of these entities considering things such as transaction amount, time of day, sender and receiver relationships and maturity. Also, although only a portion of the IP address was provided in the data to maintain PII standards, SAS was able to create geolocation centroids from this which allowed SAS’s proprietary dynamic home idea to be used to inform the model as well.

A strong fraud pattern was detected that could not be identified solely on Signature information where accounts that only received money for a long time suddenly decided to send money. In order to catch such patterns in real-time via the model, SAS designed a new batch job feature to inform the model.

Behavior segmentation was implemented which split the transactions into two segments based on transaction types. The best model performance was achieved by using a gradient boosting trees model for one segment and a neural network model for the other.

During the evaluation period, the model performed very well with a Transaction Detection rate of 79.1% and a Value Detection Rate of 75% at a 11.4:1 False Positive Rate.

Conclusion

Between the dynamic payments landscape and the ever-adapting fraudsters who continue to find new ways to exploit technology and customers, identifying fraud is only becoming more complex. Standard fraud prevention approaches are no longer as effective as they once were. SAS’ advanced machine learning models, which can analyze behaviors and detect suspicious patterns in real-time, can go a long way in helping organizations to better prevent fraud and protect customers.

Learn more

Machine learning models for payment fraud was published on SAS Users.

12月 022021
 

It’s a hard time to be a decision maker. Unexpected externalities like global pandemics, natural disasters and climate change make it harder to predict – and react to – everyday events. And that’s not just true for the world around us. The organizations we work within are more complex than ever, too.

The volume of communications and channels where we must meet customers and employees has grown exponentially – demanding our attention and reducing our focus. Not to mention added organizational complexity blurring the lines of roles and responsibilities according to geography, product and function.

Gaining control of such complexity requires rapid, streamlined and agile decision making. Technology that enables decision making needs to identify problems and take corrective action in real time to move quickly from questions to decisions.

SAS and Microsoft empower you to make better, faster decisions with unique enterprise decision management with SAS Intelligent Decisioning and Microsoft Power Automate using the SAS Decisioning connector – giving you the ability to design, deploy and manage automated decisions to improve the customer, employee and partner experience.

Enterprise decision management from SAS and Microsoft allows you to automate with a deliberate focus on decisions. You can combine business rules management with digital process automation and ModelOps, including model management and analytics, to accelerate the decision making process.

Together, Intelligent Decisioning and Power Automate unlock a breadth of use cases across the enterprise, including:

  • Insurance: Claims processing. Improve customer satisfaction and process claims faster. Receive insurance claims via Microsoft Power Apps and use Microsoft Power Automate to seamlessly ingest the claim into SAS Intelligent Decisioning. Using neural network models, SAS Intelligent Decisioning can analyze images of damage and compare with policies. If more information is required, Power Automate can trigger a flow to connect with a representative in Dynamics 365 Omnichannel for Customer Service. Once the decision is rendered, Power Automate can trigger process flows to notify the customer and deposit money into the bank account on file.
  • Banking: Credit decisioning. Reduce lender risk, improve decisioning response times and increase your bottom line. Build risk profiles in SAS Intelligent Decisioning by creating score cards and decision tables based off external data points, such as credit score, that assign each customer a risk rating. Use risk ratings to render decisions like home equity and line of credit approvals, and determine the loan amount. Once a decision has been made Power Automate flows can be used to communicate the loan amount to the customer and help them complete the loan agreement.
  • Retail/Banking: Fraud detection. Enable more secure transactions, reduce losses due to fraud and improve customer trust in your organization. SAS Intelligent Decisioning can identify fraudulent purchases and determine an appropriate course of action based on the level of confidence that a purchase is fraudulent. Power Automate can trigger automated reactions like alerting associated parties, denying a purchase at the point of sale, alerting the vendor, or sending notifications to the card holder.
  • Retail: Contextual Marketing. Increase marketing influence and become more customer centric by curating relevant and timely offers based on individual preferences. Use SAS Intelligent Decisioning to build a profile of tastes and preferences via geolocation, recommendation engines and market basket analysis. Use this profile to trigger Power Automate flows to send specific offers that align with important events, like birthdays or anniversaries, and send emails or push notifications to customers with unique, context-specific offers.

To learn more about what SAS Intelligent Decisioning and Microsoft Power Automate can help you achieve, visit sas.com/microsoft.

4 ways to make better, faster decisions with enterprise decision management from SAS Viya on Azure was published on SAS Users.

11月 102021
 

If you are thinking that nobody in their right mind would implement a Calculator API Service with a machine learning model, then yes, you’re probably right. But considering curiosity is in my DNA, it sometimes works this way and machine learning is fun. I have challenged myself to do it, not for the purpose of promoting such an experiment into production, but simply accomplishing a self-challenge that can be easily achieved with the resources provided by SAS Viya, particularly SAS Model Studio and SAS Intelligent Decisioning.

So, first, let’s define the purpose of my challenge:

deploy a basic calculator API service capable of executing the following operations for two input decimal numbers: addition, subtraction, multiplication, and division

The challenge must be executed under these restrictions:

  • Usage of a machine learning model as the “compute-engine” of the calculator
  • Development under the low code / no code paradigm
  • The end-to-end setup and execution process should not take more than a couple of hours

Use the following tools in SAS Viya:

  • Model Studio
  • Intelligent Decisioning
  • Simple web app (set up not covered)

The plan

The steps that I am identifying to complete my challenge are:
Step 1 - Create a machine learning model representing the compute-engine of my calculator (Model Studio)
Step 2 - Determine how to process other mathematical operations
Step 3 - Step 3 - Embed the needed logic into the decision to perform the calculator operations (Intelligent Decisioning)
Step 4 - Publish the artifact created as an API service (web app created outside of this post)

Step 1. Create a machine learning model as the compute-engine

Our first step is to create a model. We start with the addition operation and build from there. We’ll perform the addition by normal means of adding two numbers. Next, we’ll apply some extra logic which will perform subtraction, multiplication, and division. The diagram below represents the process:

A machine learning model is built from a data set where it self-learns what to do. I want my model to learn the addition of two numbers. I created a training data set in Excel with 100 registers, each of them with two random numbers between 0 and 1 and then the sum of them. The image below displays the general setup:

The algorithm / model I chose for my compute engine is the linear regression. Linear regression is a simple machine learning algorithm based in the following formula:

y = a + X1 + c·X2 + d·X3 + … + n·Xn

Where:

  • y is the result of the model execution – the result of the addition operation
  • X1, X2, X3, …, Xn are the input variables – for the calculator, there only will be X1 and X2 as operands
  • a, b, c, d, …, n are the parameters the machine learning process determines to create the model

So, with the training data set created, I’ll open a new machine learning project in SAS Viya Model Studio, selecting my data set from where the algorithm will learn, assign the target variable, add linear regression node, a test node, and click “Run pipeline”. Note: if following along in your own environment, make sure to use Selection Method = Adaptive LASSO and toggle Suppress intercept = On in the linear regression node. The resulting model resembles the following:

You can learn more about model creation in the How to Deploy Models in SAS tutorial on the SAS Users YouTube channel.

Once the pipeline completes, and reviewing the initial results, it seems the model is behaving in a proper way; but when I test specific operands where the result is zero, I realize the model has misgivings:

Those are the operations with zero as result. I think that maybe, the algorithm hasn’t learned with the proper data, so I change the first seven registers of my initial dataset with the following operations with zeros:

Again, running the pipeline and letting the magic work, Voila!!!, the process has learned to handle the zeroes and properly sum two input numbers. When I check the results, I verify the results calculated (predicted) by the model are the same as the original ones that were indicated in the training and test dataset. So now, I am sure my new model is ready for use as my calculator compute engine.

Now that I have my compute engine (model) ready, it’s time to use it. We know it can perform the sum operation perfectly, so how do we perform the rest of the operations? We’ll take the sum model, move it into SAS Intelligent Decisioning, and add rules to handle the rest of the operations.

First, let’s explore the logic that will build the operations from our original model. This is where mathematics start playing in (Step 2). After exploring the operations we'll look at the Decision model where we'll define the logic to run the operations (Step 3).

Addition

Execute the model with the two input numbers, with no additional logic.

Subtraction

By changing the sign of the second input, the model does the rest.

That’s a simple enough solution for subtraction, but how do we handle multiplication and division? Let’s take a look.

Multiplication and division

How can I perform a multiplication or a division operation if my compute engine only executes the addition operation? Here we can apply the magic of logarithmic properties stating:

  • log (x*y) = log(x) + log(y)
  • log (x/y) = log(x) – log(y)

Following this logic, if I want to multiply two numbers, I calculate the logarithm of each one and perform the addition operation in my compute engine. I follow this up by applying the exponential function to reverse the logarithm. The image below outlines the workflow.

For the division, it is the same process, but changing the sign of the second logarithm to the subtraction operation.

Additional Logic

There are also certain cases requiring special treatment. For example, a division by zero generates an error, a logarithm of zero or a negative number cannot be calculated, and the multiplication of two numbers is zero if at least one of them is zero.

Let's now build the calculator in SAS Intelligent Decisioning, including the operations, the model, and the extra logic.

Step 3 - Embed the needed logic into the decision to perform the calculator operations

The diagram below represents the Decision flow for our calculator example. Each node is numbered and followed by a definition.

0 - Overall decision flow - definition doc found here on GitHub
1 - Determine if zero is a factor for multiplication or division operations - definition doc found here on GitHub
2 - Decision handles value of previous step - set Variable = Decision_Internal_EndProcess = True
3 - Process calculations based on operation value - definition doc found here on GitHub
4 - Calculator linear regression model created earlier in the post - model definition file found here on GitHub
5 - Additional logic to finalize processing on multiplication and division operations - definition doc found here on GitHub

Step 4 - Publish the final artifact created as an API service

After completing all the work on the decision, click on the Publish button and the Calculator is ready to be consumed via an API.

A colleague of mine created a simple web application which calls models using SAS Viya microservice APIs. I'll use this web app to display the results of my calculator. For brevity, I won't cover the details of the app. If you'd like to see how to create a web app using SAS APIs, I recommend the Build a web application using SAS Compute Server series on the SAS Communities.

The app allows me to choose my decision flow, add my operands and indicate an operation as seen below.

I tested with several operand and operation combinations and they all checked out. It worked!

Final Thoughts

I can consider my self-challenge solved. Through this example we accomplished the following:

  • The Calculator API Service can perform the four operations based on a Machine Learning Model.
  • I created a simple Machine Learning Model to perform the addition of two decimal numbers from a 100 registers data set.
  • The model and the extra logic needed to perform the operations was developed under the low code / no code paradigm.
  • I used the visual interface to generate the model and the extra logic, in conjunction with the expression builder, to apply the logarithm and exponential operations.
  • The overall process has taken no more than a couple of hours.

Apart from the usefulness of this API, my principal takeaways of this self-challenge are:

  • In this case, building my data set to obtain the exact behavior I wanted for my model was quite straight-forward.
  • Building the model through the Graphical User Interface was easy and fast.
  • Having the capacity to embed the models with extra logic under the low code / no code paradigm provides “supercharged intelligence” to the model
  • The publishing feature of the whole artifact as an API service is great, providing instant value to the consumers.
  • SAS Viya is a great platform as it provides all the pieces needed to satisfy your analytical business needs as well as your “curiosity needs”.

 

How I used a SAS ML model and Intelligent Decisioning to build a calculator was published on SAS Users.

11月 102021
 

If you are thinking that nobody in their right mind would implement a Calculator API Service with a machine learning model, then yes, you’re probably right. But considering curiosity is in my DNA, it sometimes works this way and machine learning is fun. I have challenged myself to do it, not for the purpose of promoting such an experiment into production, but simply accomplishing a self-challenge that can be easily achieved with the resources provided by SAS Viya, particularly SAS Model Studio and SAS Intelligent Decisioning.

So, first, let’s define the purpose of my challenge:

deploy a basic calculator API service capable of executing the following operations for two input decimal numbers: addition, subtraction, multiplication, and division

The challenge must be executed under these restrictions:

  • Usage of a machine learning model as the “compute-engine” of the calculator
  • Development under the low code / no code paradigm
  • The end-to-end setup and execution process should not take more than a couple of hours

Use the following tools in SAS Viya:

  • Model Studio
  • Intelligent Decisioning
  • Simple web app (set up not covered)

The plan

The steps that I am identifying to complete my challenge are:
Step 1 - Create a machine learning model representing the compute-engine of my calculator (Model Studio)
Step 2 - Determine how to process other mathematical operations
Step 3 - Step 3 - Embed the needed logic into the decision to perform the calculator operations (Intelligent Decisioning)
Step 4 - Publish the artifact created as an API service (web app created outside of this post)

Step 1. Create a machine learning model as the compute-engine

Our first step is to create a model. We start with the addition operation and build from there. We’ll perform the addition by normal means of adding two numbers. Next, we’ll apply some extra logic which will perform subtraction, multiplication, and division. The diagram below represents the process:

A machine learning model is built from a data set where it self-learns what to do. I want my model to learn the addition of two numbers. I created a training data set in Excel with 100 registers, each of them with two random numbers between 0 and 1 and then the sum of them. The image below displays the general setup:

The algorithm / model I chose for my compute engine is the linear regression. Linear regression is a simple machine learning algorithm based in the following formula:

y = a + X1 + c·X2 + d·X3 + … + n·Xn

Where:

  • y is the result of the model execution – the result of the addition operation
  • X1, X2, X3, …, Xn are the input variables – for the calculator, there only will be X1 and X2 as operands
  • a, b, c, d, …, n are the parameters the machine learning process determines to create the model

So, with the training data set created, I’ll open a new machine learning project in SAS Viya Model Studio, selecting my data set from where the algorithm will learn, assign the target variable, add linear regression node, a test node, and click “Run pipeline”. Note: if following along in your own environment, make sure to use Selection Method = Adaptive LASSO and toggle Suppress intercept = On in the linear regression node. The resulting model resembles the following:

You can learn more about model creation in the How to Deploy Models in SAS tutorial on the SAS Users YouTube channel.

Once the pipeline completes, and reviewing the initial results, it seems the model is behaving in a proper way; but when I test specific operands where the result is zero, I realize the model has misgivings:

Those are the operations with zero as result. I think that maybe, the algorithm hasn’t learned with the proper data, so I change the first seven registers of my initial dataset with the following operations with zeros:

Again, running the pipeline and letting the magic work, Voila!!!, the process has learned to handle the zeroes and properly sum two input numbers. When I check the results, I verify the results calculated (predicted) by the model are the same as the original ones that were indicated in the training and test dataset. So now, I am sure my new model is ready for use as my calculator compute engine.

Now that I have my compute engine (model) ready, it’s time to use it. We know it can perform the sum operation perfectly, so how do we perform the rest of the operations? We’ll take the sum model, move it into SAS Intelligent Decisioning, and add rules to handle the rest of the operations.

First, let’s explore the logic that will build the operations from our original model. This is where mathematics start playing in (Step 2). After exploring the operations we'll look at the Decision model where we'll define the logic to run the operations (Step 3).

Addition

Execute the model with the two input numbers, with no additional logic.

Subtraction

By changing the sign of the second input, the model does the rest.

That’s a simple enough solution for subtraction, but how do we handle multiplication and division? Let’s take a look.

Multiplication and division

How can I perform a multiplication or a division operation if my compute engine only executes the addition operation? Here we can apply the magic of logarithmic properties stating:

  • log (x*y) = log(x) + log(y)
  • log (x/y) = log(x) – log(y)

Following this logic, if I want to multiply two numbers, I calculate the logarithm of each one and perform the addition operation in my compute engine. I follow this up by applying the exponential function to reverse the logarithm. The image below outlines the workflow.

For the division, it is the same process, but changing the sign of the second logarithm to the subtraction operation.

Additional Logic

There are also certain cases requiring special treatment. For example, a division by zero generates an error, a logarithm of zero or a negative number cannot be calculated, and the multiplication of two numbers is zero if at least one of them is zero.

Let's now build the calculator in SAS Intelligent Decisioning, including the operations, the model, and the extra logic.

Step 3 - Embed the needed logic into the decision to perform the calculator operations

The diagram below represents the Decision flow for our calculator example. Each node is numbered and followed by a definition.

0 - Overall decision flow - definition doc found here on GitHub
1 - Determine if zero is a factor for multiplication or division operations - definition doc found here on GitHub
2 - Decision handles value of previous step - set Variable = Decision_Internal_EndProcess = True
3 - Process calculations based on operation value - definition doc found here on GitHub
4 - Calculator linear regression model created earlier in the post - model definition file found here on GitHub
5 - Additional logic to finalize processing on multiplication and division operations - definition doc found here on GitHub

Step 4 - Publish the final artifact created as an API service

After completing all the work on the decision, click on the Publish button and the Calculator is ready to be consumed via an API.

A colleague of mine created a simple web application which calls models using SAS Viya microservice APIs. I'll use this web app to display the results of my calculator. For brevity, I won't cover the details of the app. If you'd like to see how to create a web app using SAS APIs, I recommend the Build a web application using SAS Compute Server series on the SAS Communities.

The app allows me to choose my decision flow, add my operands and indicate an operation as seen below.

I tested with several operand and operation combinations and they all checked out. It worked!

Final Thoughts

I can consider my self-challenge solved. Through this example we accomplished the following:

  • The Calculator API Service can perform the four operations based on a Machine Learning Model.
  • I created a simple Machine Learning Model to perform the addition of two decimal numbers from a 100 registers data set.
  • The model and the extra logic needed to perform the operations was developed under the low code / no code paradigm.
  • I used the visual interface to generate the model and the extra logic, in conjunction with the expression builder, to apply the logarithm and exponential operations.
  • The overall process has taken no more than a couple of hours.

Apart from the usefulness of this API, my principal takeaways of this self-challenge are:

  • In this case, building my data set to obtain the exact behavior I wanted for my model was quite straight-forward.
  • Building the model through the Graphical User Interface was easy and fast.
  • Having the capacity to embed the models with extra logic under the low code / no code paradigm provides “supercharged intelligence” to the model
  • The publishing feature of the whole artifact as an API service is great, providing instant value to the consumers.
  • SAS Viya is a great platform as it provides all the pieces needed to satisfy your analytical business needs as well as your “curiosity needs”.

 

How I used a SAS ML model and Intelligent Decisioning to build a calculator was published on SAS Users.

10月 122021
 

This article was co-written by Jane Howell, IoT Product Marketing Leader at SAS. Check out her blog profile for more information.

As artificial intelligence comes of age and data continues to disrupt traditional industry boundaries, the need for real-time analytics is escalating as organizations fight to keep their competitive edge. The benefits of real-time analytics are significant. Manufacturers must inspect thousands of products per minute for defects. Utilities need to eliminate unplanned downtime to keep the lights on and protect workers. And governments need to warn citizens of natural disasters, like flooding events, providing real time updates to save lives and protect property.

Each of these use cases requires a complex network of IoT sensors, edge computing, and machine learning models that can adapt and improve by ingesting and analyzing a diverse set of high-volume, high-velocity data.

SAS and Microsoft are partnering to inspire greater trust and confidence in every decision, by innovating with proven AI and streaming analytics in the cloud and on the edge. Together, we make it easier for companies to harness hidden insights in their diverse, high volume, high velocity IoT data, and capitalize on those insights in Microsoft Azure for secure, fast, and reliable decision making.

To take advantage of all the benefits that real-time streaming analytics has to offer, it’s important to tailor your streaming environment to your organization’s specific needs. Below, we’ll dive into how to understand the value of IoT in parallel to your organization’s business objectives and then strategize, plan, and manage your streaming analytics environment with SAS Viya on Azure.

Step 1: Understand the value of IoT

While you may already know that IoT and streaming analytics are the right technologies to enable your business’ real time analytics strategy, it is important to understand how it works and how you can benefit. You can think of streaming analytics for IoT in three distinct parts: sense, understand and act.

Sense: Sensors by design are distributed, numerous, and collect data at high fidelity in various formats. The majority of data collected by sensors has a short useful life and requires immediate action. Streaming analytics is well-suited to this distributed sensor environment to collect data for analysis.
Understand: A significant number of IoT use cases requires quick decision-making in real time or near-real time. To achieve this, we need to apply analytics to data in motion. This can be done by deploying AI models that detect anomalies and patterns as events occur.
Act: As with any analytics-based decision support, it is critical to act on the insight generated. Once a pattern is detected this must trigger an action to reach a desired outcome. This could be to alert key individuals or change the state of a device, possibly eliminating the need for any human intervention.

The value in IoT is driven by the reduced latency to trigger the desired outcome. Maybe that’s improving production quality in the manufacturing process, recommending a new product to a customer as they shop online, or eliminating equipment failures in a utility plant. Whatever it is, time is of the essence and IoT can help get you there.

Step 2: Strategize

Keeping the “sense, understand, act” framework in mind, the next step is to outline what you hope to achieve. To get the most out of your streaming analytics with SAS and Microsoft, keep your objectives in mind so you can stay focused on the business outcome instead of trying to act on every possible data point.

Some important questions to ask yourself are:

1. What is the primary and secondary outcomes you are hoping to achieve? Increase productivity? Augment safety? Improve customer satisfaction?
2. What patterns or events of interest do you want to observe?
3. If your machines and sensors show anomalous behavior what actions need to be taken? Is there an existing business process that reflects this?
4. What data is important to be stored as historical data and what data can expire?
5. What kind of infrastructure exists from the point where data is generated (edge) to cloud? Is edge processing an option for time-critical use cases or does processing needs to be centralized in cloud?
6. What are your analytics and application development platforms? Do you have access to high performance streaming analytics and cloud infrastructure to support this strategy?

Once you’ve identified your outcomes, define which metrics and KPIs you can measure to show impact. Make sure to have some baseline metrics to start from that you can improve upon.

Step 3: Plan and adopt

Now it’s time to take your strategy and plan the adoption of streaming analytics across your business.

Adoption will look different if you already have an IoT platform in place or if you are working to create a net-new solution. If you are going to be updating or iterating upon an existing solution, you will want to make sure you have access to key historical data to measure improvement and use institutional knowledge to maximize performance. If you are working with a net-new solution, you will want to give yourself some additional time to start small and then scale your operations up over time so you can tackle any unforeseen challenges.

In both cases it is important to have key processes aligned to the following considerations:

Data variety, volume and accuracy: Focus here on the “sense” part of the “sense, understand, act” framework. Accessing good data is the foundation to the success of your streaming projects. Make sure you have the right data needed to achieve your desired business outcome. Streaming analytics helps you understand the signals in IoT data, so you can make better decisions. But if you can’t access the right data, or your data is not clean, your project will not be successful. Know how much data you will be processing and where. Data can be noisy, so it is important to understand which data will give you the most insight.
Reliability: Ensure events are only processed once so you’re not observing the same events multiple times. When equipment fails or defects occur on the production line, ensure there are processes in place to auto-start to maximize uptime for operations.
Scalability: Data science resources are scarce, so choose a low-code, no-code solution that can address your need to scale. When volume increases, how are you going to scale up and out? Azure simplifies scale with its PaaS offerings, including the ability to auto-scale SAS Viya on Azure.
Operations: Understand how you plan to deploy your streaming analytics models, govern them and decide which processes can be automated to save time.
Choose the right partners and tools: This is critical to the success of any initiative. SAS and Microsoft provide a best-in-class solution for bringing streaming analytics on the most advanced platform for integrated cloud and edge analytics.

Now that you have created your plan, it is time to adopt. Remember to start small and add layers of capability over time.

Step 4: Manage

To get the most value from IoT and streaming analytics, organizations must implement processes for continuous iteration, development, and improvement. That means having the flexibility to choose the most powerful models for your needs – using SAS, Azure cloud services, or open source. It also means simplifying DevOps processes for deploying and monitoring your streaming analytics to maximize uptime for your business systems.

With SAS Viya on Azure, it is easy to do this and more. Seamlessly move between your SAS and Microsoft environment with single sign on authentication. Develop models with a host of no-code, low-code tools, and monitor the performance of your SAS and open-source models from a single model management library.

Maximizing value from your IoT and streaming analytics systems is a continuous, agile process. That is why it is critical to choose the most performant platform for your infrastructure and analytics needs. Together, SAS and Microsoft make it easier for organizations of all sizes and maturity to rapidly build, deploy, and scale IoT and streaming analytics, maximizing up time to better serve customers, employees, and citizens.

If you want to learn more about SAS and streaming analytics and IoT capabilities as well as our partnership with Microsoft, check out the resources below:

• Learn about SAS Viya’s IoT and streaming analytics capabilities
• Discover out all the exciting things SAS and Microsoft are working to achieve together at SAS.com/Microsoft
• See how SAS and Microsoft work together to help the town of Cary, North Carolina warn citizens of flood events: Smart city uses analytics and IoT to predict and manage flood events

Your guide for analyzing real time data with streaming analytics from SAS® Viya® on Azure was published on SAS Users.

8月 202021
 

This article was co-written by Marinela Profi, Product Marketing Manger for AI, Data Science and Open-Source. Check out her blog profile for more information.

Artificial Intelligence (AI) is changing the way people and organizations improve decision-making and move about their lives – from text translation, to chatbots and predictive analytics. However, many organizations are struggling to realize its potential as model deployment processes remain disconnected, creating unforeseen headaches and manual work. Additionally, other requirements like performance monitoring, retraining, and integration into core business processes must be streamlined for optimal teamwork and resource usage.

SAS and Microsoft are partnering to inspire greater trust and confidence in every decision, by driving innovation and proven AI in the cloud. With a combined product roadmap, SAS and Microsoft are working tirelessly to improve offerings and connectivity between SAS Viya and Microsoft Azure environments across industries. That’s why we are especially excited to announce SAS Viya users can now publish SAS and open-source models in Azure Machine Learning.

The SAS and Microsoft team built a tightly integrated connection between SAS Model Manager and Azure Machine Learning to register, validate, and deploy SAS and open-source models to Azure Machine Learning with just a few clicks. From there, data scientists can enrich their applications with SAS or open-source models within their Azure environment.

This integration will enable users to:

1) Extend SAS models stored in SAS Model Manager into the Azure Machine Learning registry, offering more opportunities for collaboration across the enterprise.

2) Deploy SAS and open-source models from SAS Model Manager to Azure Machine Learning on the same Azure Kubernetes cluster you have already set up in Azure Machine Learning. Before deploying the model, you can validate the model and ensure it meets your criteria.

3) Seamlessly connect your SAS Viya and Microsoft environments without the hassle of verifying multiple licenses with single sign-on authentication via Azure Active Directory (Azure AD).

Get started

Step 1: To get started, use Azure AD for simplified SAS Viya access.

Step 2: SAS Model Manager governs, deploys, and monitors all types of SAS and open-source models (i.e., Python, R). On the home page, you can see the projects you and your team are working on in addition to “What’s new” and “How to” videos with the latest updates.

Step 3: Compare different models to identify the most accurate “champion model.” Deploy the model throughout the Microsoft ecosystem from cloud to edge with customizable runtimes, centralized monitoring, and management capabilities.

Step 4: Using the provided artifacts, Azure Machine Learning creates executable containers supporting SAS and open-source models. You can use the endpoints created through model deployment for the scoring of the data.

Step 5: Schedule SAS Model Manager to detect model drift and automatically retrain models in case of poor performance or bias detection.

Discover more

If you want to know more about SAS Model Manager and our partnership with Microsoft, check out the resources below:

“What’s New with SAS Model Manager” article series to find out the latest and greatest updates.
SAS Viya on Azure to solve 100 different use cases on Data for Good, industries and startups.

Let us know what you think!

We would love to hear what you think about this new experience and how we can improve it. If you have any feedback for the team, please share your thoughts and ideas in the comments section below.

Deploying SAS and open-source models to Azure Machine Learning has never been easier was published on SAS Users.

8月 182021
 

When people think about sports, many things may come to mind: Screaming fans, the intensity of the game and maybe even the food. Data doesn’t usually make the list. But what people may not realize is that data is behind everything people love about sports. It can help determine how [...]

4 ways analytics are enhancing sports was published on SAS Voices by Olivia Ojeda

7月 222021
 

How do you convince decision makers in your enterprise to give a machine learning (ML) project the green light?

You might be super excited about machine learning – as many of us are – and might think that this stuff should basically sell itself! The value proposition can seem totally obvious when you are already invested in it. The improvement to current operations is a "no-brainer." And the core ML technology is nifty as heck.

But to get traction for a new initiative, to sell it to decision makers, you need to take a step back from the excitement that you feel and tell a simple, non-technical business story that is sober rather than fervent.

Start with an elevator pitch

99.5% of our direct mail is ineffective. Only half a percent respond.

If we can lower that nonresponse rate to 98.5% — and increase the response rate to 1.5% — that would mean a projected $500,000 increase in annual profit, tripling the ROI of the marketing campaigns. I can show you the arithmetic in detail.

We can use machine learning to hone down the size of our mailings by targeting the customers more likely to respond. This should cut costs about three times the amount that it will decrease revenue, giving us the gains and ROI I just mentioned.

A short pitch like this is the best place to start before asking for questions. Get straight to the point – the business value and the bottom line – and then see where your colleagues are coming from. Remember, they're not necessarily excited about ML, so in this early stage, it is really, really easy to bore them. That’s why you must lead with the value and then get into the ML technology only to the degree necessary to establish credibility.

Keep your pitch focused on accomplishing these three things

  1. Your pitch must lead with the value proposition, expressed in business terms without any real details about ML, models, or data. Nothing about how ML works, only the actionable value that it delivers. Focus on the functional purpose, the operational improvement gained by model deployment – and yet, in this opening, don't use the words "model" or "deployment."
  2. Your pitch must estimate a performance improvement in terms of one or two key performance indicators (KPIs) such as response rate, profit, ROI, costs, or labor/staff requirements. Express this potential result in simple terms. For example, the profit curve of a model is “TMI” (Too Much Information) – it introduces unnecessary complexity during this introductory pitch. Instead, just show a bar chart with only two bars to illustrate the potential improvement. Stick with the metrics that matter, the ones people care about — that is, the ones that actually drive business decisions at your company. Make the case that the performance improvement more than justifies the expense of the ML project. Don't get into predictive model performance measures such as lift.
  3. Stop and listen -- keep your pitch short and then open the conversation. Realize that your pitch isn't the conclusion but rather a catalyst to begin a dialogue. By laying out the fundamental proposition and asking them to go next, you get to find out which aspects are of concern and which are of interest, and you get a read on their comfort level with ML or with analytics in general.

So, does the wondrous technology of machine learning itself even matter in this pitch? Can you really sell ML without getting into ML? Well, yes, it does matter, and usually you will get into it, eventually. But you need to interactively determine when to do so, to what depth, and at what pace.

With machine learning, leading with the scientific virtues and quantitative capabilities of the technology that you are selling – predictive modeling algorithms, the idea of learning from data, probabilities, and so on – is like pitching the factory rather than the sausage. Instead, lead with the business value proposition.

It's more common than you may realize for the business professional to whom you're speaking to feel nervous about their own ability to understand analytical technology. The elevator-pitch format serves as an antidote to this type of "tech aversion." Lead with a simple story about how value is delivered or how processes will improve.

These tactics for green lighting compose just one part of machine learning leadership. For machine learning projects to succeed, a very particular leadership practice must be followed. To fully dive in, enroll in my SAS Business Knowledge Series course, Machine Learning Leadership and Practice – End-to-End Mastery. (This article is based on one of the course’s 142 videos.) I developed this curriculum to empower you to generate value with machine learning, whether you work as a techie, a business leader, or some combination of the two. This course delivers the end-to-end expertise that you need, covering both the core technology and the business-side practice. Why cover both sides? Because both sides need to learn both sides! Click here for more details, the full syllabus, and to enroll.

Getting the green light for a machine learning project was published on SAS Users.

7月 212021
 

In my new book, I explain how segmentation and clustering can be accomplished in three ways: coding in SAS, point-and-click in SAS Visual Statistics, and point-and-click in SAS Visual Data Mining and Machine Learning using SAS Model Studio. These three analytical tools allow you to do many diverse types of segmentation, and one of the most common methods is clustering. Clustering is still among the top 10 machine learning methods used based on several surveys across the globe.

One of the best methods for learning about your customers, patrons, clients, or patients (or simply observations in almost any data set) is to perform clustering to find clusters that have similar within-cluster characteristics and each cluster has differing combinations of attributes. You can use this method to aid in understanding your customers or profile various data sets. This can be done in an environment where SAS and open-source software work in a unified platform seamlessly. (While open source is not discussed in my book, stay tuned for future blog posts where I will discuss more fun and exciting things that should be of interest to you for clustering and segmentation.)

Let’s look at an example of clustering. The importance of looking at one’s data quickly and easily is a real benefit when using SAS Visual Statistics.

Initial data exploration and preparation

To demonstrate the simplicity of clustering in SAS Visual Statistics, the data set CUSTOMERS is used here and also throughout the book. I have loaded the CUSTOMERS data set into memory, and it is now listed in the active tab. I can easily explore and visualize this data by right-mouse-clicking and selecting Actions and then Explore and Visualize. This will take you to the SAS Visual Analytics page.

I have added four new compute items by taking the natural logarithm of four attributes and will use these newly transformed attributes in a clustering.

Performing simple clustering

Clustering in SAS Visual Statistics can be found by selecting the Objects icon on the left and scrolling down to see the SAS Visual Statistics menus as seen below. Dragging the Cluster icon onto the Report template area will allow you to use that statistic object and visualize the clusters.

Once the Cluster object is on the template, adding data items to the Data Roles is simple by checking the four computed data items.

Click the OK icon, and immediately the four data items that are being clustered will look like the report below where five clusters were found using the four data items.

There are 105,456 total observations in the data set, however, only 89,998 were used for the analysis. Some observations were not used due to the natural logarithm not being able to be computed. To see how to handle that situation easily, please pick up a copy of Segmentation Analytics with SAS Viya. Let me know if you have any questions or comments.

 

 

Clustering made simple was published on SAS Users.

5月 112021
 

It’s safe to say that SAS Global Forum is a conference designed for users, by users. As your conference chair, I am excited by this year’s top-notch user sessions. More than 150 sessions are available, many by SAS users just like you. Wherever you work or whatever you do, you’ll find sessions relevant to your industry or job role. New to SAS? Been using SAS forever and want to learn something new? Managing SAS users? We have you covered. Search for sessions by industry or topic, then add those sessions to your agenda and personal calendar.

Creating a customizable agenda and experience

Besides two full days of amazing sessions, networking opportunities and more, many user sessions will be available on the SAS Users YouTube channel on May 20, 2021 at 10:00am ET. After you register, build your agenda and attend the sessions that most interest you when the conference begins. Once you’ve viewed a session, you can chat with the presenter. Don’t know where to start? Sample agendas are available in the Help Desk.

For the first time, proceedings will live on SAS Support Communities. Presenters have been busy adding their papers to the community. Everything is there, including full paper content, video presentations, and code on GitHub. It all premiers on “Day 3” of the conference, May 20. Have a question about the paper or code? You’ll be able to post a question on the community and ask the presenter.

Want training or help with your code?

Code Doctors are back this year. Check out the agenda for the specific times they’re available and make your appointment, so you’ll be sure to catch them and get their diagnosis of code errors. If you’re looking for training, you’ll be quite happy. Training is also back this year and it’s free! SAS instructor-led demos will be available on May 20, along with the user presentations on the SAS Users YouTube channel.

Chat with attendees and SAS

It is hard to replicate the buzz of a live conference, but we’ve tried our best to make you feel like you’re walking the conference floor. And we know networking is always an important component to any conference. We’ve made it possible for you to network with colleagues and SAS employees. Simply make your profile visible (by clicking on your photo) to connect with others, and you can schedule a meeting right from the attendee page. That’s almost easier than tracking down someone during the in-person event.

We know the exhibit hall is also a big draw for many attendees. This year’s Innovation Hub (formerly known as The Quad) has industry-focused booths and technology booths, where you can interact in real-time with SAS experts. There will also be a SAS Lounge where you can learn more about various SAS services and platforms such as SAS Support Communities and SAS Analytics Explorers.

Get started now

I’ve highlighted a lot in this blog post, but I encourage you to view this 7-minute Innovation Hub video. It goes in depth on the Hub and all its features.

This year there is no reason not to register for SAS Global Forum…and attend as few or as many sessions as you want. Why? Because the conference is FREE!

Where else can you get such quality SAS content and learning opportunities? Nowhere, which is why I encourage you to register today. See you soon!

SAS Global Forum: Your experience, your way was published on SAS Users.