2月 132020
 

What do a Pulitzer prize-winning author, an Emmy award-winning TV personality and one of the top 5 influencers in the world have in common? They’ll all be at SAS Global Forum this year. It’s where analytics enthusiasts and executive thought leaders meet to share strategies, get training in abundance, and [...]

3 people I can’t wait to meet at SAS Global Forum  was published on SAS Voices by Jenn Chase

2月 122020
 

The Johnson system (Johnson, 1949) contains a family of four distributions: the normal distribution, the lognormal distribution, the SB distribution, and the SU distribution. Previous articles explain why the Johnson system is useful and show how to use PROC UNIVARIATE in SAS to estimate parameters for the Johnson SB distribution or for the Johnson SU distribution.

How to choose between the SU and SB distribution?

The graph to the right shows a histogram with an overlay of a fitted Johnson SB density curve. But why fit the SB distribution? Why not the SU? For a given sample of data, it is not usually clear from the histogram whether the tails of the data are best described as thin-tailed or heavy-tailed. Accordingly, it is not clear whether the SB (bounded) or the SU (unbounded) distribution is a more appropriate model.

One way to answer that question is to plot the sample kurtosis and skewness on a moment-ratio diagram to see if it is in the SB or SU region of the diagram. Unfortunately, high-order sample moments have a lot of variability, so this method is not very accurate. See Chapter 16 of Simulating Data with SAS for examples.

Slifker and Shapiro (1980) devised a method that does not use high-order moments. Instead, they compare the length of the tails of the distribution to the length of the central portion of the distribution. They use four quantiles to define "the tails" and "the central portion," so their method is similar to a robust definition of skewness, which also uses quantile information.

For ease of exposition, in this section I will oversimplify Slifker and Shapiro's results. Essentially, they suggest using the 6th, 30th, 70th, and 94th percentiles of the data to determine whether the data are best modeled by the SU, SB, or lognormal distribution. Denote these percentiles by P6, P30, P70, and P94, respectively. The key quantities in the computation are lengths of the intervals between percentiles of the data. In particular, define

  • m = P94 - P70 (the length of the upper tail)
  • n = P30 - P06 (the length of the lower tail)
  • p = P70 - P30 (the length of the central portion)

Slifker and Shapriro show that if you use percentiles of the distributions, the ratio m*n/p2 has the following properties:

  • m*n/p2 > 1 for percentiles of the SU distribution
  • m*n/p2 < 1 for percentiles of the SB distribution
  • m*n/p2 = 1 for percentiles of the lognormal distribution

Therefore, they suggest that you use the sample estimates of the percentiles to compute the ratio. If the ratio is close to 1, use the lognormal distribution. Otherwise, if the ratio is greater than 1, use the SU distribution. Otherwise, if the ratio is less than 1, use the SB distribution.

Details of the computation

The previous section oversimplifies one aspect of the computation. Slifker and Shapiro don't actually recommend using the 6th, 30th, 70th, and 94th percentiles of the data. They recommend choosing a normal variate, z (0 < z < 1) that depends on the size of the sample. They recommend using "a value of z near 0.5 such as z = 0.524" for "moderate-sized data sets" (p. 240). After choosing z, consider the evenly spaced values {-3*z, -z, z, 3*z}. These points divide the area under the normal density curve into four regions.

For z = 0.524, the areas of these regions are 0.058, 0.300, 0.700, and 0.942. These, then, are the quantiles to use for "moderate-sized data sets." This choice assumes that the 5.8th and 94.2th sample percentiles (which define the "tails") are good approximates of the percentiles of the distribution. If you have a large data set, another reasonable choice would be z = 0.6745, which leads to the 2.2th, 25th, 75th, and 97.8th percentiles.

A SAS program to determine the Johnson family from data

You can write a SAS/IML program that reads in univariate data, computes the percentiles of the data, and computes the ratio m*n/p2. You can use the QNTL function in the SAS/IML language to compute the percentiles, but Slifker and Shapiro use a slightly nonstandard definition of percentiles in their paper. For consistency, the following program uses their definition. Part of the percentile computation requires looking at the integer and fractional part of a number.

The following program analyzes the EngineSize variable in the Sashelp.Cars data set:

/* For Sashelp.Cars, the EngineSize (and Cylinders) variable is SB. Others are SU. */
%let dsname = Sashelp.Cars;
%let varName = EngineSize; /* OR  %let varName = mpg_city; */
 
/* Implement the Slifker and Shapiro (1980) https://www.jstor.org/stable/1268463
   method that uses sample percentiles to assess whether the data are best 
   modeled by the SB, SU, or SL (lognormal) distributions. */
proc iml;
/* exclude any row with missing value: https://blogs.sas.com/content/iml/2015/02/23/complete-cases.html */
start ExtractCompleteCases(X);
   idx = loc(countmiss(X, "row")=0);
   if ncol(idx)>0 then return( X[idx, ] ); else return( {} ); 
finish;
 
/* read the variable into x */
use &dsname;
   read all var {&varName} into x;
close;
 
x = ExtractCompleteCases(x);      /* remove missing values */
if nrow(x)=0 then abort;
call sort(x);
N = nrow(x);
 
/* Generate the percentiles as percentiles of the normal distribution for evenly spaced 
   variates. This computation does not depend on the data values. */
z0 = 0.524;                       /* one possible choice. Another might be 0.6745 */
z = -3*z0 // -z0 // z0 // 3*z0;   /* evenly space z values */
pctl = cdf("Normal", z);          /* percentiles of the normal distribution */
print pctl[f=5.3];                /* These are the percentiles to use */
 
/* Note: for z0 = 0.524, the percentiles are approximately 
   the 5.8th, 30th, 70th, and 94.2th percentiles of the data */
 
/* The following computations are almost (but not quite) the same as 
   call qntl(xz, x, p);   
   Use MOD(k,1) to compute the fractional part of a number.  ‎*/
k = pctl*N + 0.5;
intk = int(k);          /* int(k) is integer part and mod(k,1) is fractional part */
xz = x[intk] + mod(k,1) # (x[intk+1] - x[intk]); /* quantiles using linear interpol */
 
/* Use xz to compare length of left/right tails with length of central region */
m = xz[4] - xz[3];      /* right tail: length of 94th - 70th percentile */
n = xz[2] - xz[1];      /* left tail: length of 30th - 6th percentile */
p = xz[3] - xz[2];      /* central region: length of 70th - 30th percentile */
 
/* use ratio to decide between SB, SL (lognormal) and SU distributions */
ratio = m*n/p**2;
eps = 0.05;             /* if ratio is within 1 +/- eps, use lognormal */
 
if (ratio > 1 + eps) then 
   type = 'SU';
else if (ratio < 1 - eps) then
   type = 'SB';
else 
   type = 'LOGNORMAL';
 
print type;
call symput('type', type);   /* optional: store value in a macro variable */
quit;

The output of the program is the word 'SB', 'SU', or 'LOGNORMAL', which indicates which Johnson distribution seems to fit the data. When you know this information, you can use the appropriate option in the HISTOGRAM statement of PROC UNIVARIATE. For example, the SB distribution seems to be appropriate for the EngineSize variable, so you can use the following statements to produce the graph at the top of this article.

proc univariate data=Sashelp.Cars;
   var EngineSize;
   histogram EngineSize / SB(theta=0 sigma=EST fitmethod=moments);
   ods select Histogram;
run;

Most of the other variables in the Sashelp.Cars data are best modeled by using the SU distribution. For example, if you change the value of the &varName macro to mpg_city and rerun the program, the program will print 'SU'.

Concluding thoughts

Slifker and Shapiro's method is not perfect. One issue is that you have to choose a particular set of percentiles (derived from the choice of "z0" in the program). Different choices of z0 could conceivably lead to different results. And, of course, the sample percentiles—like all statistics—have sampling variability. Nevertheless, the method seems to work adequately in practice.

The second part of Slifker and Shapiro describe a method of using the percentiles of the data to fit the parameters of the data. This is the "method of percentiles" that is mentioned in the PROC UNIVARIATE documentation. In a companion article, Mage (1980, p. 251) states: "Although a given percentile point fit of the SB parameters may be adequate for most purposes, the ambiguity of obtaining different parameters by different percentile choices may be unacceptable in some applications." In other words, it would be preferable to "avoid the situation where one choice of percentiles leads to [one conclusion]and a second choice of percentiles leads to [a different conclusion]."

It is worth noting that the UNIVARIATE procedure supports three different methods for fitting parameters in the Johnson distributions. No one method works perfectly for all possible data sets, so experiment with several different methods when you use the Johnson distribution to model data.

The post The Johnson system: Which distribution should you choose to model data? appeared first on The DO Loop.

2月 102020
 

You can represent every number as a nearby integer plus a decimal. For example, 1.3 = 1 + 0.3. The integer is called the integer part of x, whereas the decimal is called the fractional part of x (or sometimes the decimal part of x). This representation is not unique. For example, you can also write 1.3 = 2 + (-0.7). There are several ways to produce the integer part of a number, depending on whether you want to round up, round down, round towards zero, or use some alternative rounding method.

Just as each rounding method defines the integer part of a number, so, too, does it define the fractional part. If [x]denotes the integer part of x (by whatever rounding method you choose), then the fractional part is defined by frac(x) = x - [x]. For some choices of a rounding function (for example, FLOOR), the fractional part is positive for all x. For other choices, the fractional part might vary according to the value of x.

In applications, two common representations are as follows:

  • Round x towards negative infinity. The fractional part of x is always positive. You can round towards negative infinity by using the FLOOR function in a computer language. For example, if x = -1.3, then FLOOR(x) is -2 and the fractional part is 0.7.
  • Round x towards zero. The fractional part of x always has the same sign as x. You can round towards zero by using the INT function. For example, if x = -1.3, then INT(x) is -1 and the fractional part is -0.3.

Here is an interesting fact: for the second method (INT), you can compute the fractional part directly by using the MOD function in SAS. In SAS, the expression MOD(x,1) returns the signed fractional part of a number because the MOD function in SAS returns a result that has the same sign as x. This can be useful when you are interested only in the fraction portion of a number.

The following DATA step implements both common methods for representing a number as an integer and a fractional part. Notice the use of the MOD function for the second method:

data Fractional;
input x @@;
/* Case 1: x = floor(x) + frac1(x) where frac1(x) >= 0 */
Floor = floor(x);
Frac1 = x - Floor;                  /* always positive */
 
/* Case 2: x = int(x) + frac2(x) where frac1(x) has the same sign as x */
Int = int(x);
Frac2 = mod(x,1);                   /* always same sign as x */
label Floor = 'Floor(x)' Int='Int(x)' Frac2='Mod(x,1)';
datalines;
-2 -1.8 -1.3 -0.7 0 0.6 1.2 1.5 2
;
 
proc print data=Fractional noobs label;
run;

The table shows values for a few positive and negative values of x. The second and third columns represent the number as x = Floor(x) + Frac1. The fourth and fifth columns represent the number as x = Int(x) + Mod(x,1). For non-negative values of x, the two methods are equivalent. When x can be either positive or negative, I often find that the second representation is easier to work with.

In statistics, the fractional part of a number is used in the definition of sample estimates for percentiles and quantiles. SAS supports five different definitions of quantiles, some of which look at the fractional part of a number to decide how to estimate a percentile.

The post Find the fractional part of a number appeared first on The DO Loop.

2月 052020
 

One of the first and most important steps in analyzing data, whether for descriptive or inferential statistical tasks, is to check for possible errors in your data. In my book, Cody's Data Cleaning Techniques Using SAS, Third Edition, I describe a macro called %Auto_Outliers. This macro allows you to search for possible data errors in one or more variables with a simple macro call.

Example Statistics

To demonstrate how useful and necessary it is to check your data before starting your analysis, take a look at the statistics on heart rate from a data set called Patients (in the Clean library) that contains an ID variable (Patno) and another variable representing heart rate (HR). This is one of the data sets I used in my book to demonstrate data cleaning techniques. Here is output from PROC MEANS:

The mean of 79 seems a bit high for normal adults, but the standard deviation is clearly too large. As you will see later in the example, there was one person with a heart rate of 90.0 but the value was entered as 900 by mistake (shown as the maximum value in the output). A severe outlier can have a strong effect on the mean but an even stronger effect on the standard deviation. If you recall, one step in computing a standard deviation is to subtract each value from the mean and square that difference. This causes an outlier to have a huge effect on the standard deviation.

Macro

Let's run the %Auto_Outliers macro on this data set to check for possible outliers (that may or may not be errors).

Here is the call:

%Auto_Outliers(Dsn=Clean.Patients,
               Id=Patno,
               Var_List=HR SBP DBP,
               Trim=.1,
               N_Sd=2.5)

This macro call is looking for possible errors in three variables (HR, SBP, and DBP); however, we will only look at HR for this example. Setting the value of Trim equal to .1 specifies that you want to remove the top and bottom 10% of the data values before computing the mean and standard deviation. The value of N_Sd (number of standard deviations) specifies that you want to list any heart rate beyond 2.5 trimmed standard deviations from the mean.

Result

Here is the result:

After checking every value, it turned out that every value except the one for patient 003 (HR = 56) was a data error. Let's see the mean and standard deviation after these data points are removed.

Notice the Mean is now 71.3 and the standard deviation is 11.5. You can see why it so important to check your data before performing any analysis.

You can download this macro and all the other macros in my data cleaning book by going to support.sas.com/cody. Scroll down to Cody's Data Cleaning Techniques Using SAS, and click on the link named "Example Code and Data." This will download a file containing all the programs, macros, and data files from the book.  By the way, you can do this with any of my books published by SAS Press, and it is FREE!

Let me know if you have questions in the comments section, and may your data always be clean! To learn more about SAS Press, check out up-and-coming titles, and to receive exclusive discounts make sure to subscribe to the newsletter.

Finding Possible Data Errors Using the %Auto_Outliers Macro was published on SAS Users.

2月 052020
 

A SAS programmer wanted to create a graph that illustrates how Deming regression differs from ordinary least squares regression. The main idea is shown in the panel of graphs below.

  • The first graph shows the geometry of least squares regression when we regress Y onto X. ("Regress Y onto X" means "use values of X to predict Y.") The residuals for the model are displayed as vectors that show how the observations are projected onto the regression line. The projection is vertical when we regression Y onto X.
  • The second graph shows the geometry when we regress X onto Y. The projection is horizontal.
  • The third graph shows the perpendicular projection of both X and Y onto the identity line. This is the geometry of Deming regression.

This article answers the following two questions:

  1. Given any line and any point in the plane, how do you find the location on the line that is closest to the point? This location is the perpendicular projection of the point onto the line.
  2. How do you use the SGPLOT procedure in SAS to create the graphs that chow the projections of points onto lines?

The data for the examples are shown below:

data Have;
input x y @@;
datalines;
0.5 0.6   0.6 1.4   1.4 3.0   1.7 1.4   2.2 1.7
2.4 2.1   2.4 2.4   3.0 3.3   3.1 2.5 
;

The projection of a point onto a line

Assume that you know the slope and intercept of a line: y = m*x + b. You can use calculus to show that the projection of the point (x0, y0) onto the line is the point (xL, yL) where
xL = (x + m*(y – b)) / (1 + m2) and yL = m * xL + b.

To derive this formula, you need to solve for the point on the line that minimizes the distance from (x0, y0) to the line. Let (x, m*x + b) be any point on the line. We want to find a value of x so that the distance from (x0, y0) to (x, m*x + b) is minimized. The solution that minimizes the distance also minimizes the squared distance, so define the squared-distance function
f(x) = (x - x0)2 + (m*x + b - y0)2.
To find the location of the minimum for this function, set the derivative equal to zero and solve for the value of x:

  • f`(x) = 2(x - x0) + 2 m*(m*x + b - y0)
  • Set f`(x)=0 and solve for x. The solution is the value xL = (x + m*(y – b)) / (1 + m2), which minimizes the distance from the point to the line.
  • Plug xL into the formula for the line to find the corresponding vertical coordinate on the line: yL = m * xL + b.

You can use the previous formulas to write a simple SAS DATA step that projects each observation onto a specified line. (For convenience, I put the value of the slope (m) and intercept (b) into macro variables.) The following DATA step projects a set of points onto the line y = m*x + b. You can use PROC SGPLOT to create a scatter plot of the observations. Use the VECTOR statement to draw the projections of the points onto the line.

/* projection onto general line of the form y = &m*x + &b */
%let b = 0.4;
%let m = 0.8;
data Want;
set Have;
xL = (x + &m *(y - &b)) / (1 + &m**2);
yL = &m * xL + &b;
run;
 
title "Projection onto Line y=&m x + &b";
proc sgplot data=Want aspect=1 noautolegend;
   scatter x=x y=y;
   vector x=xL y=yL / xorigin=x yorigin=y; /* use the NOARROWHEADS option to suppress the arrow heads */
   lineparm x=0 y=&b slope=&m / lineattrs=(color=black);
   xaxis grid; yaxis grid;
run;

You can get the graph for Deming regression by setting b=0 and m=1 in the previous formulas and program.

In summary, you can use that math you learned in high school to find the perpendicular projection of a point onto a line. You can then use the VECTOR statement in PROC SGPLOT in SAS to create a graph that illustrates the projection. Such a graph is useful for comparing different kinds of regressions, such as comparing least-squares and Deming regression.

The post Visualize residual projections for linear regression appeared first on The DO Loop.

2月 042020
 

Cancer touches nearly everyone. You probably know at least one person who's been diagnosed with cancer -- many of us know many more than one. It's the second leading cause of death worldwide, behind cardiovascular disease. World Cancer Day is observed this year on February 4, and is meant to [...]

Dispelling cancer myths -- by the numbers was published on SAS Voices by Mary Osborne

2月 032020
 

Almost everyone enjoys a good glass of wine after a long day, but did you ever stop to wonder how the exact bottle you're looking for makes it's way to the grocery store shelf? Analytics has a lot to do with it, as SAS demonstrated to attendees at the National [...]

How analytics takes wine from grape to grocery was published on SAS Voices by Charlie Chase

2月 032020
 

Recently someone on social media asked, "how can I compute the required sample size for a binomial test?" I assume from the question that the researcher was designing an experiment to test the proportions between two groups, such as a control group and a treatment/intervention group. They wanted to know how big they should make each group.

It's a great question, and it highlights one of the differences between statistics and machine learning. Statistics does not merely analyze data after they are collected. It also helps you to design experiments. Without statistics, a researcher might assign some number of subjects to each treatment group, cross his fingers for luck, and hope that the difference between the groups will be significant. With statistics, you can determine in advance how many subjects you need to detect a specified difference between the groups. This can save time and money: having too many subjects is needlessly expensive; having too few does not provide enough data to confidently answer the research question. To estimate the group size, you must have some prior knowledge (or estimate) of the size of the effect you are trying to measure. You can use smaller groups if you are trying to detect a large effect; you need larger groups to detect a small effect.

Researchers use power and sample size computations to address these issues. "Sample size" is self-explanatory. "Power" is the probability that a statistical test will reject the null hypothesis when the alternative hypothesis is true. In general, the power of a test increases with the sample size. More power means fewer Type II errors (fewer "false negatives").

In SAS, there are several ways to perform computations related to power and sample size, but the one that provides information about binomial proportions is PROC POWER. This article shows how to use PROC POWER to determine the sample size that you need for a binomial test for proportions in two independent samples.

Sample size to detect difference in proportion

Many US states have end-of-course (EOC) assessments, which help school administrators measure how well students are mastering fundamental topics such as reading and math. I read about a certain school district in which only 31% of high school students are passing the algebra EOC assessment. Suppose a company wants to sell the district software that it claims will boost student passing by two percentage points (to 33%). The administrators are interested, but the software is expensive, so they decide to conduct a pilot study to investigate the company's claim. How big must the study be?

The researchers need to calculate how many students are needed to detect a difference in the proportion of 2% (0.02). Before we do any calculations, what does your intuition say? Would 100 students in each group be enough? Would 500 students be enough? The TWOSAMPLEFREQ statement in the POWER procedure in SAS can help answer that question.

PROC POWER and a one-sided test for proportion

The POWER procedure can compute power and sample size for more than a dozen common statistical tests. In addition, you can specify multiple estimates of the parameters in the problem (for example, the true proportions) to see how sensitive the results are to your assumptions. You can also specify whether to perform a two-sided test (the default), a one-sided test, or tests for superiority or inferiority. (For information about inferiority and superiority testing, see Castelloe and Watts (2015).)

Let's analyze the results by using a one-tailed chi-square test for the difference between two proportions (from independent samples). The null hypothesis is that the control group and the "Software" group each pass the EOC test 31% of the time. The alternative hypothesis is that a higher proportion of the software group passes the test.

You can use the TWOSAMPLEFREQ statement in the POWER procedure to determine the sample sizes required to give 80% power to detect a proportion difference of at least 0.02. You use a missing value (.) to specify the parameter that the procedure should solve for, which in this case is the number of subjects in each treatment group (NPERGROUP). The GROUPPROPORTIONS= option specifies the hypothesized proportions, or you can use the REFPROPORTION= and PROPORTIONDIFF= options. The following call to PROC POWER solves for the sample size in a balanced experiment with two groups:

proc power; 
  twosamplefreq test=FM
  groupproportions = (0.31  0.33) /* OR: refproportion=0.31 proportiondiff=0.02 */
  power = 0.8
  alpha = 0.05
  npergroup = . 
  sides = 1;
run;

The output indicates that the school district needs 6,726 students in each group in order to verify the company's claims with 80% power! I was surprised by this number, which is much bigger than my intuition suggested!

The power for various sample sizes

What if I hadn't used PROC POWER? What if I had just assigned 1,000 to each group? Or 2,000 students? What power would the test of proportions have to detect the small difference of proportion (0.02), if it exists? PROC POWER can answer that question, too. It turns out that a sample size of N=1000 only results in 0.25 power and a sample size of N=2000 only results in a power of 0.39. A pilot study based on those smaller samples is a waste of time and money because the study isn't large enough to detect the small effect that the company claims.

PROC POWER makes it easy to create a graph that plots the power of the binomial test for proportions against the sample size for a range of samples. You can also display the power for a range of sample sizes, as follows:

proc power; 
  twosamplefreq test=pchi
  refproportion = 0.31 proportiondiff = 0.02
  power = .
  alpha = 0.05
  npergroup = 250 to 8000 by 250 
  sides = 1;
  plot x=n xopts=(ref=1000 2000 5000 6726 crossref=yes);
  ods exclude output;
run;
Power curve showing power as a function of sample size for a (one-sided) test of two proportions

The graph shows that samples that have 1,000 or even 2,000 students in each group do not have enough power to detect a small difference of proportion (0.02) with any confidence. Only for N ≥ 5,000 does the power of the test start to approach reasonable levels.

Quick check: Simulate a sample

Whenever I see a counterintuitive result, I like to run a quick simulation to see whether the simulation agrees with the analysis. The following program generates a random sample from two groups of size N=1,000. The control group has a 31% chance of passing the test; the "Software" group has a 33% chance. A PROC FREQ analysis for the difference in proportions indicates that the empirical difference between the groups is about 0.02, but the p-value for the one-sided test is 0.18, which does not enable you to conclude that there is a significant difference between the proportions of the two groups.

/* simulation of power */
%let N = 1000;             /* group sizes*/
%let p = 0.31;             /* reference proportion */
%let delta = 0.02;         /* true size of difference in proportions */
 
data PowerSim(drop=i);
call streaminit(321);
do i = 1 to &N;
   c='Control '; pass = rand("Bernoulli", &p);        output; /* x ~ Bern(p) */
   c='Software'; pass = rand("Bernoulli", &p+&Delta); output; /* x ~ Bern(p+delta) */
end;
run;
 
proc freq data=PowerSim;
tables c*pass / chisq riskdiff(equal var=null cl=wald) /* Wald test for equality of proportions */
             nocum norow nopct nocol;
run;

Part of the output from PROC FREQ is shown. The output displays a typical two-way frequency table for the simulated experiment. You can see that the raw number of students that pass/fail the test are very similar. In healthcare applications, binomial proportions often correspond to "risks," so a "risk difference" is a difference in proportions. The RISKDIFF option tests whether the difference of proportions (risks) is zero. The output from the test shows both the two-sided and one-sided results. For these simulated data, there is insufficient evidence to reject the null hypothesis of no difference.

If you change the value of the macro variable N to a larger number (such as N = 6,726), it is increasingly likely that the chi-square test will be able to conclude that there is a significant difference between the proportions. In fact, for N = 6,726, you would expect 80% of the simulated samples to correctly reject the null hypothesis. Simulation is a way to create a power-by-sample-size curve even when there is not an explicit formula that relates the two quantities.

In summary, the statistical concepts of power and sample size can help researchers plan their experiments. This is a powerful idea! (Pun intended!) It can help you know in advance how large your samples should be when you need to detect a small effect.

The post What sample size do you need for a binomial test of proportions? appeared first on The DO Loop.

1月 312020
 

Everyone is talking about artificial intelligence (AI) and how it affects our  lives -- there are even AI toothbrushes! But how do businesses use AI to help them compete in the market? According to Gartner research, only half of all AI projects are deployed and 90% take more than three [...]

Driving faster value from analytics – how to deploy models and decisions quickly was published on SAS Voices by Janice Newell