2月 202019
 

Last year I published a series of blogs posts about how to create a calibration plot in SAS. A calibration plot is a way to assess the goodness of fit for a logistic model. It is a diagnostic graph that enables you to qualitatively compare a model's predicted probability of an event to the empirical probability. I am happy to report that in SAS/STAT 15.1 (SAS 9.4M6), you can create a calibration plot automatically by using the PLOTS=CALIBRATION option on the PROC LOGISTIC statement.

Calibration plots for a model of a binary response

To demonstrate how to create a calibration plot by using PROC LOGISTIC, consider the simulated data that I analyzed in "Calibration plots in SAS." The data contain a binary response variable, Y, which depends quadratically on a uniformly distributed explanatory variable, X. The following call to PROC LOGISTIC fits a quadratic the model to the data. The new GOF option requests an extensive set of goodness-of-fit statistics and the PLOTS=CALIBRATION option requests a calibration plot:

/* NEW in SAS/STAT 15.1 (SAS 9.4M6): PLOTS=CALIBRATION option in PROC LOGISTIC */
title "Calibration Plot for a Quadratic Model";
title2 "Created by PROC LOGISTIC";
proc logistic data=LogiSim plots=calibration(CLM ShowObs);
   model y(Event='1') = x x*x / GOF;      /* New in 15.1: More goodness-of-fit statistics */
run;
Calibration plot for a quadratic logistic model, created by PROC LOGISTIC in SAS

The calibration plot is shown. (Click to enlarge.) The plot contains a gray diagonal line, which represents perfect calibration. If most of the predicted responses agree with the observed responses, then the blue curve should be close to the diagonal line. That is the case in this example. The light blue band is a 95% confidence region for the loess fit and is created by using the CLM option.

Because I used the SHOWOBS option, the calibration plot displays tiny histograms along the top and bottom of the plot. The histograms indicate the distribution of the Y=0 and Y=1 responses. The article "Use a fringe plot to visualize binary data in logistic models" explains more about how fringe plots can add insight to graphs that involve a binary response variable.

The lower right corner of the calibration plot contains one of the many goodness-of-fit statistics that are computed when you use the GOF option on the MODEL statement. A small p-value would indicate a lack of fit. In this case, there is no reason to suspect a lack of fit. The following table shows other goodness-of-fit tests. None of the p-values are small, so none of the tests indicate lack of fit.

Goodness-of-fit statistics for a quadratic logistic model, created by PROC LOGISTIC in SAS

Calibration plots for a polytomous response

An exciting feature of the calibration plots in PROC LOGISTIC is that you can use them for a polytomous response model. Derr (2013) fits a proportional odds model that predicts the probability of the severity of black-lung disease from the length of exposure to coal dust in 371 coal miners. The response variable, Severity, has the levels 'Severe', 'Moderate', and 'Normal'. The following statement create the data and model and request calibration plots for the model.

/* Data, from McCullagh and Nelder (1989, p. 179), used in Derr (2013, p. 8-10).
   The severity of pneumoconiosis (black lung disease) in coal miners
   and the number of years of exposure.
*/
data Coal; 
input Severity $ @@; 
do i=1 to 8; 
   input Exposure freq @@; 
   log10Exposure=log10(Exposure); 
   output; 
end; 
datalines; 
Normal   5.8 98 15 51 21.5 34 27.5 35 33.5 32 39.5 23 46 12 51.5 4 
Moderate 5.8  0 15  2 21.5  6 27.5  5 33.5 10 39.5  7 46  6 51.5 2 
Severe   5.8  0 15  1 21.5  3 27.5  8 33.5  9 39.5  8 46 10 51.5 5 
;
 
title 'Severity of Black Lung vs Log10(Years Exposure)';
proc logistic data=Coal rorder=data plots=Calibration(CLM);
   freq freq; 
   model Severity(descending) = log10Exposure; 
   effectplot / noobs individual;
run;
Panel of calibration plots for a polytomous proportional-odd model, created by PROC LOGISTIC in SAS

Derr (2013) discusses the results of the analysis, which are not shown here. I've displayed only the calibration plot for the model. Notice that PROC LOGISTIC creates a panel of three calibration plots, one for each response level. The calibration curves all lie close to the diagonal, so the diagnostic plots do not indicate a lack of calibration for any part of the model.

Summary

In summary, the PLOTS=CALIBRATION option in SAS/STAT 15.1 enables you to automatically create a calibration plot. The calibration plot is a diagnostic plot that qualitatively compares a model's predicted and empirical probabilities. You can use the PLOTS=CALIBRATION option on the PROC LOGISTIC statement to create a calibration plot. The CALIBRATION option supports several suboptions, which you can read about in the documentation for the PROC LOGISTIC statement.

You can download the SAS code used in this article, which includes SAS code that demonstrates how to create a calibration plot manually.

The post An easier way to create a calibration plot in SAS appeared first on The DO Loop.

2月 192019
 

When it comes to forecasting new product launches, executives say that it's a frustrating, almost futile, effort. The reason? Minimal data, limited analytic capabilities and a general uncertainty surrounding a new product launch. Not to mention the ever-changing marketplace. Nevertheless, companies cannot disregard the need for a new product forecast [...]

Practical approaches to new product forecasting using structured and unstructured data was published on SAS Voices by Charlie Chase

2月 192019
 

When it comes to forecasting new product launches, executives say that it's a frustrating, almost futile, effort. The reason? Minimal data, limited analytic capabilities and a general uncertainty surrounding a new product launch. Not to mention the ever-changing marketplace. Nevertheless, companies cannot disregard the need for a new product forecast [...]

Practical approaches to new product forecasting using structured and unstructured data was published on SAS Voices by Charlie Chase

2月 192019
 

Artificial Intelligence (AI) has caught everyone's attention in recent years, mainly because of its disrupting nature which gives it enormous potential with countless applications. Among the many possibilities that AI promises, customer experience (CX) is an area that offers immense opportunity for organisations to differentiate. Welcome to the experience economy [...]

Is artificial intelligence the future of customer experience? was published on Customer Intelligence Blog.

2月 182019
 
Maybe if we think and wish and hope and pray
It might come true.
Oh, wouldn't it be nice?

The Beach Boys

Months ago, I wrote about how to use the EFFECT statement in SAS to perform regression with restricted cubic splines. This is the modern way to use splines in a regression analysis in SAS, and it replaces the need to use older macros such as Frank Harrell's %RCSPLINE macro. I shared my blog post with a colleague at SAS and mentioned that the process could be simplified. In order to specify the placement of the knots as suggested by Harrell (Regression Modeling Strategies, 2010 and 2015), I had to use PROC UNIVARIATE to get the percentiles of the explanatory variable. "Wouldn't it be nice," I said, "if the EFFECT statement could perform that computation automatically?"

I am happy to report that the 15.1 release of SAS/STAT (SAS 9.4M6) includes a new option that makes it easy to place internal knots at percentiles of the data. You can now use the KNOTMETHOD=PERCENTILELIST option on the EFFECT statement to place knots. For example, the following statement places five internal knots at percentiles that are recommended in Harrell's book:
EFFECT spl = spline(x / knotmethod=percentilelist(5 27.5 50 72.5 95));

An example of using restricted cubic in regression in SAS

Restricted cubic splines are also called "natural cubic splines." This section shows how to perform a regression fit by using restricted cubic splines in SAS.

For the example, I use the same Sashelp.Cars data that I used in the previous article. For clarity, the following SAS DATA step renames the Weight and MPG_City variables to X and Y, respectively. If you want to graph the regression curve, you can sort the data by the X variable, but this step is not required to perform the regression.

/* create (X,Y) data from the Sashelp.Cars data. Sort by X for easy graphing. */
data Have;
   set sashelp.cars;
   rename mpg_city = Y  weight = X  model = ID;
run;
 
proc sort data=Have;  by X;  run;

The following call to PROC GLMSELECT includes an EFFECT statement that generates a natural cubic spline basis using internal knots placed at specified percentiles of the data. The MODEL statement fits the regression model and the OUTPUT statement writes an output data set that contains the predicted values. The SGPLOT procedure displays a graph of the regression curve overlaid on the data:

/* fit data by using restricted cubic splines using SAS/STAT 15.1 (SAS 9.4M6) */
ods select ANOVA ParameterEstimates SplineKnots;
proc glmselect data=Have;
   effect spl = spline(X/ details naturalcubic basis=tpf(noint)
             knotmethod=percentilelist(5 27.5 50 72.5 95); /* new in SAS/STAT 15.1 (SAS 9.4M6)  */
   model Y = spl / selection=none;       /* fit model by using spline effects */
   output out=SplineOut predicted=Fit;   /* output predicted values */
quit;
 
title "Restricted Cubic Spline Regression";
title2 "Five Knots Placed at Percentiles";
proc sgplot data=SplineOut noautolegend;
   scatter x=X y=Y;
   series x=X y=Fit / lineattrs=(thickness=3 color=red);
run;

In summary, the new KNOTMETHOD=PERCENTILELIST option on the EFFECT statement simplifies the process of using percentiles of a variable to place internal knots for a spline basis. The example shows knots placed at the 5th, 27.5th, 50th, 72.5th, and 95th percentiles of an explanatory variable. These heuristic values are recommended in Harrell's book. For more details about the EFFECT statement and how the location of knots affects the regression fit, see my previous article "Regression with restricted cubic splines in SAS."

You can download the complete SAS program that generates this example, which requires SAS/STAT 15.1 (SAS 9.4M6). If you have an earlier release of SAS, the program also shows how to perform the same computations by calling PROC UNIVARIATE to obtain the location of the knots.

The post An easier way to perform regression with restricted cubic splines in SAS appeared first on The DO Loop.

2月 152019
 
Beginning with SAS® 9.4, you can embed graphics output within HTML output using the ODS HTML5 destination. This technique works with SAS/GRAPH® procedures (such as GPLOT and GCHART), SG procedures (such as SGPLOT and SGRENDER), and when you create graphics output with ODS Graphics enabled. Most (if not all) existing web browsers support graphics output embedded in HTML5 output.

Note: The default graphics output format for the ODS HTML5 destination is Scalable Vector Graphics (SVG). SVG documents display clearly at any size in any viewer or browser that supports SVG. So, SVG files are ideal for display on a computer monitor, PDA, or cell phone; or printed documents. Because it's a vector graphic, a single SVG document can be transformed to any screen resolution without compromising the clarity of the document. Here's an example:

The same SVG graph, scaled at 90% and then at 200%. But 100% crisp!

SAS/GRAPH procedures

When you use the ODS HTML5 destination with a SAS/GRAPH procedure, specify a value of SVG, PNG, or JPEG for the DEVICE option in the GOPTIONS statement. The following sample PROC GPLOT code embeds SVG graphics inside the resulting HTML output:

goptions device=svg;
ods _all_ close;  
ods html5 path="c:\temp" file="svg_graph.html"; 
symbol1 i=none v=squarefilled; 
proc gplot data=sashelp.cars; 
  plot mpg_city * horsepower;   
  where make="Honda"; 
run;
quit;  
ods html5 close; 
ods preferences;

Note that the ODS PREFERENCES statement above resets the ODS environment back to its default settings when you use the SAS windowing environment.

When you use the PNG or JPEG device driver with the ODS HTML5 destination, add the BITMAP_MODE="INLINE" option to the ODS HTML5 statement. Here is an example:

goptions device=png;
ods _all_ close; 
ods html5 path="c:\temp" file="png_graph.html"     options(bitmap_mode="inline");
symbol1 i=none v=squarefilled; 
proc gplot data=sashelp.cars; 
  plot mpg_city * horsepower;   
  where make="Honda"; 
run;
quit;  
ods html5 close; 
ods preferences;

ODS Graphics and SG procedures

When you use SG procedures and ODS Graphics, specify a value of SVG, PNG, or JPEG for the OUTPUTFMT option in the ODS GRAPHICS statement. The following sample code uses PROC SGPLOT to embed SVG graphics inside the HTML output with the ODS HTML5 destination:

ods _all_ close; 
ods html5 path="c:\temp" file="svg_graph.html"; 
ods graphics on / reset=all outputfmt=svg;
proc sgplot data=sashelp.cars; 
  scatter y=mpg_city x=horsepower / markerattrs=(size=9PT symbol=squarefilled);   
  where make="Honda"; 
run;
ods html5 close; 
ods preferences;  

The following sample code uses PROC SGPLOT to embed PNG graphics inside the HTML output with the ODS HTML5 destination:

ods _all_ close; 
ods html5 path="c:\temp" file="png_graph.html" options(bitmap_mode="inline");   
      ods graphics on / reset=all outputfmt=png;
proc sgplot data=sashelp.cars; 
  scatter y=mpg_city x=horsepower / markerattrs=(size=9PT symbol=squarefilled);   
  where make="Honda"; 
run;
      ods html5 close; 
      ods preferences; 

The technique above also works when you use the ODS GRAPHICS ON statement with other procedures that produce graphics output (such as the LIFETEST procedure).

Note that the ODS HTML5 destination supports the SAS Graphics Accelerator. The SAS Graphics Accelerator enables users with visual impairments or blindness to create, explore, and share data visualizations. It supports alternative presentations of data visualizations that include enhanced visual rendering, text descriptions, tabular data, and interactive sonification. Sonification uses non-speech audio to convey important information about the graph.

You can use the ODS HTML5 destination in most situations where you need to embed all of your output into a single HTML output location. For example, when you email HTML output as an attachment or when you create graphics output via a SAS stored process. If you currently use the ODS HTML destination, you might want to experiment with the ODS HTML5 destination to see whether it meets your needs even if you cannot completely switch to it yet.

Embed scalable graphics using the ODS HTML5 destination was published on SAS Users.

2月 132019
 

Every day, military intelligence analysts sit behind computers reading a never-ending stream of reports, updating presentation templates and writing assessments. But intelligence is more than documenting events and sharing breaking news. It involves understanding and predicting complexities in human behavior across various organizational constructs and using facets of information to [...]

NLP for military intelligence was published on SAS Voices by Mary Beth Moore

 Posted by at 11:25 下午
2月 132019
 

Every day, military intelligence analysts sit behind computers reading a never-ending stream of reports, updating presentation templates and writing assessments. But intelligence is more than documenting events and sharing breaking news. It involves understanding and predicting complexities in human behavior across various organizational constructs and using facets of information to [...]

NLP for military intelligence was published on SAS Voices by Mary Beth Moore

2月 132019
 

SAS has worked with our exam delivery partners to integrate a live lab into an exam, which can be delivered anywhere, anytime, on-demand.

The post New Performance-Based Certification: Write SAS Code During Your Exam appeared first on SAS Learning Post.