So far, sas/em 6.1 does not provide PSI in EM scorecard node. but could be solve this problem by another way. A solution using sas code from rpruitt@premierbankcard.com and detail as below:

/*************************************************************************/

/* This program calculates PSI (Population Stability Index) Statistic */

/* It was originally sent to PREMIER (Jay Kosters) per his request on */

/* 4/2/2009. */

/* Dan Kelly, SAS Institute, provided an example of code with various */

/* SAS Code options. */

/* Jay asked Rex to translate the SAS Code and refine it for use by */

/* PREMIER */

/* Programming was completed between 4/10 & 4/15, 2009 */

/*************************************************************************/

/* Dan Kelly's ancillary instructions: */

/* So a few obvious questions that come up are "how do you define the */

/* buckets" and "how many buckets do I need"? And "what are sample 1 */

/* and sample 2"? */

/* If sample 1 and sample 2 are different months (as you have) then you */

/* just need the bucket definition. */

/* */

/* Most of the time I think people use this on the scores, not the */

/* individual attributes that comprise the score. There's nothing */

/* to stop you from testing whether x1 drifts from month to month, */

/* or x2, or x3, ... */

/* */

/* For the most part when I see people use this they are just looking at */

/* whether the distribution of the score is fairly stable. */

/* */

/* I used 10 buckets just because I like the word "decile"; */

/* often people use "demidecile" for 20 5% buckets. */

/* */

/* Finally, your cutoffs (.1, .25...) sound like what I usually hear. */

SAS Global Forum 2010 Posters

5

/* This statistic is basically (I think) a divergence type statistic, */

/* like the Information value. So any cutoff that seems reasonable for */

/* those types of stats is probably reasonable here as well. */

/* */

/* You can change the distribution of MODELVAR in one of the data sets */

/* and see what that does to the PSI in the last printout to get a feel */

/* for what kind of differences in the distribution make what kind of */

/* difference in the work. */

/*************************************************************************/

/* Per Jay Kosters' research, a score of <= 0.1 indicates little change, */

/* 0.1 - 0.25 is little change but to small to determine and > 0.25 is */

/* a significant shift. */

/*************************************************************************/

/*************************************************************************/

/* These Macro variables must be changed to represent the PSI Variable */

/* (MODELVAR), PSI Output Library (PSILibrary) for storage of the ODS */

/* Output, Source Data representing the original data file name of the */

/* population being measured for stability (SourceData1), and the */

/* current population file name being used to identify possible */

/* divergence (SourceData2). */

/*************************************************************************/

/*insert the model variable (Interval ONLY) on this line*/

%Let MODELVAR=Receivables;

/*insert the PSI Output Data Library on this line*/

%Let PSILibrary=\\pbidelprd042\DM_Inputs\rpruitt\PSIResults;

/*insert the original population File Name on this line*/

%Let SourceData1=EMWS.Ids_DATA;

/*insert the current population File Name on this line*/

%Let SourceData2=EMWS.Ids4_DATA;

/**********************************************************************/

/* BEGIN Steps to get the data samples for the periods being compared */

LIBNAME PSI "&PSILibrary";

DATA PSI.PSISample1;

SET &SourceData1

(Keep=&MODELVAR)

;

Format &MODELVAR 12.2;

/******************************************************************/

/* This is where you can place more SAS statements to modify your */

/* PSI Variable so it accurately represents the format and value */

/* in your model. */

/******************************************************************/

RUN;

DATA PSI.PSISample2;

SET &SourceData2

(Keep=&MODELVAR)

;

Format &MODELVAR 12.2;

/******************************************************************/

/* This is where you can place more SAS statements to modify your */

/* PSI Variable so it accurately represents the format and value */

SAS Global Forum 2010 Posters

5

/* in your model. */

/******************************************************************/

RUN;

/* END Steps to get the data samples for the periods being compared */

/********************************************************************/

/**********************************/

/*BEGIN establish ODS Output File */

ODS Listing Close;

ODS HTML

Style=default

File="&PSILibrary\PSICode&MODELVAR..htm"

;

Title2 "PSI (Population Stability Index) Calculations for &MODELVAR";

/**************************/

/* BEGIN PSI Calculations */

/************************************/

/* BEGIN break Sample1 into bins */

/* BEGIN Sorting & Ranking process */

Proc Means Noprint Data=PSI.PSISample1 ;

Output

Out=PSI.RankedTotal (rename=(_freq_=RankedTotal))

;

run;

Data _Null_;

Set PSI.RankedTotal (Where=(_Type_=0));

Call Symput('RankedTotal',RankedTotal);

run;

Proc Means Noprint Data=PSI.PSISample2;

Output

Out=PSI.RankedTotal2 (rename=(_freq_=RankedTotal2))

;

run;

Data _Null_;

Set PSI.RankedTotal2 (Where=(_Type_=0));

Call Symput('RankedTotal2',RankedTotal2);

run;

Proc Sort

Data=PSI.PSISample1;

By &MODELVAR;

run;

Proc Sort

Data=PSI.PSISample2;

By &MODELVAR;

run;

/*********************************************************************/

/*BEGIN Use the Program Data Vector to override the binning of Zero's*/

Data PSI.PSISample1 (Keep=BinVar);

Set PSI.PSISample1;

BinVar=Sum(&MODELVAR,(_n_/&RankedTotal));

run;

SAS Global Forum 2010 Posters

5

Data PSI.PSISample2 (Keep=BinVar);

Set PSI.PSISample2;

BinVar=Sum(&MODELVAR,(_n_/&RankedTotal2));

run;

/*END Use the Program Data Vector to override the binning of Zero's*/

/*******************************************************************/

Proc Sort

Data=PSI.PSISample1;

By BinVar;

run;

Proc Sort

Data=PSI.PSISample2;

By BinVar;

run;

Proc Format;

Value DecileF

Low-0='00'

0-.1='01'

.1-.2='02'

.2-.3='03'

.3-.4='04'

.4-.5='05'

.5-.6='06'

.6-.7='07'

.7-.8='08'

.8-.9='09'

.9-1='10'

.='11'

;

Value DemiDecileF

Low-0='00'

0-.05='01'

.05-.1='02'

.1-.15='03'

.15-.2='04'

.2-.25='05'

.25-.3='06'

.3-.35='07'

.35-.4='08'

.4-.45='09'

.45-.5='10'

.5-.55='11'

.55-.6='12'

.6-.65='13'

.65-.7='14'

.7-.75='15'

.75-.8='16'

.8-.85='17'

.85-.9='18'

.9-.95='19'

.95-1='20'

.='21'

;

Value ZeroMiss

0='Zero'

11='Missing'

21='Missing'

;

run;

Data PSI.PSISample1;

Length decile 8.;

Set PSI.PSISample1;

Rank=_n_/&RankedTotal;

Decile=Put(Rank,DecileF.);

run;

/* END Sorting & Ranking process */

/* END break Sample1 into 10 bins */

/**********************************/

/*********************************************************************/

/* BEGIN you can see they are 10 equally sized bins with no ties in */

/* the output of this step. */

proc freq data=PSI.PSISample1;

tables decile / out=PSI.out1;

Title3 'Base-Line Sample Frequency By Decile Bin (Data=PSISample1)';

run;

/* END you can see they are 10 equally sized bins with no ties in */

/* the output of this step. */

/*********************************************************************/

/******************************************************/

/* BEGIN Calculate how the deciles are defined on the */

/* Supplied Variable (MODELVAR) scale */

/* so I want MAX(MODELVAR) in each decile */

proc means data=PSI.PSISample1 nway;

class decile;

var BinVar;

output out=PSI.endpoints max=maxVar;

Title3 'Base-Line Sample Mean, Max & Min Values (Data=PSISample1)';

run;

/* END Calculate how the deciles are defined on the */

/* Supplied Variable (MODELVAR) scale */

/* so I want MAX(MODELVAR) in each decile */

/******************************************************/

/*****************************************************************************/

/* BEGIN Data Step to write code that applies the above decile definition to */

/* the data set with MODELVAR on it */

data _NULL_;

set PSI.endpoints end=last;

file "&PSILibrary\decileSample1.sas";

if _N_ = 1 then put " select;";

put " when (BinVar le " maxVar ") decile = " decile ";" ;

if last then do ;

put " otherwise decile = " decile ";" ;

put "end;";

call symput('maxbin',decile);

end;

run;

data PSI.PSISample2;

set PSI.PSISample2;

%inc "&PSILibrary\decileSample1.sas" / source;

If BinVar=. Then decile=&maxbin;

run;

SAS Global Forum 2010 Posters

5

/* END Data Step to write code that applies the above decile definition to */

/* the data set with MODELVAR on it */

/*********************************************************************/

/*********************************************************************/

/* BEGIN Use the same definition for the buckets to establish how */

/* much data falls in each group for the sample 2 */

proc freq data=PSI.PSISample2;

tables decile / out=PSI.out2;

Title3 'Current Sample Frequency By Decile Bin (Data=PSISample2)';

run;

/* END Use the same definition for the buckets to establish how */

/* much data falls in each group for the sample 2 */

/*********************************************************************/

/************************************************************************************/

/* BEGIN put the % fields on the same file and calculate the terms that make up PSI */

data PSI.PSICompare;

merge PSI.out1 PSI.out2(rename=(percent=percent2));

by decile;

psi = log(percent/percent2)*(percent-percent2)/100;

run;

proc print data=PSI.PSICompare noobs;

var dec: per:;

Format decile ZeroMiss.;

sum psi;

Title3 "NOTE: PSI Calc Accomodates the Binning of Zero And Missing";

run;

/* END put the % fields on the same file and calculate the terms that make up PSI */

/**********************************************************************************/

/* END PSI Calculations */

/************************/

ODS _ALL_ Close;

ODS Listing;

/*END establish ODS Output File */

/********************************/

**From Rex Pruitt PREMIER Bankcard LLC Sioux Falls, SD**

阅读全文
**类别：**默认分类 查看评论