Some discriminat methodes and SAS codes

NOTICE: The following text is one part of my published paper. Do not distribute it! All right reserved.

介绍线性判别方法、针对强共线性数据的两种降维判别方法及SAS实现代码。

1, linear discriminant analysis (LDA)
Because of its simplicity and robustness, LDA has been one of the most frequently used classification techniques since 1936.
LDA:
proc discrim data=ex1 testdata=ex2;
class g;
var x1-x10
run;

2, Combation of PCA and LDA(PCA+LDA), or PLS and PCA (PLS+LDA)
Principal component analysis (PCA) is the fundamental method used in chemometric and is based on vector algebra. The main purpose of this method is to reduce the dimensions of a data set with a large number of intercorrelated variables, whilst retaining as much of the information present in the original data as possible. A new set of orthogonal variables, principal components (PCs), describe the variance in data. Only first few of them can retain most of variation in describing the systematic information of all the original variables. Usually, a subset of limited PCs is used to explore the trends of samples with different treatments. Furthermore, when using these PCs as input variables, linear discriminant analysis (LDA) can greatly reduce multiple co-linearity among the variables of the original data. Therefore, the combination of PCA and LDA (PCA+LDA) was used here for the goal of classification. Principal component regression (PCR) is a multiple linear regression method for relating two sets of variables (PCs and response variables) with predictive purposes. PLS is an extension of PCR, which is applied to relate two sets of variables by a regression model. But in PLS, the principal components are more correlated with the response variables. This results in a more effective prediction of the response variable. In the same way, the PCs of PLS can be used in conjunction with LDA (PLS+LDA) to tackle classification problems. Continue reading Some discriminat methodes and SAS codes