标签归档:sas

Some discriminat methodes and SAS codes

NOTICE: The following text is one part of my published paper. Do not distribute it! All right reserved.

介绍线性判别方法、针对强共线性数据的两种降维判别方法及SAS实现代码。

1, linear discriminant analysis (LDA)
Because of its simplicity and robustness, LDA has been one of the most frequently used classification techniques since 1936.
LDA:
proc discrim data=ex1 testdata=ex2;
class g;
var x1-x10
run;

2, Combation of PCA and LDA(PCA+LDA), or PLS and PCA (PLS+LDA)
Principal component analysis (PCA) is the fundamental method used in chemometric and is based on vector algebra. The main purpose of this method is to reduce the dimensions of a data set with a large number of intercorrelated variables, whilst retaining as much of the information present in the original data as possible. A new set of orthogonal variables, principal components (PCs), describe the variance in data. Only first few of them can retain most of variation in describing the systematic information of all the original variables. Usually, a subset of limited PCs is used to explore the trends of samples with different treatments. Furthermore, when using these PCs as input variables, linear discriminant analysis (LDA) can greatly reduce multiple co-linearity among the variables of the original data. Therefore, the combination of PCA and LDA (PCA+LDA) was used here for the goal of classification. Principal component regression (PCR) is a multiple linear regression method for relating two sets of variables (PCs and response variables) with predictive purposes. PLS is an extension of PCR, which is applied to relate two sets of variables by a regression model. But in PLS, the principal components are more correlated with the response variables. This results in a more effective prediction of the response variable. In the same way, the PCs of PLS can be used in conjunction with LDA (PLS+LDA) to tackle classification problems. 继续阅读Some discriminat methodes and SAS codes

用SAS模拟随机数据 求pie值

刚刚看到一本好书《统计模拟》作者叫罗斯[英文:Sheldon M. Ross. Simulation(4th Ed).Elsevier Inc..2006 ]. 顾名思义,这是一本描述怎么利用模拟一些符合统计学理论的数据,用途很广,也就是说实际中的任何数据的分布都符合某种统计学模型,于是在没有得到真实数据之前,我们可以通过模拟数据来研究这些现实中的问题。 继续阅读用SAS模拟随机数据 求pie值