8月 052008
 
That's a fancy way of saying that we are adding a few more RSS feeds for support.sas.com visitors. Feeds are an efficient and easy way to get information about updates made to the SAS Support Web site. Feeds let you quickly stay current on updated content. I believe that if you scan the contents of the feeds every now and then, it will help you to locate information more quickly when you really need it.

How can that be? Well, it is just a theory, so I'd love to hear what you all think. My theory is that my brain holds on to more information than I can ever retrieve. (And it seems that the older I get, the harder it gets to retrieve information when I need it.) But, when I start to search for something -- either on a Web site or in a department store -- I am encouraged to keep looking if I know that I have seen the item I need. I often find that some snippet of the memory will return just in time to help me locate exactly the thing for which I am hunting. Well, it is just a theory, so I'd love to hear what you think.

I think that content notification (RSS in this case) can serve this purpose. Maybe you see a sample that looks interesting, so you read it on the spot. But what if you see information about a problem that you haven't encountered yet. Then weeks later, boom, you hit the same snag. Maybe you remember seeing it in the feed. Maybe you even remember part of the title, which will make your Web search much more successful.

Give the feeds a try and let me know if my theory is right or wrong.
Continue reading "New Syndicated Content"
8月 032008
 

SAS9.2安装其实和9.1也没有甚么太多区别,还是那么经典而老土的界面,也没有甚么说的,文字也是多余,大家就看看我安装的界面把

 Posted by at 3:25 上午
8月 012008
 
Maybe the question should be "have you noticed the toolbar at the top of every support.sas.com page?" The big yellow arrow is pointing to the toolbar in the image below.

image of toolbar in the header

When you select the Print link, we strip all of the outer edges from the page (top and left navigation, footer and the right informational column if it exists). Your printer window is displayed and the page is sent to your designated printer. This works great for Samples & SAS Notes because we also concatenate the content from the various tabs into one printout!

While I'm talking about the toolbar, I should mention the other two features. The e-mail link will open a new message and will pre-fill the body of the message with the page title and the URL of the page you are viewing.

The bookmark feature opens the browser window that enables you to add a favorite link. When we designed the toolbar, several people wondered about the usefulness of this feature. What do you think? Are shortcut links like this nice to have?

Keep an eye on the toolbar. New features may appear there soon. What feature would you like to see as a toolbar item?
7月 302008
 
Does your e-mail signature contain several sets of letters like MCSE or PMP? You see more and more people expressing their educational experience by including qualifiers after their name in e-mail signatures. Some say that this is a result of the tight job market. Honestly, I don't know. I do know that SAS-L, the discussion forums, and the Online Support e-mail box is full of questions about SAS certification.

Maybe I can help by providing a bit of information here. I started by asking my colleagues over in the certification group what I should post in a Did you know post about certification. They provided me with the following:


Did You Know ... With more than 44,000 sites leveraging the power of SAS, the need for highly skilled SAS professionals continues to grow. With a credential from SAS you can accelerate your career to the next level and earn global recognition for your achievement! Join the more than 17,000 SAS Certified Professionals worldwide and list your name on a directory for colleagues, current and future employers, and friends to find you. Visit the directory on support.sas.com.

Then I saw more and more questions about certification. I decided to expand the Did you know post to include links that answer commonly asked certification questions. I hope you find them helpful.

Sample questions on support.sas.com
Exam prep sorted by exam topic
Frequently asked questions about certification
Request more information about the certification program

You may also want to check with your peers to see how certification has helped them or how it is viewed by their companies when they are hiring. One place to start your research is to search previous posts on SAS-L.
7月 222008
 
People ask me all of the time where they should look for technical hints and tips. I have a long list of excellent resources; one that always makes the top 5 list is SAS Global Forum papers. These papers are written and presented by knowledgeable SAS software users, SAS employees, and SAS consultants. They cover every topic you can dream up. The thing is, there is always room for another idea, another method, or another approach.

I have seen papers quoted and referenced as answers to tough programming dilemmas. I have seen near riots ensue when the Online Proceedings search stopped working. I have seen mere programming mortals rise to superstar status after presenting a paper. I have seen the call for papers for SAS Global Forum 2009. (Sorry. I got carried away.)

SAS Global Forum 2009 will be held March 22 - 25 at the Gaylord National Resort, National Harbor. If you have attended a SAS users' group conference, you know that they are all in great cities. You also know that what makes the conference truly great is the value of the information and the quality of the attendees and presenters. Don't miss your chance to make National Harbor the best location yet.

The 2009 call for papers is now open. The deadline for submitting your idea, along with an abstract, is October 13, 2008. If you have a topic in mind, read the instructions for submitting a paper and get started.

If you are wondering what kinds of papers are appropriate, I copied this text right from the SAS Global Forum web site:
Each section description explains in detail what the section chairs are anticipating. In general, papers describing real-world applications of SAS software are particularly appropriate submissions. Tips and techniques on effective or innovative uses that others can adapt are also well received papers. Theoretical or general overview papers are also welcome.
7月 152008
 
Did you know ... that you can select a section of the support.sas.com site to search before you submit your search? You can limit your search to one of the four major site sections or to one of the many subsections.

Here's how:

  1. Type your search term or phrase in the entry field at the top of any support.sas.com page.
  2. Select the down arrow next to Search support.sas.com.
  3. Select any section from the list.


In the picture below, you can see that I am searching for training that I can take online. I know that all training information is available from the LEARNING CENTER, so I limit my search to only that section.




Some sections, like Samples & SAS Notes, have a dedicated search box that searches only that section. Using this drop box accomplishes the same task as if you went to the Samples & SAS Notes entry page and searched from the dedicated search field.

I have an advantage because I know what content is associated with each section. However, with a small amount of effort, you too can make good use of this filtering mechanism. To get started, review the section definitions at The big four or use the sitemap on support.sas.com to help you locate content.

Try it and let me know if it helps you to locate your answers faster. You can use the voting poll to the right to provide your quick feedback or add a comment to provide even more.
6月 302008
 
Did you know ... that we provide a collection of frequently used links at the bottom of the Samples and SAS Notes page? We worked with Technical Support to determine what types of content is frequently requested from the samples and notes database. We have created links to this content to give you a running start when searching for information.

Updated June 02, 2009
The location of the frequently used links has changed since this post was written. New information follows.

The quick links are now on two pages:

We also move the featured links from the bottom of the page to the top of each of these pages.
End of updated text.

Notice that we have also added search boxes for other SAS note collections, such as SAS/C and older notes that apply only to SAS Version 6.
11月 132007
 
The most common question we get regarding Stephen Few's white paper and webcast on visualizing change is about the scripts for showing the connected trails in bubble plots. Often the existing bubble trails are overlapping, so the progression is clear, but when the bubble trails are spaced out it can be helpful to connect the bubbles with a line.

Bubble Plot with Connected Trails

In addition to the journal file that contains the data and plots from the paper, you can also download from the JMP Extras area the graphics script by itself which can be added to any bubble plot (via Right-Click and Customize) if you edit the column names in the script.



Local( {s = "", c = 0, xx = {}, yy = {}},

For Each Row(

If( Selected( Row State() ),

s = :State;

c = Color Of( Row State() );

Insert Into( xx, :Property Rate );

Insert Into( yy, :Violent Rate );

);

If( N Items( xx ) != 0 & (Row() == N Rows()

| !Selected( Row State( Row() + 1 ) )

| s != :State[Row() + 1]),

Pen Color( c );

Pen Size( 2 );

Transparency( 0.3 );

Line( Matrix( xx ), Matrix( yy ) );

xx = {};

yy = {};

);

)

);


The script requires that the data be sorted by the ID column, "State" in this case.

用SAS读入原始数据(1):文本文件

 未分类  用SAS读入原始数据(1):文本文件已关闭评论
11月 301999
 

文本文件多以.txt、.dat以及.csv为后缀(在Unix/Linux世界,还可能出现.data数据甚至没有后缀的情况)。一般分两种,其一看起来跟下面一样——如果用UltraEdit等编辑器打开,你能看到指示列数的标尺,而且每个字段的列数是固定的,称为字段固定(Fixed Fields)的文本文件:

2810 61 MOD  F
2804 38 HIGH F

下面的数据中,每个字段的列数不必相同,但都由同一样个的分隔符(这里是逗号)分开,所以称为分隔符固定(Delimited Data)或者自由格式(Free-format)的文件文件。如果分隔符为逗号(通常以.csv为后缀),干脆就叫做“用逗号分隔的文本文件”:

1-Mar-90,LON,198
13-Mar-90,FRA,2073

对以上两种文本格式的数据,SAS提供了以下四种基本的输入模式:

  1. 列输入模式(Column) ——应用于字段固定的文本文件
  2. 格式化输入模式(Formatted) ——应用于字段固定的文本文件
  3. 列举输入模式(List) ——应用于分隔符固定的文本文件
  4. 命名输入模式(Named)

对以上几种输入模式,基本的语句如下,区别就在于input语句的具体设定:

data 你对导入数据的命名;
    infile  源文件名,加上具体的盘符位置;
    input 变量输入设定;
run;

1.列输入模式(Column)–应用于字段固定的文本文件

对字段固定的源文件,input语句的形式是

input 变量名1<s>开始列数-结束列数 变量名2<s>开始列数-结束列数 …;

一个能工作的列输入模式语句看起来就像下面展示的:

data work.example1;
    infile ‘C:\data\example1.dat’  firstobs=2 obs=100;
    input ID 1-3 Name $ 5-10;
run;

优点:

  1. 对字段的选择非常灵活,可以任意选择字段,以及安排读入的次序;
  2. 整个字段或其部分可以重复读入;
  3. 字段间不一定要有空格或者其他分隔符来分隔;
  4. 字符变量最多可含32K个字符,并可以包含空格符;
  5. 对缺失数据没有特殊占位符的要求。一个空格字段就读为一个缺失值,也不会引起其他字段读入的错误。

局限:

  1. 可以设定输入长度,但不可以设定输入格式。对数值型变量,只能读入标准数据值(Standard Numeric Data Value),即只包括数字、正负号、小数点和科学计数符号E构成的数,对日期型数据,以及包含美元符号、逗号等其他符号的数值,因为需要设定输入格式并按格式读入,列输入模式就无法正确读取。

2.格式化输入模式(Formatted)–应用于字段固定的文本文件

格式化输入模式类似于列模式:

  1. 它适用于字段固定格式的数据文件;
  2. 它也给出字段开始的列数,但不直接给出结束列数,而是通过输入格式给出读入长度;
  3. 它可以设定输入格式。

它的input语句的格式为:

input <指针控制> 变量名 输入格式 …;

注1:指针控制,就是将输入列指针控制在某个位置,作为读入字段的开始列号,它有 @n 或者 +n 的形式:

  • @n表示从第n列开始读入(指示开始列数的绝对位置);
  • +n表示将列控制指针增加n列侯读入(指示相对位置)。

一个能工作的格式化输入模式语句看起来就像下面展示的:

data work.example2;
    infile ‘C:\data\example2.dat’;
    input Name $ 2. @3 Job $5. +7 Place $8.;
run;

3.列举输入模式(List)–应用于分隔符固定的文本文件

因为分隔符固定的数据可以用分隔符来确定字段,input语句特别简单:

input 变量名<s>…;

如果要读入非空格分隔符的数据,就需要在infile语句中指明(默认为空格):

infile 文件盘符 <dlm="分隔符">;

在列举模式下,变量长度的缺省值为8,长度超过8的字符变量在读入时会被截断,这时可以用lenght语句来设定长度:

length 变量名<s> 长度;

列举模式也可以设定输入格式,这只需要在输入变量设定时附加上相关修饰:

input 变量名<s>: 输入格式 …;

一个能工作的列举输入模式语句看起来就像下面展示的:

data work.example3;
    infile ‘C:\data\example3.dat’ dlm=’,';
    length item $ 10.;
    input ID Name $ item $   income:comma9.;
run;

4.命名输入模式(Named)

命名输入模式很少见到,因为很难得见到这种格式的原数据,其中三个变量是ID、Name和Score:

1 Name=Tom Score=A
2 Name=Jim  Score=C

相应的input语句为:

input  ID Name=$3. Score=$1.

参考资料:

  1. 汪嘉冈《SAS V8基础教程》,北京:中国统计出版社,2001
  2. SAS OnlineTutor: Basic and Intermediate SAS
Technorati Tags:

白话统计(2):中心极限定理

 统计备忘录  白话统计(2):中心极限定理已关闭评论
11月 301999
 

*************本书给你数理统计的直观****************************

资料来自美国G.H.维恩堡等著的《数理统计初级教程》(常学将等译,太原:山西人民出版社,1986)

《白话统计(1):平均数、中位数、众数》

*************************************************************************

定理1:(中心极限定理)假定大量的等容量随机样本都是从同一无限总体采样的,算出每个样本的和,并把不同样本的和放在一起以形成一个新的分布,于是这个新的分布就是渐近正态的(其中要假定产生这些和的随机样本每个容量是足够大)。

在传统概率论教科书上,一般会这么陈述这个定理:

定理1`:(独立同分布的中心极限定理)设随机变量X1,…Xn,…相互独立,服从同一分布,且具有相同的数学期望和方差,则随机变量之和ΣXi的标准化变量服从标准正态分布。

演示性例子

想像一个很大的箱子,装满了小纸条,可供我们无穷无尽地抽取,每张纸条上写有一个数字。为简单起见,假定只有0、1、2三个数字,且每个数字出现在每张纸条上的可能性都是1/3。记住,这个箱子里的纸条如此之多,以致我们可以抽取任一数目的任一种纸条,而不必担心会改变箱中剩下的各种纸条之间的比例。

箱子有一个小口,通过它,每次可以释放出一张纸条。箱子还有一个洗牌装置,这种装置会把纸条洗得这样得均匀,以至当我们决定抽取一张时,每张纸条有同样的被释放出来的机会。因此,我们的观察室独立的,而且我们的样本是随机的。

现在我们就来抽取等容量的随机样本,假设每个样本都包含200张纸条。

我们一张一张地抽取200张纸条。比如头一张纸条上的数字是2,第二张纸条的数字是0,第三张纸条是2,如此等等。假设构成这个第一份样本的200张纸条上的数字总和是210,这个和成为所产生的新的分布的第一项。

第二个样本的200张纸条上的数字之和比如是194.对大量的样本,每个样本都包含200张纸条,重复这个过程。定理1告诉我们,这种样本和数越来越多时,样本和的分布近似于正态分布。

如何实际运用定理1

关于定理1,对被抽取样本的那个总体没有要求任何限制。不管被抽取样本的那个总体,其分布的形状如何,样本和的分布都是正态的。

定理1说明,为什么正态分布出现在如此多的不同的问题之中。我们用于纸条取样的那种方法,看来是实际中特别喜欢使用的一种方法。在每次情况中出现的、构成一个正态分布的那些数,都可以看作独立观察资料的等容量样本的和

例子1。考察射击时围绕靶子构成正态分布的子弹。每一颗子弹击中的位置实际上是许多随机影响的和,比如姿势、风向、光线、心理等等。这些因素和诸如此类因素的影响,同时在一位特定射手的身上起作用;且对于不同的射手,它们是不同的。一个射手的得分,表明他的子弹最终射到何处去了,这个得分是那些随机影响的样本之和。具体地,比如每一个射手的分布式70项主要影响之和,因而每一发子弹的得分,都可以看作是70项的一个样本和(与70张纸条上的那些数字的和相对应)。这样一来,不同射手的得分,就可以看作是不同的等容量样本的和。根据定理1,子弹得分的分布式正态的。

例子2。考察每个人的智力水平,也可以当作出自不同根源的小影响的和来看待,包括营养、机会、性格、遗传等等等等。这么看来,大量的人的智力水平的分布式正态的。

定理2:(定义1的一个变形,平均数的中心极限定理)假定,大量的等容量随机样本是从同一无限总体中采集的,算出每一个样本的平均数,并把不同样本的平均数放到一起形成一个新的分布,于是这个新的分布就是渐近分布的(假定产生这些平均数的随机样本容量是足够大的)。

样本平均数的集合可以通过样本和集合直接得到,因此平均数的分布就是和的分布的一个小比例的变形。样本平均数的分布用两个有用的性质:

  1. 假定无穷多个等容量随机样本是从同一无限总体中抽取的,而且把这些样本的平均数放到一起,以构成一个新的分布,那么这个新分布(样本平均数构成的)的均值与原总体的均值相同。
  2. 假定无穷多个等容量随机样本是从同一无限总体中抽取的,以n表示每一个样本的容量,这些样本的平均数的分布有一个标准差,它等于原总体的标准差除以n的平方根。

定理2及其两个性质就是我们熟悉的mean(X)~N(μ,σ**2/n)。