Help us design a better SAS customer support Web site

 Michele Reister  Help us design a better SAS customer support Web site已关闭评论
10月 062010
The folks in charge of our SAS customer support Web site, are working on a new and improved home page. Who better to evaluate this Web page than you? You can view the beta page now and take a survey to tell us what you think.

One new section I’d like to point out is the new role-based area in the center/right-side of the page. SAS offers so many resources to help our customers, it’s often challenging to sort through everything to find what’s right for you. This new section is similar to our SAS Training by job roles, which provide you with a learning path to help guide your SAS training based on your job role. Now, in this new section on the beta home page, you can find a variety of resources, including training based on your job role. We hope this is helpful. Please tell us in the survey.

You can learn more about the changes we’re proposing and the reasoning behind the changes in this blog post.

On three statistical realms

 misc  On three statistical realms已关闭评论
10月 062010

Peter Petocz and Anna Reid(2010) grouped three levels of students’ conceptions of statistics:

  • Level I:   focus on techniques
  • Level II:  focus on using data
  • Level III: focus on meaning

I found the three conceptions could be easily interpreted as the three kinds of state of learning and using statistics based on my personal experience:

  • State I: focus on techniques—As a student of Economics and (then) Software Engineering, I needed some statistics techniques to support my study on data mining and machine learning. So I invested a lot on some fancy skills such as logistic regression, decision tree,  neural network and even support vector machine in graduate school and SAS R&D(as an intern). In most time, I just thrown data to the models and checked their functionality and feasibility(Wula-IT-WORKS! or Oops-crash-again). When looking back, I’d just have to say these techniques were toys played in labs.
  • State II:  focus on using data—Now I worked as a SAS programmer(also titled as statistical analyst) in pharma. All data are not just the rows and columns in the tables. They are SUBJECTS! Statistical techniques are used carefully to display and interpret the story of real world. Why the denominator is 999 while 1000 subjects were recruited in this trial? Because subject 001-127, male, 23 months of age,  discontinued due to his father’s wish and opinion!
  • State III: focus on meaning—Peter Petocz and Anna Reid concluded that, regarding the MEANING conception of statistics, “statistics is an inclusive tool used to make sense of the world and develop personal meanings.” The last state of any realms ideal, is always sounded like philosophy or religion. That may be a life in a statistical way or style(If got it, I would change my blog’s title as From a Statistical Point of View^).

—————-some notes on non-statistics—————————-

1. three kinds of state of Chan

  • just mountain
  • isn’t mountain
  • still mountain

2. three realm ideal of Wang Guowei

  • heaven is integrated with man:

Last night the west wind shriveled the green-clad trees,

Alone I climb the high tower

To gaze my fill along the road to the horizon.

  • knowledge is integrated with practice

My clothes grow daily more loose, yet care I not.

For you am I thus wasting away in sorrow and pain.

  • feeling is integrated with scenery

I sought her in the crowd a hundred, a thousand times.

Suddenly with a turn of the head [I saw her],

That one there where the lamplight was fading.


Peter Petocz and Anna Reid. On Becoming a Statistician—A Qualitative View. International Statistical Review(2010), 78,2,271

WANG Guowei. Ren jian ci hua. translated by Adele Austin Rickett.

10月 062010
The issues marketers face are daunting. Let’s talk about three.

Response and opt out rates are awful. So we struggle to push thru the noise and stand out. We work to develop a strong list, message, and call-to-action. But we can’t know for certain if our offer will resonate with the recipient. We do our best.

Online ads.
These used to be about volume - interrupt as many as you can, and a few will inevitably click-through. Now, ads need to bring value to the viewer. And that makes sense. We work to segment and target our offers; aligning them to the right publications and the right readers at the right time. But we can’t know for certain if an ad will be accepted as well-timed information or loathed as a disruption. We do our best.

Almost every new sale begins with a Google search. But being on the first results page does not mean you’ve won; it only means you get to play in round two. We work to get people to our site and make it easy for them to find the information they need once they arrive. But we don’t know for certain if the content we have or the way we present it is hitting the mark. We do our best.

Or do we?

A new report from the CMO Council and Accenture reveals that nearly “one-third of marketers and IT executives alike” report that they are “either having difficulty integrating critical analytics capabilities or believe they are not integrated at all.” Does that sound like “our best” to you?

Doing our best means realizing that our gut isn’t going to help much in the digital marketing era. Our best demands we employ every tool we have to better serve our customers. Customer data. Math. Analytics. Optimization. Segmentation models. This is where the new-best begins.

Success with Email? The best way forward is to be ruthless in the application of analytics and optimize lists of contacts who are actually interested in your current message.

Success with online ads? Segmentation models that match ads and offers to the best possible readers are what’s needed here, so that we deliver assistance, not interruption.

Success with your Website? Advanced forms of social media and web analytics that track mentions and website visitors and analyze behaviors, allowing us to improve customer experience and serve up the right offer of content at the right time.

Marketing is quickly becoming an analytically driven discipline. Do you welcome the explosion of data as a gold mine of information? Or do you drown in details you’re unable to harness? Do you have a deep understanding of your customers and the confidence to act? Or do you struggle to deliver campaigns with little benefit of insight or promise of improvement?
10月 052010
This is a special R-only entry.

In Example 8.7, we showed the Hosmer and Lemeshow goodness-of-fit test. Today we demonstrate more advanced computational approaches for the test.

If you write a function for your own use, it hardly matters what it looks like, as long as it works. But if you want to share it, you might build in some warnings or error-checking, since the user won't know its limitations the way you do. (This is likely good advice even if you are the only one to use your code!)

In R, you can add another layer of detail so that your function conforms to standards for built-in functions. This is a level of detail we don't pursue in our book, but is worth doing in many settings. Here we provided a modified version of a Hosmer-Lemeshow test sent to us by Stephen Taylor of the Auckland University of Technology. We've added a few annotations.

Note that the function accepts a glm object, rather than the two vectors our function used.

hosmerlem2 = function(obj, g=10) {
# first, check to see if we fed in the right kind of object
stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")
y = obj$model[[1]]
# the double bracket (above) gets the index of items within an object
if (is.factor(y))
y = as.numeric(y)==2
yhat = obj$fitted.values
cutyhat = cut(yhat, quantile(yhat, 0:g/g), include.lowest=TRUE)
obs = xtabs(cbind(1 - y, y) ~ cutyhat)
expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
if (any(expect < 5))
warning("Some expected counts are less than 5. Use smaller number of groups")
chisq = sum((obs - expect)^2/expect)
P = 1 - pchisq(chisq, g - 2)
# by returning an object of class "htest", the function will perform like the
# built-in hypothesis tests
method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g, "bins", sep=" ")), = deparse(substitute(obj)),
statistic = c(X2=chisq),
parameter = c(df=g-2),
p.value = P
), class='htest'))

We can run this using last entry's data from the HELP study.

ds = read.csv("")
logreg = glm(homeless ~ female + i1 + cesd + age + substance,

The results are the same as before:

> hosmerlem2(logreg)
Hosmer and Lemeshow goodness-of-fit test with 10 bins

data: logreg
X2 = 8.4954, df = 8, p-value = 0.3866
10月 052010
Momentum for SAS Customer Intelligence is at an all-time high, and we’re responding in kind by taking it to the streets with a blockbuster line-up for October this year in a city near you. These opportunities have been designed to enable you to interact with both SAS and non-SAS experts and learn more about the difference that customer intelligence solutions can make for your business. We hope you can join us at one or all of these events.

Washington, DC
On October 5, 2010 in our nation’s capital, SAS is hosting a Web Analytics Tuesday reception at the Sheraton National Hotel near Pentagon City at 900 Orme Street, Arlington, VA. This is the single most popular social network and networking event in the digital measurement sector and THE event for Web analysts. This reception is being held in connection with the eMetrics Marketing Optimization Conference. Highlights include:

  • Hear from John Lovett and SAS on social media approaches and why Web Analytics is a natural partner for Social Media Analytics.
  • Enjoy free refreshments.
  • Enter for a chance to win a free Flip HD video recorder.
  • Click here to Register and join us for the fun!

Boston, MA
On October 6-7, 2010 in Boston, SAS is a gold sponsor of the Inbound Marketing Summit.

On Tuesday, Oct. 6 at 4:50pm, SAS will be featured on a panel discussion titled, “Harnessing the Power of Digital Marketing From Traditional to Digital: a B2B Technology Company’s Journey.” Speakers from SAS include:
In addition, we have a display in the exhibit area and we’re highlighting SAS Social Media Analytics and other SAS Customer Intelligence solutions. The Inbound Marketing Summit is part of FutureM, a week-long marketing event in Boston hailed as the first-ever week long, multi-location conference offering the opportunity to tap into the freshest thinkers in marketing, media and technology. Come see us in Boston, register here!

San Francisco, CA
On Oct. 9-14, 2010 in San Francisco, SAS is Thought Leadership sponsor of DMA: 2010 Conference and Exhibition. SAS is featured in 3 sessions at the event, in addition to hosting a hospitality reception that includes a book signing with Chris Brogan. For even more details, click here or check out the basics on the 3 sessions below:

Enhance Consumer Insight with Social Media with Chris Brogan
Monday, Oct. 11, 1:45 – 2:45 p.m. | Room 103 Lower Level
Chris Brogan, President, New Marketing Labs
Mark Chaves, Director of Media Intelligence Solutions, SAS

Social Media Analytics: The Science of Listening
Monday, Oct. 11, 11:15 a.m. – 12:15 p.m. | Room 123 Lower Level
John Bastone, Product Marketing Manager, SAS

Event-Triggered Marketing Solutions: Which One Is Best for You?
Monday, October 11, 11:15 a.m. – 12:15 p.m. | Room 113 Lower Level
Andy Bober, Director of Customer Intelligence Product Management, SAS

The hospitality reception is an exclusive book signing event with social media veteran Chris Brogan :
Monday, October 11, 6 – 9 p.m.
The St. Regis San Francisco | Overlook Terrace – Ninth Floor | 125 Third St. | San Francisco
This event is free, but does require registration if you are interested.

Orlando, FL
On Oct. 13 – 16, 2010 in Orlando, SAS is sponsoring the annual Masters of Marketing Conference of the Association of National Advertisers.
The conference will take place at the Rosen Shingle Creek Resort. The SAS exhibit will feature the Customer Intelligence Online Analytics Suite:
In addition, SAS has secured a full-page advertisement of The Advertiser magazine, celebrating the 100th anniversary of the ANA, and which will be distributed at the show. Click here to register for this event. We hope to see you in Orlando!

Cary, NC
On Oct. 19-20, 2010 at SAS World Headquarters in Cary, NC, SAS will host the Customer Intelligence Users Connection Conference. At this event, customers are invited for a two-day forum to connect with SAS experts and learn about the latest SAS offerings, as well as to hear from peers from across industries about how they are solving the most pressing customer marketing issues. The keynote speaker will be Dave Frankland, Vice President and Principal Analyst at Forrester Research.

In addition, SAS will recognize organizations that have most effectively used SAS Analytic Marketing solutions to enable smarter decisions to solve more business challenges with a Customer Intelligence Award. Deadline for submissions is Friday, Oct. 8. The winner will be announced at the conference and all submitting individuals/organizations will be entered into a drawing for an iPad. Click here to learn more about and register for this great customer event!

Las Vegas, NV
On Oct. 26 – 28, 2010 in Las Vegas, NV, SAS will host the Premier Business Leadership Series. Designed specifically for senior-level executives, The Series features thought leaders and experts from a broad spectrum of industries who will share how they have cultivated a corporate culture that inspires innovation, and how they are optimizing resources to transform the way they approach business decisions.

On the main stage, you’ll hear from Chris Brogan, Charlene Li and David Meerman Scott on social media. In sessions, you’ll hear from marketing leaders, such as Staples, UBS, Vistaprint, GE Money, T-Mobile, AEGON Direct Marketing, Harrah’s and Major League Soccer.

In addition, there will be “lunch with the experts” roundtables with social media thought leader Chris Brogan on Social Media Analytics, retail titans Bernie Brennan and Lori Schafer on retail views of social media & mobility, and hospitality industry expert Michael McCall from Cornell University on Building Customer Loyalty.

Attendance at the Premier Business Leadership Series is offered free of charge, but restricted to director-level and above executives. Click here to register.

We have just about all four corners of the USA covered this month and hope you can come out and see us at one of these events!

You’re the World’s Expert

 sas global forum, SAS GloFo,  You’re the World’s Expert已关闭评论
10月 042010
Space is big. You just won't believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it's a long way down the road to the drug store, but that's just peanuts to space. - Douglas Adams, Hitchhiker's Guide to the Galaxy
SAS is big. It's not quite as big as space, although sometimes it might seem that it's just as vast. You can't know everything about SAS because your brain would explode. But that's why we have user group conferences and proceedings. The expertise isn't simply distributed all over the world. It's scattered over the decades as well, recorded for posterity in the form of SAS user group proceedings. If you use SAS often, then the chances are pretty good that there is some corner of SAS, some feature or function, that you know better than just about anyone else. Whether it's some special use of the DATA step, a unique database structure that you use for reporting, a production chart that has become mission-critical in your organization, or some unique use of one of the thousands of SAS procedures, statements or functions: there is something that you know about SAS that no one else knows. For that one aspect of SAS, you are the world's foremost expert. For me, my "World's Expert" debut was at SUGI 21, when I presented "Developing Native Help for SAS/AF Applications." The year was 1996, and I can say it with confidence: nobody knew more than I did about developing help content with Windows Help and OS/2 IPF, and integrating it with a SAS/AF application. That's not Nobel prize material, but I know that my conference paper helped hundreds of people. My presentation was attended by maybe 50 people on a brisk March day in Chicago, but the content lives on within proceedings. Even though I don't think it gets much reference now (at least, I hope it doesn't), it was my first user group talk...and I've written or contributed to dozens of others since then. You still have time to share your SAS expertise with the world. The Call for Papers is still on for SAS Global Forum 2011 (until October 25). In this case, what happens in Vegas does not stay in Vegas; it is shared with the SAS community around the world and across the years.

SAS语言管窥 SAS_Dream 2004

 dream, list, saslist, SAS评论, 前辈, 清单, 纵览, 经典, 评论  SAS语言管窥 SAS_Dream 2004已关闭评论
10月 042010
热度: 这个文章最早见于2004年的sasor论坛,现在读来,仍然感觉经典。 尽管SAS经过这么多年发展,并且现在版本更新越来快,新模块和新功能如雨后春笋般冒出来,但是经典的文章仍值得再读一遍,哪怕是你读过很多遍。前一文转载了SAS的零碎印象一文,这两文每次读来都感到自己见识局限。因此,“精通“一词不管用于形容一个人的SAS技术,还是用来作为书名,值得谨慎考虑,再此,重读一些这些经典文章来提醒自己。因此,本博虽崇尚原创,并且网上的转载无数到连作者和出处都变更无数或者干脆没有,但是这里仍推荐大家重读一次经典。 另外,我很迷惑一点,为什么时隔五六年,还没有超过这两篇的关于SAS的中文评论出现,是没有像SAS_DREAM这样的技术高手,还是技术高手很忙? 附: SAS语言管窥 由 SAS_Dream » 2004-3月-28 00:15 感觉SAS语言体系是庞杂多于宏大。因为很多可以称得上宏大的语系例如微软系或者现在的Java系,多是先有一个比较周全的架构,通过有序的新生、继承和变异,逐渐扩展膨胀的,语言元素之间有比较规范的关联。而SAS的语系虽有局部的架构,但就全局而言,主要是自发形成,也就是20多年的堆积和承袭。其实这也自然,SAS的应用领域靠近最终用户,模式千变万化,很难现有周全架构,只要有可行解就行了,而很多有组织的语系比较靠近系统底层,实际范畴比较集中,比较容易研究出架构。 因此,SASOR们的武艺和兵器往往是门派繁多,千变万化,但是很难有那种18般兵刃样样精通的武林宗师(如果那位知道有,麻烦通知一下,我们好沐浴焚香去拜)。 粗浅的印象是,SAS语系可以大致如下划分: 国语:Base语系 这是SASOR们不分阶级不分贫富都可以讲的话,里面就包含了常说的Data Step,Proc Step和Macro。SAS的基础语言元素主要是在这里演进而来。这个语言可以说是七十和八十年代面向过程处理语言的扛鼎之作,甚至还带有浓郁的非结构化色彩;难得的是SAS公司作为偏重技术的私人公司,二十多年以继承发展而非不断否定的方式打造Base,使得一些二十多岁“高龄”的函数和过程历久弥新,在如今面向对象的强势群体中仍以面向过程的独特魅力占有一席之地。 Data Step为处理与数据存储引擎的交互提供了规范,可以处理大量复杂的数据操作和变量操作,Data Step的底层是用C语言开发的。而Proc Step的出现则具有两重含义,一是将一些常用的过程组合归整为固定的过程调用,在语言书写上或处理效率上起到提升作用;二是确定了今后很多SAS模块语言的规范,比如PROC 的调用格式,CLASS, VAR, BY等语句,被广泛地应用在统计模块(如Proc Reg),数据访问模块(如Proc DBLoad),多维模块(Proc MDDB),数据共享模块(如Proc Server)以及很多GUI驱动的模块的shell命令(如EM中的Proc Neural)。Proc Step用Data Step和C语言结合开发而成。 Macro是Base中增强程序流程控制的语言机制。Macro并不是函数封装的概念,它的核心思路是文本替换,同操作系统shell脚本的机制相似。因此,macro的执行是依据macro定义首先进行文本替换,得出最终程序语句后再解释执行。所以在内存分配中,并不像其它语言中那样形成函数调用堆栈。所以在Macro开发中,不能像函数调用那样实现调用现场退栈式的参数传递。虽然这种机制不像函数调用那样带来更多的编程灵活性,但是由于文本替换不涉及复杂的内存分配管理,所以即使用很复杂的macro,替换的效率也很高,同时出现内存管理错误的概率也较小。由于Macro的设计含有大量的非结构性元素,所以编程的流程管理要多加注意,否则很容易造成程序可读性差的现象(事实上,看到%就想吐的现象是普遍存在的)。 Base中有一个过程值得单独加以考察,就是Proc SQL。事实上,它实现了对SQL的兼容,给很多熟悉SQL的编程者多了一个选择。截至V8系列,Proc SQL使用的SQL是基于SQL92标准的SAS SQL超集,有很多SAS特点的语法。关于同样的处理是使用SQL还是Data / Proc Step效率高的问题可以另行讨论,简单的说,从设计思路上,SQL是基于集合的语言,而SAS是基于记录的语言;SAS的开发在SQL和Data / Proc Step上并不是协调一致的,在V6的SAS中,很多SQL操作明显比Data / Proc Step低效,在V8中,SQL有了明显改善,有些情况下会超过Data / Proc Step,但是也需具体情况具体分析,随着数据量的增长,Proc SQL不如Data […] ↓ Read the rest of this entry...