9月 162010
 
When we have a string like this "9/01/2010 11:52:54 AM" and would like to translate the string to a numeric SAS date time variable, most of the times we use SCAN function to extract the information to get the DATETIME format. This is definitely a tedious job. SAS formats (MDYAMPM, ANTDTDTM) comes to rescue us. Here is how it works. data test; length date $25; date="9/01/2010 11:52:54 AM"; *Convert the character string to SAS datetime value; datetimevar =input(date,mdyampm25.2); datetimevar1 =input(date,anydtdtm20.); *Apply format to the SAS date time value; format datetimevar datetimevar1 datetime19.; run; Result: 01SEP2010:11:52:54 *ANYDTDTM and MDYAMPM informats work together when the datetime value has AM PM specified or day, month, and year components are not ambiguous. The MDYAMPMw. format writes datetime values with separators in the form mm/dd/yy hh:mm AM PM, and requires a space between the date and the time. The ANYDTDTM w. format writes datetime values with...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]
 Posted by at 10:35 下午
9月 162010
 


When we have a string like this "9/01/2010 11:52:54 AM" and would like to translate the string to a numeric SAS date time variable, most of the times we use SCAN function to extract the information to get the DATETIME format. This is definitely a tedious job. SAS formats (MDYAMPM, ANTDTDTM) comes to rescue us. Here is how it works. data test; length date $25; date="9/01/2010 11:52:54 AM"; *Convert the character string to SAS datetime value; datetimevar =input(date,mdyampm25.2); datetimevar1 =input(date,anydtdtm20.); *Apply format to the SAS date time value; format datetimevar datetimevar1 datetime19.; run; Result: 01SEP2010:11:52:54 *ANYDTDTM and MDYAMPM informats work together when the datetime value has AM PM specified or day, month, and year components are not ambiguous. The MDYAMPMw. format writes datetime values with separators in the form mm/dd/yy hh:mm AM PM, and requires a space between the date and the time. The ANYDTDTM w. format writes...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]
 Posted by at 10:35 下午

A 64-bit success story: JMP

 64-bit, JMP, SAS Enterprise Guide, Technology  A 64-bit success story: JMP已关闭评论
9月 162010
 
In my post yesterday about the 64-bit hype and how client apps like SAS Enterprise Guide would see only a limited boost from a 64-bit version, I forgot to point out another offering from SAS that has embraced the 64-bit architecture: JMP. JMP offers a 64-bit version, and it makes a big difference. JMP is a desktop application, the same as SAS Enterprise Guide. But JMP performs most of the data access, analysis and computations in-process on your desktop, whereas SAS Enterprise Guide acts as a gateway to a SAS session where all of that work gets done. Customers often use SAS Enterprise Guide (and thus SAS) together with JMP. In SAS Enterprise Guide 4.3, we made that scenario a little bit easier by adding a Send To JMP feature. If you have JMP installed on the same machine as SAS Enterprise Guide, you can select any data source in your SAS Enterprise Guide project (for example, the result of a query step or a table from a database library) and send it to a new JMP session. From there, you can use the great data visualization features of JMP to gain even more insights about your data. (We prototyped that feature last year in a custom task: now it's baked into the product.)

通过 SAS 读取网页内容

 Expected dataset, Question, Solution, Statement  通过 SAS 读取网页内容已关闭评论
9月 162010
 

Question from: 峥岩

通过SAS读取网页 http://detail.zol.com.cn/cell_phone_index/subcate57_list_s528_1.html 中关于各款智能手机的具体参数,包括手机名称,手机系列,操作系统,网络模式,主屏尺寸,主屏色彩,触摸屏,摄像功能,蓝牙功能,以及上市日期等。

Expected dataset: want

elek-dot-me-2010-9-16-sas-read-web

Solution 1:

filename readweb url %nrstr("http://detail.zol.com.cn/cell_phone_index/subcate57_list_s528_1.html") lrecl=60000;

data tmp;
	infile readweb lrecl=60000 dlm="><";
	retain flag 0;
	length title1 $ 20;
	if flag=0 then do;
		title="手机名称";
		input @'id="proName_' @'>' content :$50. @@;
		flag=1;
		num+1;
		if title="手机名称" then title1="name";
		output;
	end;
	input @'<dd class="tit_new">' title :$50. @;
	if title="上市日期" then do;
		flag=0;
	end;
	input @'<dd class="con_new">' content :$50. @@;
	if scan(content,1,"=")="a href" then input content :$50. @@;
	if title="所属系列" then title1="series";
	if title="操作系统" then title1="os";
	if title="手机类型" then title1="type";
	if title="网络模式" then title1="net";
	if title="主屏尺寸" then title1="screen_size";
	if title="主屏色彩" then title1="screen_color";
	if title="触摸屏:" then title1="screen_type";
	if title="摄像头像" then title1="camera";
	if title="蓝牙功能" then title1="bluetooth";
	if title="上市日期" then title1="date";
	output;
	drop flag;
run;

proc transpose data=tmp out=want(drop=_name_);
	var content;
	by num;
	id title1;
run;

相关资料:FILENAME Statement, URL Access Method



Related Posts
9月 142010
 


An anonymous commenter expressed a desire to see how one might use SAS to draw a bubble plot with bubbles in three colors, corresponding to a fourth variable in the data set. (x, y, z for bubble size, and the category variable.) In a previous entries we discussed bubble plots and showed how to make the bubble print in two colors depending a fourth dichotomous variable.

The SAS approach to this cannot be extended to fourth variables with many values: we show here an approach to generating this output. The R version below represents a trivial extension of the code demonstrated earlier.

SAS

We'll start by making some data-- 20 observations in each of 3 categories.

data testbubbles;
do cat = 1 to 3;
do i = 1 to 20;
abscissa = normal(0);
ordinate = normal(0);
z = uniform(0);
output;
end;
end;
run;

Our approach will be to make an annotate data set using the annotate macros (section 5.2). The %slice macro easily draws filled circles. Check its documentation for full details on the parameters it needs in the on-line help: SAS Products; SAS/GRAPH; The Annotate Facility; Annotate Dictionary. Here we note that the 5th parameter is the radius of the circle, chosen here as an arbitrary function of z that makes pleasingly sized circles. Other parameters reflect color density, arc, and starting angle, which could be used to represent additional variables.

%annomac;
data annobub1;
set testbubbles;
%system(2,2,3);
%slice(abscissa, ordinate, 0, 360, sqrt(3*z), green, ps, 0);
run;

Unfortunately, due to a quirk of the macro facility, I don't think the color can be changed conditionally in the preceding step. Instead, we need a new data step to do this.

data annobub2;
set annobub1;
if cat=2 then color="red";
if cat=3 then color="blue";
run;

Now we're ready to plot. We use the symbol (section 5.2.2) statement to tell proc gplot not to plot the data, add the annotate data set, and suppress the legend, as the default legend will not look correct here. An appropriate legend could be generated with a legend statement.

symbol1 i=none r=3;
proc gplot data=testbubbles;
plot ordinate * abscissa = cat / annotate = annobub2 nolegend;
run;
quit;

The resulting plot is shown above. Improved axes are demonstrated throughout the book and in many previous blog posts.

R

The R approach merely requires passing three colors to the bg option in the symbols() function. To mimic SAS, we'll start by defining some data, then generate the vector of colors needed.

cat = rep(c(1, 2, 3), each=20)
abscissa = rnorm(60)
ordinate = rnorm(60)
z = runif(60)
plotcolor = ifelse(cat==1, "green", ifelse(cat==2, "red", "blue"))

The nested calls to the ifelse function (section 1.11.2) allow vectorized conditional tests with more than two possibilities. Another option would be to use a for loop (section 1.11.1) but this would be avoiding one of the strengths of R. In this example, I suppose I could have defined the cat vector with the color values as well, and saved some keystrokes.

With the data generated and the color vector prepared, we need only call the symbols() function.

symbols(ordinate, abscissa, circles=z, inches=1/5, bg=plotcolor)

The resulting plot is shown below.

Coming of Age in Text Analytics

 Richard Foley  Coming of Age in Text Analytics已关闭评论
9月 142010
 
With the data mining conference, M2010, coming up in Oct 24, I am reflecting over the past year and thinking about how prominent Text Analytics has become.

I have been fortunate to attend some very good analytics conferences this year and oddly none of them where specifically focused on Text Analytics. However, text analytics represented a good amount of attention and presentations at those conferences.

Predictive Analytics World
eMetrics Optimization Summit
KDD-2010

At many of these conferences, text analytics dominated the presentations. At all of the conferences, the text analytics papers were presented to packed rooms. All of these conferences had an extraordinary number of text analytics papers, demonstrating the large number of problems that text analytics solves:

•Social Media Analytics
•Fraud Detection
•Healthcare Automation
•Product Reliability
•Voice of Customer
•Enterprise Search
•And many more

Tom Davenport & Jeanne Harris, in their book, “Competing on Analytics,” presents a great story on how Honda is using text analytics to increase revenue, save money and mitigate brand risk.

I’ve seen companies recognize 4X lift when integrating textual data with their relational information, as opposed to using just relational data. Yet the general adoption of text analytics wasn’t there (if your company achieved a huge advantage over your competitors would you tell everyone your secret sauce?).

Every year Seth Grimes makes a prediction on the growth of the text analytics market. This year in his blog, Seth predicted the text analytics market could grow as much as 200%, a bold prediction indeed. With business users having the ability to put text analytics into their solutions, the growth could even be higher.

This is a very exciting time for us in the text analytics space. With papers covering Claims Management, Integrating Data Mining with Sentiment Analysis and more, how can I not be excited to hear the text analytics presentations at M2010 and discover how analytical companies are using unstructured data to gain a competitive advantage? What are you getting excited about in this marketspace?
9月 132010
 
How do marketers help to fight terrorism?

Today I experienced one way...run a top-notch conference that allows private sector banks, law enforcement and regulatory agencies to come together and collaborate with an objective to stop Terrorism Financing.

Just two days before the anniversary of the largest terrorist attack on US soil, the SAS campus hosted 400+ people from the US and Canadian Governments; commercial banks: Bank of America, RBC, Ally; federal and state law enforcement. All came together to discuss how to jointly prevent the financing of terrorist and criminal activities. Where else would you find these institutions, plus Senators from both sides of the aisle, encouraging collaboration toward a common goal? I was awestruck as I listened to the Chief Compliance Officer of Bank of America, Charles Bowman, state "we can make this world a safer place".

James A. Dinkins, Executive Associate Director of Homeland Security told stories of investigation of crimes on a scale I could not imagine. He ended his keynote with a message, again, to the commercial banks. "We cannot do it without you. While implementing the regulations are onerous, your efforts are making a difference."

We can and do make a difference, every day. Big thanks and kudos to our SAS Marketing team who made this collaborative effort possible.
9月 122010
 
Here are the few shortcuts you need to know to speed up the code writing. These work in both EPG (Enterprise Guide) and SAS Enhanced Editor. Shortcuts and their descriptions: Remember that the keyboard shortcuts listed here are default. Selection Operations:1) Comment the section with line comments (/): press CTL + / 2) Undo the comment: press CTL + SHIFT + / 3) Convert selected text to lowercase: press CTL + SHIFT + L 4) Convert selected text to uppercase: press CTL + SHIFT + U Shortcuts (pre-defined) CTRL+Shift+L or +U (only for the enhanced editor), which convert all selected text into lowercase or uppercase respectively. These become very handy  when we insert the text by copy+paste. 5) Indent selected section: press TAB 6) Un-indent selected section: press SHIFT + TAB 7) To move curser to the matching DO/END statement: press      ALT + [ or      ALT + { or        ALT+]...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]
 Posted by at 8:04 下午