Book

8月 012022
 

Recently, the SAS Press team moved to a new building on the SAS campus. And when the SAS Press team moves, we bring a lot of books with us! Packing and organizing all of our books gave us a chance to appreciate all of our authors’ hard work during the more than 30 years that SAS Press has existed.

One author has an outsized presence on the SAS Press bookshelves – Ron Cody. He has written over a dozen books that include some of the most popular titles for new SAS users. He taught statistics at the Rutgers Robert Wood Johnson Medical School for many years and is a frequent presenter at SAS conferences.

Few people know more about SAS than Ron, which made him the perfect person to be our first SAS Press Author of the Month. During August, you can get special deals on all of Ron's books at RedShelf. Find out more information on his author page. Ron is also hosting a free webinar on data cleaning tips and tricks on August 11th.

We recently asked Ron to share a little bit about his journey as an author and teacher. As you might imagine, Ron has a lot of SAS knowledge and advice to share.

Ron Cody's books on a bookshelf.

The Cody section of the SAS Press bookshelf.

Q: When did you decide to write your first book?

I decided to write my first SAS book (Applied Statistics and the SAS Programming Language) in 1985. It was published by Elsevier Publishing, which was later bought by Prentice Hall. At the time I was writing the book, there were no other books out there—just SAS manuals. This book is still in print (in a fifth edition).

Q: What made you decide to keep writing more books about SAS?

Once I realized I could write well enough to get published, I got the "writer's bug." Although writing a book is hard work, the reward is substantial. My motivation is more about teaching people about SAS and statistics rather than the monetary reward. My goal in writing any book is to make enough money to take my wife to dinner!

Q: Is one of your books your favorite? Or do you love them all equally?

I do have some favorites. I would put Learning SAS by Example as one. It contains a section for beginners as well as later chapters that are useful to intermediate, or even advanced SAS programmers. I particularly like my latest book, A Gentle Introduction to Statistics Using SAS Studio in the Cloud. I think I made statistical concepts accessible to non-mathematically minded people. There are only a few equations in the entire book.

Q: As a teacher, how did you encourage students who were having a hard time understanding statistics?

I try really hard to convince them that the statistical concepts really make sense, and it is sometimes the terminology that gets in the way.

Q: What do you think students struggle with the most when they are learning SAS?

I believe the overwhelming richness of SAS can intimidate a beginning programmer. That's why I start from simple examples and explain step-by-step how each program works.

Q: What is your best advice for someone who wants to learn SAS?

Buy all of my books, of course! Just kidding. There are many YouTube videos and other online resources that are useful. I think two of the best books would be The Little SAS Book, and either Learning SAS by Example (for the more serious student) or Getting Started with SAS Programming Using SAS Studio in the Cloud. The latter book is more suited to someone using SAS OnDemand for Academics.

Q: You recently published a memoir about your time as an EMT. How did it feel to reflect back on that time of your life? Any more memoirs in your future?

I thoroughly enjoyed writing a book about my years as a volunteer EMT (10-8 Awaiting Crew: Memories of a Volunteer EMT). I was fortunate that I had kept a journal and recorded details of some of the more interesting and exciting calls. As of now, I do not have another memoir in my future, but I am working on a nonfiction novel. I'm not sure how successful it will be, but I'm going to give it a try. I strive to write almost every day. I tell other beginning authors, that if you spend a few hours writing every day, it becomes much easier. I call it "getting in the groove."


Ron Cody sitting in a chair with a cat in his lap. Ron is wearing a shirt that says "using Statistics to prove a point is just MEAN."

Ron and Dudley. Credit: Jan Cody

Thanks for talking to us, Ron! We hope you will join us in celebrating Ron's accomplishments as our Author of the Month. Check out his author page to see special deals on all of Ron’s books during August and to discover more resources.

 

Meet our SAS Press Author of the Month – Ron Cody was published on SAS Users.

7月 122022
 

SQL (Structured Query Language) is the most widely used programming language for relational databases worldwide. No other programming language produces more hits for a web search than SQL and interest is growing rapidly. In June 2010, Google showed 135 million hits versus 586 million hits in June 2020.

SQL is the programming language to handle big data. Within a very short time after its invention, SQL developed into an industry quasi-standard. The reasons behind this rapid development include the fast spread of databases, proven performance, popularity among users, and the growing relevance of analysts and programmers. Most analyst positions expect at least knowledge of basic SQL capabilities.

For analysts and programmers, SQL is attractive because once acquired, SQL skills can be applied in basically all environments from finance to healthcare, and in all systems from open source to commercial. Due to its relative independence from manufacturers (an ANSI standard), what you learn to do in SQL in Oracle can be applied to SQL in SAS.

PROC SQL in SAS is powerful and versatile. It offers myriad possibilities for working with:

  • descriptive statistics
  • advanced macro programming
  • SAS datasets, views, or complex queries
  • special application areas like integrity constraints or performance tuning
  • thousands of SAS functions and SAS function calls

If you start learning PROC SQL, you will also acquire the basics of PROC FEDSQL, PROC DS2, and PROC CAS. And that will offer you a handy toolbox for SAS platforms and applications like SAS 9.4, SAS Viya, SAS Integration Studio, SAS Studio, Enterprise Guide, and many more.

Is that tempting enough to try your hand at PROC SQL? No, you want to see what you get? I will show you examples of four programs that do the same thing using PROC SQL, PROC FEDSQL, PROC DS2, and PROC CAS. I’ll keep it simple just to prove the point. But have no fear, SQL can accomplish very advanced tasks. I’ve been involved in rewriting complex SQL programs that were thousands of lines long.

Let’s use PROC SQL as a springboard. From there, choose where you want to go.

Example 1: PROC SQL

proc sql;
   select REGION, SUBSIDIARY, SALES
   from work.shoes
      where SALES > 750000 ; 
quit;

With PROC FEDSQL, you can start working in the cloud environment of SAS Viya. Please note that PROC FEDSQL is not always 1:1 to PROC SQL as it may appear from this example.

Example 2: PROC FEDSQL

proc fedsql;
   select REGION, SUBSIDIARY, SALES
   from work.shoes
      where SALES > 750000 ; 
quit;

DS2 allows you to speed up processing by using its built-in multi-threading capabilities.

Example 3: PROC DS2

proc DS2 ; 
data LEFT_RIGHT4 (overwrite=yes) ; 
method run(); 
set {select LEFT.ID, 
             LEFT.A, RIGHT.F 
     from work.LEFT, work.RIGHT 
       where LEFT.ID = RIGHT.ID} ; 
output ; 
end ; 
enddata ;
run ;
quit;

PROC CAS enables you to take advantage of SAS Cloud Analytic Services (CAS).

Example 4: PROC CAS

proc CAS; 
session my_CAS_session ; 
  fedsql.execdirect 
  query=
   'select * 
    from 
    CASUSER.CAS_CLASS' ;
Quit ;

Notice that the SQL language elements like the SELECT statement are the same in each example. Once you have learned the basic syntax, you can use it in PROC FEDSQL, PROC DS2, and PROC CAS. And I am pretty sure, there are some more to come.

Why learn SQL? Because it’s a sustainable investment in your future. If you want to learn more about PROC SQL techniques, check out my book Advanced SQL with SAS®.

Why learn SQL? was published on SAS Users.

7月 122022
 

SQL (Structured Query Language) is the most widely used programming language for relational databases worldwide. No other programming language produces more hits for a web search than SQL and interest is growing rapidly. In June 2010, Google showed 135 million hits versus 586 million hits in June 2020.

SQL is the programming language to handle big data. Within a very short time after its invention, SQL developed into an industry quasi-standard. The reasons behind this rapid development include the fast spread of databases, proven performance, popularity among users, and the growing relevance of analysts and programmers. Most analyst positions expect at least knowledge of basic SQL capabilities.

For analysts and programmers, SQL is attractive because once acquired, SQL skills can be applied in basically all environments from finance to healthcare, and in all systems from open source to commercial. Due to its relative independence from manufacturers (an ANSI standard), what you learn to do in SQL in Oracle can be applied to SQL in SAS.

PROC SQL in SAS is powerful and versatile. It offers myriad possibilities for working with:

  • descriptive statistics
  • advanced macro programming
  • SAS datasets, views, or complex queries
  • special application areas like integrity constraints or performance tuning
  • thousands of SAS functions and SAS function calls

If you start learning PROC SQL, you will also acquire the basics of PROC FEDSQL, PROC DS2, and PROC CAS. And that will offer you a handy toolbox for SAS platforms and applications like SAS 9.4, SAS Viya, SAS Integration Studio, SAS Studio, Enterprise Guide, and many more.

Is that tempting enough to try your hand at PROC SQL? No, you want to see what you get? I will show you examples of four programs that do the same thing using PROC SQL, PROC FEDSQL, PROC DS2, and PROC CAS. I’ll keep it simple just to prove the point. But have no fear, SQL can accomplish very advanced tasks. I’ve been involved in rewriting complex SQL programs that were thousands of lines long.

Let’s use PROC SQL as a springboard. From there, choose where you want to go.

Example 1: PROC SQL

proc sql;
   select REGION, SUBSIDIARY, SALES
   from work.shoes
      where SALES > 750000 ; 
quit;

With PROC FEDSQL, you can start working in the cloud environment of SAS Viya. Please note that PROC FEDSQL is not always 1:1 to PROC SQL as it may appear from this example.

Example 2: PROC FEDSQL

proc fedsql;
   select REGION, SUBSIDIARY, SALES
   from work.shoes
      where SALES > 750000 ; 
quit;

DS2 allows you to speed up processing by using its built-in multi-threading capabilities.

Example 3: PROC DS2

proc DS2 ; 
data LEFT_RIGHT4 (overwrite=yes) ; 
method run(); 
set {select LEFT.ID, 
             LEFT.A, RIGHT.F 
     from work.LEFT, work.RIGHT 
       where LEFT.ID = RIGHT.ID} ; 
output ; 
end ; 
enddata ;
run ;
quit;

PROC CAS enables you to take advantage of SAS Cloud Analytic Services (CAS).

Example 4: PROC CAS

proc CAS; 
session my_CAS_session ; 
  fedsql.execdirect 
  query=
   'select * 
    from 
    CASUSER.CAS_CLASS' ;
Quit ;

Notice that the SQL language elements like the SELECT statement are the same in each example. Once you have learned the basic syntax, you can use it in PROC FEDSQL, PROC DS2, and PROC CAS. And I am pretty sure, there are some more to come.

Why learn SQL? Because it’s a sustainable investment in your future. If you want to learn more about PROC SQL techniques, check out my book Advanced SQL with SAS®.

Why learn SQL? was published on SAS Users.

2月 172022
 

This blog serves two purposes: the main purpose is to show you some useful SAS coding techniques, and the second is to show you an interesting method of creating a Beale cipher.

TJ Beale is famous in Virginia for leaving behind three ciphers, supposedly describing the location of hidden gold and treasures. (Most cryptologists and historians believe the whole set of ciphers and treasure was a hoax.) In one of the ciphers, he used a method based on the Declaration of Independence. His coding method was as follows:

  • Get a copy of the Declaration of Independence and number each word.
  • Take the first letter of each word and form a list.
  • Associate each number with that letter.

For example, consider this text:

“Four score and seven years ago, our fathers brought forth upon this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal. “

To create a Beale cipher, you would proceed as follows:

Four(1) score(2) and(3) seven(4) years(5) ago(6), our(7) fathers(8) brought(9) forth(10) upon(11) this(12) continent(13) a(14) new(15) nation(16), conceived(17) in(18) liberty(19) and(20) dedicated(21) to(22) the(23) proposition(24) that(25) all(26) men(27) are(28) created(29) equal(30).

Next, you would make a table like this:

Letter Numbers
F 1,8,10 (all the numbers of words that begin with 'F')
S 2,4
A 3,6,14,20,26,28
Y 5
…and so on

 

You would then want to put the list in alphabetical order like this:

A 3,6,14,20,26,28
B 9
C 13,17,29
D 21
E 30
F 1,8,10
…and so on

 

To create your cipher, select any number at random from the list of numbers, corresponding to the letter that you want to encode. The advantage of this over a simple substitution cipher is that you cannot use frequency analysis to guess what letter a particular number represents.

This blog explains how to create a Beale cipher; my next blog will explain how to decipher a Beale cipher.

You need to start out with a book or document that is accessible to the sender and recipient of the cipher. To offer some additional security, you could decide to start from a specific page in a book. For a simple demonstration of how to create a Beale cipher, I have entered part of the Declaration of Independence in a text file called Declare.txt.

A funny aside: I was teaching my Functions course in the UK, in a small town north of London on the Thames. One of the programs demonstrating several SAS character functions was the program I'm using here to demonstrate how to create a Beale cipher. I had completely forgotten that the document was the Declaration of Independence. Whoops! I asked the class, "I hope you're not still angry with us about that." Apparently not, and we all had a good laugh.

Back to the problem. I will break down the program into small steps and provide a partial listing of data sets along the way, so that you can see exactly how the program works. The first step is read the text file, extract the first letter from each word, change the letter to uppercase, and associate each letter with the count of words in the text.

Here is the first part of the program.

data Beale;
   length Letter $ 1;  
   infile 'c:\Books\Blogs\Declare.txt'; 
   input Letter : $upcase1. @@; ❶
   N + 1; ❷	
   output;
run;
 
title "First Step in the Beale Cipher (first 10 observations)";
proc print data=Beale(obs=10) noobs;
run;

❶ By using the $UPCASE1. informat, you are selecting the first letter of each word and converting it to uppercase. If you are unfamiliar with the $UPCASEn. informat, it is similar to the $n. informat with the additional task of converting the character(s) to uppercase.

❷ You use a SUM statement to associate each letter with the word count.

Here is the listing from this first step:

Next, you need to sort the data set by Letter so that all the words that start with As, Bs, and so forth are placed together.

proc sort data=Beale;
   by Letter;
run;
 
title "The list in sorted order (partial listing)";
proc print data=Beale(obs=10) noobs;
run;

Below is a partial listing of the sorted file:

Any of the numbers 24, 25, 27, and so forth can be used to code an 'A'.

The final step is to list all the letters from A to Z (Z is pronounced Zed in the UK and Canada) in a line, followed by all the possible numbers associated with each letter.

data Next;
   length List $ 40; ❸
   retain List; ❹
   set Beale;
   by Letter; ❺
   if first.Letter then List = ' '; ❻
   List = catx(',',List,N); ❼ 
   if last.Letter then output; ❽ 
run;
 
title "List of Beale Substitutions";
proc print data=next(obs=5) noobs;
   var Letter List;
run;

❸ The variable List will hold all the possible numbers that can be used to code any of the letters. In a real program, this list might be longer.

❹ You need to RETAIN this variable; otherwise, it would be set back to a missing value for each iteration of the DATA step.

❺ Following the SET statement with a BY statement creates the two temporary variables, First.Letter and Last.Letter. First.Letter is true when you are reading the first observation for each letter—Last.Letter is true when you are reading the last observation for a letter.

❻ For the first A, B, C, and so on, initialize the variable List to a missing value.

❼ Use the CATX function to concatenate all the numbers, separated by commas.

❽ When you are done reading the last A, B, C, and so on, output the string.

Below are a few lines generated by this program:

For more information about the CATX function and other SAS functions, please take a look at my book, SAS Functions by Example, Second Edition.

Fun with Ciphers (Part 1) was published on SAS Users.

2月 012022
 

When we moved out to the country with our two dogs, our oldest dog Todd suddenly decided he liked to howl…. And he would do so every time we left the house. Maybe it was the country air? Maybe it was a time-lapse gene? Maybe he just wanted to learn something new?

If you’ve been using SAS Visual Analytics for a while, it’s possible you might have never created reports that linked to external URLs. SAS Visual Analytics can do so much on its own, perhaps you never thought about extending its functionality outside the product itself! Well, it’s time to learn a new trick.

To illustrate how this can be done (and to keep with the theme), let’s consider an example.

I’m interested in adding a new member to my family (a dog!), and I know I would like to adopt an animal in need. I’m not sure, however, which breed will suit my lifestyle. I need a dog that’s playful and sweet, but one that also likes to sleep late.

I have a report in SAS Visual Analytics that shows details about animals surrendered at an Austin animal shelter. I’d like to select various characteristics (like animal type, sex, whether the dog is spayed or neutered, and condition) and see what breeds they have available. Then, I’d like to see additional details about each breed at the American Kennel Club website (www.akc.org). On this website, you can find information about various breeds of dogs (and cats!), including average sizes, life expectancy, personality, and many other traits.

I’ll add an interactive link to the report, so when a user selects a specific breed, the page for that breed appears. The interactive link will use parameters to pass a selected value from the report to the web page.

To create interactive links, I like to follow four simple steps:

  1. Research the structure of the URL
  2. Use a hardcoded value to test the link
  3. Parameterize the link
  4. Test the parameterized value

Step 1: Research

Before adding interactive links to a report, you need to understand how the target web page structures the URL. I typically do this by accessing the target web page and searching for a specific subject. For some websites, you might need to view the Developer Guide for the website to fully understand the structure.

Typically, URLs are constructed in one of three ways:

  • Path: In these URLs, the subject is added at the end of the URL. For example, to view a country page on Wikipedia, you use the following URL: https://en.wikipedia.org/wiki/country where country is the full name of the country of interest.
  • Query: In these URLs, the subject is assigned to the value of a URL parameter using a sequence of attribute-value pairs: ?parameter1=value1&parameter2=value2. Multiple parameters can be assigned by separating the attribute-value pairs with an ampersand (&). For example, to search Etsy for a specific type of item, you use the following URL: https://www.etsy.com/search?q=item where item is the specific search string.
  • File: In these URLs, the subject is a part of a file name at the end of the URL. For example, to view a country profile on CIA Factbook, you use the following URL: https://www.cia.gov/library/publications/the-world-factbook/geos/country-code.html where country-code is the 2-letter abbreviation of the country of interest.

To start, I’ll select one of the breeds in the list: Australian Cattle Dog (my current dog’s breed). The American Kennel Club website has a drop-down selector at the top of the page where you can select a breed.

Australian cattle dogs are alert, curious, and pleasant. Tell me about it! He won’t let a leaf fly by outside without raising the alarm.

The URL is constructed with the breed as part of the path: https://www.akc.org/dog-breeds/australian-cattle-dog/. Notice that for breeds with multiple words (like Australian Cattle Dog), the link uses hyphens (-) instead of spaces.

Step 2: Hardcode

Now that you understand the structure of the URL, you can test the link using various hardcoded values. For example, to view details about dachshunds, go to https://www.akc.org/dog-breeds/dachshund/. Dachshunds are friendly, curious, and spunky. They must be! Why else would they be chosen to star in dog races at Oktoberfest celebrations around the world?

Step 3: Parameterize

After you have tested the URL using hardcoded values, you need to replace the hardcoded value with parameters. These are values that will be passed from your report to the external URL to make the links interactive. For the report, I’ll add the link to the word cloud and replace the hardcoded breed with the breed I select in the report.

Because the URL replaces spaces with hyphens, I have created a calculated item in SAS Visual Analytics that has breeds with multiple words separated by slashes instead of spaces, Breed (ForLink).

Because I want to pass Breed (ForLink) from the word cloud to the URL, I need to add it to one of the roles for the word cloud. I don’t want the breeds to appear with hyphens in the word cloud, so I’ll add the calculated item to the Hidden role.

Tip: Data items assigned to the Hidden role are available for color-mapped display rules, external links, and mapping data sources and should only be assigned if it will not increase the number of rows in the query. In this example, the word cloud shows details about breeds. Adding Breed (ForLink) to the Hidden role makes the value available for the external link and does not increase the number of rows in the query.

Then, to add the link:

  1. With the word cloud selected, click Actions in the right pane and expand URL Links.
  2. Click New URL Link.
  3. Specify a descriptive name for the link.
  4. For the URL, enter the URL up to, but not including, the breed (https://www.akc.org/dog-breeds/); this value will be passed from the selected breed in the word cloud.
  5. Next to Parameters, click the Add icon.
  6. For the Source field, select Breed (ForLink) and leave the Target value blank. Adding nothing to the Target field indicates that the value of Breed (ForLink) should be appended to the end of the URL.

When a viewer selects a breed in the word cloud, the breed value will be appended to the end of the URL and details for that breed will be displayed.

Step 4: Test

After the interactive link has been created, you need to ensure that the link works by testing it in the report.

I’ll select both Cat and Dog as the type of animal, Male for sex, Yes for spayed or neutered, and Aged for condition. There are 143 animals in the Austin animal shelter that meet these criteria. I’m thinking a Labrador retriever might be good for my family, so I’ll double-click Labrador Retriever in the word cloud to see the traits and characteristics for that breed. It looks like Labrador retrievers are friendly, active, and outgoing, and they are also highly adaptable (meaning I can train them to sleep late). It sounds like a perfect fit!

For more information about how to add interactive links to your SAS Visual Analytics reports, including examples on creating links with different URL structures, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

You can’t teach an old dog new tricks… or can you? was published on SAS Users.

2月 012022
 

When we moved out to the country with our two dogs, our oldest dog Todd suddenly decided he liked to howl…. And he would do so every time we left the house. Maybe it was the country air? Maybe it was a time-lapse gene? Maybe he just wanted to learn something new?

If you’ve been using SAS Visual Analytics for a while, it’s possible you might have never created reports that linked to external URLs. SAS Visual Analytics can do so much on its own, perhaps you never thought about extending its functionality outside the product itself! Well, it’s time to learn a new trick.

To illustrate how this can be done (and to keep with the theme), let’s consider an example.

I’m interested in adding a new member to my family (a dog!), and I know I would like to adopt an animal in need. I’m not sure, however, which breed will suit my lifestyle. I need a dog that’s playful and sweet, but one that also likes to sleep late.

I have a report in SAS Visual Analytics that shows details about animals surrendered at an Austin animal shelter. I’d like to select various characteristics (like animal type, sex, whether the dog is spayed or neutered, and condition) and see what breeds they have available. Then, I’d like to see additional details about each breed at the American Kennel Club website (www.akc.org). On this website, you can find information about various breeds of dogs (and cats!), including average sizes, life expectancy, personality, and many other traits.

I’ll add an interactive link to the report, so when a user selects a specific breed, the page for that breed appears. The interactive link will use parameters to pass a selected value from the report to the web page.

To create interactive links, I like to follow four simple steps:

  1. Research the structure of the URL
  2. Use a hardcoded value to test the link
  3. Parameterize the link
  4. Test the parameterized value

Step 1: Research

Before adding interactive links to a report, you need to understand how the target web page structures the URL. I typically do this by accessing the target web page and searching for a specific subject. For some websites, you might need to view the Developer Guide for the website to fully understand the structure.

Typically, URLs are constructed in one of three ways:

  • Path: In these URLs, the subject is added at the end of the URL. For example, to view a country page on Wikipedia, you use the following URL: https://en.wikipedia.org/wiki/country where country is the full name of the country of interest.
  • Query: In these URLs, the subject is assigned to the value of a URL parameter using a sequence of attribute-value pairs: ?parameter1=value1&parameter2=value2. Multiple parameters can be assigned by separating the attribute-value pairs with an ampersand (&). For example, to search Etsy for a specific type of item, you use the following URL: https://www.etsy.com/search?q=item where item is the specific search string.
  • File: In these URLs, the subject is a part of a file name at the end of the URL. For example, to view a country profile on CIA Factbook, you use the following URL: https://www.cia.gov/library/publications/the-world-factbook/geos/country-code.html where country-code is the 2-letter abbreviation of the country of interest.

To start, I’ll select one of the breeds in the list: Australian Cattle Dog (my current dog’s breed). The American Kennel Club website has a drop-down selector at the top of the page where you can select a breed.

Australian cattle dogs are alert, curious, and pleasant. Tell me about it! He won’t let a leaf fly by outside without raising the alarm.

The URL is constructed with the breed as part of the path: https://www.akc.org/dog-breeds/australian-cattle-dog/. Notice that for breeds with multiple words (like Australian Cattle Dog), the link uses hyphens (-) instead of spaces.

Step 2: Hardcode

Now that you understand the structure of the URL, you can test the link using various hardcoded values. For example, to view details about dachshunds, go to https://www.akc.org/dog-breeds/dachshund/. Dachshunds are friendly, curious, and spunky. They must be! Why else would they be chosen to star in dog races at Oktoberfest celebrations around the world?

Step 3: Parameterize

After you have tested the URL using hardcoded values, you need to replace the hardcoded value with parameters. These are values that will be passed from your report to the external URL to make the links interactive. For the report, I’ll add the link to the word cloud and replace the hardcoded breed with the breed I select in the report.

Because the URL replaces spaces with hyphens, I have created a calculated item in SAS Visual Analytics that has breeds with multiple words separated by slashes instead of spaces, Breed (ForLink).

Because I want to pass Breed (ForLink) from the word cloud to the URL, I need to add it to one of the roles for the word cloud. I don’t want the breeds to appear with hyphens in the word cloud, so I’ll add the calculated item to the Hidden role.

Tip: Data items assigned to the Hidden role are available for color-mapped display rules, external links, and mapping data sources and should only be assigned if it will not increase the number of rows in the query. In this example, the word cloud shows details about breeds. Adding Breed (ForLink) to the Hidden role makes the value available for the external link and does not increase the number of rows in the query.

Then, to add the link:

  1. With the word cloud selected, click Actions in the right pane and expand URL Links.
  2. Click New URL Link.
  3. Specify a descriptive name for the link.
  4. For the URL, enter the URL up to, but not including, the breed (https://www.akc.org/dog-breeds/); this value will be passed from the selected breed in the word cloud.
  5. Next to Parameters, click the Add icon.
  6. For the Source field, select Breed (ForLink) and leave the Target value blank. Adding nothing to the Target field indicates that the value of Breed (ForLink) should be appended to the end of the URL.

When a viewer selects a breed in the word cloud, the breed value will be appended to the end of the URL and details for that breed will be displayed.

Step 4: Test

After the interactive link has been created, you need to ensure that the link works by testing it in the report.

I’ll select both Cat and Dog as the type of animal, Male for sex, Yes for spayed or neutered, and Aged for condition. There are 143 animals in the Austin animal shelter that meet these criteria. I’m thinking a Labrador retriever might be good for my family, so I’ll double-click Labrador Retriever in the word cloud to see the traits and characteristics for that breed. It looks like Labrador retrievers are friendly, active, and outgoing, and they are also highly adaptable (meaning I can train them to sleep late). It sounds like a perfect fit!

For more information about how to add interactive links to your SAS Visual Analytics reports, including examples on creating links with different URL structures, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

You can’t teach an old dog new tricks… or can you? was published on SAS Users.

1月 112022
 

Last year, I wrote a blog demonstrating how to use the %Auto_Outliers macro to automatically identify possible data errors. This blog demonstrates a different approach—one that is useful for variables for which you can identify reasonable ranges of values for each variable. For example, you would not expect resting heart rates below 40 or over 100 or adult heights below 45 inches or above 84 inches. Although values outside those ranges might be valid values, it would be a good idea to check those out-of-range values to see if they are real or data errors.

In the third edition of my book, Cody's Data Cleaning Techniques, I present two macros: %Errors and %Report. These two macros provide a consolidated error report.

To demonstrate these two macros, I created a data set called Patients. A listing of the first 10 observations is shown below:

Notice that the unique identifier is called Patno, and you see three variables HR (heart rate), SBP (systolic blood pressure), and DBP (diastolic blood pressure).

The calling arguments for the %Errors macro are:

  • VAR=     A variable for which you have pre-defined bounds
  • Low=     The lowest reasonable value for this variable
  • High=    The highest reasonable value for this variable
  • Missing=Error or Missing=Ignore. The default for this argument is IGNORE, but it is still good practice to include it in the call so that anyone reading your program understands how missing values are being handled.

Because you might be calling this macro for many variables, the values of two macro variables &Dsn (data set name) and &IDVar (the identifying variable such as Patno or Subj) are assigned values once, using two %Let statements. You can then call the %Errors macro for each variable of interest. When you are finished, call the %Report macro to see a consolidated report of your possible errors.

Here is an example:

/* Set values for &Dsn and %IDVar with %LET statements */
%let Dsn = Clean.Patients;   
%let Idvar = Patno;
 
%Errors(Var=HR, Low=40, High=100, Missing=error)
%Errors(Var=SBP, Low=80, High=200, Missing=ignore)
%Errors(Var=DBP, Low=60, High=120)   
 
/* When you are finished selecting variables, create the report */
%Report

You are reporting all heart rates below 40 or above 100 (and considering missing values as errors); values of SBP below 80 or above 200 (and ignoring missing values); and values of DBP below 60 or above 120 (also ignoring missing values—using the default value of IGNORE).

Here is the result:

Notice that several patients with missing values for HR are flagged as errors. I should point out that I have violated a cardinal rule of macro programming: never write a macro that has no arguments—it should be a program. However, I liked the idea of calling %Errors and then %Report. Shown below are the two macros:

/****************************************************************
| PROGRAM NAME: ERRORS.SAS  in c:\Books\Cleans\Patients         |
| PURPOSE: Accumulates errors for numeric variables in a SAS    |
|          data set for later reporting.                        |
|          This macro can be called several times with a        |
|          different variable each time. The resulting errors   |
|          are accumulated in a temporary SAS data set called   |
|          Errors.                                              |
| ARGUMENTS: Dsn=    - SAS data set name (assigned with a %LET) |
|            Idvar=  - Id variable (assigned with a %LET)       |
|                                                               |
|            Var     = The variable name to test                |
|            Low     = Lowest valid value                       |
|            High    = Highest valid value                      |
|            Missing = IGNORE (default) Ignore missing values   |
|                      ERROR Missing values flagged as errors   |
|                                                               |
| EXAMPLE: %let Dsn = Clean.Patients;                           |
|          %let Idvar = Patno;                                  |
|                                                               |
|          %Errors(Var=HR, Low=40, High=100, Missing=error)     |
|          %Errors(Var=SBP, Low=80, High=200, Missing=ignore)   |
|          %Errors(Var=DBP, Low=60, High=120)                   |
|          Test the numeric variables HR, SBP, and DBP in data  |
|          set Clean.patients for data outside the ranges       |
|          40 to 100, 80 to 200, and 60 to 120 respectively.    |
|          The ID variable is PATNO and missing values are to   |
|          be flagged as invalid for HR but not for SBP or DBP. |
****************************************************************/
%macro Errors(Var=,    /* Variable to test     */
              Low=,    /* Low value            */
              High=,   /* High value           */
              Missing=IGNORE 
                       /* How to treat missing values         */
                       /* Ignore is the default.  To flag     */
                       /* missing values as errors set        */
                       /* Missing=error                       */);
data Tmp;
   set &Dsn(keep=&Idvar &Var);
   length Reason $ 10 Variable $ 32;
   Variable = "&Var";
   Value = &Var;
   if &Var lt &Low and not missing(&Var) then do;
      Reason='Low';
      output;
   end;
 
   %if %upcase(&Missing) ne IGNORE %then %do;
      else if missing(&Var) then do;
         Reason='Missing';
         output;
      end;
   %end;
 
   else if &Var gt &High then do;
      Reason='High';
      output;
      end;
      drop &Var;
   run;
 
   proc append base=Errors data=Tmp;
   run;
 
%mend Errors;

The basic idea for the %Errors macro is to test each variable and, if it is a possible error, use PROC APPEND to add it to a data set called Errors. When the first error is detected, PROC APPEND creates the data set Errors. From then on, each observation in data set Tmp is added to data set Errors.

Most of this macro is straightforward. For those readers who are not that comfortable with macro programming, the third section (beginning with %if %upcase(&Missing)) is executed only when the value of the macro variable &Missing is not equal to IGNORE.

Below is a listing of the %Report macro:

%macro Report;
   proc sort data=Errors;
      by &Idvar;
   run;
 
   proc print data=Errors;
      title "Error Report for Data Set &Dsn";
      id &Idvar;
      var Variable Value Reason;
   run;
 
   proc delete data=Errors Tmp;
   run;
 
%mend Report;

The %Report macro is mainly a PROC PRINT of the temporary data set Errors. I added PROC DELETE to delete the two temporary data sets Error and Tmp.

You can cut and paste these macros, or you can download all of the macros, programs, and data sets from Cody's Data Cleaning Techniques Using SAS®, Third Edition, by going to support.sas.com/Cody, search for the book, then click Example Code and Data. You do not have to buy the book to download all the files (although I would be delighted if you did). This is true for all of my books published by SAS Press.

Comments and/or corrections are always welcome.

Two macros for detecting data errors was published on SAS Users.

1月 112022
 

Last year, I wrote a blog demonstrating how to use the %Auto_Outliers macro to automatically identify possible data errors. This blog demonstrates a different approach—one that is useful for variables for which you can identify reasonable ranges of values for each variable. For example, you would not expect resting heart rates below 40 or over 100 or adult heights below 45 inches or above 84 inches. Although values outside those ranges might be valid values, it would be a good idea to check those out-of-range values to see if they are real or data errors.

In the third edition of my book, Cody's Data Cleaning Techniques, I present two macros: %Errors and %Report. These two macros provide a consolidated error report.

To demonstrate these two macros, I created a data set called Patients. A listing of the first 10 observations is shown below:

Notice that the unique identifier is called Patno, and you see three variables HR (heart rate), SBP (systolic blood pressure), and DBP (diastolic blood pressure).

The calling arguments for the %Errors macro are:

  • VAR=     A variable for which you have pre-defined bounds
  • Low=     The lowest reasonable value for this variable
  • High=    The highest reasonable value for this variable
  • Missing=Error or Missing=Ignore. The default for this argument is IGNORE, but it is still good practice to include it in the call so that anyone reading your program understands how missing values are being handled.

Because you might be calling this macro for many variables, the values of two macro variables &Dsn (data set name) and &IDVar (the identifying variable such as Patno or Subj) are assigned values once, using two %Let statements. You can then call the %Errors macro for each variable of interest. When you are finished, call the %Report macro to see a consolidated report of your possible errors.

Here is an example:

/* Set values for &Dsn and %IDVar with %LET statements */
%let Dsn = Clean.Patients;   
%let Idvar = Patno;
 
%Errors(Var=HR, Low=40, High=100, Missing=error)
%Errors(Var=SBP, Low=80, High=200, Missing=ignore)
%Errors(Var=DBP, Low=60, High=120)   
 
/* When you are finished selecting variables, create the report */
%Report

You are reporting all heart rates below 40 or above 100 (and considering missing values as errors); values of SBP below 80 or above 200 (and ignoring missing values); and values of DBP below 60 or above 120 (also ignoring missing values—using the default value of IGNORE).

Here is the result:

Notice that several patients with missing values for HR are flagged as errors. I should point out that I have violated a cardinal rule of macro programming: never write a macro that has no arguments—it should be a program. However, I liked the idea of calling %Errors and then %Report. Shown below are the two macros:

/****************************************************************
| PROGRAM NAME: ERRORS.SAS  in c:\Books\Cleans\Patients         |
| PURPOSE: Accumulates errors for numeric variables in a SAS    |
|          data set for later reporting.                        |
|          This macro can be called several times with a        |
|          different variable each time. The resulting errors   |
|          are accumulated in a temporary SAS data set called   |
|          Errors.                                              |
| ARGUMENTS: Dsn=    - SAS data set name (assigned with a %LET) |
|            Idvar=  - Id variable (assigned with a %LET)       |
|                                                               |
|            Var     = The variable name to test                |
|            Low     = Lowest valid value                       |
|            High    = Highest valid value                      |
|            Missing = IGNORE (default) Ignore missing values   |
|                      ERROR Missing values flagged as errors   |
|                                                               |
| EXAMPLE: %let Dsn = Clean.Patients;                           |
|          %let Idvar = Patno;                                  |
|                                                               |
|          %Errors(Var=HR, Low=40, High=100, Missing=error)     |
|          %Errors(Var=SBP, Low=80, High=200, Missing=ignore)   |
|          %Errors(Var=DBP, Low=60, High=120)                   |
|          Test the numeric variables HR, SBP, and DBP in data  |
|          set Clean.patients for data outside the ranges       |
|          40 to 100, 80 to 200, and 60 to 120 respectively.    |
|          The ID variable is PATNO and missing values are to   |
|          be flagged as invalid for HR but not for SBP or DBP. |
****************************************************************/
%macro Errors(Var=,    /* Variable to test     */
              Low=,    /* Low value            */
              High=,   /* High value           */
              Missing=IGNORE 
                       /* How to treat missing values         */
                       /* Ignore is the default.  To flag     */
                       /* missing values as errors set        */
                       /* Missing=error                       */);
data Tmp;
   set &Dsn(keep=&Idvar &Var);
   length Reason $ 10 Variable $ 32;
   Variable = "&Var";
   Value = &Var;
   if &Var lt &Low and not missing(&Var) then do;
      Reason='Low';
      output;
   end;
 
   %if %upcase(&Missing) ne IGNORE %then %do;
      else if missing(&Var) then do;
         Reason='Missing';
         output;
      end;
   %end;
 
   else if &Var gt &High then do;
      Reason='High';
      output;
      end;
      drop &Var;
   run;
 
   proc append base=Errors data=Tmp;
   run;
 
%mend Errors;

The basic idea for the %Errors macro is to test each variable and, if it is a possible error, use PROC APPEND to add it to a data set called Errors. When the first error is detected, PROC APPEND creates the data set Errors. From then on, each observation in data set Tmp is added to data set Errors.

Most of this macro is straightforward. For those readers who are not that comfortable with macro programming, the third section (beginning with %if %upcase(&Missing)) is executed only when the value of the macro variable &Missing is not equal to IGNORE.

Below is a listing of the %Report macro:

%macro Report;
   proc sort data=Errors;
      by &Idvar;
   run;
 
   proc print data=Errors;
      title "Error Report for Data Set &Dsn";
      id &Idvar;
      var Variable Value Reason;
   run;
 
   proc delete data=Errors Tmp;
   run;
 
%mend Report;

The %Report macro is mainly a PROC PRINT of the temporary data set Errors. I added PROC DELETE to delete the two temporary data sets Error and Tmp.

You can cut and paste these macros, or you can download all of the macros, programs, and data sets from Cody's Data Cleaning Techniques Using SAS®, Third Edition, by going to support.sas.com/Cody, search for the book, then click Example Code and Data. You do not have to buy the book to download all the files (although I would be delighted if you did). This is true for all of my books published by SAS Press.

Comments and/or corrections are always welcome.

Two macros for detecting data errors was published on SAS Users.

9月 132021
 

The Day of the Programmer is not enough time to celebrate our favorite code-creators. That’s why at SAS, we celebrate an entire week with SAS Programmer Week! If you want to extend the fun and learning of SAS Programmer Week year-round, SAS Press is here to support you with books for programmers at every level.

2021 has been a big year for learning, so we wanted to share the six most popular books for programmers this year. There are some old favorites on this list as well as some brand-new books on a variety of topics. Check out the list below, and see what your fellow programmers are reading this year!

  1. Little SAS Book: A Primer, Sixth Edition

This book is at the top of almost every list of recommended books for anyone who wants to learn SAS. And for good reason! It breaks down the basics of SAS into easy-to-understand chunks with tons of practice questions. If you are new to SAS or are interested in getting your basic certification, this is the book for you.

  1. Learning SAS by Example: A Programmer’s Guide, Second Edition

Whether you are learning SAS for the first time or just need a quick refresher on a single topic, this book is well-organized so that you can read start to finish or skip to your topic of interest. Filled with real-world examples, this is a book that should be on every SAS programmer’s bookshelf!

  1. Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS

If you work with big data, then you probably work with a lot of text. The third book on our list is for anyone who handles unstructured data. This book focuses on practical solutions to real-life problems. You’ll learn how to collect, cleanse, organize, categorize, explore, analyze, and interpret your data.

  1. End-to-End Data Science with SAS: A Hands-On Programming Guide

This book offers a step-by-step explanation of how to create machine learning models for any industry. If you want to learn how to think like a data scientist, wrangle messy code, choose a model, and evaluate models in SAS, then this book has the information that you need to be a successful data scientist.

  1. Cody's Data Cleaning Techniques Using SAS, Third Edition

Every programmer knows that garbage in = garbage out. Take out the trash with this indispensable guide to cleaning your data. You’ll learn how to find and correct errors and develop techniques for correcting data errors.

  1. SAS Graphics for Clinical Trials by Example

If you are a programmer who works in the health care and life sciences industry and want to create visually appealing graphs using SAS, then this book is designed specifically for you. You’ll learn how to create a wide range of graphs using Graph Template Language (GTL) and statistical graphics procedures to solve even the most challenging clinical graph problems.

An honorable mention also goes to the SAS Certification Guides. They are a great way to study for the certification exams for the SAS Certified Specialist: Base Programming and SAS Certified Professional: Advanced Programming credentials.

We have many books available to support you as you develop your programming skills – and some of them are free! Browse all our available titles today.

Top Books for SAS Programmers was published on SAS Users.

12月 142020
 

Do you need to see how long patients have been treated for? Would you like to know if a patient’s dose has changed, or if the patient experienced any dose interruptions? If so, you can use a Napoleon plot, also known as a swimmer plot, in conjunction with your exposure data set to find your answers. We demonstrate how to find the answer in our recent book SAS® Graphics for Clinical Trials by Example.

You may be wondering what a Napoleon plot is? Have you ever heard of the map of Napoleon’s Russian campaign? It was a map that displayed six types of data, such as troop movement, temperature, latitude, and longitude on one graph (Wikipedia). In the clinical setting, we try to mimic this approach by displaying several different types of safety data on one graph: hence, the name “Napoleon plot.” The plot is also known as a swimmer plot because each patient has a row in which their data is displayed, which looks like swimming lanes.

Code

Now that you know what a Napoleon plot is, how do you produce it? In essence, you are merely writing GTL code to produce the graph you need. In order to generate a Napoleon plot, some key GTL statements that are used are DISCRETEATTRMAP, HIGHLOWPLOT, SCATTERPLOT and DISCRETELEGEND. Other plot statements are used, but the statements that were just mentioned are typically used for all Napoleon plot. In our recent book, one of the chapters carefully walks you through each step to show you how to produce the Napoleon plot. Program 1, below, gives a small teaser of some of the code used to produce the Napoleon Plot.

Program 1: Code for Napoleon Plot That Highlights Dose Interruptions

	   discreteattrmap name = "Dose_Group";
            value "54" / fillattrs = (color = orange) 
                         lineattrs = (color = orange pattern = solid);     
            value "81" / fillattrs = (color = red) 
                         lineattrs = (color = red pattern = solid);
         enddiscreteattrmap;
 
         discreteattrvar attrvar = id_dose_group var = exdose attrmap = "Dose_Group";
 
         legenditem type = marker name = "54_marker" /
            markerattrs = (symbol = squarefilled color = orange)
            label = "Xan 54mg";
 
         < Other legenditem statements >
 
 
	     layout overlay / yaxisopts = (type = discrete 
                                         display = (line label)     
                                         label = "Patient")
 
	        highlowplot y = number 
                          high = eval(aendy/30.4375) 
                          low = eval(astdy/30.4375) / 
                 group = id_dose_group                       
                 type = bar 
                 lineattrs = graphoutlines 
                 barwidth = 0.2;
		 scatterplot y = number x = eval((max_aendy + 10)/30.4375) /      
                 markerattrs = (symbol = completed size = 12px);               
		 discretelegend "54_marker" "81_marker" "completed_marker" /  
                 type = marker  
                 autoalign = (bottomright) across = 1                          
                 location = inside title = "Dose";
         endlayout;

Output

Without further ado, Output 1 shows you an example of a Napoleon plot. You can see that there are many patients, and so the patient labels have been suppressed. You also see that the patient who has been on the study the longest has a dose delay indicated by the white space between the red and orange bars. While this example illustrates a simple Napoleon plot with only two types, dose exposure and treatment, the book has more complex examples of swimmer plots.

Output 1: Napoleon Plot that Highlights Dose Interruptions

Napoleon plot with orange and red bars showing dose exposure and treatment

How to create a Napoleon plot with Graph Template Language (GTL) was published on SAS Users.