Susan Slaughter

7月 302020
 

With COVID-19 spreading worldwide, accurate data have become more important than ever. In this blog, I share some of my favorite sources:

The Economist Reported deaths often underestimate actual deaths. One way to get at the real numbers is to compare total deaths from all causes versus the typical death rate. This “mortality tracker” plots excess deaths which is a more reliable measure than reported deaths.

Johns Hopkins University This interactive dashboard by the Coronavirus Resource Center at the Bloomberg School of Public Health shows detailed data about the pandemic worldwide.

National Public Radio These interactive graphics by NPR focus on the pandemic in the US.

SAS Institute This interactive dashboard gives a different look at global COVID-19 data.

Avi Schiffmann This webpage may be the most impressive effort by an individual person, and shows that tabular data can be profoundly thought-provoking too.

These articles are also highly recommended:

The Risks–Know Them–Avoid Them This article explains in plain language how COVID-19 spreads and how to keep yourself safe. Share this with your family.

COVID-19 Superspreader Events in 28 Countries: Critical Patterns and Lessons This fascinating article compiles data about superspreader events (SSEs) and reveals a lot about how this virus is spread.

Temporary reduction in daily global CO2 emissions during the COVID-19 forced confinement Finally, something positive: an article about the reduction in CO2 emissions due to the pandemic.

Knowledge is power. Working together we can all stay healthy.

6月 112020
 

Whether you enjoy debugging or hate it, for programmers, debugging is a fact of life. It’s easy to misspell a keyword, scramble your array subscripts, or (heaven forbid!) forget a semicolon. That’s why we include a chapter on debugging in The Little SAS® Book and its companion book, Exercises and Projects for the Little SAS® Book. We believe that learning to debug makes you a better programmer. Once you understand a bug, you will be better prepared to avoid it in the future.

To help hone your debugging skills, here is an example of the type of problems you can find in our book of exercises and projects. See if you can find the bugs.

Programming exercise

  1. A friend tells you that she is learning SAS and wrote the following program. Unfortunately, the program won’t run. Help her improve her programming skills by finding the mistakes.

TITLE Height, Weight, and BMI;
TITLE2 by Sex and Age Group;
PROC CONTENT DATA = SASHELP.class; RUN;
DATA; SET SASHELP.class;
Height_m = Heigth * 0.0254;
Weight_kg = Weight * 0.4536;
BMI = Weight_kg / Height_m**2;
PROC FORMAT; VALUE
$sex 'M' = 'Boys' 'F' = 'Girls';
VALUE agegp 11-12 = 'Preteens
13-16 = 'Teens';
PROC TABULATE;
CLASS Sex Age; VAR Height_m Weight_kg;
TABLES (Height_m Weight_kg BMI)*
MEAN, Sex Age ALL;
FORMAT Sex $sex. Age agegp.;
RUN;
QUIT;

 

  • a. Examine the SAS data set SASHELP.CLASS including variable attributes.
  • b. Clean up the formatting of the program by adding appropriate indention and line spacing to show the structure of the DATA and PROC steps. Make changes as needed to make the program conform to standard best practices.
  • c. Fix any errors in the code so that the program will run correctly.
  • d. Add comments to the revised program for each bug that you fix so that your friend can understand her mistakes.

Solution

In the book, we provide solutions for odd-numbered multiple choice and short answer questions, and hints for the programming exercises. So here is a hint for this exercise:

  1. Hint: This program contains four bugs. It also contains “red herrings” that are unusual for SAS code, but nonetheless do run properly and so are not actual bugs. Be sure you know how SAS handles data set names by default. SAS Enterprise Guide can format code for you; right-click the Program window and select Format Code from the pop-up menu. To format code in SAS Studio, click the Format Code icon at the top of Program window.

For more about The Little SAS Book and its companion book of exercises and projects, check out these blogs:

What's wrong with this code? was published on SAS Users.

3月 272020
 

Six editions is a lot! If you had told us, back when we wrote the first edition of The Little SAS Book, that someday we would write a sixth; we would have wondered how we could possibly find that much to say. After all, it is supposed to be The Little SAS Book, isn’t it? But the developers at SAS Institute are constantly hard at work inventing new and better ways of analyzing and visualizing data. And some of those ways turn out to be so fundamental that they belong even in a little book about SAS.

Interface independence

One of the biggest changes to SAS software in recent years is the proliferation of interfaces. SAS programmers have more choices than ever before. Previous editions contained some sections specific to the SAS windowing environment (also called Display Manager). We wrote this edition for all SAS programmers whether you use SAS Studio, SAS Enterprise Guide, the SAS windowing environment, or run in batch. That sounds easy, but it wasn’t. There are differences in how SAS behaves with different interfaces, and these differences can be very fundamental. In particular, the system option that sets the rules for names of variables varies depending on how you run SAS. So old sections had to be rewritten, and we added a whole new section showing how to use variable names containing blanks and special characters.

New ways to read and write Microsoft Excel files

Previous editions already covered how to read and write Microsoft Excel files, but SAS developers have created new ways that are even better. This edition contains new sections about the XLSX LIBNAME engine and the ODS EXCEL destination.

More PROC SQL

From the very first edition, The Little SAS Book always covered PROC SQL. But it was in an appendix and over time we noticed that most people ignore appendices. So for this edition, we removed the appendix and added new sections on using PROC SQL to

  • Subset your data
  • Join data sets
  • Add summary statistics to a data set
  • Create macro variables with the INTO clause

For people who are new to SQL, these sections provide a good introduction; for people who already know SQL, they provide a model of how to leverage SQL in your SAS programs.

Updates and additions throughout the book

Almost every section in this edition has been changed in some way. We added new options, made sure everything is up-to-date, and ran every example in every SAS interface noting any differences. For example, PROC SGPLOT has some new options, the default ODS style for PDF has changed, and the LISTING destination behaves differently in different interfaces. Here’s a short list, in no particular order, of new or expanded topics in the sixth edition:

  • More examples with permanent SAS data sets, CSV files, or tab-delimited files
  • More log notes throughout the book showing what to look for
  • LIKE or sounds-like (=*) operators in WHERE statements
  • CROSSLIST, NOCUM, and NOPRINT options in PROC FREQ
  • Grouping data with a user-defined format and the PUT function
  • Iterative DO groups
  • DO WHILE and DO UNTIL statements
  • %DO statements

Even though we have added a lot to this edition, it is still a little book.  In fact, this edition is shorter than the last—by twelve pages! We think this is the best edition yet.

3月 132020
 

If you have been using SAS for long, you have probably noticed that there is generally more than one way to do anything. The Little SAS Book has long covered reading and writing Microsoft Excel files with the IMPORT and EXPORT procedures, but for the Sixth Edition we decided it was time to add two more ways: The ODS EXCEL destination makes it easy to convert procedure results into Excel files, while the XLSX LIBNAME engine allows you to access Excel files as if they were SAS data sets.

With the XLSX LIBNAME engine, you can convert an Excel file to a SAS data set (or vice versa) if you want to, but you can also access an Excel file directly without the need for a SAS data set. This engine works for files created using any version of Microsoft Excel 2007 or later in the Windows or UNIX operating environments. You must have SAS 9.4M2 or higher and SAS/ACCESS Interface to PC Files software. A nice thing about this engine is that it works with any combination of 32 bit and 64 bit systems.

The XLSX LIBNAME engine uses the first line in your file for the variable names, scans each full column to determine the variable type (character or numeric), assigns lengths to character variables, and recognizes dates, and numeric values containing commas or dollar signs. While the XLSX LIBNAME engine does not offer many options, because you are using an Excel file like a SAS data set, you can use some standard data set options. For example, you can use the RENAME= data set option to change the names of variables, and FIRSTOBS= and OBS= to select a subset of rows.

Reading an Excel file as is 

Suppose you have the following Excel file containing data about magnolia trees:


With the XLSX LIBNAME engine, SAS can read the file, without first converting it to a SAS data set. Here is a PROC PRINT that prints the data directly from the Excel file.

* Read Excel spreadsheet XLSX LIBNAME;
LIBNAME exfiles XLSX ‘c:\MyExcel\Trees.xlsx’;
PROC PRINT DATA = exfiles.sheet1;
   TITLE ‘PROC PRINT of Excel File’;
RUN;

Here are the results of the PROC PRINT. Notice that the variable names were taken from the first row in the file.

Converting an Excel file to a SAS data set 

If you want to convert an Excel file to a SAS data set, you can do that too. Here is a DATA step that reads the Excel file. The RENAME= data set option changes the variable name MaxHeight to MaxHeightFeet. Then a new variable is computed which is equal to the height in meters.

* Import Excel into a SAS data set;
DATA magnolia;
SET exfiles.sheet1 (RENAME = (MaxHeight = MaxHeightFeet));
MaxHeightMeters = ROUND(MaxHeightFeet * 0.3048);
RUN;

Here is the SAS data set with the renamed and new variables:

Writing to an Excel file 

It is just as easy to write to an Excel file as it is to read from it.

* Write a new sheet to the Excel file;
DATA exfiles.trees;
   SET magnolia;
RUN;
LIBNAME exfiles CLEAR;

Here is what the Excel file looks like with the new sheet. Notice that the new tab is labeled with the name of the SAS data set TREES.

Another nice thing about the XLSX LIBNAME is that it only locks a spreadsheet while SAS is accessing it. So generally speaking, it’s not necessary to issue a second LIBNAME statement to clear the libref. However, I did find, when I ran this in SAS Enterprise Guide, that I could not open the Excel spreadsheet unless I cleared the libref. So you can probably skip the LIBNAME CLEAR statement if you are using Display Manager or SAS Studio.

The XLSX LIBNAME engine is so flexible and easy to use that we think it’s a great addition to any SAS programmer’s skill set.

For more about the XLSX LIBNAME engine, I recommend this blog by Chris Hemedinger.

3月 102020
 

If you have been using SAS for long, you have probably noticed that there is generally more than one way to do anything. (For an example, see my co-author Lora Delwiche’s blog about PROC SQL.) The Little SAS Book has long covered reading and writing Microsoft Excel files with the IMPORT and EXPORT procedures, but for the Sixth Edition, we decided it was time to add two more ways: The ODS EXCEL destination makes it easy to convert procedure results into Excel files, while the XLSX LIBNAME engine allows you to access Excel files as if they were SAS data sets.

With the XLSX LIBNAME engine, you can convert an Excel file to a SAS data set (or vice versa) if you want to, but you can also access an Excel file directly without the need for a SAS data set. This engine works for files created using any version of Microsoft Excel 2007 or later in the Windows or UNIX operating environments. You must have SAS 9.4M2 or higher and SAS/ACCESS Interface to PC Files software. A nice thing about this engine is that it works with any combination of 32-bit and 64-bit systems.

The XLSX LIBNAME engine uses the first line in your file for the variable names, scans each full column to determine the variable type (character or numeric), assigns lengths to character variables, and recognizes dates, and numeric values containing commas or dollar signs. While the XLSX LIBNAME engine does not offer many options, because you are using an Excel file like a SAS data set, you can use many standard data set options. For example, you can use the RENAME= data set option to change the names of variables, and FIRSTOBS= and OBS= to select a subset of rows.

Reading an Excel file as is 

Suppose you have the following Excel file containing data about magnolia trees:

With the XLSX LIBNAME engine, SAS can read the file, without first converting it to a SAS data set. Here is a PROC PRINT that prints the data directly from the Excel file.

* Read an Excel spreadsheet using XLSX LIBNAME;
LIBNAME exfiles XLSX 'c:\MyExcel\Trees.xlsx';

PROC PRINT DATA = exfiles.sheet1;
   TITLE 'PROC PRINT of Excel File';
RUN;

Here are the results of the PROC PRINT. Notice that the variable names were taken from the first row in the file.

PROC PRINT of Excel File

Converting an Excel file to a SAS data set 

If you want to convert an Excel file to a SAS data set, you can do that too. Here is a DATA step that reads the Excel file. The RENAME= data set option changes the variable name MaxHeight to MaxHeightFeet. Then a new variable is computed which is equal to the height in meters.

* Import Excel into a SAS data set and compute height in meters;
DATA magnolia;
   SET exfiles.sheet1 (RENAME = (MaxHeight = MaxHeightFeet));
   MaxHeightMeters = ROUND(MaxHeightFeet * 0.3048);
RUN;

Here is the SAS data set with the renamed and new variables:


Writing to an Excel file 

It is just as easy to write to an Excel file as it is to read from it.

* Write a new sheet to the Excel file;
DATA exfiles.trees;
   SET magnolia;
RUN;
LIBNAME exfiles CLEAR;

Here is what the Excel file looks like with the new sheet. Notice that the new tab is labeled with the name of the SAS data set TREES.

The XLSX LIBNAME engine is so flexible and easy to use that we think it’s a great addition to any SAS programmer’s skill set.

To learn more about the content in The Little SAS Book, check out the free book excerpt.  To see up-and-coming titles and get exclusive discounts, make sure to subscribe to the SAS Books newsletter.

Accessing Excel files using LIBNAME XLSX was published on SAS Users.

11月 082019
 

Six editions is a lot! If you had told us, back when we wrote the first edition of The Little SAS Book, that someday we would write a sixth; we would have wondered how we could possibly find that much to say. After all, it is supposed to be The Little SAS Book, isn’t it? But the developers at SAS Institute are constantly hard at work inventing new and better ways of analyzing and visualizing data. And some of those ways turn out to be so fundamental that they belong even in a little book about SAS.

Interface independence

One of the biggest changes to SAS software in recent years is the proliferation of interfaces. SAS programmers have more choices than ever before. Previous editions contained some sections specific to the SAS windowing environment (also called Display Manager). We wrote this edition for all SAS programmers whether you use SAS Studio, SAS Enterprise Guide, the SAS windowing environment, or run in batch. That sounds easy, but it wasn’t. There are differences in how SAS behaves with different interfaces, and these differences can be very fundamental. In particular, the system option that sets the rules for names of variables varies depending on how you run SAS. So old sections had to be rewritten, and we added a whole new section showing how to use variable names containing blanks and special characters.

New ways to read and write Microsoft Excel files

Previous editions already covered how to read and write Microsoft Excel files, but SAS developers have created some great new ways. This edition contains new sections about the XLSX LIBNAME engine and the ODS EXCEL destination.

More PROC SQL

From the very first edition, The Little SAS Book always covered PROC SQL. But it was in an appendix and over time we noticed that most people ignore appendices. So for this edition, we removed the appendix and added new sections on using PROC SQL to

  • Subset your data
  • Join data sets
  • Add summary statistics to a data set
  • Create macro variables with the INTO clause

For people who are new to SQL, these sections provide a good introduction; for people who already know SQL, they provide a model of how to leverage SQL in your SAS programs.

Updates and additions throughout the book

Almost every section in this edition has been changed in some way. We added new options, made sure everything is up-to-date, and ran every example in every SAS interface noting any differences. For example, PROC SGPLOT has some new options, the default ODS style for PDF has changed, and the LISTING destination behaves differently in different interfaces. Here’s a short list, in no particular order, of new or expanded topics in the sixth edition:

  • More examples with permanent SAS data sets, CSV files, or tab-delimited files
  • More log notes throughout the book showing what to look for
  • LIKE or sounds-like (=*) operators in WHERE statements
  • CROSSLIST, NOCUM, and NOPRINT options in PROC FREQ
  • Grouping data with a user-defined format and the PUT function
  • Iterative DO groups
  • DO WHILE and DO UNTIL statements
  • %DO statements

Even though we have added a lot to this edition, it is still a little book.  In fact, this edition is shorter than the last—by twelve pages! We think this is the best edition yet.

10月 222019
 

I am excited to announce that the sixth edition of The Little SAS Book is now available. We spent over a year rewriting and updating, and this may well be the best edition yet.

You can download a sample chapter or purchase e-book versions (PDF, EPUB or Kindle) by visiting SAS Press’ site.

If, like me, you like to be able to flip the pages and make notes in the margin, then you can get a hard copy (in paperback or hardback!) from Amazon.

10月 172019
 

It has recently become fashionable to talk about data literacy. This is an important idea so I’m glad to see people discussing it.

To me, data literacy means understanding that data are not dry, dusty, abstract squiggles on a computer screen, but represent living things: people, plants, animals. Having a deep understanding of data enables people to engage with data, to explore data, to interpret data, and to use data to impact their lives and work.

Data literacy necessarily comes with a degree of skepticism, recognizing that data can be not only used, but also misused. In this age of “alternative facts,” it is important to recognize when assertions are supported by data, and when they are not.

Everyone knows that technology is becoming more and more a part of everyday life. Without data literacy, people become passive recipients; with data literacy, you can actively engage with technology.

You know you are fluent in a foreign language when you are comfortable speaking it and can communicate what you want to say. The same is true for data literacy; it is about reaching a level of comfort, about seeing the meaning behind the data, and about being able to communicate what you want to say.

Here is how Wikipedia defines data literacy.