11月 112016
 

If you obtain data from web sites, social media, or other unstandardized data sources, you might not know the form of dates in the data. For example, the US Independence Day might be represented as "04JUL1776", "07/04/1776", "Jul 4, 1776", or "July 4, 1776." Fortunately, the ANYDTDTE informat makes it easy read dates like these into SAS.

The ANYDTDTEw. informat is a flexible alternative to older informats such as DATEw., MMDDYYw., and YYMMDDw. If your dates are in a specific form, the older informats work great and serve to document that all dates must be in that standard form. If the dates are not standardized or you need to read a string like "July 4, 1776", the ANYDTDTE informat is a godsend.

The ANYDTDTE informat for reading dates

The following SAS DATA step shows that the ANYDTDTEw. format combines several older formats into a "super format" that attempts to convert a character string into a date. The ANYDTDTE format can not only replace many of the older formats, but it can be used to convert a string like "Jul 4, 1776" into a date, as follows:

data Dates;
input @1 Style $8.
      @9 Value anydtdte12.;
format Value DATE10.;
datalines;
DATE    04JUL1776
MMDDYY  07041776
MMDDYY  07/04/1776
YYMMDD  17760704 
N/A     Jul 4, 1776
N/A     July 4, 1776
;
 
proc print noobs; run;
Result of using the ANYDTDTE informat to read strings that represent dates

As you can see, the ANYDTDTE informat reads six different strings, but converts all of them to the SAS date value that corresponds to 04JUL1776.

MMDD or DDMM? How does ANYDTDTE interpret ambiguous dates?

The string 07/04/1776 can be interpreted as "April 7, 1776" or "July 4, 1776," depending upon the local convention. Europeans tend to interpret the string as DD/MM/YYYY whereas the US convention is to use MM/DD/YYYY. How does the ANYDTDTEw. informat guess which interpretation might be correct?

The answer is that the informat looks at the DATESTYLE SAS option. By default, the DATESTYLE option uses the LOCALE system option to guess which style to use. You can use PROC OPTIONS to see the value of these options, which are printed to the SAS log:

proc options option=(DATESTYLE LOCALE) value; run;
Option Value Information For SAS Option DATESTYLE
    Value: MDY
    Scope: Default
    How option value set: Locale
 
Option Value Information For SAS Option LOCALE
    Value: EN_US
...

For my system, the DATESTYLE option is set to MDY, which means that the string "07/04/1776" will be interpreted MM/DD/YYYY. If you need to read dates that obey a different convention, you can use the global OPTIONS statement to set the DATESTYLE option:

options DATESTYLE=DMY;    /* change default style convention */
/* Restore default convention: options DATESTYLE=Locale; */

Other "ANY" informats in SAS

There are two other SAS infomats that are similar to the ANYDTDTE informat:

Here's a tip to help you remember these seemingly cryptic names. The first part of the name is "ANYDT", which means that the input string can be ANY datetime (DT) value. The end of the name refers to the numerical value that is produced by the informat. The resulting value can be a date (DTE), a datetime (DTM), or a time (TME) value. Thus the three informats all have the mnemonic form ANYDTXXX where the XXX suffix refers to the value that is produced.

tags: Reading and Writing Data, SAS Programming

The post One informat to rule them all: Read any date into SAS appeared first on The DO Loop.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)