2月 272017
 

Longtime SAS programmers know that the SAS DATA step and SAS procedures are very tolerant of typographical errors. You can misspell most keywords and SAS will "guess" what you mean. For example, if you mistype "PROC" as "PRC," SAS will run the program but write a warning to the log: WARNING 14-169: Assuming the symbol PROC was misspelled as PRC.

This feature provided a big productivity boost in the days before GUI program editors. Imagine submitting a program from a command line in the early 1980s. If you mistyped one keyword you would have to retype the entire statement. As a convenience, SAS implemented an algorithm that checks the "spelling distance" between the tokens that you submit and a list of valid keywords for the procedure that you are calling. DATA step programmers might be familiar with the SPEDIS function, which measures how close two words are to each other in the English language. The SAS language parser uses the same algorithm.

Not everyone wants this feature. Many companies in regulated industries (such as pharmaceuticals) turn off the autocorrect feature in SAS because they want to force their programmers to type every keyword correctly. You can determine whether AUTOCORRECT option is enabled on your system by running PROC OPTIONS:

proc options option=AUTOCORRECT value;  run;

The AUTOCORRECT option is turned on by default. You can turn off the option by submitting options NOAUTOCORRECT or by putting -NOAUTOCORRECT in a configuration file.

Today I've invited two people to argue for and against using this feature. Larry Literal is a programmer who believes that no program should ever accept a syntax error. Annie Intel sees nothing wrong with programs that self-correct. She argues that it is desirable for programs to interpret the intention of the programmer. Which do you agree with? Do you have something to add? Leave a comment.

Point: A program should not allow ambiguity

My name is Larry Literal and I believe that computer programming should be an exact science. There is no room for ambiguity. A program that runs because it is "close to" a correct program is an abomination. I do not want a computer to change the code that I write!

When my system administrator installs a new version of SAS, the first thing I do is turn off the autocorrect feature. (I've also turned off the autocorrect feature on my phone. What a pain!) My main argument against the AUTOCORRECT option is that it makes code unreadable. Take a look at the following program:

/* The correct program is:
   proc freq dta=sashelp.class order=freq;
      table sex / chisq;
   run;
*/
prc freq dta=sashelp.class ordor=freqq;
   tble sex / chsq;
runn;

Every keyword in this program is mistyped. The only tokens that are specified correctly are the name of the procedure, the name of the data set, and the name of the variable. The program looks more like the Klingon language than the SAS language, yet this program runs if you use the AUTOCORRECT option!

And what happens if SAS introduces a new keyword that is closer to a mistyped word than a previous keyword? Then the procedure might do something different even though I have not changed the program! The autocorrect feature is an abomination and should never be used!

Counterpoint: Computers should interpret what you say

Really, Larry? "An abomination"? What century are you living in?

My name is Annie Intel, but my friends call me "A.I." I think the SAS autocorrect feature was way ahead of its time. Today we have autocorrecting logic on smartphones and word processors. Applying the same techniques to computer programs is no different. In fact, if you use a modern SAS program editor, the editor will suggest valid keywords and flag any keyword that is not valid.

Let's be real: Larry's example is not realistic. No programmer is going to use that garbled call to PROC FREQ in a production job. The autocorrect feature does not "make code unreadable." It is a convenience while developing a program, not an excuse to write nonsense. Any competent programmer will check the log for warning messages and correct the typos.

Larry claims that he doesn't want a computer munging and altering the code he writes. But optimizing compilers have been doing exactly that for decades! Programmers write instructions in a high-level language and an optimizing compiler maps the code to a set of machine instructions. The compiler will sometimes rearrange the structure of the program to get better performance. If it is okay for a compiler to map a program into an optimal version of itself, why is it not okay for a parser to do the same by correcting misspellings?

I want computers to recognize my intentions. When I give a voice command to my smartphone or personal home device, the audio signal is mapped to an action. I am allowed a certain amount of flexibility. "Turn on the lights" and "turn da light on" are equivalent phrases that should be understood and mapped to the same action. The SAS AUTOCORRECT feature is similar. The interpreter has a context (the name of the procedure) which is used to standardize your input. I think it is very cool. In the future, I think more programming languages will accept ambiguities.

The post Point/Counterpoint: Should a programming language accept misspelled keywords? appeared first on The DO Loop.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)