Just for Fun

Solving Scrambled-Word Puzzles

 Just for Fun, Statistical Programming  Solving Scrambled-Word Puzzles已关闭评论
10月 152010
 
Have you ever been stuck while trying to solve a scrambled-word puzzle? You stare and stare at the letters, but no word reveals itself? You are stumped. Stymied.

I hope you didn't get stumped on the word puzzle I posted as an anniversary present for my wife. She breezed through it.

In a previous post, I showed that you can scramble and unscramble words by using permutations. I also showed that you can generate the set of all permutations on N elements. By putting these techniques together, you can write a SAS/IML program that solves scrambled-word puzzles.

Here's what you need:

A Dictionary of Words

In this internet age, it is easy to obtain a file that contains a dictionary of words. Some word lists are small. For example, the "Unix dictionary" contains just over 25 thousand entries. Others are quite a bit larger. I like the official Scrabble™ player's dictionary (OSPD) because it limits itself to words with eight or fewer letters.

Download your favorite wordlist and name the file "wordlist.txt." I used the following DATA step code to convert the word list to a SAS data set:

filename wordlist 'C:\Documents and Settings\...\wordlist.txt';
data Dictionary;
   /** keep first 8 chars unless you need longer words **/
   length word $8.; 
   infile wordlist;
   input word $;
run;

A SAS/IML Program to Unscramble Words

Suppose you have a list of 5- and 6-letter scrambled words that you want to unscramble. For this example, I will use the following set of scrambled statistical words:

proc iml;
target = upcase({NAGER, RRREO, ORRCDE, LROANM});
print target;

target

NAGER

RRREO

ORRCDE

LROANM

You can read the dictionary and the permutations into SAS/IML matrices:

/** read word list; convert to uppercase **/
use Dictionary; read all var {Word}; close Dictionary;
Word = upcase(Word);

/** read permutations for 5- an 6-letter words **/
use perm5; read all var _NUM_ into p5; close perm5;
use perm6; read all var _NUM_ into p6; close perm6;

The following algorithm unscrambles the words:

free solution;
do i = 1 to nrow(target); /** for each word **/
   w = strip(target[i]);
   if length(w)=5 then p = p5; 
   else p = p6; /** use permutation for N elements **/
   idx = loc( length(Word) = length(w) );
   list = Word[idx]; /** limit word list **/

   foundWord = 0;  /** how many permutations result in a word? **/ 
   do n = 1 to nrow(p) while (foundWord<=3); /** for each permutation **/
      perm = p[n, ];  
      s = PermuteChars(w, perm); /** apply permutation **/
      idx = loc(list = s);       /** check if this is a word **/
      if ncol(idx)>0 then do;    /** remember solution **/
         foundWord = foundWord + 1;
         solution = solution // (s + " is a solution for " + w); 
      end;
   end;
end;
print solution;

solution

REGNA is a solution for NAGER

RANGE is a solution for NAGER

ANGER is a solution for NAGER

ERROR is a solution for RRREO

ERROR is a solution for RRREO

ERROR is a solution for RRREO

ERROR is a solution for RRREO

CORDER is a solution for ORRCDE

CORDER is a solution for ORRCDE

RECORD is a solution for ORRCDE

RECORD is a solution for ORRCDE

NORMAL is a solution for LROANM

You can see that people who create scrambled-word puzzles need to be careful that there are not two or more permutations that each lead to valid words. For the letters NAGER, both RANGE and ANGER are common words that solve the puzzle. (I'd exclude REGNA.) Also notice that words with repeated letters, such as RRREO, are easier to solve in the sense that there are multiple permutations that unscramble the letters

So next time you are stuck, just fire up this SAS program to solve scrambled word puzzles.