Fun with Ciphers (Part 2)
In my previous blog, you saw how to create a Beale cipher. In this blog, you will see a program that can decode a Beale cipher. As a reminder, here is a list of numbers that you can use as a substitute for a letter when creating your cipher.
Now, suppose you want to send the following message: "Come to safe house at ten tonight." One possible cipher for this message is:
65 12 81 84 55 46 3 73 88 71 80 11 7 20 57 94 35 84 82 22 29 33 44 16 31 10 67 48 73 60
The first step to decode this cipher is the same as the first step in the program to create the cipher: Make a list of possible numbers to represent each letter. I'll repeat it here:
*Create the list of letters and numbers; Data Decipher; length Letter $ 1; infile 'c:\Books\Blogs\Declare.txt'; input Letter : $upcase1. @@; N + 1; output; run; title "Listing of Data Set Decipher"; title2 "First Five Observations"; proc print data=Decipher(obs=5) noobs; run;
This is the program that created the list of numbers corresponding to each letter. The next step in the program to create a Beale cipher was to sort by Letter. This time you want it in number order. Because it is already in order by the variable N, you don't have to sort it. Here are the first five observations in data set Decipher:
The next step is to read the message and make a SAS data set.
*Make a SAS data set from the Message text; data Message; infile 'c:\books\Blogs\Cipher\Message.txt'; input NN @@; run; title "First 5 Observations from Data Set Message"; proc print data=Message(obs=5) noobs; run;
The final step is to create a temporary array (long enough to hold all the numbers). Each element in this array will contain the letter corresponding to the position in the array. The DATA step below first loads the temporary array elements with the appropriate letters and then reads each number from the file Message.txt (that contains the secret code). The temporary array is acting as a lookup table to find the letter corresponding to the number. I have annotated the program so that you can see exactly what is going on.
data Final; length Letter $ 1 String $ 200; array Letters $ _temporary_; ❶ set Decipher (keep=Letter) end=Last_Obs; ❷ N+1; Letters[N] = Letter; ❸ if Last_Obs then do i = 1 to N_Message; ❹ set Message Nobs=N_Message; ❺ Letter = Letters[NN]; ❻ String = catx(' ',String,Letter); ❼ if i = N_Message then output; ❽ end; keep String; run; title "Decoded message"; proc print data=Final noobs; ❾ run;
❶ Create a temporary array. Each element in the temporary array (Letters) is a letter corresponding the element number. For example, Letters is 'W', Letters is 'I', and so forth.
❷ Bring in the observations in data set Decipher. Each observation in this data set contains the first letter of each word in the document. The END= option lets you know when you have read the last observation in the Decipher data set.
❸ Load up the temporary array based on the values of N and Letter
❹ Once the temporary array is loaded, read in the observations in data set Message. Notice that the variable N_Message was set to the number of observations in data set Message at compile time by using the SET option NOBS=.
❺ Bring in the observations from data set Message.
❻ Decipher the number (NN) to determine the letter it represents.
❼ Use the CATX function to add all the letters to the variable String.
❽ After all the numbers from the file Message.txt have been processed, it is time to output an observation containing the variable String.
❾ Use PROC PRINT to print out the message.
I showed this program to my friend Mark Jordan (aka SAS Jedi), and he came up with a solution that uses formats to do the table lookup. It is probably an easier and more elegant program than mine (his programs usually are), and I am including his program here.
The first step is once again to create the cipher. Make a list of possible numbers to represent each letter. This time, though, we’ll create the Decipher data set so that it can be used to build a SAS format.
*Create the list of letter and numbers; data Decipher; retain Fmtname 'Decipher' Type 'N'; ❶ length LABEL $ 1; ❷ infile 'c:\Books\Blogs\Declare.txt'; input Label : $upcase1. @@; N + 1; Start=N; ❸ output; ❹ drop N; run; title "Listing of Data Set Decipher"; title2 "First 5 Observations"; proc print data=Decipher(obs=5) noobs; ❺ run;
❶ FMTNAME and TYPE are required to be the same value for each observation. We accomplish that with a RETAIN statement.
❷ LABEL and START are the other two required variables for a PROC FORMAT control data set.
❸ Set Start to N.
❹ Write one row for each value we want to decode.
❺ Print the first 5 observations of the Decipher data set.
The next step is to create a format from the Decipher data set:
* Make a format from the Decipher data set; proc format cntlin=Decipher fmtlib; run;
The FMTLIB option produces a report documenting the format. Here is a sample:
The final step is to use the format on each number in the message text to decode it. The first DATA step below reads each number from the file Message.txt (that contains the secret code) to create the Message data set. The second DATA step reads the Message data set and applies the format to each numeric value using the PUT function. This produces the letter corresponding to the number. I have annotated the program so that you can see exactly what is going on.
*Make a SAS data set from the Message text; data Message; infile 'c:\Books\Blogs\Cipher\Message.txt'; input NN @@; run; data Final; length String $200; retain string; keep String; set Message end=last; ❶ String = catx(' ',String,put(NN,decipher.)); ❷ if last then output; run; title "Decoded message"; proc print data=Final noobs; ❸ run;
❶ Bring in the observations from the Message data set.
❷ Use the PUT function to produce the correct letter, and the CATX function to combine the letters into the variable String.
❸ Use PROC PRINT to print out the message.
I hope you enjoy both of these programs. Please add a comment to the blog with your preference. I think I'll vote for Mark's program!
Fun with Ciphers (Part 2) was published on SAS Users.