6 Sep 2004 darkewolf   » (Journeyer)

This website was discovered this morning and provided me with a distraction to keep me occupied during lunch times and other times when my brain needs a kickstart.

It contains a number of enciphered texts and basically no clue on how to solve them. Well except for the more complex ones (which I am yet to try). But below I shall discuss how I solved Challenge 1 and Challenge 2.

Challenge 1

The cipher text is as follows:

Zl sngure'f snzvyl anzr orvat Cveevc, naq zl Puevfgvna anzr Cuvyvc,
zl vasnag gbathr pbhyq znxr bs obgu anzrf abguvat ybatre be zber
rkcyvpvg guna Cvc.  Fb, V pnyyrq zlfrys Cvc, naq pnzr gb or pnyyrq
Cvc.

Looks pretty evil eh? Thankfully this one is dead easy. Its a common form of cipherment used on Usenet and other places, called Rot-13. Its a form of substitution cipher, done by rotating the characters 13 places modulo 26.

For instances, the letter A becomes the letter N (N is 13 characters ahead of A), of course, you need to wrap around if you use the letters N through to Z. This cipher is particularily convient when one has an even number of characters in their alphabet).

Thus the decipherment (using the following set of Unix commands cat challenge1.txt | tr '[A-M][N-Z][a-m][n-z]' '[N-Z][A-M][n-z][a-m]' > answer1.txt) :

My father's family name being Pirrip, and my Christian name Philip,
my infant tongue could make of both names nothing longer or more
explicit than Pip.  So, I called myself Pip, and came to be called
Pip.

Pretty cool eh? :) As I said, its often (at least in the past) is used on Usenet to protect the punchlines from jokes from being accidently read.

Challenge 2

This one was moderately more tricky, but once I got the solution it was obvious.

THESNFZOGH OA ZIT FSGETAA GY EGHCTSZOHU Q FKQOHZTBZ JTAAQUT OHZG QH QKZTSHQZT EOFITSZTBZ JTAAQUT.  
ZIT EOFITSZTBZ JTAAQUT EGHZQOHA QKK ZIT OHYGSJQZOGH GY ZIT FKQOHZTBZ JTAAQUT, WXZ OA HGZ OH Q       
YGSJQZ STQRQWKT WN Q IXJQH GS EGJFXZTS VOZIGXZ ZIT FSGFTS JTEIQHOAJ ZG RTESNFZ OZ. --- YSGJ         
VOLOFTROQ, ZIT YSTT THENEKGFTROQ.

A quick try of Rot-13 showed it obviously wasn't the same cipher. This time I decided to be a bit trickier and wrote a quick perl program to do a frequency count of the characters present (Mostly to ensure I was dealing with a simple substitution cipher rather than something more complex). The program showed that the characters appeared to be distributed within the same frequencies as the English language (See the table below to see the character distribution of the English language):

Character    Frequency
sp    0.186550
e    0.108321
t    0.079711
a    0.066101
h    0.062808
o    0.053881
s    0.049366
n    0.048965
r    0.047798
i    0.041987
l    0.036380
d    0.035168
u    0.024981
w    0.023349
m    0.020149
c    0.019151
g    0.017733
y    0.017043
f    0.014561
b    0.013218
p    0.012472
k    0.008703
v    0.008059
j    0.001296
x    0.001119
q    0.000615
z    0.000503

I did ignore the sp group though. Maybe I should add to my frequency count program to also test for pairs of letters also.

I got the following frequencies based on the file (only the first 5 though, rest on request):

 1)cipher (T)   count( 36)   freq (0.1385)
 2)cipher (Z)   count( 30)   freq (0.1154)
 3)cipher (Q)   count( 22)   freq (0.0846)
 4)cipher (O)   count( 20)   freq (0.0769)
 5)cipher (G)   count( 18)   freq (0.0692)

Now, what did this lead me to believe? I am fairly certain that T in the ciphertext would be the letter </b>e</b> in the real world. And I also made the assumption that Z would be t. Lets make that substition (using lowercase to indicate deciphered letters of course):

eHESNFtOGH OA tIe FSGEeAA GY EGHCeStOHU Q FKQOHteBt JeAAQUe OHtG QH QKteSHQte EOFIeSteBt JeAAQUe.
tIe EOFIeSteBt JeAAQUe EGHtQOHA QKK tIe OHYGSJQtOGH GY tIe FKQOHteBt JeAAQUe, WXt OA HGt OH Q
YGSJQt SeQRQWKe WN Q IXJQH GS EGJFXteS VOtIGXt tIe FSGFeS JeEIQHOAJ tG ReESNFt Ot. --- YSGJ
VOLOFeROQ, tIe YSee eHENEKGFeROQ.

Looking good so far. At this stage I made some more substitions based on frequency and with one or two hiccups I ended up with a fairly good portion of it done. I also guessed I would be h due to the tIe word most likely being the.

From the point of having about 30% of it decoded, I used other Unix tools to find out potential words. Using the YSee on the last line I looked for words in the dictionary (well word list) that had four characters, two unknown ending in double e:

grep "^..ee$" /usr/share/dict/words

Agee
alee
Cree
flee
free
glee
knee
tree

Could be any of these. It would not be 'tree' as we had used the t before. And I decided to eliminate Agee, Alee, Cree cause they were silly.

And then it continued as I spotted more and more words.

Ultimately the message deciphered to be:

encryption is the process of converting a plaintext message into an alternate ciphertext message.
the ciphertext message contains all the information of the plaintext message, but is not in a
format readable by a human or computer without the proper mechanism to decrypt it. --- from
wikipedia, the free encyclopedia.

I had deciphered it without really knowing the method it used. I used the following command to do it cat Chal2.txt | tr 'TZIYSGOJXHQRWKNEFVALBUC' 'ethfroimunadblycpwskxgv' > Decoded.chal2

So out of curiousity I went and wrote another perl script which showed the plaintext and the ciphertext versions side by side with their 'position' in the alphabet so I could see if there was any formula being used, and piped it through a sort so it would be in plaintext-alphabetical order:

First the perl display code I used:

#!/usr/bin/perl
$plain = "ethfroimunadblycpwskxgv";
$code  = "TZIYSGOJXHQRWKNEFVALBUC";
my (@a_plain) = split(//, $plain);
my (@a_code)  = split(//, $code);
$length = length($plain);
for($i = 0; $i < $length; $i++) {
        printf "%s (%3d) -- %s (%3d)\n", $a_plain[$i], ord($a_plain[$i]) - 96 , $a_code[$i], ord(lc($a_code[$i])) - 96 ;
        }

I ran the following ./Cipher.chal2.pl | sort and got the following:

a (  1) -- Q ( 17)
b (  2) -- W ( 23)
c (  3) -- E (  5)
d (  4) -- R ( 18)
e (  5) -- T ( 20)
f (  6) -- Y ( 25)
g (  7) -- U ( 21)
h (  8) -- I (  9)
i (  9) -- O ( 15)
k ( 11) -- L ( 12)
l ( 12) -- K ( 11)
m ( 13) -- J ( 10)
n ( 14) -- H (  8)
o ( 15) -- G (  7)
p ( 16) -- F (  6)
r ( 18) -- S ( 19)
s ( 19) -- A (  1)
t ( 20) -- Z ( 26)
u ( 21) -- X ( 24)
v ( 22) -- C (  3)
w ( 23) -- V ( 22)
x ( 24) -- B (  2)
y ( 25) -- N ( 14)

How bloody obvious. The cipher substitution was using the QWERTY keyboard layout.

Now, the rest of the challenges may be a bit harder. Ideally I'd have a nice PDA with Perl installed and go sit down on the river and hack away on ciphers at my lunch, but for now that will have to wait *smirks*

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!