Training the Next Generation

My day job involves warping, um teaching young minds the art of computer programming. Every now and then I get to slip in something fun.

This assignment is one where the students do a completely automated decryption of what was up until well after WW2 a state of the art cryptosystem.  While I did weaken it a little (a short key rather than a very long one), the linearity of the cipher is a fundamental weakness. More than a few of the students did the whole project, and at least one found it pretty cool. Cool enough to think about working for one of those un-nameable three letter agencies.

The basic techniques were used at Bletchley Park both for Enigma and Lorenz.

DSC_0319

Although largely by hand (until they built the machines) in small dark offices like this one. (Turing’s). Could you do it?


Assignment 8

RWH

due November 5 2015

Another encryption assignment

  1. Vernam Encryption Write a program in C that will encrypt a file use XOR and a Vernam key. A Vernam key is a short string. (The real system used a much longer random sequence.) Read the input at low level, as binary data (hint unsigned char is useful here.) Then xor each character in the binary data with the character in the key. When you get to the end of the key reuse the key from the beginning.

The program should take command line arguments for the key, input and output.

./vern abc input.clear output.encrypted

Note that the cipher should decrypt its own output. The command:

./vern abc output.encrypted input.clear

should recover the original input. If you use open to create the output file, you may want to also set O RDWR or O WRONLY as well as O CREAT. You could also use fread and fwrite for this problem.

  1. Finding the period

This cipher is vulnerable if the key repeats, and with a short key, like abc above, it will repeat many times for any reasonably sized input.

The incidence of coincidence slides the cipher along itself and counts the number of times the same symbol is seen.

ABCABCABC            count is 9

ABCABCABC then shift 1 ABCABCABC        count is 0

ABCABCABC then shift 2 ABCABCABC      count is 0

ABCABCABC then shift 3 ABCABCABC    count is 6

ABCABCABC

Clearly the period is 3.

Write the code to do that. The file classcipher.vrn is in my directory for this.

  1. EXTRA CREDIT The character ’ ’ (space) is most common in English text. After finding the period count the most common character for each period and recover the key by XOR’ing it with ’ ’.

Concealment.

Continuing on from yesterday…

Even if you use a cipher system that does not carry incriminating evidence, you’re still left with the problem of sending the message. There were more than a few German spies picked up by the British in WW2 because one of their suitcases concealed a radio set.

If a message is obviously secret, even if it can’t be read, it’s obvious that the sender was trying to hide something. That could mean a short stay in a nasty prison followed by a short drop on a patented neck stretching machine or a shave with the ‘national razor.’ Neither is recommended for your health.

By the way this is a problem with book codes. Possession of a certain edition of a book, unless it’s dead common, could be evidence.

So what is a refined genteel lady of the regency to do?

Secret inks and concealed communications.

Step 1: write a letter. Not a problem, then as now females wrote lots of communications. Not email, twitter or texting, but on paper. Remember to inquire after everyone’s health and to tell the recipient up front that everyone here is well. (or not).

Step 2: Prick out or underline certain letters or words in the message. These are the real message. If these markings are noticed, and they will be if they’re in plain ink, you may be in trouble. On the other hand, if they spell out a love missive, you will be excused. It was not uncommon in the days when parents read every young ladies correspondence, to use a subterfuge like this.

Step 3: Use a secret ink to mark out the real real message. You could skip step 2 if appropriate.

So what can you use for an ink?

Here’s the problem, possessing the ink is evidence of your intent to conceal. A Lady of Quality would have vinegars, a few cosmetics, and even the contents of the ‘gozunda’ available to her. These make reasonably decent ‘organic’ inks, where the compound alters the browning temperature of the paper. Heat it and the message is revealed. (There were ‘sympathetic’ inks which required a developing reagent available at the time, but they’d be problematic for a Lady to be carrying around.)

Ciphers.

One of the books I’m writing involves a young woman training to be a secret agent in 1803 England. The running title is either “The Art of Deception” or possibly “Pride and Extreme Prejudice.”

Spies like her would need to be versed in secret communications. Unfortunately, possession of a code machine – like Jefferson’s disks – would automatically show that she was a spy. So would possession of secret inks.

She needs a cipher, one that she could easily generate, and one where she could easily destroy the evidence. Given what was available at the time, the best idea seems to be to move the invention date of the “playfair” cipher back a few years and have her use it.

Playfair was the top British cipher for much of the 19th century, which only shows how lame were the opposition. It is one step up from mono-alphabetic encryption (the cryptograms you sometimes have in the better newspapers (although not the AJC)). It encrypts pairs of letters, but in a systematic manner. On a small enough and random enough set of messages this can be really tough to break. A not dissimilar approach occupied Alan Turing for much of 1942 and 1943 when trying to break the German Naval Enigma (they used it to encrypt the rotor settings). Long chunks of text and repeated keys are much easier.

It starts with a keyword and a square (could be a rectangle).  Say “nevermore” and a 5×5 square.

1 2 3 4 5
2
3
4
5

Put the keyword (nevermore in this example) in the first row, dropping repeated letters. Then fill in with the rest of the alphabet. Have I and J in the same square.

n e v r m
o a b c d
f g h i k
l p q s t
u w x y z

Then for each pair of letters do one of three things.

  • If they’re in opposite corners of a rectangle (th above) take the other corners of the rectangle th -> kq and ht -> qk.
  • If they’re in the same row or column (gh above) take the letters to their right or below gh ->hi, is -> sy. Wrap around if you have to. There are alternatives here. It’s a pity they weren’t used because this is a weak point.
  • If the same letter occurs twice in a row, ‘ll’ for example, encode it as lx lx.

That’s all reasonable, except the distribution of pairs of letters is not flat. ‘th’ is by far the most common pair, and much more common than ‘ht’. So given a long enough message we can count pairs and from that deduce the structure of the 5×5 square. Repeated pairs are very useful because they define relationships between infrequent letters.

One simple way to harden playfair (though not enough to make it secure by modern standards) is to make the key ‘progressive.’

a n e v r
m o b c d
f g h i k
l p q s t
u w x y z
a b n e v
r m o c d
f g h I k
l p q s t
u w x y z

And so on until the 13th iteration. After that it begins to lose too much key to be unique. (Iteration 25 would always be just the alphabet in the square.)

a b c d e
f g h i k
l m n v o
r p q s t
u w x y z

Other schemes would be better than this, but you get the idea.