The Autokey cipher is based on the Vigenère cipher but avoids the problem of periodically repeating a keyword. If the keyword is shorter than the plaintext, the plaintext is simply added to the keyword.
Plaintext: THIS IS A SECRET TEXT
Ciphertext: DLGL PA S AWCJIV KIQM
The Autokey cipher was developed by Blaise de Vigenère as well. It is primarily based on the same methods as the Vigenère cipher, but it includes a modification, which increases the security of the cipher. This modification is based on ideas from Gerolamo Cardano.
The Autokey cipher is more secure than the Vigenère cipher, because a pattern search with the Kasiski- or the Friedman-Test leads to no result with the Autokey cipher. On the other hand, this cipher is not very secure when the attacker knows some parts of the plaintext because the plaintext is part of the key. Also, characters can be identified with analytical methods. Because parts of the keyword are derived from the plaintext, one can assume that the keyword contains natural language. If one would further assume that English has been used for the plaintext, the most frequently occurring character in the plaintext as well as in the key should be E. In a sufficiently long encoded text, the most frequently occurring character would then be the I which is the result of mapping the character E according to the key E. With the same procedure, every other character in the message can be determined. ¹
(1) Lang, H.W.: „Klassische Kryptografie“, http://www.iti.fh-flensburg.de/lang/krypto/klassisch.htm, Date: 2009-02-20
What is an alphabet?
An alphabet is an ordered set of all characters which can occur in a plaintext, a secret text, or the key. "Ordered" means that sorting is possible and we can speak of the n-th character of an alphabet.
For classical methods, the alphabet often consists only of the uppercase letters (A-Z). Characters not belonging to the alphabet are not encrypted or allowed as keys.
The handling of non-alphabet characters (convert, skip, ...) can be set in the options - but this is not a function of the actual encryption process itself. This requires additional meta-information of the letters that must be recorded before encryption. Also, there is no general match on how to handle digits or special characters. Instead of performing a transformation before encryption, this implementation allows several alphabets to be specified (see below), thereby accomplishing the same within the encryption process.
Our implementation of Vigenère, Beaufort, etc. does not work internally with letters, but with numbers. Therefore, a translation must take place, which can on the one hand transform letters in numbers and, conversely, re-generate letters again. A function that performs this is called an alphabet function.
Let s be such a reversible function. Then the Vigenère encryption for an input character in and a key key can be described as:
out = s-1(s(in) + s(key))
The letters of in and key are converted into numbers, these numbers are added, and the sum is re-converted to a letter. The conversion to letters takes place modulo to the alphabet length: If a 1 is added to the last character, the result of the sum is the first character of the alphabet.
How to describe an alphabet?
Alphabets (yes, there may be several: more below) can be described by a list L of letters. The alphabet function sL returns the smallest index at which it occurs to a letter that is present in L. The index of the first character can be configured. For letters that do not occur in L, the alphabet function sL is undefined.
Although the function is well-defined when a letter occurs more than once, this makes little sense in encryption algorithms, since the reversibility suffers. A corresponding warning is displayed.
The inverse function returns the n-th character for a number n in L. To n, the length of the list L is added or subtracted as often as necessary until the index lies in the list.
Example of an alphabet
For example, take the list L =
"ABCD", whose length is 4. The message
"ACDC" should be encrypted with the key
"ABBA" according to the Vigenère method. The following steps take place:
|in||sL(in)||key||sL(key)||sL(in) + sL(key)||sL-1(sL(in) + sL(key))|
| ||0|| ||0||0|| |
| ||2|| ||1||3|| |
| ||3|| ||1||4|| |
| ||2|| ||2||0|| |
In the example, an overflow has occurred in the third letter, so that modulo |L| = 4 is calculated.
Shortcuts for alphabets
In order to simplify the representation of the alphabets, the following abbreviation has been introduced: The minus sign in the following letter 1-letter 2 is extended to all the letters between the two flanking letters.
"A-Z"for all uppercase letters,
"a-z"for all lowercase letters,
"z-a"for all lowercase letters in reverse order and
"0-9"for all digits.
The only disadvantage is that the minus sign itself has to be written as
"---", so as not to be confused as a range operator.
In the detailed representation of the alphabets (click on the "..." -button), the alphabets can be edited in the short-write mode.
It is possible to distinguish between 2 types of actions in the plain text: uppercase letters [A-Z] and digits [0-9]
- Complete alphabet:
The two partial alphabets [A-Z] and [0-9] are combined in a certain order to form a new alphabet [A-Z0-9]. If you 'shold' here 3, you get to the '2'.
- Separated partial alphabets:
However, all transformations within the respective sub-alphabet are also carried out. This means that a capital letter in the plain text is also mapped to a capital letter in the encrypted text. If you 'shuffle' here 3, you get to the 'C'.
Separated partial Alphabets
The use of several alphabets does not require the algorithms to distinguish between upper and lower case letters.
The algorithm memorizes the alphabet with which it has determined the number of the plaintext. The same alphabet is used to generate the encrypted text.
When a letter occurs in several alphabets, the first of these alphabets is used.
Options regulate the case when a letter does not appear in any alphabet: it is not encrypted, but transferred directly to the output. This is also the case when the letter is in the key. Alternatively, the non-alphabet letters in the key and the plain text can also be filtered out to increase the security. This, however, limits readability.
Example for several alphabets: Addition at Vigenère
Here both approaches are treated: for separate partial alphabets and for a memorized alphabet.
As a small example we consider Vigenère with the following two alphabets:
- L1 =
- L2 =
In both cases, both the plaintext and the key should both consist of the text
For separate partial alphabets the following results:
- The encrypted text is the smallest digit of an addition of plaintext and key when both are hexadecimal digits.
- The encrypted text is:
- Since the first alphabet L1 has been used for the digits, digits and uppercase letters in the encrypted text are always the numbers in the plain text.
- Note the difference in 'D' and 'd': The index value is the same, but the 'd' is L2, so the results differ in the encrypted text: 'A' and 'a'.
For a merged alphabets, the encrypted text is
The following table shows the calculation for the case of the separated partial alphabets L1, L2 as well as for a merged alphabet L =
|separate alphabets||merged alphabet|
|8||8||8||8||16 = 0||0||8||8||16||a|
|9||9||9||9||18 = 2||2||9||9||18||c|
|a||a||10||10||20 = 4||4||16||16||32 = 10||A|
|b||b||11||11||22 = 6||6||17||17||34 = 12||C|
|c||c||12||12||24 = 8||8||18||18||36 = 14||E|
|d||d||13||13||26 = 10||a||19||19||38 = 16||a|
|A||A||10||10||20 = 4||4||10||10||20||e|
|B||B||11||11||22 = 6||6||11||11||22 = 0||0|
|C||C||12||12||24 = 8||8||12||12||24 = 2||2|
|D||D||13||13||26 = 10||A||13||13||26 = 4||4|
cleartext = key = "0123456789abcdABCD"
Method 1: Separated: In each sub-alphabet, mod 16 is calculated (hex addition), since each sub-alphabet contains 16 elements, and it remains in the same partial alphabet from which the plaintext letter originates.
Method 2: Merged: In the alphabet, mod 22 is calculated because the alphabet contains 22 elements.