Learning Resources for Software Engineering Students »
Authors: Dickson Tan
Cryptography is the science of developing methods for secure communication, while Cryptanalysis is the discipline of breaking such methods. Understanding cryptography will enable you to correctly use cryptography libraries and combine cryptographic primitives to match the security needs of your application.
Modern cryptography involves developing techniques to achieve the following objectives:
In Cryptanalysis, the objective is to exploit weaknesses in cryptographic algorithms - for example, attempting to recover the original message from its encrypted form. The security of cryptographic algorithms is evaluated based on their resistance to these attacks.
Encryption is the process of encoding messages so that it is readable only by the intended recipient. The message being encrypted is called the plaintext. The encryption algorithm "scrambles" the message, producing ciphertext, which should only be readable by the recipient. Decryption is the reverse process of recovering the plaintext from the ciphertext.
A cipher refers to a pair of algorithms - one for encryption and decryption, and is used to refer to symmetric key techniques. A cryptosystem consists of 3 algorithms - a cipher and a key generation algorithm. Though cryptosystems usually refer to asymmetric key techniques, it may also be used to refer to symmetric key techniques.
The key is a parameter, or a piece of secret information, that determines the output of cryptographic algorithms. According to Kerckhoffs' principle, a cryptosystem should be secure even if the attacker knows everything about the cryptosystem, except the key. This means that the attacker knows the algorithms used for encryption and decryption. Although it is tempting to design secure systems by employing "security by obscurity" to keep their details hidden, they are usually easily broken once their design is known. The CSS copy-protection system for DVDs is one such example.
One of the earliest encryption algorithms or ciphers is the substitution cipher, which encrypts text by substituting each letter of the message with another letter. This page introduces the Caesar substitution cipher, and how it can be defeated using statistical analysis, a ciphertext-only attack. Modern ciphers today operate on bits rather than letters.
In a symmetric key cipher, the same key, a shared secret, is used for both encryption and decryption. The parties wishing to communicate securely share the same key. For example, in the substitution cipher, the same key (the substitution table) is used for both encryption and decryption. There are 2 types of symmetric ciphers: stream and block ciphers.
The ciphertext, keystream and plaintext are sequences or streams of bits of equal length. During encryption, each plaintext bit is combined (usually xored) with the corresponding keystream bit to produce the ciphertext bits. For example, the 1st plaintext bit is xored with the 1st keystream bit, 2nd plaintext bit with 2nd keystream bit and so on. During decryption, each ciphertext bit is xored with the corresponding keystream bit to produce the plaintext. Notice that encryption and decryption are the same operation; this is possible since xoring a bit x with another bit y twice recovers the original bit; (x xor y) xor y = x.
The simplest stream cipher is the one-time pad. In this cipher, the keystream used is bits from a truly random source, and is also the key. It is the only known cipher that cannot be cracked, even if the attacker has infinite computing power. This property is known as perfect secrecy; the ciphertext gives no additional information about the plaintext, so knowing the ciphertext does not provide any advantage to the attacker trying to recover the plaintext. Unfortunately, Shannon, renowned cryptographer and founder of Information Theory, proved that any cipher that achieves perfect secrecy must have the following limitations, making them impractical.
In practice, we do not require perfect secrecy, since attackers have limited computational power. Hence, all other ciphers are only computationally secure; their security relies on the assumption that certain problems are difficult to solve.
Modern stream ciphers approximate the operation of the one-time pad. A short key (say 256 bits) is used to seed a cryptographically secure pseudorandom number generator, which is used to generate the keystream for both encryption and decryption. They are more practical, as the key the communicating parties need to share is much shorter.
Keys must never be reused in stream ciphers. Doing so causes the same keystream, k, to be generated, and 2 plaintexts, p and q, to be encrypted with the same keystream. If we xor the ciphertexts for p and q, we get (p xor k) xor (q xor k) = (p xor q) xor (k xor k) = p xor q. This exposes information about the plaintexts, which may lead to the recovery of both. Here is a visual illustration of this attack, and an example of this happening in practice.
Stream ciphers are used for their efficiency, ease of implementation in hardware, and when the length of the plaintext is unpredictable. However, block ciphers are more widely used than stream ciphers. In some modes of operation, they can be used like stream ciphers, reducing the need for dedicated stream ciphers.
RC4 is the most widely used stream cipher. Though its use is now discouraged due to known vulnerabilities. The eSTREAM project, a research effort to develop state-of-the-art stream ciphers, has identified several ciphers suitable for widespread adoption. However, being relatively new, they have not been analyzed as extensively by cryptographers.
Unlike stream ciphers, which operate on individual bits, block ciphers operate on an entire block of bits at a time. In practice, the size of each block is 64 or 128 bits.
Shannon introduced 2 primitives, which modern block ciphers are built on.
Ciphers that use only one of these operations are insecure. For example, the insecure Caesar cipher only uses confusion. But strong ciphers can be built by using both confusion and diffusion - these are called product ciphers.
Block ciphers alone aren't very useful, because they only provide a secure way of encrypting one block of data. Modes of operation are ways of using block ciphers to securely encrypt multiple blocks. The plaintext is padded if it is not an even multiple of the block size. Block ciphers can also provide additional services such as integrity, depending on the mode used, which makes them vercitile. This article provides a nice overview of common modes.
Most modes require a random value called an initialization vector (IV) so that encrypting the same message twice doesn't produce the same ciphertext, which leaks information. It is critical that the IV be random, used only once and unpredictable. Not doing so has caused several vulnerabilities such as the BEAST Attack on TLS and the recovery of WEP keys.
Symmetric cryptography is not practical for the following situations, which motivates the development of asymmetric or public key cryptography:
nusers to securely communicate to every other user, each user must securely store
n-1keys , which is impractical.
In asymmetric cryptography, there are 2 separate keys; one for encryption, and the other for decryption. The key used for encryption is published so that anyone can securely send messages to Alice. Hence, it is called a public key. Alice has the corresponding decryption key, or private key, which is kept secret. Hence, messages encrypted with the public key can only be decrypted by Alice. In contrast, there is a single shared key which must be kept secret in symmetric ciphers.
Public key cryptography has many uses beyond sending encrypted messages, such as key agreement and non-repudiation, which will be covered in future sections.
For this to be secure, it must be computationally infeasible to obtain the private key from the public key. This is achieved using one-way functions, which have the following properties:
RSA is one of the earliest and most widely used public key cryposystems.
Its one-way function is the integer factorization problem; Given 2 large primes
q, it is easy to compute the product
pq, but difficult to factor
The first article of this 2-part series explains how RSA works, as well as the minimal number theory required. The follow-up article explains why RSA works by introducing some important results in number theory.