Cryptography Lecture 8–9
1. Lecture 8: Chosen Plaintext Security and PRFs
1.1. Motivation: Key Reuse
The lecture starts by returning to the question of key reuse.
So far, the encryption schemes studied in the course do not really allow us to reuse a key safely.
For the one-time pad, this is especially clear:
\[ c = k \oplus m \]
where the key \(k\) must be used only once.
Stream ciphers improve the situation by replacing the very long one-time pad key with a short seed:
\[ G(s) \]
where \(G\) is a pseudorandom generator and \(s\) is a short seed.
Then encryption looks like
\[ c = G(s) \oplus m. \]
This allows us to generate long key material from a short seed.
However, this still does not give us fully reusable keys.
The problem is that both the encryptor and decryptor must stay synchronized. They must know how far the PRG output has already been consumed.
For example, if the sender has already used the first \(1000\) bits of \(G(s)\), then the receiver must also know that decryption should start from bit \(1001\).
This requires shared state.
The professor emphasizes that shared state in distributed systems is usually something we want to avoid whenever possible.
So the goal is:
\[ \boxed{\text{We want fully reusable keys.}} \]
But before constructing such schemes, we must first update the security definition.
1.2. Chosen Plaintext Attacks
If a key is reused, then an adversary may see many ciphertexts encrypted under the same key.
Moreover, not all plaintexts are necessarily unpredictable.
For example, in real-world network traffic, packets often contain structured headers or metadata. An adversary may already know or partially predict these parts of the plaintext.
Even worse, a real-world adversary may be able to influence what honest parties encrypt.
The lecture gives an example involving a front-end server and a back-end server.
Suppose:
- a front-end server and a back-end server share a secret key \(K\);
- the front-end encrypts user requests before forwarding them to the back-end;
- an adversary can send a query to the front-end;
- the adversary can observe the encrypted traffic between the front-end and back-end.
Then the adversary can choose a plaintext, for example:
\[ m = \text{"Mother's Day gift ideas"} \]
and cause the front-end server to encrypt it.
Later, if another user sends the same query and the encryption is deterministic, then the same plaintext will produce the same ciphertext.
So the adversary can detect that the second user made the same query.
This motivates a stronger attack model:
\[ \boxed{ \text{The adversary may choose plaintexts and see their encryptions.} } \]
This is called a chosen plaintext attack.
1.3. Encryption Oracle
The conservative modeling approach is to give the adversary an encryption oracle.
The adversary does not get the secret key \(K\), because that would trivially break security.
Instead, the adversary gets black-box access to:
\[ \mathsf{Enc}(K,\cdot). \]
That means the adversary may submit messages \(m\), and receive ciphertexts:
\[ c \leftarrow \mathsf{Enc}(K,m). \]
The adversary may do this any polynomial number of times, and the queries may be adaptive.
Adaptive means:
- the adversary first queries \(m_1\) and receives \(c_1\);
- then, based on \(c_1\), it chooses \(m_2\);
- then, based on \(c_1,c_2\), it chooses \(m_3\);
- and so on.
1.4. IND-CPA Experiment
The new security experiment is called the IND-CPA experiment.
Here:
- IND means indistinguishability;
- CPA means chosen plaintext attack.
The experiment is written as:
\[ \mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda). \]
The experiment proceeds as follows.
1.4.1. Step 1: Key generation
The challenger generates a key:
\[ K \leftarrow \mathsf{KeyGen}(1^\lambda). \]
1.4.2. Step 2: First oracle phase
The adversary \(\mathcal A\) gets oracle access to:
\[ \mathsf{Enc}(K,\cdot). \]
It may query messages of its choice:
\[ m_i \]
and receive:
\[ c_i \leftarrow \mathsf{Enc}(K,m_i). \]
This may happen any polynomial number of times.
1.4.3. Step 3: Challenge messages
At some point, the adversary submits two challenge messages:
\[ m_0, m_1. \]
These are chosen adaptively based on everything the adversary has seen so far.
As usual, the messages must have the same length.
1.4.4. Step 4: Challenge ciphertext
The challenger chooses a random bit:
\[ b \leftarrow \{0,1\}. \]
Then it computes:
\[ c^\ast \leftarrow \mathsf{Enc}(K,m_b) \]
and sends \(c^\ast\) to the adversary.
1.4.5. Step 5: Second oracle phase
After seeing \(c^\ast\), the adversary again gets access to the encryption oracle:
\[ \mathsf{Enc}(K,\cdot). \]
Again, it may make any polynomial number of adaptive queries.
1.4.6. Step 6: Guess
Finally, the adversary outputs a guess:
\[ b'. \]
The experiment outputs \(1\) if the adversary guessed correctly:
\[ b' = b. \]
Otherwise it outputs \(0\).
So:
\[ \mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda)=1 \]
means that \(\mathcal A\) won the experiment.
1.5. IND-CPA Security Definition
An encryption scheme
\[ (\mathsf{KeyGen},\mathsf{Enc},\mathsf{Dec}) \]
is IND-CPA secure if for every PPT adversary \(\mathcal A\), there exists a negligible function \(\nu\) such that for all \(\lambda \in \mathbb N\),
\[ \Pr[\mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda)=1] < \frac12 + \nu(\lambda). \]
The intuition is:
\[ \boxed{ \text{Even with encryption-oracle access, the adversary cannot guess } b \text{ much better than random guessing.} } \]
Random guessing wins with probability \(1/2\).
So the adversary is allowed only negligible advantage over \(1/2\).
1.6. Security Defined by Insecurity
The professor emphasizes again that these game-based definitions define security indirectly by saying what it means to be insecure.
The negation of IND-CPA security is:
There exists a PPT adversary \(\mathcal A\) and a non-negligible function \(\varepsilon\) such that
\[ \Pr[\mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda)=1] \ge \frac12 + \varepsilon(\lambda). \]
So the scheme is insecure if some efficient adversary wins the game with non-negligible advantage.
This is the same paradigm as in earlier IND definitions.
1.7. Why Deterministic Encryption Cannot Be IND-CPA Secure
The professor then discusses an important consequence:
\[ \boxed{ \text{IND-CPA secure encryption cannot be deterministic.} } \]
Suppose encryption is deterministic.
Then the adversary can do the following:
- Choose challenge messages \(m_0,m_1\).
Query the encryption oracle on \(m_0\) and \(m_1\):
\[ c_0 \leftarrow \mathsf{Enc}(K,m_0), \qquad c_1 \leftarrow \mathsf{Enc}(K,m_1). \]
Receive the challenge ciphertext:
\[ c^\ast \leftarrow \mathsf{Enc}(K,m_b). \]
- Compare \(c^\ast\) with \(c_0\) and \(c_1\).
If encryption is deterministic, then:
\[ c^\ast = c_0 \Longleftrightarrow b=0, \]
and
\[ c^\ast = c_1 \Longleftrightarrow b=1. \]
Therefore, the adversary can recover \(b\) perfectly.
So IND-CPA security requires encryption to be randomized.
This explains why the syntax of encryption was defined as a randomized algorithm.
1.8. Need for a Stronger Primitive
PRGs transform a small amount of randomness into a larger amount of pseudorandomness:
\[ \text{short seed} \quad\Longrightarrow\quad \text{long pseudorandom string}. \]
But for reusable-key encryption, this is not enough.
The adversary may ask for polynomially many encryptions under the same key. So we need a way to generate pseudorandom-looking values on many different inputs.
This leads to the next main primitive:
\[ \boxed{\text{Pseudorandom functions.}} \]
1.9. Pseudorandom Functions
A pseudorandom function, or PRF, is an efficiently computable keyed function ensemble.
For each key \(K\), we get a function:
\[ F_K : \{0,1\}^{\lambda} \to \{0,1\}^{\ell}. \]
Here:
- \(K \in \{0,1\}^{\lambda}\);
- \(\lambda\) is the security parameter;
- \(\ell\) is the output length.
The word ensemble means that we do not have just one function, but a whole family of functions indexed by the key \(K\).
So we can think of \(F\) either as:
\[ F(K,x) \]
or, after fixing \(K\), as:
\[ F_K(x). \]
1.10. Uniformly Random Functions
A PRF is compared to a truly random function with the same signature.
Let
\[ H : \{0,1\}^{\lambda} \to \{0,1\}^{\ell} \]
be a uniformly random function.
This means:
For every input
\[ x \in \{0,1\}^{\lambda}, \]
the value
\[ H(x) \]
is chosen independently and uniformly at random from
\[ \{0,1\}^{\ell}. \]
Equivalently, one can imagine choosing the full truth table of \(H\) uniformly at random.
But the full truth table is exponentially large, so we do not actually sample it all in advance.
1.11. Lazy Sampling
To simulate a truly random function efficiently, we use lazy sampling.
The simulator maintains a table.
Initially, the table is empty.
When a query \(x\) arrives:
- if \(x\) is already in the table, return the previously stored value \(H(x)\);
if \(x\) is new, choose
\[ y \leftarrow \{0,1\}^{\ell} \]
uniformly at random, store \(H(x)=y\), and return \(y\).
This produces exactly the same distribution as choosing the whole truth table in advance, but only samples the values that are actually queried.
Since a PPT distinguisher can make only polynomially many queries, this is efficient.
1.12. PRF Security Definition
A keyed function ensemble \(F\) is a pseudorandom function if for every oracle PPT distinguisher \(\mathcal D\),
\begin{equation*} \left| \Pr[\mathcal D^{F_K(\cdot)}(1^\lambda)=1] - \Pr[\mathcal D^{H(\cdot)}(1^\lambda)=1] \right| < \mathsf{negl}(\lambda). \end{equation*}The probability on the left is over:
the random key
\[ K \leftarrow \{0,1\}^{\lambda}, \]
- and the random coins of \(\mathcal D\).
The probability on the right is over:
- the random choice of the truly random function \(H\);
- and the random coins of \(\mathcal D\).
The distinguisher gets oracle access either to \(F_K\) or to \(H\), and must decide which world it is in.
The PRF property says that no efficient distinguisher can tell the two worlds apart except with negligible advantage.
1.13. PRFs from PRGs
The lecture states a famous theorem by Goldreich, Goldwasser, and Micali from 1986.
If there exists a length-doubling PRG
\[ G : \{0,1\}^{\lambda} \to \{0,1\}^{2\lambda}, \]
then there exists a PRF.
The proof is not given in full, but the professor explains the idea.
Write the output of \(G\) as two halves:
\[ G(s) = G_0(s) \,\|\, G_1(s), \]
where
\[ G_0(s),G_1(s) \in \{0,1\}^{\lambda}. \]
Then evaluate \(G\) in a binary tree.
For example, starting from key \(K\):
\[ G(K) = G_0(K) \,\|\, G_1(K). \]
Then expand again:
\[ G(G_0(K)) = G_0(G_0(K)) \,\|\, G_1(G_0(K)), \]
\[ G(G_1(K)) = G_0(G_1(K)) \,\|\, G_1(G_1(K)). \]
Inputs are interpreted as paths in the binary tree.
For example, input \(01\) means:
- first go left using \(0\);
- then go right using \(1\).
The value at the reached leaf becomes the function output.
So:
\[ \boxed{ \text{Inputs map to paths in a binary tree.} } \]
The security proof is technical because for exponentially large domains one cannot replace the entire tree at once. Instead, the proof uses lazy evaluation and replaces only the nodes touched by the distinguisher.
1.14. Combining PRFs: First Example
Assume \(F\) and \(F'\) are PRFs.
Define:
\[ F^{\ast}_{K,K'}(x) = F_K(x) \oplus F'_{K'}(x), \]
where \(K\) and \(K'\) are independent keys.
Question:
Is \(F^\ast\) a PRF?
Answer:
\[ \boxed{\text{Yes.}} \]
The reasoning is:
Because \(F\) is a PRF, the value \(F_K(x)\) can be replaced by a truly random function \(H(x)\), without any efficient distinguisher noticing.
Then:
\[ F^{\ast}_{K,K'}(x) \approx H(x) \oplus F'_{K'}(x). \]
But \(H(x)\) is uniformly random and independent.
For any fixed value \(a\),
\[ H(x) \oplus a \]
is still uniformly random.
Therefore,
\[ H(x) \oplus F'_{K'}(x) \]
is distributed like a uniformly random function.
So \(F^\ast\) is a PRF.
The professor emphasizes the useful proof pattern:
If a PRF component appears once and independently, try replacing it by a truly random function.
1.15. Combining PRFs: Second Example
Now consider another construction:
\[ F^{\ast}_{K}(x) = \bigl(F_K(x), F_K(x \oplus 1^\lambda)\bigr). \]
Here the same key \(K\) is used twice.
Question:
Is \(F^\ast\) a PRF?
Answer:
\[ \boxed{\text{No.}} \]
The professor says that key reuse inside a construction should make us suspicious.
To distinguish this function from a truly random function, the distinguisher queries two inputs:
\[ 0^\lambda \]
and
\[ 1^\lambda. \]
First query:
\[ F^\ast_K(0^\lambda) = \bigl(F_K(0^\lambda), F_K(0^\lambda \oplus 1^\lambda)\bigr) = \bigl(F_K(0^\lambda), F_K(1^\lambda)\bigr). \]
Let:
\[ a = F_K(0^\lambda), \qquad b = F_K(1^\lambda). \]
So the first output is:
\[ (a,b). \]
Second query:
\[ F^\ast_K(1^\lambda) = \bigl(F_K(1^\lambda), F_K(1^\lambda \oplus 1^\lambda)\bigr) = \bigl(F_K(1^\lambda), F_K(0^\lambda)\bigr) = (b,a). \]
So the two outputs are always swapped:
\[ F^\ast_K(0^\lambda) = (a,b), \]
\[ F^\ast_K(1^\lambda) = (b,a). \]
This structured relation happens with probability \(1\) for \(F^\ast\).
For a truly random function, this kind of relation happens only with negligible probability.
Therefore, there is an efficient distinguisher, so \(F^\ast\) is not a PRF.
1.16. Encryption from PRFs
Now the lecture constructs IND-CPA secure encryption from a PRF.
Let
\[ F_K : \{0,1\}^{\lambda} \to \{0,1\}^{\ell} \]
be a PRF.
The message space consists of messages
\[ m \in \{0,1\}^{\ell}. \]
The scheme is:
1.16.1. Key Generation
\[ \mathsf{KeyGen}(1^\lambda): \quad K \leftarrow \{0,1\}^{\lambda}. \]
1.16.2. Encryption
To encrypt \(m\), choose fresh randomness:
\[ r \leftarrow \{0,1\}^{\lambda}. \]
Then compute:
\[ y = F_K(r) \oplus m. \]
Output:
\[ c = (r,y). \]
So:
\[ \mathsf{Enc}(K,m) = (r, F_K(r) \oplus m). \]
The random value \(r\) is included in the ciphertext so that the decryptor knows where to evaluate the PRF.
1.16.3. Decryption
Parse the ciphertext as:
\[ c=(r,y). \]
Then compute:
\[ m = F_K(r) \oplus y. \]
So:
\[ \mathsf{Dec}(K,(r,y)) = F_K(r) \oplus y. \]
1.17. Correctness
We check:
\[ \mathsf{Dec}(K,\mathsf{Enc}(K,m)) = m. \]
Indeed:
\[ \mathsf{Enc}(K,m) = (r,F_K(r)\oplus m). \]
Therefore:
\[ \mathsf{Dec}(K,(r,F_K(r)\oplus m)) = F_K(r) \oplus (F_K(r)\oplus m). \]
Since XOR cancels:
\[ F_K(r) \oplus F_K(r) = 0, \]
we get:
\[ F_K(r) \oplus F_K(r)\oplus m = m. \]
So the scheme is correct.
1.18. Why This Looks Like a One-Time Pad
For each encryption, the scheme chooses a fresh random \(r\).
Then \(F_K(r)\) is used as a pad:
\[ F_K(r) \oplus m. \]
The ciphertext contains \(r\), but not \(F_K(r)\), because computing \(F_K(r)\) requires knowing \(K\).
So the scheme can be viewed as generating a fresh pseudorandom one-time pad for each encryption.
1.19. IND-CPA Security Theorem
The theorem is:
If \(F\) is a PRF, then the above encryption scheme is IND-CPA secure.
Formally:
\[ \boxed{ F \text{ is a PRF} \Longrightarrow (\mathsf{KeyGen},\mathsf{Enc},\mathsf{Dec}) \text{ is IND-CPA secure.} } \]
1.20. Proof Idea
The proof is by contraposition.
Assume the encryption scheme is not IND-CPA secure.
Then there exists a PPT adversary \(\mathcal A\) and a non-negligible function \(\varepsilon\) such that:
\[ \Pr[\mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda)=1] \ge \frac12 + \varepsilon(\lambda). \]
The goal is to build a PPT distinguisher \(\mathcal D\) that distinguishes \(F_K\) from a truly random function \(H\).
This would contradict the assumption that \(F\) is a PRF.
1.21. Original Experiment
In the real IND-CPA experiment, every encryption oracle query works as follows.
Given a message \(m\):
choose
\[ r \leftarrow \{0,1\}^{\lambda}; \]
compute
\[ y = F_K(r)\oplus m; \]
return
\[ (r,y). \]
The challenge ciphertext is computed similarly.
The challenger chooses:
\[ b \leftarrow \{0,1\}, \]
\[ r^\ast \leftarrow \{0,1\}^{\lambda}, \]
and returns:
\[ c^\ast = (r^\ast, F_K(r^\ast)\oplus m_b). \]
1.22. Modified Experiment
Now define a modified experiment where every call to \(F_K\) is replaced by a truly random function \(H\).
So encryption oracle queries return:
\[ (r, H(r)\oplus m). \]
The challenge ciphertext is:
\[ c^\ast = (r^\ast, H(r^\ast)\oplus m_b). \]
Call this modified experiment:
\[ \mathsf{Exp}'_{\mathcal A}(\lambda). \]
1.23. First Step: Replacing the PRF by a Random Function
The professor initially went quickly over this step in lecture 8 and then returned to it at the beginning of lecture 9.
The claim is:
\begin{equation*} \Pr[\mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda)=1] \le \Pr[\mathsf{Exp}'_{\mathcal A}(\lambda)=1] + \mathsf{negl}(\lambda). \end{equation*}Why?
Suppose this were false.
Then the adversary’s winning probability would noticeably change when replacing \(F_K\) by \(H\).
Then we could build a PRF distinguisher \(\mathcal D\).
The distinguisher \(\mathcal D\) gets oracle access to some oracle \(O\), where \(O\) is either:
\[ O = F_K \]
or
\[ O = H. \]
The distinguisher simulates the IND-CPA experiment for \(\mathcal A\), but whenever the experiment needs to compute \(F_K(r)\), \(\mathcal D\) queries its own oracle:
\[ O(r). \]
If \(O=F_K\), then \(\mathcal D\) perfectly simulates the real IND-CPA experiment.
If \(O=H\), then \(\mathcal D\) perfectly simulates the modified experiment.
Therefore, if \(\mathcal A\)’s success probability differs noticeably between the two experiments, then \(\mathcal D\) distinguishes \(F_K\) from \(H\).
This contradicts PRF security.
So replacing \(F_K\) by \(H\) changes the adversary’s success probability by at most a negligible amount.
1.24. Second Step: Analyze the Modified Experiment
In the modified experiment, the only remaining issue is whether the random value used in the challenge ciphertext collides with one of the random values used in an encryption oracle query.
Let:
\[ r_1,\dots,r_q \]
be the random values generated by the encryption oracle, where \(q\) is polynomial in \(\lambda\).
Let
\[ r^\ast \]
be the random value used in the challenge ciphertext.
Consider the good event:
\[ \forall i,\quad r_i \ne r^\ast. \]
If this event holds, then \(H(r^\ast)\) is used exactly once: only in the challenge ciphertext.
Then:
\[ c^\ast = (r^\ast,H(r^\ast)\oplus m_b). \]
Since \(H(r^\ast)\) is uniform and used only once, it acts exactly like a one-time pad.
Therefore, conditioned on no collision, the challenge ciphertext reveals no information about \(b\).
So:
\[ \Pr[\mathcal A \text{ wins} \mid \forall i,\ r_i\ne r^\ast] = \frac12. \]
1.25. Collision Probability
The bad event is:
\[ \exists i \in \{1,\dots,q\} : r_i = r^\ast. \]
By the union bound:
\[ \Pr[\exists i : r_i = r^\ast] \le \sum_{i=1}^{q} \Pr[r_i = r^\ast]. \]
For each \(i\),
\[ \Pr[r_i = r^\ast] = 2^{-\lambda}. \]
Therefore:
\[ \Pr[\exists i : r_i = r^\ast] \le q \cdot 2^{-\lambda}. \]
Since \(q\) is polynomial in \(\lambda\), the quantity
\[ q \cdot 2^{-\lambda} \]
is negligible.
Thus the bad event happens only with negligible probability.
1.26. Conclusion of the Proof
In the modified experiment:
- except with negligible probability, there is no collision;
- if there is no collision, \(H(r^\ast)\) is a one-time pad;
- therefore the adversary wins with probability exactly \(1/2\).
So:
\[ \Pr[\mathsf{Exp}'_{\mathcal A}(\lambda)=1] \le \frac12 + \mathsf{negl}(\lambda). \]
Combining this with the PRF replacement step:
\[ \Pr[\mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda)=1] \le \frac12 + \mathsf{negl}(\lambda). \]
Therefore the scheme is IND-CPA secure.
1.27. Lecture 8 Summary
The main takeaways are:
- Key reuse requires a stronger security notion.
- This stronger notion is IND-CPA security.
- IND-CPA security gives the adversary an encryption oracle.
- IND-CPA secure encryption cannot be deterministic.
- Pseudorandom functions are introduced as the main workhorse for reusable-key symmetric encryption.
A simple IND-CPA secure encryption scheme can be built from a PRF:
\[ \mathsf{Enc}(K,m) = (r,F_K(r)\oplus m). \]
- The proof uses two main ideas:
- replace the PRF by a truly random function;
- argue that, except with negligible collision probability, the challenge pad is used only once and therefore behaves like a one-time pad.
2. Lecture 9: Block Ciphers and Modes of Operation
2.1. Beginning of Lecture 9: Clarifying the Previous Proof
Lecture 9 begins by revisiting the proof from Lecture 8.
The professor says that in the previous lecture he went too quickly over the step where the PRF is replaced by a truly random function.
So he expands that step.
The original experiment is the IND-CPA experiment with the PRF-based encryption scheme plugged in.
The modified experiment is the same experiment, except that every occurrence of
\[ F_K(r) \]
is replaced by
\[ H(r), \]
where \(H\) is a truly random function.
The key claim is:
\begin{equation*} \Pr[\mathsf{IND\text{-}CPA}_{\mathcal A}(\lambda)=1] \le \Pr[\mathsf{Exp}'_{\mathcal A}(\lambda)=1] + \mathsf{negl}(\lambda). \end{equation*}The justification is via reduction.
A distinguisher \(\mathcal D\) gets oracle access to \(O\), where:
\[ O = F_K \]
or
\[ O = H. \]
It simulates the experiment for \(\mathcal A\).
Whenever the simulated experiment wants to compute \(F_K(r)\), the distinguisher queries:
\[ O(r). \]
If \(O=F_K\), then the simulation is exactly the original IND-CPA experiment.
If \(O=H\), then the simulation is exactly the modified experiment.
Thus, if \(\mathcal A\) behaves noticeably differently in the two experiments, then \(\mathcal D\) distinguishes \(F_K\) from \(H\), contradicting PRF security.
The professor also emphasizes that this kind of step will often be done inline in later proofs:
If a PRF is used correctly, we replace it by a truly random function and say that otherwise we would get a PRF distinguisher.
2.2. Recap: Pseudorandom Functions
The lecture then recaps the definition of PRFs.
A PRF is an efficiently computable keyed function ensemble:
\[ F_K : \{0,1\}^{\lambda} \to \{0,1\}^{\ell}. \]
A truly random function with the same signature is:
\[ H : \{0,1\}^{\lambda} \to \{0,1\}^{\ell}. \]
For every input \(x\),
\[ H(x) \]
is chosen independently and uniformly at random from \(\{0,1\}^{\ell}\).
The PRF security definition is:
\begin{equation*} \left| \Pr[\mathcal D^{F_K(\cdot)}(1^\lambda)=1] - \Pr[\mathcal D^{H(\cdot)}(1^\lambda)=1] \right| < \mathsf{negl}(\lambda) \end{equation*}for every oracle PPT distinguisher \(\mathcal D\).
2.3. From PRFs to PRPs
The next concept is a pseudorandom permutation, or PRP.
A permutation is a bijection from a set to itself.
In this setting, the domain and codomain are both:
\[ \{0,1\}^{\lambda}. \]
So a permutation is a bijective map:
\[ F_K : \{0,1\}^{\lambda} \to \{0,1\}^{\lambda}. \]
Because it is bijective, it has an inverse:
\[ F_K^{-1} : \{0,1\}^{\lambda} \to \{0,1\}^{\lambda}. \]
A PRP is a keyed function ensemble where both directions are efficiently computable:
\[ F_K \]
and
\[ F_K^{-1}. \]
2.4. Why Random Functions Are Usually Not Permutations
The professor points out that a random function is usually not a permutation.
For example, a random function
\[ H : \{0,1\}^{\lambda} \to \{0,1\}^{\lambda} \]
is likely to have collisions.
That means there may exist two distinct inputs \(x \ne x'\) such that:
\[ H(x)=H(x'). \]
If such a collision exists, then \(H\) is not injective and therefore not a permutation.
So for PRPs, we should not compare against uniformly random functions.
Instead, we compare against uniformly random permutations.
2.5. Uniformly Random Permutations
A uniformly random permutation is chosen uniformly from the set of all bijections:
\[ H : \{0,1\}^{\lambda} \to \{0,1\}^{\lambda}. \]
Since the domain is finite, the set of all permutations is finite, so in principle one can sample uniformly from it.
But just like random functions, we do not want to explicitly sample the whole table.
2.6. Lazy Sampling for Random Permutations
For a random function, lazy sampling works by choosing a fresh random output for each new input.
For a random permutation, we must also ensure that two different inputs do not map to the same output.
So we use rejection sampling.
Maintain a table of already assigned input-output pairs.
When a new input \(x\) is queried:
- choose a candidate output \(y\) uniformly at random from \(\{0,1\}^{\lambda}\);
- if \(y\) was already used as an output, reject it and sample again;
- otherwise store \(H(x)=y\) and return \(y\).
If the same input \(x\) is queried again, return the stored value.
The rejection step enforces injectivity.
Since the domain size is huge and a PPT adversary makes only polynomially many queries, rejection happens with very small probability.
2.7. PRP Security Definition
A keyed permutation ensemble \(F\) is a PRP if for every oracle PPT distinguisher \(\mathcal D\),
\begin{equation*} \left| \Pr[ \mathcal D^{F_K(\cdot),F_K^{-1}(\cdot)}(1^\lambda)=1 ] - \Pr[ \mathcal D^{H(\cdot),H^{-1}(\cdot)}(1^\lambda)=1 ] \right| < \mathsf{negl}(\lambda). \end{equation*}Here:
- \(K \leftarrow \{0,1\}^{\lambda}\);
- \(H\) is a uniformly random permutation;
- \(\mathcal D\) gets oracle access to both the forward and inverse directions.
This is stronger than the PRF setting in one important way:
\[ \mathcal D \]
can query both:
\[ F_K(x) \]
and
\[ F_K^{-1}(y). \]
So the distinguisher can test whether the object behaves like an efficiently invertible random-looking permutation.
2.8. Relation Between PRFs and PRPs
The professor mentions that PRPs can be constructed from PRFs via the Luby-Rackoff construction.
However, the proof is not given in the lecture.
The proof strategy would again be by contraposition:
Assume the constructed PRP is insecure. Then there is a distinguisher against the PRP. Use that distinguisher to build a distinguisher against the underlying PRF.
2.9. Block Ciphers
Practitioners call pseudorandom permutations block ciphers.
So, conceptually:
\[ \boxed{ \text{block cipher} \approx \text{pseudorandom permutation}. } \]
However, practical block ciphers usually do not come with a clean security reduction to simpler assumptions.
Instead, they are designed according to best practices and studied by cryptanalysis.
The professor emphasizes:
- practical block ciphers are treated as primitive objects;
- cryptanalysts try to break them;
- if no good attacks are found after extensive analysis, they are considered secure in practice.
Two major design paradigms are mentioned:
- Feistel schemes, historically used in DES;
- substitution-permutation networks, used in AES.
2.10. Feistel Schemes
A Feistel scheme splits the input block into two halves.
For example:
\[ (L,R). \]
A round applies some round function \(f\) to one half and XORs the result into the other half, then swaps roles.
A simplified round has the form:
\[ L' = R, \]
\[ R' = L \oplus f(R). \]
The important point is that a Feistel round is invertible even if \(f\) itself is not invertible.
Given:
\[ (L',R'), \]
we recover:
\[ R = L', \]
and then:
\[ L = R' \oplus f(R). \]
So the inverse is easy to compute by running the structure backward.
This explains why Feistel networks are a natural way to build permutations.
DES is based on this design paradigm.
The professor mentions that DES used parameters suitable for older hardware, but those parameters are no longer secure for modern standards.
2.11. Substitution-Permutation Networks and AES
The second major paradigm is substitution-permutation networks.
AES is based on this idea.
Instead of splitting the state into two halves as in Feistel schemes, AES treats the state as a matrix and repeatedly applies operations such as:
- local substitutions;
- row shifts;
- mixing/permutation operations.
The lecture refers to AES as the modern standard and says it is still considered secure.
For AES-128, the best known attacks are only slightly better than brute force; the professor describes this as roughly a factor of \(\sqrt{2}\) better than brute force.
The important takeaway is:
\[ \boxed{ \text{AES has a lot of structure, but known attacks do not exploit it enough to break it.} } \]
2.12. S-boxes and Cryptanalysis
The professor mentions S-boxes.
An S-box is a small substitution component inside a block cipher.
He also mentions two major cryptanalytic paradigms:
- differential cryptanalysis;
- linear cryptanalysis.
Differential cryptanalysis studies how input differences propagate to output differences.
Although block ciphers are not functions over continuous domains, the idea is loosely inspired by derivatives:
\[ \Delta x \longmapsto \Delta y. \]
One studies whether certain input differences lead to output differences with non-random probabilities.
Linear cryptanalysis tries to approximate parts of the cipher by linear relations.
The lecture does not develop these attacks in detail, but mentions them as important cryptanalytic tools.
2.13. Block Ciphers Are Primitives
The professor stresses that block ciphers are not themselves the final encryption product.
They are primitives, just like:
- PRGs;
- PRFs;
- PRPs.
A block cipher only gives us a fixed-input-length permutation:
\[ F_K : \{0,1\}^{\lambda} \to \{0,1\}^{\lambda}. \]
But in practice, we want to encrypt long messages:
\[ m = m_1 \,\|\, m_2 \,\|\, \cdots \,\|\, m_t. \]
So we need a way to use a block cipher to encrypt long messages.
These constructions are called modes of operation.
2.14. Modes of Operation
A mode of operation is an encryption scheme built from a block cipher.
The lecture assumes messages are chopped into blocks of size \(\lambda\):
\[ m = m_1 \,\|\, m_2 \,\|\, \cdots \,\|\, m_t, \]
where each
\[ m_i \in \{0,1\}^{\lambda}. \]
If the message length is not a multiple of \(\lambda\), padding is used.
Padding may fill the final block with zeros or use a special terminator format.
The lecture then discusses several modes:
- Electronic Codebook Mode;
- Cipher Block Chaining Mode;
- Chained CBC Mode;
- Output Feedback Mode;
- Counter Mode.
2.15. Electronic Codebook Mode
The slides call this Electronic Codebook Mode, abbreviated ECM.
In standard terminology, this is usually called ECB mode.
Encryption is done independently block by block:
\[ c_i = F_K(m_i). \]
So:
\[ c_1 = F_K(m_1), \]
\[ c_2 = F_K(m_2), \]
\[ c_3 = F_K(m_3), \]
and so on.
This is very simple, but insecure.
If two plaintext blocks are equal:
\[ m_i = m_j, \]
then the ciphertext blocks are also equal:
\[ c_i = c_j. \]
Therefore, plaintext patterns remain visible in the ciphertext.
The professor emphasizes:
\[ \boxed{ \text{Electronic Codebook Mode is not IND-CPA secure.} } \]
In fact:
\[ \boxed{ \text{It is not even IND secure.} } \]
Because encryption is deterministic and equal blocks remain equal.
The lecture uses the famous encrypted penguin image example:
- the original image contains visible structure;
- after electronic-codebook encryption, the structure is still visible;
- with a secure encryption scheme, the encrypted image should look like random noise.
The takeaway is:
\[ \boxed{ \text{Do not use Electronic Codebook Mode for new systems.} } \]
It may still exist in libraries for legacy reasons.
2.16. Why Electronic Codebook Mode Was Once Considered Plausible
The professor says that historically people may have assumed that messages were already random-looking.
If plaintext blocks were random and non-repeating, then ECB-like encryption might appear acceptable.
But this is a dangerous assumption.
Real messages are highly structured, just like natural-language text or image data.
This is analogous to earlier attacks on classical ciphers, where English text structure leaked information.
2.17. Cipher Block Chaining Mode
Cipher Block Chaining Mode, or CBC mode, introduces randomness through an initialization vector.
Let
\[ IV \leftarrow \{0,1\}^{\lambda} \]
be chosen uniformly at random.
The \(IV\) is included in the ciphertext.
Encryption works as follows:
\[ c_1 = F_K(m_1 \oplus IV), \]
\[ c_2 = F_K(m_2 \oplus c_1), \]
\[ c_3 = F_K(m_3 \oplus c_2), \]
and in general:
\[ c_i = F_K(m_i \oplus c_{i-1}) \]
with
\[ c_0 = IV. \]
The ciphertext is:
\[ (IV,c_1,c_2,\dots,c_t). \]
The random \(IV\) ensures that encrypting the same message twice gives unrelated ciphertexts.
The lecture states:
\[ \boxed{ \text{CBC is IND-CPA secure for uniformly random } IV \text{ if } F \text{ is a PRP.} } \]
2.18. Decryption in CBC Mode
To decrypt:
\[ c_i = F_K(m_i \oplus c_{i-1}), \]
apply \(F_K^{-1}\):
\[ F_K^{-1}(c_i) = m_i \oplus c_{i-1}. \]
Then recover:
\[ m_i = F_K^{-1}(c_i) \oplus c_{i-1}. \]
So each block can be decrypted locally if \(c_i\) and \(c_{i-1}\) are known.
The lecture emphasizes:
- encryption must be sequential;
- decryption is local.
Encryption is sequential because to compute \(c_i\), one needs \(c_{i-1}\).
So \(c_2\) cannot be computed before \(c_1\), and \(c_3\) cannot be computed before \(c_2\).
This is inefficient for large data when hardware supports many parallel cryptographic operations.
2.19. Chained CBC Mode
The lecture next discusses a variant called Chained CBC Mode.
The idea is that encryption may be paused and resumed.
Suppose the system encrypts:
\[ m_1,m_2,m_3 \]
and produces:
\[ c_1,c_2,c_3. \]
Then later it continues with:
\[ m_4,m_5,\dots \]
using \(c_3\) as the chaining value for the next block.
So:
\[ c_4 = F_K(m_4 \oplus c_3). \]
This seems natural if one wants to encrypt data in a streaming or incremental way.
However, the lecture shows that this is not IND-CPA secure when the adversary can choose later plaintexts after seeing earlier ciphertexts.
2.20. Attack on Chained CBC Mode
Suppose the adversary sees:
\[ IV,c_1,c_2,c_3. \]
Then it can choose the next plaintext block \(m_4\) adaptively.
The transcript discusses two related ways to force a repeated ciphertext.
One way is to choose:
\[ m_4 = c_3 \oplus IV \oplus m_1. \]
Then:
\[ m_4 \oplus c_3 = (c_3 \oplus IV \oplus m_1)\oplus c_3 = IV \oplus m_1. \]
Therefore:
\[ c_4 = F_K(m_4 \oplus c_3) = F_K(IV \oplus m_1) = c_1. \]
So the adversary can force:
\[ c_4 = c_1. \]
This should not happen in a secure encryption scheme.
The lecture also discusses the intuition that if the chaining value can be manipulated or adaptively used, then previous ciphertext values can be used to create relationships between future ciphertexts.
Thus:
\[ \boxed{ \text{Chained CBC Mode is not IND-CPA secure.} } \]
2.21. Output Feedback Mode
Output Feedback Mode, or OFB mode, uses the block cipher to generate a pseudorandom pad stream.
Choose a uniformly random initialization vector:
\[ IV \leftarrow \{0,1\}^{\lambda}. \]
Then define:
\[ s_1 = F_K(IV), \]
\[ s_2 = F_K(s_1) = F_K(F_K(IV)), \]
\[ s_3 = F_K(s_2) = F_K(F_K(F_K(IV))), \]
and so on.
Then encrypt by XORing these values with the message blocks:
\[ c_1 = s_1 \oplus m_1, \]
\[ c_2 = s_2 \oplus m_2, \]
\[ c_3 = s_3 \oplus m_3. \]
The ciphertext includes the \(IV\):
\[ (IV,c_1,c_2,\dots,c_t). \]
The professor points out that the first block is exactly like the PRF-based IND-CPA secure encryption scheme from the previous lecture:
\[ c_1 = F_K(IV) \oplus m_1. \]
If we rename \(IV\) as \(r\), this is the same recipe:
\[ (r,F_K(r)\oplus m). \]
OFB continues by feeding the output back into \(F_K\), thereby generating a stream of pads.
The lecture states:
\[ \boxed{ \text{OFB is IND-CPA secure for uniformly random } IV \text{ if } F \text{ is a PRF.} } \]
Since every PRP is also a PRF in the forward direction for this use, a block cipher can be used here.
2.22. Efficiency of OFB Mode
In OFB mode, encryption and decryption are essentially the same operation: XOR the same pad stream with the input.
However, both encryption and decryption are sequential because:
\[ s_{i+1} = F_K(s_i). \]
So \(s_3\) cannot be computed before \(s_2\), and \(s_2\) cannot be computed before \(s_1\).
The advantage is that the pad stream can be precomputed before the message is known.
For example, one can compute:
\[ F_K(IV),\quad F_K(F_K(IV)),\quad F_K(F_K(F_K(IV))),\dots \]
in advance.
Then once the message arrives, encryption is just XOR.
So:
\[ \boxed{ \text{OFB is sequential, but its pad can be precomputed.} } \]
2.23. Counter Mode
Counter Mode, or CTR mode, is the final and recommended mode discussed in the lecture.
Instead of feeding the output of \(F_K\) back into itself, CTR mode evaluates \(F_K\) on successive counter values.
Choose a uniformly random counter:
\[ ctr \leftarrow \{0,1\}^{\lambda}. \]
The counter is included in the ciphertext.
For the first block, compute:
\[ s_1 = F_K(ctr+1). \]
Then:
\[ c_1 = s_1 \oplus m_1 = F_K(ctr+1)\oplus m_1. \]
For the second block:
\[ c_2 = F_K(ctr+2)\oplus m_2. \]
For the third block:
\[ c_3 = F_K(ctr+3)\oplus m_3. \]
In general:
\[ c_i = F_K(ctr+i)\oplus m_i. \]
Here \(ctr+i\) means integer addition, not XOR.
The professor explicitly stresses that this is not XOR.
If we used XOR with \(1\), applying it repeatedly could flip back and repeat values. Integer addition lets us count forward.
The ciphertext is:
\[ (ctr,c_1,c_2,\dots,c_t). \]
2.24. Security of CTR Mode
CTR mode follows the same high-level recipe as the PRF-based encryption scheme.
It uses values of the form:
\[ F_K(ctr+i) \]
as one-time pads for message blocks.
The proof idea is:
- Replace \(F_K\) by a truly random function \(H\).
As long as the inputs
\[ ctr+i \]
are never repeated across encryptions, the pads are independent random values.
- Therefore, they behave like one-time pads.
The only failure mode is overlap between counter windows.
For one encryption, if the message has \(t\) blocks, it uses the window:
\[ ctr+1,\ ctr+2,\ \dots,\ ctr+t. \]
For another encryption, using a fresh random counter \(ctr'\), it uses:
\[ ctr'+1,\ ctr'+2,\ \dots,\ ctr'+t'. \]
If these windows overlap, then some PRF output is reused.
If the same pad is reused, then XORing the corresponding ciphertext blocks reveals the XOR of the corresponding plaintext blocks:
\[ c_i \oplus c'_j = (F_K(x)\oplus m_i)\oplus(F_K(x)\oplus m'_j) = m_i \oplus m'_j. \]
So overlap is bad.
However, if counters are chosen uniformly from a huge space such as \(\{0,1\}^{128}\), and the number of encrypted blocks is polynomially bounded, then the probability of overlap is negligible.
The professor gives the intuition:
If a window has size about \(2^{30}\), and the counter space is about \(2^{128}\), then the probability that a random counter lands in that window is roughly:
\[ \frac{2^{30}}{2^{128}} = 2^{-98}. \]
This is tiny.
Even summing over polynomially many windows remains negligible.
Thus:
\[ \boxed{ \text{CTR mode is IND-CPA secure for uniformly random counters if } F \text{ is a PRF.} } \]
2.25. Efficiency of CTR Mode
CTR mode has excellent efficiency.
Each block can be encrypted independently:
\[ c_i = F_K(ctr+i)\oplus m_i. \]
So if we want to compute \(c_i\), we only need:
- the key \(K\);
- the counter value \(ctr+i\);
- the message block \(m_i\).
We do not need previous ciphertext blocks or previous pad values.
Therefore:
\[ \boxed{ \text{Encryption and decryption in CTR mode are local and fully parallelizable.} } \]
The pad can also be precomputed.
Moreover, CTR mode only needs the forward direction of \(F_K\).
It never needs:
\[ F_K^{-1}. \]
So a PRF is enough; a full PRP inverse is not required.
2.26. Recommendation: Use CTR Whenever Possible
The professor’s practical takeaway is very clear:
\[ \boxed{ \text{Use CTR mode whenever possible.} } \]
Compared with the other modes:
- Electronic Codebook Mode leaks patterns and should not be used.
- CBC is secure with random IV, but encryption is sequential.
- Chained CBC is not IND-CPA secure.
- OFB is secure and pad-precomputable, but sequential.
- CTR is secure, local, parallelizable, and pad-precomputable.
So from the modes discussed, CTR has the best combination of security and efficiency.
2.27. Special Note: Order-Preserving or Leakage-Allowing Scenarios
At the end, the professor briefly mentions that there may be special scenarios where one intentionally wants to leak some information.
For example, order-preserving encryption intentionally preserves some order information.
But this is not ordinary encryption in the strong confidentiality sense studied here.
For almost every standard use case, one should not use modes that leak equality patterns such as Electronic Codebook Mode.
2.28. Lecture 9 Summary
The main takeaways are:
- The PRF-to-IND-CPA proof from the previous lecture relies on replacing a PRF by a truly random function via a reduction.
PRPs are pseudorandom permutations:
\[ F_K : \{0,1\}^{\lambda} \to \{0,1\}^{\lambda}. \]
A PRP must have an efficiently computable inverse:
\[ F_K^{-1}. \]
- PRPs are compared against uniformly random permutations, not uniformly random functions.
- In practice, PRPs are called block ciphers.
- DES is based on Feistel schemes.
- AES is based on substitution-permutation networks.
- Block ciphers are primitives, not complete encryption schemes.
- Modes of operation build encryption schemes from block ciphers.
- Electronic Codebook Mode is insecure because equal plaintext blocks produce equal ciphertext blocks.
- CBC mode is IND-CPA secure with a uniformly random IV if \(F\) is a PRP, but encryption is sequential.
- Chained CBC mode is not IND-CPA secure because later plaintext blocks can be chosen adaptively based on earlier ciphertexts.
- OFB mode uses repeated applications of \(F_K\) to generate a pad stream; it is IND-CPA secure with a random IV if \(F\) is a PRF, but it is sequential.
CTR mode evaluates \(F_K\) on counter values:
\[ ctr+1,\ ctr+2,\dots \]
and uses the results as pads.
- CTR mode is IND-CPA secure as long as counter windows do not overlap, and such overlap has negligible probability when counters are chosen randomly from a large space.
- CTR mode is local, parallelizable, and precomputable.
Therefore:
\[ \boxed{\text{Use CTR whenever possible.}} \]