Application Layer 2: Email

1. Email

After finishing DNS, the lecture moved to email as another major application-layer protocol.

The lecturer described email as simple for end users, but very complex internally.

Email is older than many modern Internet services and has accumulated many extensions and security mechanisms over time.

1.1. Email as a communication system

Email is still widely used, especially for:

formal communication,
university communication,
work communication,
asynchronous communication.

The lecturer mentioned that messaging apps are often more common for casual digital communication now, but email remains important.

There are billions of email accounts and hundreds of billions of emails sent per day.

A large fraction of email traffic is spam.

1.2. Historical background

The first networked email is associated with Ray Tomlinson in 1971.

Earlier systems already had local message mechanisms:

users on a shared machine could append messages to a local file,
others could later read them.

Ray Tomlinson combined this message idea with network communication.

He also introduced the use of the @ sign in email addresses.

The basic structure is:

\[ user@host \]

or more generally:

\[ local\text{-}part@domain \]

1.3. Email is not simple

The lecturer emphasized that email is even messier than DNS.

Internet standards are specified in RFCs.

There are hundreds of RFCs related to email.

Even reading all of them would not necessarily make email easy to implement, because:

the standards are text written by humans,
many extensions exist,
many real deployments behave differently,
compatibility with old systems matters.

2. Email message format

2.1. RFC 822-style message structure

An email message has two main parts:

header,
body.

They are separated by a blank line.

The structure is:

Header lines
Header lines
Header lines

Body
Body
Body

Important standard header fields:

From:
To:
Subject:
Date:
Message-ID:
MIME-Version:
Content-Type:

The body is the actual message.

Historically, messages were ASCII-only, although later extensions such as MIME allow richer content and encodings.

2.2. Mail headers vs. SMTP envelope

A very important distinction:

Email header fields are not the same as SMTP envelope commands.

In the email message, one may see:

From: alice@example.com
To: bob@example.net
Subject: Hello

But SMTP also has commands:

MAIL FROM:<alice@example.com>
RCPT TO:<bob@example.net>

These are not the same thing.

The lecturer compared this to a physical letter:

the email header is like the address written inside the letter;
the SMTP envelope is like the address written on the outside envelope.

The mail transport system uses the envelope.

The user usually sees the headers.

This distinction is crucial for understanding spoofing and spam.

A message can claim one thing in the From: header while the SMTP envelope says something else.

2.3. Example of real email headers

The lecture showed an example email that originally had simple visible content:

Subject: Test

test

But the raw email contained many headers.

The path was approximately:

from T.Fiebig@tudelft.nl,
to fwd@engelsystem.de,
forwarded to fwd@fiebig.nl,
forwarded to tfiebig@mpi-inf.mpg.de.

The visible message was tiny, but the full raw message contained many headers added by mail systems.

Examples of headers:

spam filter results,
DKIM signatures,
ARC signatures,
SPF and DMARC results,
Received: headers showing transfer path,
TLS information,
return path,
delivery information.

The lecturer read the headers from bottom to top because each mail server adds new headers at the top.

Thus the oldest transport information is near the bottom.

2.4. Spam-filter headers

Spam filters add diagnostic headers.

They may include:

spam score,
whether the message is considered spam,
Bayesian spam probability,
whether DKIM is valid,
whether SPF passed,
whether the sender domains differ,
neural or rule-based scores.

These headers are useful for debugging why a message was accepted or rejected.

2.5. Authentication-related headers

Modern email may contain:

DKIM signatures,
ARC seals,
ARC authentication results,
DMARC results,
SPF results.

These mechanisms are discussed later.

3. Email components

Email has three major components:

user agents,
mail servers,
SMTP.

In practice, DNS is also an essential fourth component, because mail delivery requires DNS lookups for MX records and other security-related records.

3.1. User agent

A user agent is the user’s mail program.

Examples:

Thunderbird,
Outlook,
Apple Mail,
webmail interfaces such as Gmail in the browser.

A user agent is used for:

composing email,
editing email,
reading email,
organizing email.

The lecturer also called this a “mail reader”.

3.2. Mail server

A mail server stores and transfers email.

It contains:

user mailboxes for incoming messages,
outgoing message queues for messages waiting to be delivered.

When a mail server receives an email for a local user, it places the message in that user’s mailbox.

When a user sends an email to another domain, the mail server places it in the outgoing queue and later sends it onward.

3.3. SMTP

SMTP stands for:

Simple Mail Transfer Protocol

SMTP is used:

from a user’s agent to the user’s outgoing mail server,
from the sending mail server to the receiving mail server.

SMTP is not normally used by the receiver to read mail.

Retrieving mail from the server uses other protocols such as POP3 or IMAP.

3.4. Sending example: Alice to Bob

Suppose Alice sends an email to:

\[ bob@someschool.edu \]

The process is:

Alice uses her user agent to compose the message.
Alice’s user agent sends the message to Alice’s mail server using SMTP.
Alice’s mail server places the message in its outgoing queue.
Alice’s mail server opens a TCP connection to Bob’s mail server.
Alice’s mail server sends the message using SMTP.
Bob’s mail server places the message in Bob’s mailbox.
Bob uses his user agent to read the message.

4. SMTP

4.1. Basic properties

SMTP uses TCP.

TCP provides reliable byte-stream transport.

SMTP default port:

TCP port 25 for server-to-server SMTP.

Other ports exist for user submission and encrypted variants:

TCP 587 is commonly used for mail submission with STARTTLS,
TCP 465 is commonly used for implicit TLS submission.

The lecture emphasized port 25 as the classic SMTP port.

SMTP transfer has three phases:

handshaking,
message transfer,
closure.

SMTP is a command-response protocol, similar in style to HTTP.

Commands are ASCII text.

Responses contain:

status code,
human-readable phrase.

4.2. SMTP handshake

A simplified SMTP session begins after the TCP connection is established.

The server first sends a greeting:

S: 220 hamburger.edu

Then the client identifies itself:

C: HELO crepes.fr

The command is historically spelled HELO with one L.

Modern SMTP usually supports the extended command:

EHLO crepes.fr

EHLO allows the server to advertise supported extensions.

The server answers:

S: 250 Hello crepes.fr, pleased to meet you

4.3. SMTP envelope commands

The client gives the sender envelope address:

C: MAIL FROM:<alice@crepes.fr>
S: 250 alice@crepes.fr... Sender ok

Then the client gives the recipient envelope address:

C: RCPT TO:<bob@hamburger.edu>
S: 250 bob@hamburger.edu ... Recipient ok

Then the client starts the message body transfer:

C: DATA
S: 354 Enter mail, end with "." on a line by itself

The client sends the email content.

The email data ends with a single dot on a line by itself:

C: .

More precisely, the end marker is:

\[ CRLF . CRLF \]

Then the server accepts the message:

S: 250 Message accepted for delivery

Finally:

C: QUIT
S: 221 hamburger.edu closing connection

4.4. Complete simple SMTP example

S: 220 hamburger.edu
C: HELO crepes.fr
S: 250 Hello crepes.fr, pleased to meet you
C: MAIL FROM:<alice@crepes.fr>
S: 250 alice@crepes.fr... Sender ok
C: RCPT TO:<bob@hamburger.edu>
S: 250 bob@hamburger.edu ... Recipient ok
C: DATA
S: 354 Enter mail, end with "." on a line by itself
C: From: alice@crepes.fr
C: To: bob@hamburger.edu
C: Subject: Question
C:
C: Do you like ketchup?
C: How about pickles?
C: .
S: 250 Message accepted for delivery
C: QUIT
S: 221 hamburger.edu closing connection

Important point:

The From:, To:, and Subject: lines inside DATA are part of the email message.

They are not SMTP envelope commands.

4.5. Trying SMTP manually

The slides mention that one can try SMTP manually with:

telnet <server-name> 25

However, in modern deployments:

authentication is often required,
TLS is often required,
port 25 may be blocked,
mail servers may reject unauthenticated or suspicious clients.

The lecturer demonstrated a secure connection using openssl because the available server required TLS.

A typical command for implicit TLS would be similar to:

openssl s_client -connect <server-name>:465

After the TLS handshake, the SMTP commands are still text-based.

The lecturer also demonstrated that some authentication mechanisms use Base64-encoded username and password data.

Base64 is not encryption; it is just encoding.

5. DNS in email delivery

DNS is essential for email.

Suppose a user sends email from:

\[ userA@a.com \]

to:

\[ userB@b.com \]

The sending mail server needs to know where to deliver mail for b.com.

It therefore asks DNS for the MX record of b.com.

5.1. MX record

MX stands for:

Mail Exchange

An MX record says which mail server handles email for a domain.

Example:

b.com. IN MX 10 mail.b.com.

This says:

mail for b.com should be sent to mail.b.com,
priority value is 10. In an MX record, the priority number is called the preference value. It tells the sending mail server which receiving mail server should be tried first when a domain has multiple mail servers. Lower number means higher priority.

After learning mail.b.com, the sender also needs its IP address.

Therefore it also needs an A or AAAA record:

mail.b.com. IN A 192.168.178.1

or an IPv6 AAAA record.

5.2. Delivery steps with DNS and SMTP

The process is:

The mail user agent sends the email to mail.a.com.
mail.a.com asks its recursive DNS resolver for the MX record of b.com.
The recursive resolver asks the authoritative DNS server for b.com.
DNS returns something like:
```
b.com. IN MX 10 mail.b.com.
```
The sender resolves mail.b.com to an IP address.
mail.a.com opens a TCP connection to mail.b.com.
mail.a.com starts an SMTP session.
The message is transferred.
mail.b.com stores it in the recipient mailbox.

Thus email delivery depends on DNS.

The lecturer repeatedly emphasized that email security mechanisms also add even more DNS dependencies.

6. SMTP problems

SMTP was originally very simple.

That simplicity causes problems in the modern Internet.

Main problems:

spam,
spoofing,
man-in-the-middle attacks,
forwarding complications,
operational complexity.

6.1. Spam

A large amount of email is spam.

Mail servers therefore perform spam filtering.

Spam filtering can be:

inbound, to protect local users,
outbound, to prevent local users or compromised accounts from sending spam.

If a mail server sends spam, other operators may block or distrust it.

Therefore responsible mail server operators also filter outbound mail.

6.2. Spoofing

SMTP by itself does not strongly authenticate that the sender is allowed to send for a given domain.

A sender may put one address in the visible From: header and another in the envelope.

This makes spoofing possible.

Several mechanisms try to reduce spoofing:

forward confirmation,
SPF,
DKIM,
DMARC.

7. Anti-spam and anti-spoofing mechanisms

7.1. Forward confirmation

Forward confirmation checks whether the sending mail server’s name and IP address plausibly match.

Suppose mail.a.com connects from IP address:

\[ 10.23.42.1 \]

The receiving server may check:

what IP address does mail.a.com resolve to?
what domain name does 10.23.42.1 reverse-resolve to?

This involves:

A or AAAA lookup,
PTR lookup.

The idea is that if both directions are consistent, the sender is more likely to be legitimate.

However, this is not a complete security solution:

DNS can be spoofed if not authenticated,
it creates additional DNS queries,
it is operationally annoying,
it does not solve all spoofing problems.

7.2. Greylisting

Greylisting relies on the assumption that legitimate mail servers retry delivery.

If a receiving mail server sees a new sender for the first time, it can temporarily reject the message with a temporary error.

A legitimate mail server will retry later.

A simple spammer often will not retry because it sends huge volumes and does not keep state.

Thus:

first attempt: temporary rejection,
later retry: accept if the sender returns.

Example behavior:

receiving server returns a 4xx temporary error,
legitimate sender queues and retries,
spammer may give up.

This works surprisingly well against simple spam senders.

But it is easy to bypass if the attacker retries.

7.3. SPF: Sender Policy Framework

SPF allows a domain to publish which servers are allowed to send mail for that domain.

The policy is published in DNS using a TXT record.

Example:

a.com. IN TXT "v=spf1 mx -all"

Meaning:

v=spf1 declares SPF version,
mx means the domain’s MX servers are allowed to send,
-all means all other senders should be denied.

SPF mechanisms include:

mx: allow the zone’s MX servers,
a:<name>: allow IPs that a name resolves to,
ip4:<addr>: allow an IPv4 address or prefix,
ip6:<addr>: allow an IPv6 address or prefix,
include:<zone>: include another zone’s SPF policy,
?all, ~all, +all, -all: different policies for all remaining senders.

SPF adds more DNS queries.

For example, to evaluate:

v=spf1 mx -all

the receiver may need:

TXT record for a.com,
MX records for a.com,
A/AAAA records for the MX hosts.

SPF authenticates the envelope sender domain, not necessarily the visible From: header by itself.

DMARC later connects SPF/DKIM alignment to the visible From: domain.

7.4. DKIM: DomainKeys Identified Mail

DKIM stands for:

DomainKeys Identified Mail

DKIM is based on public-key cryptography.

The sending mail server signs parts of the email with a private key.

The public key is published in DNS.

The receiving mail server:

reads the DKIM signature header,
learns which domain and selector to use,
retrieves the public key from DNS,
verifies the signature.

A DKIM signature is placed in a mail header such as:

DKIM-Signature: v=1; a=rsa-sha256; d=a.com; s=default; h=from:to:subject; ...

Important fields:

d= domain,
s= selector,
h= signed headers,
b= signature,
bh= body hash.

The public key is often stored under a DNS name like:

default._domainkey.a.com. IN TXT "v=DKIM1; k=rsa; p=..."

The lecturer emphasized:

the From: header must be signed,
other important headers should also be signed,
the DKIM-Signature: header itself is included in a special way during signing.

Headers that should commonly be signed include:

From:,
Subject:,
Date:,
Message-ID:,
To:,
Cc:,
MIME-Version:,
content-related headers.

7.4.1. Why signing Date matters

The lecture asked what happens if Date: is not signed.

If the date is not protected by DKIM, an attacker may replay the same signed email but modify the date.

This is a replay-style problem.

A signature only protects what it actually covers.

Therefore all security-relevant headers should be signed.

7.5. DMARC

DMARC stands for:

Domain-based Message Authentication, Reporting and Conformance

DMARC builds on SPF and DKIM.

It allows a domain to publish a policy saying what receivers should do if SPF/DKIM validation fails or does not align with the visible From: header.

DMARC is published in DNS as a TXT record.

Example:

_dmarc.a.com. IN TXT "v=DMARC1; p=reject; rua=mailto:postmaster@a.com"

Meaning:

v=DMARC1 declares DMARC,
p=reject says failing mail should be rejected,
rua=mailto:postmaster@a.com says aggregate reports should be sent there.

DMARC provides:

policy,
reporting,
validation rules,
alignment between SPF/DKIM and the visible From: domain.

Example:

If a mail claims:

From: userA@a.com

but SPF fails and DMARC says p=reject, the receiver should reject the message and may report this to the domain owner.

7.6. DMARC reporting

DMARC can produce reports showing:

which sources sent mail claiming to be from a domain,
whether SPF passed,
whether DKIM passed,
whether alignment passed,
what actions were taken.

These reports are useful for operators, but they add operational burden.

The lecturer noted that these mechanisms often break or are disabled because they require careful coordination between:

mail server configuration,
DNS records,
key rotation,
forwarding behavior,
cache timing.

7.7. Forwarding breaks SPF/DKIM/DMARC

Forwarding is common.

Example:

a university address forwards to a private mailbox,
a mailing list forwards to members,
a mail server forwards abuse reports.

Forwarding can break SPF because the final receiver sees the forwarder’s IP address, not the original sender’s authorized mail server.

Forwarding can break DKIM if the forwarder modifies signed headers or the body.

Therefore strict DMARC policies can cause legitimate forwarded mail to fail.

The lecture introduced two mechanisms that try to handle this:

SRS,
ARC.

7.8. SRS: Sender Rewriting Scheme

SRS changes the envelope sender when forwarding mail.

The goal:

preserve the original visible From: header,
make SPF checks succeed for the forwarding server,
still allow bounces to be routed correctly.

Example transformation:

Original sender:

T.Fiebig@tudelft.nl

Forwarded envelope sender:

SRS0=oSy4=VY=tudelft.nl=T.Fiebig@engelsystem.de

Meaning:

SRS0: first SRS rewrite,
oSy4=VY hash fields: generated with a secret key by the forwarding mail server,
=tudelft.nl=T.Fiebig: original sender domain and local part are encoded,
@engelsystem.de: final domain is the forwarding domain.

The hash prevents the forwarding server from becoming an open relay.

State is often kept for several days so that bounces can be mapped back correctly.

In SRS, any temporary state is kept by the forwarding mail server, so that later bounces sent to the rewritten SRS address can be mapped back to the original sender.
Although the SRS address often encodes the original sender, the forwarder must still verify that the address was genuinely generated by itself and has not expired.
This can be done statelessly with a keyed hash/timestamp, or statefully by keeping temporary mapping information for a few days.

If forwarding happens multiple times, SRS can add additional layers such as SRS1.

7.9. ARC: Authenticated Received Chain

ARC can be understood as “DKIM with extra steps.”

A forwarding mail server records the authentication results it observed when receiving the mail and signs them.

It adds headers such as:

ARC-Seal,
ARC-Message-Signature,
ARC-Authentication-Results.

The idea:

the forwarder says: “When I received this message, SPF/DKIM/DMARC had these results”,
then the forwarder signs that statement,
later receivers can evaluate the chain.

ARC is useful when forwarding would otherwise break DKIM/SPF/DMARC.

However, it requires forwarders to perform cryptographic signing and maintain correct configuration.

8. TLS and email transport security

8.1. Plain SMTP and STARTTLS

Email is old.

Originally, SMTP had no transport encryption.

Modern SMTP often uses TLS, but the transition is messy.

A common pattern is STARTTLS:

client connects in plaintext,
server advertises STARTTLS,
client sends STARTTLS command,
TLS handshake begins,
SMTP continues inside the encrypted channel.

Example:

< 220 mail.b.com ESMTP Postfix
> EHLO mail.a.com
< 250-mail.b.com
< 250-STARTTLS
< 250 CHUNKING
> STARTTLS
< 220 2.0.0 Ready to start TLS
> EHLO mail.a.com
...

After STARTTLS, the client sends EHLO again because the capabilities may differ inside TLS.

8.2. Opportunistic TLS problem

SMTP often uses opportunistic TLS.

This means:

use TLS if available,
but still deliver mail without TLS if not available.

This creates a downgrade problem.

Because the first phase is plaintext, a man-in-the-middle can remove the server’s STARTTLS advertisement.

Then the client may think TLS is unavailable and continue in plaintext.

This is why additional mechanisms are needed.

8.3. DANE and TLSA records

DANE stands for:

DNS-based Authentication of Named Entities

DANE uses DNS to publish information about the valid TLS certificate.

For SMTP, this is done with TLSA records.

Example:

_25._tcp.mail.b.com. IN TLSA 3 0 1 D2ABDE240D7CD3EE...

Meaning:

3: RR points to cert, no PKIX path validation
0: Select entire certificate for matching
1: SHA256 of selected data
service is SMTP on TCP port 25,
the record contains information for validating the certificate.

The lecturer explained that SMTP certificate validation is messy because:

certificates often do not match cleanly,
SNI support historically came late to SMTP,
mail server operators do not always manage certificates correctly.

DANE gives a DNS-based way to say which certificate is valid.

However, if TLSA records are trusted, DNS itself must be authenticated.

Therefore DNSSEC is required for DANE to be secure.

Without DNSSEC, an attacker could tamper with the TLSA record.

8.4. MTA-STS

MTA-STS stands for:

SMTP Mail Transfer Agent Strict Transport Security

It is similar in spirit to HSTS for the web.

It lets a domain publish that mail to its MX hosts should use TLS.

The mechanism involves DNS and HTTPS.

Example DNS records:

_mta-sts.b.com. IN TXT "v=STSv1; id=2022041601"
_smtp._tls.b.com. IN TXT "v=TLSRPTv1; rua=mailto:postmaster@b.com"

The first record tells sending mail servers that b.com supports MTA-STS.

_mta-sts.b.com. is the special DNS name used for MTA-STS.
v=STSv1 means this is an MTA-STS version 1 record.
id=2022041601 is a policy version identifier.

The full MTA-STS policy is not stored directly in this DNS record. Instead, the sender fetches it via HTTPS:

https://mta-sts.b.com/.well-known/mta-sts.txt

That policy can say, for example, that mail to b.com must be delivered using TLS.

The second record tells sending mail servers where to report SMTP TLS delivery problems.

_smtp._tls.b.com. is the special DNS name for SMTP TLS reporting.
v=TLSRPTv1 means this is a TLS Reporting version 1 record.
rua=mailto:postmaster@b.com means aggregate TLS reports should be sent to postmaster@b.com.

These reports may describe problems such as TLS handshake failures, certificate errors, or violations of the domain’s MTA-STS policy.

The sending server then fetches a policy file via HTTPS:

https://mta-sts.b.com/.well-known/mta-sts.txt

Example policy:

version: STSv1
mode: enforce
mx: mail.b.com
max_age: 86400

Meaning:

policy version is STSv1,
enforce TLS,
valid MX host is mail.b.com,
policy can be cached for 86400 seconds.

MTA-STS helps detect downgrade attacks.

If the policy says TLS must be used, and the SMTP server does not advertise STARTTLS, the sender can conclude that something is wrong and stop delivery.

8.5. TLS reporting

The _smtp._tls TXT record can specify where TLS reports should be sent.

Example:

_smtp._tls.b.com. IN TXT "v=TLSRPTv1; rua=mailto:postmaster@b.com"

This allows daily reporting digests about TLS delivery problems.

8.6. Summary of SMTP security extensions

The lecturer emphasized that all of these mechanisms solve partial problems, but also add complexity.

Mechanisms include:

spam filters,
forward confirmation,
greylisting,
SPF,
DKIM,
DMARC,
SRS,
ARC,
STARTTLS,
DANE/TLSA,
DNSSEC,
MTA-STS,
TLS reporting.

A recurring theme:

Email security keeps adding more DNS records and more operational complexity.

This makes modern email difficult to implement correctly and completely.

9. Mail access protocols

SMTP delivers mail to the receiver’s mail server.

But SMTP is not the usual protocol used by the receiver’s user agent to read mail.

For retrieval, mail access protocols are used.

Main options:

POP3,
IMAP,
HTTP/webmail.

9.1. POP3

POP3 stands for:

Post Office Protocol version 3

POP3 provides:

authorization,
download of messages.

POP3 has a download-and-delete mode:

the client downloads messages,
the server deletes them,
if the client device dies, the messages may be gone.

The lecturer warned that this mode is annoying and should not be configured casually.

POP3 also has a download-and-keep mode:

the client downloads messages,
the server keeps copies.

However, POP3 is largely stateless across sessions and is less convenient for using multiple devices.

9.2. IMAP

IMAP stands for:

Internet Mail Access Protocol

IMAP keeps messages on the server.

It supports:

server-side folders,
manipulating messages on the server,
keeping user state across sessions,
mapping message IDs to folders,
better multi-device use.

The lecturer said that most people today effectively use IMAP, especially when using a mail client across devices.

9.3. HTTP / webmail

Many providers also offer webmail over HTTP/HTTPS.

Examples:

Gmail,
Hotmail / Outlook.com,
Yahoo Mail.

In this case, the user accesses mail through a browser.

Internally the provider may still use other protocols, but the user-facing access is via HTTP.

If using a desktop client such as Thunderbird to access Gmail, the client still usually uses IMAP or POP3.