Input and Output

1. Big Picture: What Is I/O?

1.1. Definition

Input / Output, usually abbreviated as I/O, is the part of the operating system concerned with how the OS communicates with physical devices, and how devices communicate back to the OS.

Examples of I/O devices include:

  • mouse
  • keyboard
  • wireless controller
  • microphone
  • monitor / screen
  • speakers
  • printer
  • scanner
  • USB flash drive
  • SSD / hard disk
  • camera / webcam
  • network interfaces:
    • Ethernet
    • Wi-Fi
    • cellular modem
    • Starlink or similar devices
  • removable storage:
    • CD
    • DVD
    • Blu-ray
    • floppy disk

The key problem is that there are many kinds of devices, many brands, and many hardware-specific protocols.

Applications should not need to understand every possible device and its low-level protocol.

Therefore, the OS hides device-specific details behind a small number of consistent abstractions.

1.2. Main OS Goal in I/O

The OS provides a stable interface to applications.

Instead of forcing every application to know the details of every device, the OS groups devices into a few classes and provides common interfaces for each class.

The device-specific details are handled by drivers inside the kernel.

The main idea is:

Application
    |
    v
Operating System abstraction
    |
    v
Device class subsystem
    |
    v
Device driver
    |
    v
Hardware device

2. Device Classes and Drivers

2.1. Character Devices

Character devices expose a byte stream.

Properties:

  • sequential access
  • no fixed block size
  • usually one byte or character at a time
  • no random access

Examples:

  • keyboard
  • mouse
  • serial port
  • terminal / TTY

Historically, TTY means “teletypewriter”. The name is old, but the abstraction still appears in modern operating systems.

A useful mental model is a tape-like stream: data arrives or is consumed in order.

2.2. Block Devices

Block devices expose fixed-size blocks.

Properties:

  • random access
  • fixed-size units
  • suitable for storage

Examples:

  • hard disk
  • SSD
  • optical disk

The internal mechanism can be very different, for example magnetic disk versus SSD, but the OS exposes a similar block abstraction.

2.3. Network Devices

Network devices send and receive packets or frames.

Properties:

  • asynchronous
  • packet-based rather than byte-stream or block-based
  • often event-driven

Examples:

  • Ethernet
  • Wi-Fi
  • cellular modem

2.4. Drivers

A driver is device-specific kernel code.

Its job is to:

  • program the device’s hardware registers
  • handle interrupts from the device
  • translate between the OS class abstraction and the specific hardware protocol

For example, the OS may have a general network subsystem, but a particular Wi-Fi card still needs a driver that knows how to operate that card.

2.5. Class Subsystems

A class subsystem is the common OS interface for all devices of a particular class.

Examples:

  • character device subsystem
  • block device subsystem
  • network subsystem

The class subsystem provides a shared abstraction.

The driver bridges the gap between that abstraction and a specific device.

2.6. USB and PCIe Are Not Device Classes

USB and PCI Express are not character/block/network classes.

They are I/O interconnects or transports.

That means they are ways of connecting devices to the system.

A USB device can be:

  • a keyboard, i.e. character-like input
  • a storage device, i.e. block device
  • a network adapter, i.e. network device

Similarly, PCIe can carry different kinds of devices, such as GPUs, network cards, storage controllers, and so on.

3. I/O Architecture Overview

3.1. Overall Structure

A simplified I/O architecture looks like this:

Applications
    |
    v
Operating System
    |
    v
Device Class Subsystems
    |
    v
Drivers
    |
    v
I/O Bus
    |
    +--> Hardware Devices
    +--> Main Memory
    +--> CPU

The I/O bus connects the CPU, memory, and devices.

Examples of I/O buses or interconnects:

  • PCI
  • PCI Express
  • ISA
  • USB

ISA is an older IBM PC bus and is mostly historical.

3.2. Function of the I/O Bus

The I/O bus has two main functions:

3.2.1. Communication

The bus carries:

  • commands from the CPU to devices
  • data between devices and memory/CPU
  • interrupts from devices to the CPU

For example, the CPU can command a disk to read a sector, and the device can later notify the CPU that the operation is complete.

3.2.2. Enumeration

Enumeration means discovering what devices are present.

During boot, or when a hot-pluggable device is inserted, the bus helps the OS discover:

  • which device exists
  • what model it is
  • what capabilities it has
  • which driver should be matched to it

For hot-plug buses such as USB, enumeration can happen at runtime when devices are inserted or removed.

4. Device Registers

4.1. Device Registers as the OS-Visible Control Surface

The OS controls devices through device registers.

A device exposes a small set of registers to the OS.

The driver reads and writes these registers to control the device.

Common register types:

Register Purpose
Command register CPU writes commands here to start operations, e.g. read, write, reset, configure
Status register CPU reads this to check readiness, completion, or errors
Data / buffer Used to transfer payload data or identify a buffer
Interrupt-enable Controls which device events may raise interrupts

4.2. Command Register

The CPU writes to the command register to start an operation.

Examples:

  • read this sector
  • write this data
  • reset the device
  • configure the device

4.3. Status Register

The CPU reads the status register to determine the device state.

It can indicate:

  • device ready
  • operation completed
  • error occurred
  • data available

4.4. Data / Buffer Register

For PIO, the data register may carry one word or byte of payload per access.

For DMA, the device usually gets a memory address, length, and direction, rather than transferring every byte through a data register.

4.5. Interrupt-Enable Register

The interrupt-enable register tells the device which events are allowed to generate interrupts.

For example, a keyboard driver may configure the device to interrupt the CPU whenever a key is pressed.

4.6. Core Point

The mechanisms discussed in this lecture all build on these device registers:

  • synchronous programmed I/O
  • interrupts
  • DMA

The driver interacts with the device by reading and writing these registers.

5. Programmed I/O

5.1. Definition

Programmed I/O, abbreviated PIO, means the CPU directly performs device register accesses.

The CPU drives the I/O operation by reading and writing registers.

PIO can be implemented in two main ways:

  • port I/O
  • memory-mapped I/O

5.2. Port I/O

Port I/O uses special CPU instructions to access a separate I/O address space.

On x86, typical instructions include:

  • inb
  • outb

Example idea:

inb(port)   -> read one byte from I/O port
outb(port)  -> write one byte to I/O port

This is a legacy x86 mechanism, but some devices still use it.

5.3. Memory-Mapped I/O

Memory-mapped I/O, abbreviated MMIO, places device registers into the physical address space.

The CPU accesses device registers using normal load and store instructions.

Important distinction:

  • the CPU instruction looks like a normal memory access
  • but the access goes to the device, not normal RAM

MMIO regions are usually mapped as uncached and order-sensitive.

This matters because device register operations often have side effects, and the CPU must not freely reorder or cache them like normal memory.

5.4. Why PIO Can Be Inefficient

I/O devices are often extremely slow compared with the CPU.

For example:

  • a human typing on a keyboard produces only a few events per second
  • a CPU can execute billions of cycles per second

If the CPU waits in a loop for a device, most CPU cycles are wasted.

6. Synchronous PIO

6.1. Basic Idea

Synchronous PIO means the OS issues a command and then repeatedly polls the device status register until the operation completes.

The sequence is:

  1. The OS writes command registers to issue a request.
  2. The device performs the operation.
  3. The kernel repeatedly reads the status register.
  4. When the device is ready or done, the kernel reads or writes the data.
  5. Data is transferred word-by-word or byte-by-byte by the CPU.

6.2. Polling

Polling means repeatedly asking the device:

Are you done?
Are you done?
Are you done?
...

In code, this is often a loop that checks a status bit.

This is also called busy waiting because the CPU is actively executing instructions while waiting.

6.3. Example: Serial Port / UART

UART stands for Universal Asynchronous Receiver / Transmitter.

It is hardware for sending and receiving serial bytes.

The serial driver talks to the UART through a few registers.

6.3.1. Sending a byte

Conceptually:

void serial_putc(uint8_t byte) {
    while ((inb(LSR_REG) & LSR_THRE) == 0)
        ;   // spin until UART is ready

    outb(THR_REG, byte);
}

Meaning:

  • read the line status register
  • wait until the transmit-holding register is empty
  • write one byte to the transmit register

6.3.2. Receiving a byte

Conceptually:

uint8_t serial_getc(void) {
    while ((inb(LSR_REG) & LSR_DR) == 0)
        ;   // spin until data arrives

    return inb(RBR_REG);
}

Meaning:

  • spin until the UART reports that data is ready
  • read one byte from the receive buffer register

6.4. Main Takeaway from Synchronous PIO

The CPU waits until a status bit changes.

Every byte costs CPU time.

This can be acceptable for tiny transfers or simple early-boot code, but it is inefficient for larger transfers or slow devices.

6.5. Advantages of Synchronous PIO

Advantages:

  • simple to implement
  • little hardware support required
  • low overhead if the device is already ready
  • good enough for very small transfers
  • historically common

6.6. Disadvantages of Synchronous PIO

Disadvantages:

  • busy waiting wastes CPU cycles
  • CPU cannot do useful work while spinning
  • CPU must move every byte or word itself
  • large transfers burn a lot of CPU time

7. Interrupts

7.1. Motivation

The problem with polling is that the CPU has to actively check whether the device is done.

Interrupts solve this by letting the device notify the CPU.

Instead of the CPU repeatedly asking:

Are you done yet?

the device says:

I need attention now.

7.2. Definition

An interrupt is an asynchronous notification from a device to the CPU/OS.

It tells the OS that the device needs attention.

Reasons for an interrupt include:

  • operation completed
  • device ready
  • new input arrived
  • error occurred

7.3. Benefit

The CPU can do other work while the device is busy.

For example:

  • run another process
  • perform computation
  • schedule other I/O
  • manage other devices

The CPU no longer needs to waste time in a polling loop.

8. From Device Interrupt to Driver Handler

8.1. Interrupt Delivery Path

The interrupt path is approximately:

Device raises interrupt
    |
    v
Interrupt controller routes it to a CPU and vector number
    |
    v
CPU enters kernel mode through the IDT
    |
    v
OS dispatches to registered driver handler
    |
    v
Handler acknowledges device/controller and processes or defers work

8.2. Interrupt Controller

The interrupt controller receives interrupt signals from devices and routes them to a CPU.

It also associates the interrupt with a vector number.

The vector number helps the CPU/OS find the correct handler.

8.3. Interrupt Descriptor Table

On x86-like systems, the CPU uses the Interrupt Descriptor Table, or IDT, to enter the correct kernel entry point.

The CPU interrupts normal execution, saves state, and enters kernel mode.

8.4. Driver Handler

The OS dispatches the interrupt to the corresponding driver handler.

The handler must usually:

  • determine what happened
  • acknowledge the device
  • acknowledge the interrupt controller
  • schedule further work if needed

If the handler does not acknowledge the interrupt properly, the interrupt may keep firing.

9. Interrupt Context and Its Problems

9.1. Interrupts Interrupt Normal Execution

An interrupt can arrive while the CPU is executing unrelated code.

Example:

  • a user process is computing something
  • a keyboard event or disk completion occurs
  • the CPU stops the current execution path
  • the kernel interrupt handler runs
  • later the original execution resumes

9.2. Interrupt Handlers Should Be Short

Interrupt handlers should not do too much work.

Reasons:

  • they increase interrupt latency
  • other devices may have to wait
  • device buffers may overflow
  • packets may be dropped
  • keystrokes may be lost
  • edge-triggered events may be missed
  • level-triggered devices may keep reasserting the interrupt
  • interrupt context cannot block

9.3. Interrupt Context Cannot Block

Interrupt handlers run in a special context.

They generally cannot sleep or block.

This restricts synchronization.

For example, an interrupt handler cannot simply wait on a normal blocking lock if the lock is unavailable.

This is why interrupt handlers should do only the urgent part of the work.

10. Top Half and Bottom Half

10.1. Motivation

Because interrupt handlers should be short, the OS often splits interrupt handling into two parts:

  • top half
  • bottom half

The basic principle is:

Do the urgent minimum now.
Defer the expensive work until later.

10.2. Top Half

The top half runs in interrupt context.

It cannot sleep.

It should do only urgent work.

Typical responsibilities:

  • acknowledge the device
  • acknowledge the interrupt controller
  • record what happened
  • schedule deferred work

The top half is on the critical path of interrupt handling.

Therefore, it must be fast.

10.3. Bottom Half

The bottom half runs later.

It performs the more expensive or less urgent work.

Typical responsibilities:

  • drain buffers
  • complete requests
  • wake waiting threads
  • perform follow-up processing
  • hand data to higher-level kernel subsystems

Depending on the OS mechanism, bottom halves may or may not be allowed to sleep.

10.4. Examples of Deferred-Work Mechanisms

Linux examples:

  • softirqs
  • tasklets
  • workqueues
  • threaded IRQs

Windows example:

  • Deferred Procedure Calls, abbreviated DPCs

Important distinction:

  • softirqs and tasklets still cannot sleep
  • workqueues and threaded IRQs run in process context and may block

11. Serial Port with Interrupts

11.1. Basic Idea

Instead of spinning until the UART is ready, the driver can enqueue data and return.

Later, when the UART is ready, it raises an interrupt.

The interrupt handler drains queued bytes into the UART.

11.2. Sending with Interrupts

Conceptually:

void serial_putc(uint8_t byte) {
    intq_putc(&txq, byte);
    outb(IER_REG, inb(IER_REG) | IER_XMIT);
}

Meaning:

  • put the byte into a transmit queue
  • enable the transmit-empty interrupt
  • return immediately

The CPU does not wait for the UART to become ready.

11.3. Interrupt Handler

Conceptually:

void serial_interrupt(void) {
    while (!intq_empty(&txq) &&
           (inb(LSR_REG) & LSR_THRE) != 0) {
        outb(THR_REG, intq_getc(&txq));
    }

    uint8_t ier = inb(IER_REG);

    if (intq_empty(&txq))
        ier &= ~IER_XMIT;
    else
        ier |= IER_XMIT;

    outb(IER_REG, ier);
}

Meaning:

  • while the UART can accept bytes and the queue is non-empty:
    • send bytes from the queue
  • if the queue is empty:
    • disable transmit interrupts
  • otherwise:
    • keep transmit interrupts enabled

11.4. Main Takeaway

The caller returns quickly after enqueueing the byte.

The interrupt handler sends the data when the hardware is ready.

This removes busy waiting, but the CPU is still involved in moving each byte.

12. IDE Disk with PIO and Interrupts

12.1. IDE

IDE stands for Integrated Drive Electronics.

It is a classic parallel disk interface.

With PIO and interrupts, the CPU no longer waits by polling, but it still copies the sector data itself.

12.2. Read Flow

Conceptually:

int ide_read(uint64_t sec_no) {
    setup_lba(c, sec_no);
    outb(reg_command(c), CMD_READ_SECTOR_RETRY);
    sema_down(&sector_sema);
    return 0;
}

Meaning:

  1. Program the sector number.
  2. Issue a read command.
  3. Block the requesting thread on a semaphore.
  4. Wait until the disk interrupt wakes the thread.

12.3. Interrupt Handler

Conceptually:

void ide_interrupt(void) {
    uint16_t *buf = (uint16_t *) sector_buffer;

    for (int i = 0; i < SECTOR_SIZE / 2; i++)
        buf[i] = inw(reg_data(c));

    sema_up(&sector_sema);
}

Meaning:

  • the disk interrupt fires when data is ready
  • the CPU copies the data from the device data register into memory
  • for a 512-byte sector, the CPU copies 256 16-bit words
  • after copying, the handler wakes the waiting thread

12.4. Main Takeaway

Interrupts remove waiting.

But the CPU still performs the data copy.

So PIO with interrupts solves one problem but not all of them.

13. PIO with Interrupts: Trade-Offs

13.1. Advantages

Advantages:

  • asynchronous operation
  • CPU can do other work while the device is busy
  • no busy waiting
  • useful for slow devices and small transfers

13.2. Disadvantages

Disadvantages:

  • CPU still copies data word-by-word or byte-by-byte
  • large transfers still consume CPU time
  • each request has interrupt overhead
  • if interrupts happen very frequently, interrupt processing itself can become expensive

13.3. Core Limitation

Interrupts remove the waiting cost.

They do not remove the copying cost.

This motivates DMA.

14. Direct Memory Access

14.1. Motivation

With PIO, even interrupt-driven PIO, the CPU still copies data between the device and memory.

For large transfers, this wastes:

  • CPU cycles
  • memory bandwidth
  • interrupt-handler time

DMA solves this by letting the device move data directly.

14.2. Definition

Direct Memory Access, abbreviated DMA, means the device reads from or writes to main memory directly.

The CPU does not copy the payload.

Instead, the CPU only manages the I/O operation.

14.3. DMA Flow

A typical DMA operation:

  1. The driver prepares a memory buffer.
  2. The driver maps the buffer for DMA and obtains a device-visible bus address.
  3. The driver programs the device with:
    • address
    • length
    • direction
    • command
  4. The requesting thread may block.
  5. The CPU runs other work.
  6. The device transfers data directly to or from memory.
  7. The device raises a completion interrupt.
  8. The interrupt handler wakes the waiting thread.
  9. The driver unmaps the DMA buffer.

14.4. CPU Role in DMA

The CPU manages I/O, but does not copy the data.

The CPU is responsible for:

  • validating the request
  • setting up the DMA mapping
  • programming the device
  • handling completion
  • waking the caller
  • cleaning up the mapping

14.5. Device Role in DMA

The device performs the payload movement itself.

For example, a disk controller can write data directly into a memory buffer.

A GPU can use its DMA engine to move data between GPU/device memory and main memory.

15. DMA in Modern Systems

15.1. GPU over PCI Express

Modern GPUs are connected over PCI Express.

The CPU programs a transfer.

Then the GPU’s DMA engine can read and write main memory directly while the CPU continues executing other work.

This is important for:

  • rendering
  • compute acceleration
  • machine learning workloads
  • high-throughput data transfer

15.2. Memory-Mapped Files and DMA

The lecture briefly connected DMA to memory-mapped files.

A memory-mapped file maps part of a file into a process’s virtual address space.

Instead of explicitly calling read/write for every access, the program can access the file through memory loads and stores.

On a page fault or page-cache miss:

  1. The process accesses a mapped page.
  2. The page is not currently in memory.
  3. The kernel issues block I/O.
  4. The storage controller typically uses DMA to load the data into the page-cache page.
  5. The process can then continue.

Dirty pages can later be written back to storage, also typically using DMA.

The important idea is that DMA lets the storage device move the file data directly into the relevant memory page, rather than requiring the CPU to manually copy every byte.

15.3. Bus Addresses, Not Normal CPU Pointers

A device does not necessarily use the same kind of address as the CPU.

The driver often maps a buffer to obtain a bus address, also called a DMA address.

This is the address that the device can use.

The CPU may see:

  • virtual addresses
  • physical addresses

The device may see:

  • bus addresses
  • I/O virtual addresses, depending on whether an IOMMU is used

16. Example: IDE Disk with DMA

16.1. Read Flow

Conceptually:

int ide_dma_read(uint64_t sec_no) {
    dma_addr_t dma = dma_map(sector_buffer,
                             SECTOR_SIZE,
                             DMA_FROM_DEVICE);

    setup_lba(c, sec_no);
    out_dma_addr(reg_dma_addr(c), dma);
    outl(reg_dma_len(c), SECTOR_SIZE);
    outb(reg_command(c), CMD_READ_DMA);

    sema_down(&sector_sema);

    dma_unmap(dma, SECTOR_SIZE, DMA_FROM_DEVICE);
    return 0;
}

Meaning:

  1. Map the sector buffer for DMA.
  2. Obtain a DMA/bus address.
  3. Program the sector number.
  4. Program the DMA address.
  5. Program the transfer length.
  6. Issue the DMA read command.
  7. Block until the interrupt arrives.
  8. Unmap the buffer after completion.

16.2. Interrupt Handler

Conceptually:

void ide_interrupt(void) {
    sema_up(&sector_sema);
}

In the DMA version, the handler does not copy the sector.

The device has already written the data into memory.

The interrupt handler mainly wakes the waiting thread.

16.3. Main Takeaway

With DMA:

  • no CPU payload copy is needed
  • the interrupt handler becomes very small
  • the device writes data directly into memory

The handler did not disappear, but its job became much lighter.

17. DMA Trade-Offs

17.1. Advantages

Advantages:

  • no CPU payload copy
  • simple completion interrupt handler
  • scales well to large transfers
  • suitable for high-throughput devices
  • can use descriptor rings for many queued transfers

17.2. Disadvantages

Disadvantages:

  • setup is more complex
  • buffers must be mapped for DMA
  • pages may need to be pinned
  • one descriptor usually fixes:
    • address
    • length
    • direction
  • scatter-gather transfers need more setup
  • consumes memory bandwidth
  • consumes interconnect bandwidth
  • cache coherence must be considered
  • device failure needs timeout/error handling

17.3. Fixed Direction and Address

Once a DMA descriptor is set up, it has a fixed address, length, and direction.

For example, a descriptor may say:

Read 4096 bytes from the device into this buffer.

Changing the target or direction requires setting up another descriptor or mapping.

17.4. Scatter-Gather

Scatter-gather DMA allows a device to transfer data across multiple memory regions.

This is useful because physical memory may be fragmented.

However, it requires more setup because the driver must provide a list or ring of descriptors.

17.5. Device Failure and Timeouts

If a device fails during DMA, the expected completion interrupt may never arrive.

Then the waiting process could remain blocked forever unless the kernel has a timeout or error-handling path.

A robust driver must handle:

  • timeout
  • device error status
  • failed transfer
  • waking the caller with an error

18. Correctness Problems with DMA

This part is especially important because DMA bypasses the ordinary CPU execution path.

18.1. DMA Bypasses the CPU Copy Path

With DMA, the device accesses memory directly.

This is good for performance, but it creates correctness and protection issues.

The kernel must be careful when telling a device where to read or write.

18.2. Pinning

The OS may move pages, swap pages, or perform copy-on-write.

But a device doing DMA has been given an address to a physical frame or device-visible mapping.

If the OS moves or reuses that page while DMA is still in progress, the device may write into the wrong memory.

Therefore, pages used for DMA often need to be pinned for the duration of the transfer.

Pinning means:

Do not move, swap, or reuse this page while the device may access it.

18.3. Bounce Buffers

Some devices cannot address all physical memory.

Example:

  • a 32-bit device may not be able to address memory above 4 GB
  • the machine may have much more memory than that

In such cases, the OS can use a bounce buffer.

A bounce buffer is an intermediate buffer located in memory that the device can access.

Flow:

Original buffer <--> Bounce buffer <--> Device

This solves addressability problems but adds an extra copy.

18.4. Cache Coherence

The CPU sees memory through caches.

A DMA device accesses physical memory directly.

This can create stale-data problems.

Examples:

  • CPU has modified data in cache, but device reads old data from memory
  • device writes new data to memory, but CPU reads old cached data

On cache-coherent platforms, hardware keeps this consistent.

On non-cache-coherent platforms, the OS/driver must explicitly:

  • flush caches before device reads
  • invalidate caches after device writes
  • perform required synchronization

19. Security Problems with DMA

19.1. DMA Can Bypass CPU Memory Protection

Normal CPU memory protection uses:

  • page tables
  • privilege rings
  • permission bits
  • user/kernel separation
  • mechanisms such as SMEP/SMAP on some CPUs

But a DMA-capable device does not necessarily go through the CPU’s MMU.

Therefore, without additional protection, a device may be able to access physical memory directly.

19.2. Why This Is Dangerous

A buggy driver, compromised peripheral, or malicious firmware could potentially read or write sensitive memory.

Sensitive data in RAM may include:

  • kernel memory
  • memory of other processes
  • disk encryption keys
  • login screen data
  • credentials
  • private application data

19.3. Hot-Plug Devices Make This Worse

Hot-plug interfaces expose the system bus to devices that may be attached after boot.

Examples:

  • Thunderbolt
  • external PCIe-like interfaces

Historically, attacks such as PCILeech and Thunderclap demonstrated that malicious DMA-capable devices can be dangerous.

The key lesson is:

The CPU's protection model alone is not enough to protect against DMA.

20. IOMMU

20.1. Motivation

Because DMA can bypass normal CPU page-table protection, systems use an IOMMU.

IOMMU stands for Input/Output Memory Management Unit.

It provides address translation and protection for device memory accesses.

20.2. Analogy with MMU

The MMU protects memory accesses made by CPU processes.

The IOMMU protects memory accesses made by devices.

Analogy:

CPU side Device side
process virtual address I/O virtual address
MMU IOMMU
process page table per-device IOMMU page table
physical memory physical memory

The IOMMU is to devices what the MMU is to processes.

20.3. Address Spaces

There are three relevant address spaces:

20.3.1. Virtual Address

Used by user processes.

Translated by the MMU through per-process page tables.

20.3.2. Physical Address

Actual address of physical memory.

Used internally by the memory system.

20.3.3. I/O Virtual Address

Used by devices.

Translated by the IOMMU through per-device mappings.

20.4. What the IOMMU Provides

An IOMMU can provide:

  • per-device page tables
  • address translation
  • permission checking
  • isolation between devices
  • prevention of unauthorized DMA
  • faulting on invalid device accesses
  • ability to present scattered physical pages as contiguous I/O virtual ranges

20.5. IOMMU Is a Mechanism, Not a Guarantee

The IOMMU only protects memory if configured correctly.

Problems can still happen.

20.5.1. Passthrough or Identity Mode

The IOMMU may be enabled but configured to map device addresses directly to physical addresses.

In that case, there is little or no protection.

20.5.2. Late Initialization

If the IOMMU is configured late during boot, early drivers or devices may have a window where DMA is not protected.

20.5.3. Missing Unmaps

If the kernel forgets to unmap an IOVA after a DMA transfer, a device may retain access to memory after the buffer is freed.

This can lead to stale access.

20.5.4. Coarse Page Granularity

IOMMU mappings operate at page granularity.

If the intended buffer is smaller than a page, mapping the page may expose neighboring data in the same page.

21. Comparing the Three Data Transfer Strategies

21.1. Synchronous PIO

Best for:

  • already-ready devices
  • tiny transfers
  • simple devices
  • early boot
  • simple serial console usage

Cost:

  • busy waiting
  • CPU moves every byte
  • poor for large transfers

21.2. PIO with Interrupts

Best for:

  • slow devices
  • small transfers
  • keyboards
  • UART / serial ports
  • cases where waiting is expensive but copying is acceptable

Cost:

  • CPU still copies data
  • interrupt overhead per request

21.3. DMA with Interrupts

Best for:

  • large transfers
  • high-throughput devices
  • disks
  • SSDs
  • network cards
  • GPUs

Cost:

  • setup complexity
  • mapping complexity
  • pinning
  • cache coherence issues
  • security issues
  • completion/error handling

21.4. Short Summary Table

Mechanism What it removes What remains Best use case
Synchronous PIO Nothing Waiting and copying Tiny/simple transfers
PIO + interrupts Busy waiting CPU copying Slow devices, small transfers
DMA + interrupts Busy waiting and CPU copying Setup/mapping/security complexity Large/high-throughput transfers

22. Conceptual Progression of the Lecture

The lecture develops I/O transfer mechanisms incrementally.

22.1. First Step: Synchronous PIO

The simplest approach:

CPU tells device what to do.
CPU waits.
CPU copies the data.

Problem:

  • CPU wastes time waiting and copying.

22.2. Second Step: Interrupts

Improvement:

CPU tells device what to do.
CPU does other work.
Device interrupts when ready.
CPU copies the data.

Solved:

  • busy waiting

Still not solved:

  • CPU copying

22.3. Third Step: DMA

Further improvement:

CPU tells device where the buffer is.
CPU does other work.
Device copies data directly to/from memory.
Device interrupts when done.
CPU wakes the caller.

Solved:

  • busy waiting
  • CPU payload copying

New problems:

  • mapping
  • pinning
  • cache coherence
  • IOMMU/security
  • error handling

23. Important Terms

23.1. I/O

Input / Output: communication between the OS and physical devices.

23.2. Driver

Device-specific kernel code that controls hardware registers and handles interrupts.

23.3. Device Class

A category of devices with a common OS abstraction, such as character, block, or network.

23.4. I/O Bus

Interconnect that connects CPU, memory, and devices.

23.5. Enumeration

Discovery of devices and their capabilities.

23.6. Device Register

Small hardware-visible control or status location used by the driver.

23.7. PIO

Programmed I/O: CPU-driven device register reads and writes.

23.8. Port I/O

PIO through special CPU instructions and a separate I/O address space.

23.9. MMIO

Memory-mapped I/O: device registers mapped into the physical address space.

23.10. Polling

Repeatedly checking a device status register.

23.11. Busy Waiting

Actively spinning while waiting for an event.

23.12. Interrupt

Asynchronous signal from a device to the CPU/OS.

23.13. IRQ

Interrupt request.

23.14. IDT

Interrupt Descriptor Table, used by the CPU to enter the correct kernel interrupt handler.

23.15. Top Half

Fast interrupt-context part of interrupt handling.

23.16. Bottom Half

Deferred part of interrupt handling.

23.17. DMA

Direct Memory Access: device directly reads or writes main memory.

23.18. DMA Mapping

Process of preparing a memory buffer and obtaining a device-visible address.

23.19. Bus Address

Address used by a device for DMA.

23.20. IOMMU

I/O Memory Management Unit: translates and protects device memory accesses.

23.21. Pinning

Preventing a page from being moved, swapped, or reused during DMA.

23.22. Bounce Buffer

Intermediate DMA-accessible buffer used when a device cannot access the real target buffer.

23.23. Cache Coherence

Ensuring CPU caches and device memory accesses see consistent data.

24. Coming Next

The next lecture moves from general I/O mechanisms to block storage.

Topics announced:

  • how the OS talks to a real storage device
  • hard disk drives
    • geometry
    • seek time
    • rotational delay
  • solid-state drives
    • flash translation
    • wear leveling
  • block device abstraction
  • I/O scheduler

Author: Lowtroo

Created on: 2026-05-30 Sat 22:50

Powered by Emacs 29.3 (Org mode 9.6.15)