Shared Memory Synchronization 2
1. Why the lecture still studies shared-memory synchronization
1.1. Actor models and software transactional memory are useful, but they are higher-level abstractions
- The lecture begins by addressing a question: why not use actor models or software transactional memory instead of low-level synchronization?
- The answer is that operating systems live directly on top of hardware.
- Hardware fundamentally exposes shared state / shared memory.
- Therefore, kernel and OS developers must understand and handle synchronization at the low level first.
- Higher-level abstractions can be built on top of the OS, but the OS itself cannot assume them.
2. Recap: synchronization and the “too much milk” example
2.1. The central problem
- Two processes (or actors) need to coordinate so that exactly one of them performs some action.
- Example: check whether there is milk; if not, go buy milk.
- This sequence is a critical section.
- The critical section is non-atomic: checking and acting are separated in time.
2.2. Desired properties
2.2.1. Safety
- Safety means mutual exclusion.
- At most one process should execute the critical section.
2.2.2. Liveness
- Liveness means progress.
- Eventually, some process should make progress and perform the required action.
2.2.3. Combined goal
- “At most one” + “at least one eventually” gives the desired exactly one behavior.
2.3. Recap of failed solution attempts
2.3.1. Naive solution
- Fails because the two processes may interleave step-by-step.
- Both can observe the same state and both may decide to act.
- Thus mutual exclusion is violated.
2.3.2. Symmetry-breaking attempt
- Improves safety in some executions.
- But it is not robust: if one process stops participating, the other may fail to make progress.
- So liveness is violated.
2.3.3. Another near-correct attempt
- Preserves “at most one” better.
- Also more robust against one process being absent.
- But there is still a corner case where both conclude that the other process is doing the work.
- Then both wait forever: no progress.
2.3.4. Busy-waiting style solution
- A final solution works by forcing one side to wait while the other completes the whole critical section.
- This can enforce correctness for the example.
- However, it is inefficient because it wastes CPU time.
3. Semaphores as the OS abstraction for synchronization
3.1. Motivation
- Busy waiting is usually not considered a good general solution.
- The operating system provides a better abstraction: the semaphore.
3.2. Semaphore semantics
3.2.1. P operation
- Wait until the semaphore value is positive.
- The check and the decrement happen atomically.
- After a successful P, the counter is decremented.
3.2.2. V operation
- Signal / increment the semaphore.
- It announces that some resource or event is available.
3.3. Two kinds of semaphore usage
3.3.1. Binary semaphore
- Used primarily for mutual exclusion.
- Behaves like a mutex in many examples.
3.3.2. Counting semaphore
- Used for event counting or resource counting.
- Represents how many times something has happened, or how many instances of a resource are available.
3.4. Important distinction
- Binary semaphores and counting semaphores serve different purposes.
- Correct synchronization solutions must keep these roles conceptually separate.
4. Producer-consumer synchronization
4.1. Problem setting
- There is a bounded pool of buffers.
- Producers generate data and place it into buffers.
- Consumers take data out of buffers and return the empty buffers to the pool.
4.2. Shared state and synchronization needs
- The queue / pool state is shared.
- Access to shared queue state must be mutually exclusive.
- Producers must wait until an empty buffer exists.
- Consumers must wait until a full buffer exists.
4.3. Required semaphores
4.3.1. One binary semaphore
- A mutex protects the shared queue / buffer-pool state.
4.3.2. Two counting semaphores
- One counts empty buffers available.
- One counts full buffers available.
4.4. Producer logic
- Wait until an empty buffer exists.
- Enter the mutex-protected critical section.
- Remove an empty buffer / update queue state.
- Leave the critical section.
- Produce / publish data.
- Signal that a full buffer is now available.
4.5. Consumer logic
- Wait until a full buffer exists.
- Enter the mutex-protected critical section.
- Remove a full buffer / update queue state.
- Leave the critical section.
- Process the data.
- Re-enter critical-section protection if needed for shared pool bookkeeping.
- Return the empty buffer to the pool.
- Signal that an empty buffer is now available.
4.6. Key structural observation
4.6.1. Mutex synchronization is symmetric
- With a mutex-like semaphore, P and V naturally occur in matching pairs.
- Lock, do work, unlock.
4.6.2. Condition synchronization is asymmetric
- For event-counting semaphores, P and V often occur in different places in the program.
- Example:
- producer does
V(full) - consumer does
P(full)
- producer does
- This is normal, because one side causes the event and another side waits for it.
4.7. Multiple consumers
- Adding a second consumer requires no algorithmic change.
- Another consumer can simply run the same code.
- The semaphore-based design already supports multiple consumers correctly.
4.8. Splitting the mutex
- The lecture discusses whether separate mutexes could protect different parts of the state.
- Answer: yes, that is possible.
- In real implementations, doing so may improve performance and concurrency.
- The lecture uses one mutex mainly for conceptual simplicity.
4.9. Reordering V operations
- Reordering certain V operations may be possible if the operations touch different state and are already atomic.
- However, it is often pointless.
- In general, critical sections should be kept as short as possible.
4.10. Reordering P operations can be dangerous
- If a consumer acquires the mutex before waiting for data, disaster may occur:
- the consumer holds the mutex,
- but waits for a producer to make data available,
- while the producer cannot acquire the mutex to publish the data.
- This creates a deadlock.
5. Deadlock
5.1. Intuition
- Deadlock occurs when processes are stuck waiting on one another and no progress is possible.
- Dijkstra called this a deadly embrace.
5.2. Wait-for graph view
- Model processes as nodes.
- Add an edge when one process waits for a resource held by another.
- A cycle in this graph indicates deadlock.
5.3. Stability of deadlock
- Once processes are in a deadlocked cycle, nothing inside the cycle can change the situation.
- Therefore, deadlock is stable unless some outside intervention occurs.
5.4. Necessary conditions for deadlock
The lecture develops the standard necessary conditions.
5.4.1. Mutual exclusion
- Some resource must require exclusive access.
5.4.2. No resource preemption
- Once a process holds a resource, the system cannot simply take it away safely.
5.4.3. Hold and wait
- A process holds one resource while waiting for another.
5.4.4. Circular wait
- A cycle exists in the wait-for relation.
5.5. Why simple resource preemption is hard
- A lock usually protects shared state.
- If the OS simply takes the lock away, the process may already have partially modified the shared data.
- Then the protected data structure may become inconsistent or corrupted.
5.6. Relation to transactions
- In database systems, deadlock recovery is more practical because transactions keep enough information to roll back changes.
- Similar ideas inspired software transactional memory.
- But for arbitrary low-level memory manipulation in an OS kernel, rollback is difficult and expensive.
6. Approaches to deadlock
6.1. Detection and recovery
- Detect cycles in the wait-for graph.
- Then break the deadlock, for example by killing one process.
6.1.1. Why this is difficult in kernels
- The wait-for graph changes dynamically.
- Detecting cycles reliably has overhead.
- Deciding when to run detection is tricky.
- Recovery is dangerous if the killed process has already left shared state inconsistent.
6.1.2. Suitable at higher abstraction levels
- Database transaction systems can often do this.
- OS kernels generally avoid relying on this.
6.2. Prevention
- Prevent deadlock from arising in the first place.
- Since the first three necessary conditions are hard to remove in practice, the lecture focuses on preventing circular wait.
7. Deadlock prevention via resource ordering
7.1. Core idea
- Impose a partial order on resources / locks.
- Processes may acquire locks only in an order consistent with that partial order.
- Then cycles become impossible.
7.2. Why a partial order works
- If every lock acquisition moves “forward” in the order, a cycle would imply:
- \(R_1 < R_2 < \dots < R_k < R_1\),
- which is impossible.
7.3. Why only a partial order, not a total order
- Not all locks in a system need to be related.
- If two locks never interact, their relative order does not matter.
- Therefore a partial order is sufficient.
7.4. Practical interpretation
- In real systems, the order may be implicit rather than globally written down.
- Every code path must respect it.
- If one path acquires \(A\) then \(B\), another path must not acquire \(B\) then \(A\).
7.5. Linux example: lock dependency checking
- The lecture mentions Linux’s
lockdepdebugging feature. - The kernel is too large for humans to write one global order explicitly.
- Instead, runtime checking can detect counterexamples:
- one code path acquires locks in one order,
- another path acquires them in the reverse order.
- Such a pair cannot both belong to the same valid partial order.
7.6. Important detail about V operations
- Deadlock fundamentally arises from waiting.
- P operations may block; V operations do not block.
- Therefore ordering constraints matter primarily for blocking acquisition operations, not for releases.
8. “All-or-nothing” resource acquisition
8.1. Idea
- Instead of acquiring locks incrementally, require a process to request all needed resources at once.
- The request succeeds only if all are available; otherwise it gets none.
8.2. Advantage
- This eliminates the hold-and-wait condition.
8.3. Limitation
- In many OS scenarios, a process discovers which resource it needs next only after inspecting already locked state.
- Therefore it often cannot know all required resources in advance.
8.4. Conservative workaround
- Request every resource that might be needed.
- This is safe but pessimistic.
- It harms concurrency.
8.5. Retry-loop workaround
- Lock something, inspect it, unlock, then request a larger set, check again, maybe retry.
- This may work in practice, but progress can no longer be guaranteed because after releasing the lock and reacquiring a larger set of resources, the observed state may have changed, forcing the process to restart. Under contention, this restart can happen indefinitely.
- So it is not a fully satisfactory solution.
9. Reader-writer synchronization
9.1. Problem setting
- Shared data structure, e.g. a database.
- Readers only inspect the structure.
- Writers modify it.
9.2. Desired access policy
9.2.1. Readers
- Multiple readers may read concurrently.
9.2.2. Writers
- A writer must exclude:
- other writers,
- all readers.
9.3. Reader protocol requirements
A reader should:
- Wait until reading is allowed.
- Record its presence as an active reader.
- Perform the read.
- Remove its presence when done.
- Potentially wake a waiting writer when the last reader leaves.
9.4. Writer protocol requirements
A writer should:
- Wait until writing is allowed.
- Record its presence as the active writer.
- Perform the write.
- Remove its presence when done.
- Wake either another writer or the waiting readers.
9.5. State variables
The lecture introduces four counters:
AR= active readersAW= active writersWR= waiting readersWW= waiting writers
9.6. Semaphores needed
- One mutex to protect these counters.
- One counting semaphore
okToRead. - One counting semaphore
okToWrite.
9.7. Reader-side logic
9.7.1. Entry section
- Acquire the mutex.
- If there are no active or waiting writers, the arriving reader may proceed immediately:
- increment
AR, - signal
okToReadfor itself.
- increment
- Otherwise:
- increment
WR, - do not signal
okToRead, - so the reader will really block later.
- increment
- Release the mutex.
- Perform
P(okToRead). - Then enter the actual read section.
9.7.2. Exit section
- Acquire the mutex.
- Decrement
AR. - If this reader was the last active reader and writers are waiting:
- signal one writer,
- update bookkeeping so that one writer becomes active.
- Release the mutex.
9.8. Writer-side logic
9.8.1. Entry section
- Acquire the mutex.
- If there are no active readers and no active writers:
- the writer may proceed,
- increment
AW, - signal
okToWritefor itself.
- Otherwise:
- increment
WW, - the writer will later block.
- increment
- Release the mutex.
- Perform
P(okToWrite). - Then enter the actual write section.
9.8.2. Exit section
- Acquire the mutex.
- Decrement
AW. - If there are waiting writers:
- wake one writer next.
- Otherwise, if there are waiting readers:
- wake all waiting readers, not just one.
- Release the mutex.
9.9. Why waking all readers requires a loop
- Readers are allowed to proceed concurrently.
- Therefore the writer exit code must signal
okToReadonce for each waiting reader.
9.10. Important subtlety
- The code sometimes “signals itself”:
- when a process discovers that it may proceed immediately,
- it records that fact by incrementing the corresponding semaphore,
- then later consumes that permission with its own P operation.
- This keeps the structure uniform: both immediate progress and blocking cases use the same synchronization point.
10. Writer preference
10.1. Policy in this lecture
- Once a writer is waiting, newly arriving readers are no longer allowed to enter immediately.
- They must wait.
10.2. Motivation
- Reader-writer locks are typically used when readers are frequent and writers are less frequent but important.
- If readers were always preferred, a continuous stream of readers could starve writers forever.
- Writer preference reduces update latency.
10.3. Alternative policies
10.3.1. Reader preference
- Easier on readers.
- But may starve writers.
10.3.2. Fair reader-writer locks
- Alternate or otherwise enforce fairness between readers and writers.
11. Additional clarifications from discussion
11.1. Waiting vs scheduler preemption
- “Waiting” in the deadlock discussion means blocked / non-runnable waiting for an event.
- This is different from being merely preempted by the scheduler.
- A preempted process is still conceptually runnable.
11.2. Why writer entry does not explicitly check waiting writers
- The lecture notes that checking active readers and active writers is sufficient for correctness in the given logic.
- A question revealed that one must be careful with the precise implication:
- if a writer is waiting, then some blocking condition already existed,
- namely active readers or an active writer.
- So the active-state checks are the essential part.
11.3. Domain of some variables
AWis effectively binary in the presented design: either 0 or 1.okToReadmay count multiple readers.okToWriteonly enables one writer at a time.
12. Practical takeaway of the lecture
12.1. Semaphores enable layered synchronization design
- Once the OS provides semaphores, many higher-level coordination patterns can be built systematically.
12.2. Distinguish two major uses
- Mutual exclusion for protecting shared state.
- Condition/event synchronization for waiting on state changes.
12.3. Deadlock is a design issue, not just a bug in one statement
- Incorrect acquisition order can make perfectly reasonable code deadlock.
- Preventing circular wait by lock ordering is the key practical discipline.
12.4. Reader-writer synchronization is more subtle than mutex-only synchronization
- Because readers may share access, but writers need exclusion.
- This creates richer state and more involved bookkeeping.
13. Administrative note at the end of the lecture
- The first project assignment / milestone is posted.
- Submission deadline: May 13.
- Students are encouraged to start early because:
- the project is work-intensive,
- the next assignment appears already on May 6,
- so assignments overlap.
- Submission is through a specially named Git branch and an integrated CMS/repository workflow.
- Automatic grading exists, but human inspection also matters.
- Good code quality and style are expected; passing auto-tests alone is not enough.