There are many benefits of virtual memory
- Simplify software: addresses in one process not affected by other processes
- Fault isolation: contain bugs and improve security
- Today's topic: Better performance and utilization:
- Swapping, demand paging, copy-on-write, shared libraries
Swapping
- The memory consumption of all running apps sometimes exceeds the amount of physical memory.
- Swapping: OS moves some pages from memory to disk and vice versa. (Disk is much larger/cheaper than memory)
- When not to rely on swapping?
- Not a solution if your program simply requires a lot of memory to run.
- Better to optimize your program to run with less memory.
- When is swapping beneficial?
- Applications that are IDLE, but take up space in memory
- Parts of application code/data that are NOT in use, but still placed in memory
- Memory leaks in applications
- The key idea: if the pages moved to the disk are RARELY used again, then swapping
improves memory utilization with little performance degradation.
- How is swapping implemented?
- The hardware must inform the kernel when an application is trying to touch a page not
currently residing in memory. Hardware notifies kernel via a Page Fault.
- Kernel allocates a page in memory and creates the absent va to pa mapping.
- H/w transparently resumes execution by re-executing the faulting instruction.
- The following pseudocode gives an outline of the page fault handler
//the kernel needs to keep track of the whereabouts of a swapped page on disk
typedef struct {
uintptr_t va;
bool swapped;
off_t swaploc;//location of the page in swap area;
} vpageinfo_t;
page_fault_handler(uintptr_t va, int cpl, int write)
{
...
vp = vpage_info(current, va);
if(vp was swapped out) {
pa = eviction_policy(); //find a victim page
block all victim processes whose page maps contains the victim page //not runnable
write the physical page to swap area
clear the presence bit of the victim page mapping
set victim processes to be runnable
read swapped page from vp->swaploc to pa;
pgdir_set(current->pgdir, vp->va, pa, PTE_P | ...);
run(current);
}
}
- Swapping significantly slows down the process.
- Thrashing: Thrashing happens when there is high swapping activity. This could happen because
either there's a genuine lack of memory to fit the working set of active processes or the eviction policy repeatedly swaps
out the ``wrong'' page.
Page Eviction Policies
Which page should the OS choose to swap out to disk?
A strawman policy: First-in, First-out. See example in Saltzer, Kaashoek (6-349)
Reference string: 0 1 2 3 0 1 4 0 1 2 3 4
physical page 1 :*0 0 0 *3 3 3 *4 4 4 4 4 4
physical page 2 : - *1 1 1 *0 0 0 0 0 *2 2 2
physical page 3 : - - *2 2 2 *1 1 1 1 1 *3 3
Now we add one more physical page,
Reference string: 0 1 2 3 0 1 4 0 1 2 3 4
physical page 1 :*0 0 0 0 0 0 *4 4 4 4 *3 3
physical page 2 : - *1 1 1 1 1 1 *0 0 0 0 *4
physical page 3 : - - *2 2 2 2 2 2 *1 1 1 1
physical page 4 : - - - *3 3 3 3 3 3 *2 2 2
FIFO eviction policy suffers from Belady's anomaly: Increasing physical memory results in increased page faults!
What's the optimal policy? Evict the page which will be used in the furthest in the future.
Reference string: 0 1 2 3 0 1 4 0 1 2 3 4
physical page 1 :*0 0 0 0 0 0 0 0 0 *2 *3 3
physical page 2 : - *1 1 1 1 1 1 1 1 1 1 1
physical page 3 : - - *2 *3 3 3 *4 4 4 4 4 4
Why is it optimal?
OPT is not a practical algorithm since we cannot predict future
LRU (least-recently used) is a popular approximation
Reference string: 0 1 2 3 0 1 4 0 1 2 3 4
physical page 1 :*0 0 0 *3 3 3 3 3 *1 1 1 *4
physical page 2 : - *1 1 1 *0 0 0 0 0 0 *3 3
physical page 3 : - - *2 2 2 *2 *4 4 4 *2 2 2
Does LRU or OPT suffer from Belady's anomaly?
- No. Both maintains the subset property. At any instance of time, the set of pages kept in
the 3-page-physical-memory is a subset of pages to be kept in the 4-page-physical memory.
- Does LFU policy keep the subset property?
Keeping track of every memory reference in the kernel software is too expensive. How to approximate it?
- H/W sets the referenced bit in the page table entry for a page upon using the entry
- Kernel periodically clears the reference bit, hence a non-zero reference bit indicates recent use.
The clock page-removal algorithm
- Sweep clock arm clock-wise, clearing the reference bit of the page encountered
- The first page whose reference bit is false is replaced.
Demand paging
- Often an application does not need all of its code/data pages
- E.g. many features of Adobe photoshops are not used all the time.
- Demand paging: move an application's code/data pages into memory only after the application attempts to use them.
- Implementation: setting the swap location to point of the locations of the binary executable on disk.
- When a process needs certain code that hasn't been loaded, it causes a page fault and the OS swaps in the portion from disk.
- Benefits: faster startup of huge applications!
- Pre-paging: prepaing is the opposite of demand paging. Here, the OS speculates which pages might be needed by
an application and brings them in before the app attempts to access them.
Shared libraries
Most programs in the system use (link to) the C library libc, or other common libraries like KDE, X etc.
There's no need to store multiple copies of the same library since these libraies mostly consist of
code and are rarely modified. But how to ensure applications do not change shared library's code/data?
Solution: store one copy of library in memory and map it to applications' address spaces as read-only.
What parts of the C library cannot be shared?
- Writable Data. For writable variables, they must be in their own page for every process.
Copy-on-write fork
In Lab1, we forked a mini-process by copying only the parent's stack to the child. This does not
provide isolation of processes, since the parent and child can easily trash other other's global/heap data.
In Lab 3, you are asked to implement fork() again, with correct isolation this time.
One way to do a fork is Eager Copying, where we copy the address space immediately.
Eager copying is wasteful because
- what's the a common syscall after fork()? exec().
- Exec replaces the existing address space content with a brand new one from disk. Thus, all the copying is wasted!!
How about lazy copying? i.e. we delay copying the content until it is required for the child's execution
- Clear child's page entries' PTE_P flag and copy from parent's address space upon child's page fault.
Will it work?
- Parent is allowed to run upon finishing of fork(). What if parent modifies the content of its pages?
- This violates fork() semantics as the child will copy a a modified page from parent.
- Thus, we need to mark parent's address space as read-only.
Copy-on-write fork makes the child's address space a copy of the parent's and mark both as read-only. Then kernel continues
execution with both parent or child.
How to handle page fault?
page_fault_handler(uintptr va, int cpl, bool write){
vp = vpage_info(current, va);
if(vp is copy_on_write){
copy the page contents into a new page pa
mark va as pointing to the new page with pgdir_set(current->pgdir, va, pa, PTE_P|PTE_W|...)
}
}
You can play many of the virtual memory tricks at user-level too using proper syscalls
- mmap: Maps files into process memory space. OS does demand paging, so the file content is loaded into memory only when you read it.
- mprotect: allows a process to make parts of its own memory space inaccessible, read-only etc.
- sigaction: registers an application specific signal handler. The OS sends the application process a SIGSEGV signal if it does not know how to handle a page fault.
The application signal handler can act as an app page fault handler.