42 lines
		
	
	
		
			2.2 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			42 lines
		
	
	
		
			2.2 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
Hollis Blanchard <hollisb@us.ibm.com>
 | 
						|
15 Apr 2008
 | 
						|
 | 
						|
Various notes on the implementation of KVM for PowerPC 440:
 | 
						|
 | 
						|
To enforce isolation, host userspace, guest kernel, and guest userspace all
 | 
						|
run at user privilege level. Only the host kernel runs in supervisor mode.
 | 
						|
Executing privileged instructions in the guest traps into KVM (in the host
 | 
						|
kernel), where we decode and emulate them. Through this technique, unmodified
 | 
						|
440 Linux kernels can be run (slowly) as guests. Future performance work will
 | 
						|
focus on reducing the overhead and frequency of these traps.
 | 
						|
 | 
						|
The usual code flow is started from userspace invoking an "run" ioctl, which
 | 
						|
causes KVM to switch into guest context. We use IVPR to hijack the host
 | 
						|
interrupt vectors while running the guest, which allows us to direct all
 | 
						|
interrupts to kvmppc_handle_interrupt(). At this point, we could either
 | 
						|
- handle the interrupt completely (e.g. emulate "mtspr SPRG0"), or
 | 
						|
- let the host interrupt handler run (e.g. when the decrementer fires), or
 | 
						|
- return to host userspace (e.g. when the guest performs device MMIO)
 | 
						|
 | 
						|
Address spaces: We take advantage of the fact that Linux doesn't use the AS=1
 | 
						|
address space (in host or guest), which gives us virtual address space to use
 | 
						|
for guest mappings. While the guest is running, the host kernel remains mapped
 | 
						|
in AS=0, but the guest can only use AS=1 mappings.
 | 
						|
 | 
						|
TLB entries: The TLB entries covering the host linear mapping remain
 | 
						|
present while running the guest. This reduces the overhead of lightweight
 | 
						|
exits, which are handled by KVM running in the host kernel. We keep three
 | 
						|
copies of the TLB:
 | 
						|
 - guest TLB: contents of the TLB as the guest sees it
 | 
						|
 - shadow TLB: the TLB that is actually in hardware while guest is running
 | 
						|
 - host TLB: to restore TLB state when context switching guest -> host
 | 
						|
When a TLB miss occurs because a mapping was not present in the shadow TLB,
 | 
						|
but was present in the guest TLB, KVM handles the fault without invoking the
 | 
						|
guest. Large guest pages are backed by multiple 4KB shadow pages through this
 | 
						|
mechanism.
 | 
						|
 | 
						|
IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network
 | 
						|
and block IO, so those drivers must be enabled in the guest. It's possible
 | 
						|
that some qemu device emulation (e.g. e1000 or rtl8139) may also work with
 | 
						|
little effort.
 |