Voodoo Bug

This has been fixed: https://lists.cs.columbia.edu/pipermail/kvmarm/2012-November/004172.html

What is the voodoo bug?

The voodoo bug in KVM/ARM or related software, which causes random kernel panics in VMs when the host machine is swapping memory.

How to introduce the voodoo bug?

Assuming you are running on a host with 2GB of physical memory, do the following:

Checkout the kvm-arm-master branch from here: git://github.com/virtualopensystems/linux-kvm-arm.git
Find the line in arch/arm/kvm/mmu.c that says "kvm_release_pfn_dirty(pfn);" and change it to "kvm_release_pfn_clean(pfn);"
Compile this kernel and fire up your favorite KVM/ARM system using that kernel
Enable swapping on the host
$ swapon /dev/sdX # (or create a loopback device to swapon)
Create a VM with 512 MB of RAM.
Start the VM, and inside the VM:
1. $ git clone git://github.com/chazy/mtest.git
2. $ cd mtest
3. $ git checkout guest-test
4. $ make
5. $ ./mtest 450
Back in the host, do:
1. $ mkdir /mnt/ramfs
2. $ mount -t ramfs none /mnt/ramfs
3. $ dd if=/dev/zero of=/mnt/ramfs/foo bs=1M count=1200
4. $ git clone git://github.com/chazy/mtest.git
5. $ cd mtest
6. $ git checkout guest-test
7. $ make
8. $ ./mtest 600

Now wait for a little bit, and the guest kernel will go boom!

What has been tried?

We have tried all of these things with no progress:

Compile host and guest kernel as non-smp kernels
Disable kernel preemption on both host and guest
Disable highmem on the host
Set stage2 translations to make memory non-cacheable
Remove logic that frees stage2 page tables
Flush caches and TLBs on every world switch
Manually invalidate TLBs using an IPI instead of relying on inner-shareable invalidation
Calling set_page_dirty_lock instead of kvm_set_pfn_dirty for writable pages
Running without VGIC and arch. timers support

Other interesting observations

The bug only happens when the host begins to swap

Christoffer traced through a number of the guest kernel crashes, and one example was the assert_raw_spin_locked(&task_rq(p)->lock) failed in resched_task(p), and the call stack clearly showed that the code went through a function that locks the runqueue lock, so this would indicate that a write is somehow lost (a write that coincidentally happened from a strex), but other simple null pointer exceptions are also typically observed, for example with linked list traversals.

The bug only happens in kernel space. The mtest program runs through a loop of around 450MB and reads/writes every single word to test for consistency, and we have never seen this fail. The bug is somehow memory related, and we see the bug always in the kernel, accessing only roughly 10MB, so what does the kernel do differently than user space to provoke this bug?

Example guest kernel crashes

(These are not necessarily indicating that the bug looks different with no-preemption, it's more random than that)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!