MiscellaneouZ: November 2009

Saturday, November 21, 2009

Why are GDT descriptors so messed up?

Ever wondered why a GDT descriptor had such a fragmented format? Like anybody born in the 80's, I have.

Here is a 64-bit, standard, non-system, generic GDT segment descriptor:

The base address is fragmented into 2 pieces (low 24 bits, high 8 bits), as is the segment limit (low 16 bits, high 8 bits). Why so?

The answer is, you guessed, backward compatibility. And the guilty is the 80286. This processor was introduced in 1982, and was the first to support Protected Mode (PM). This PM was not exactly the one we know and use nowadays though; it was simpler, sort of a version 0.1 is you will.

The 286 manual here shows us the encoding of a standard GDT descriptor - check figure 6-3. Its size is 8 bytes, but the upper 2 bytes "must be set to 0 for compatibility with iAPX 386". Interesting; so even then, they were envisaging some PM extensions... The meaningful data is contained in the low 6 bytes, formatting a non-fragmented descriptor:
- bytes 0-1: 16-bit limit
- bytes 2-4: 24-bit base
- byte 5: flags

So wait. Does it mean a segment can be at max 64Kb? And a 24-bit base means that only 16Mb of memory are physically addressable, right? - you say. Well, yes and yes. It's old-PM, remember.

The new version introduced by the 386 has its lot of improvements, among which:

Real 32-bit addressing: an extra byte for the base (8th position) was added, making it 32-bit long
The possibility to have 4Gb long segments (as opposed to 64Kb...): 4 bits were added to the limit field, making it 20 bits. Since 20 bits can only address 1Mb, 4 extra attribute bits were added, including the G bit (pos.23). Setting the G(ranularity) bit means the limit field indicates the last addressable 4-Kb block. Therefore, one sets the limit to 0xFFFFF with G=1 to have a 4Gb segment (like all modern OS do).

Which explains why GDT descriptors seem so messed up...

Another interesting bit introduced in the extra 4-bit of the attributes field is the D/B bit (pos.22). This bit indicates the default operand-size of the segment, and setting it to 1 means it's 32-bit. It was of course set to 0 for the 286 "6-byte" descriptors. Just one more element that just shows how the 386 was the real cornerstone, implementing the things that lacked in the 286 (including the paging unit), and became a standard.

If you want to know more about this, check out the Wikipedia PM history section as well as the 286 manual mentioned earlier. Also, a very interesting trivia on the 286 is how the inability to switch back from protected-mode to real-more gave a few guys at Microsoft some very hard (and fun) time!

Friday, November 6, 2009

Detecting simple hypervisors

This was tested only on Intel VT-x but it might work as well with AMD-V. In between a couple of blue screens, I realized there exist many simple ways to detect primitive HVM rootkits - ie, the ones that don't implement all the code required to achieve super-stealth.

Let me give you a simple example. CPUID is a peculiar instruction; it's the only non ring-0 allowed instruction that will unconditionally trigger a VM-exit (Vol.3B:253669, p21-2). If you trace over CPUID in your favorite debugger, the VM-exit will happen: the CPU enters VMX-Root mode, and your VMM is called to handle the offending instruction.

The lazy coder would write something like:


// emulate CPUID
if(Reason == EXIT_REASON_CPUID) then
    pushad
    mov     eax, GuestEax
    mov     ecx, GuestEcx
    cpuid
    mov     GuestEax, eax
    mov     GuestEcx, ecx
    mov     GuestEdx, edx
    mov     GuestEbx, ebx
    popad
    ...

// update isntruction pointer
GuestEip += 2

vmresume()

Well, that's not enough. Execution will resume after CPUID. The trap flag will raise the debug exception after executing the instruction that follows... and there you have one simple way to detect poorly coded HVMs. One way to prevent this would be to inject a vectored event (INT1) on VM-entry.

This piece of assembly code implements the above trick (if we may call it so):


    call    GetCurrentProcess
    push    1
    push    eax
    call    SetProcessAffinityMask

    push    offset seh
    push    dword ptr fs:[0]
    mov     fs:[0], esp

    pushfd
    or      dword ptr [esp], 100h
    popfd

    cpuid

traphere:
    mov     eax, 1
    jmp     done
    
seh:
    mov     eax, [esp+0Ch]
    cmp     dword ptr [eax+0B8h], traphere
    setne   al
    movzx   eax, al

done:
    add     eax, 30h
    push    eax
    push    esp
    call    crt_printf

    push    eax
    call    ExitProcess

(On SMP systems, change the thread affinity to execute the test on other CPUs.)

There are other similar ways to detect a hypervisor from user-mode - again, assuming its implementation is lacking.

MiscellaneouZ

Saturday, November 21, 2009

Why are GDT descriptors so messed up?

Friday, November 6, 2009

Detecting simple hypervisors

Blog Archive

Labels

About Me

Blog Corner (in French)