Saturday, November 21, 2009

Why are GDT descriptors so messed up?

Ever wondered why a GDT descriptor had such a fragmented format? Like anybody born in the 80's, I have.

Here is a 64-bit, standard, non-system, generic GDT segment descriptor:


The base address is fragmented into 2 pieces (low 24 bits, high 8 bits), as is the segment limit (low 16 bits, high 8 bits). Why so?

The answer is, you guessed, backward compatibility. And the guilty is the 80286. This processor was introduced in 1982, and was the first to support Protected Mode (PM). This PM was not exactly the one we know and use nowadays though; it was simpler, sort of a version 0.1 is you will.

The 286 manual here shows us the encoding of a standard GDT descriptor - check figure 6-3. Its size is 8 bytes, but the upper 2 bytes "must be set to 0 for compatibility with iAPX 386". Interesting; so even then, they were envisaging some PM extensions... The meaningful data is contained in the low 6 bytes, formatting a non-fragmented descriptor:
- bytes 0-1: 16-bit limit
- bytes 2-4: 24-bit base
- byte 5: flags

So wait. Does it mean a segment can be at max 64Kb? And a 24-bit base means that only 16Mb of memory are physically addressable, right? - you say. Well, yes and yes. It's old-PM, remember.

The new version introduced by the 386 has its lot of improvements, among which:
  • Real 32-bit addressing: an extra byte for the base (8th position) was added, making it 32-bit long
  • The possibility to have 4Gb long segments (as opposed to 64Kb...): 4 bits were added to the limit field, making it 20 bits. Since 20 bits can only address 1Mb, 4 extra attribute bits were added, including the G bit (pos.23). Setting the G(ranularity) bit means the limit field indicates the last addressable 4-Kb block. Therefore, one sets the limit to 0xFFFFF with G=1 to have a 4Gb segment (like all modern OS do).
Which explains why GDT descriptors seem so messed up...

Another interesting bit introduced in the extra 4-bit of the attributes field is the D/B bit (pos.22). This bit indicates the default operand-size of the segment, and setting it to 1 means it's 32-bit. It was of course set to 0 for the 286 "6-byte" descriptors. Just one more element that just shows how the 386 was the real cornerstone, implementing the things that lacked in the 286 (including the paging unit), and became a standard.

If you want to know more about this, check out the Wikipedia PM history section as well as the 286 manual mentioned earlier. Also, a very interesting trivia on the 286 is how the inability to switch back from protected-mode to real-more gave a few guys at Microsoft some very hard (and fun) time!

1 comment:

Tom said...

Very informative post! Thanks alot