MmGetPhysicalAddress is a kernel-exported API that allows converting a virtual address to a physical one.
Quick reminder: Windows runs in protected (or long/IA64) mode with paging enabled, plus/minus physical address extensions enabled. The post focus on a 32-bit kernel with PAE disabled for enhanced clarity.
The PDE/PTE structures for 4Kb pages are (from Intel manual):

The PDE structure for 4Mb pages is:

The conversion to a physical address is done internally by the hardware. No instruction implements a physical-to-virtual address conversion. When the operating system uses paging, the key to physical conversion lies in CR3. This register contains the physical address to the page-directory entries table.
In order to do a virtual-to-physical converion programmatically, one needs to know where the page translation tables are located in 
virtual memory. When created, these pages are referenced only by their physical addresses, stored in the page-directory or page-table entries (see pictures above). This chicken&egg problem is solved by Windows (and most likely other x86 OSs) by reserving a range in the kernel address space for all lowest-level page-description pages (PTEs for 4Kb pages, PDEs for 4Mb pages).
Let's examine how MmGetPhysicalAddress is implemented in the simplest version of a Windows XP SP3 kernel (32-bit, PAE disabled):
.text:0042E046 ; unsigned __int64 __stdcall MmGetPhysicalAddress(unsigned int BaseAddress)
.text:0042E046
.text:0042E046 BaseAddress     = dword ptr  8
.text:0042E046
.text:0042E046                 mov     edi, edi
.text:0042E048                 push    ebp
.text:0042E049                 mov     ebp, esp
.text:0042E04B                 push    esi
.text:0042E04C                 mov     esi, [ebp+BaseAddress]
.text:0042E04F                 mov     eax, esi
.text:0042E051                 shr     eax, 14h
.text:0042E054                 and     eax, 0FFCh
.text:0042E059                 mov     ecx, [eax-3FD00000h]
.text:0042E05F                 mov     eax, ecx
.text:0042E061                 and     ax, 81h
.text:0042E065                 cmp     al, 81h ;present? page size?
.text:0042E067                 jnz     short 4Kbpage
.text:0042E067
.text:0042E069                 mov     eax, esi ;4Mb page
.text:0042E06B                 shr     eax, 0Ch
.text:0042E06E                 and     eax, 3FFh
.text:0042E073                 shr     ecx, 0Ch
.text:0042E076                 add     eax, ecx
.text:0042E076
.text:0042E078
.text:0042E078 convert:
.text:0042E078                 xor     ecx, ecx
.text:0042E07A                 shld    ecx, eax, 0Ch
.text:0042E07E                 shl     eax, 0Ch
.text:0042E081                 and     esi, 0FFFh
.text:0042E087                 add     eax, esi
.text:0042E089                 mov     edx, ecx ;0
.text:0042E089
.text:0042E08B
.text:0042E08B done:
.text:0042E08B                 pop     esi
.text:0042E08C                 pop     ebp
.text:0042E08D                 retn    4
.text:0042E090
.text:0042E090 4Kbpage:
.text:0042E090                 test    cl, 1 ;present?
.text:0042E093                 jz      short error
.text:0042E093
.text:0042E095                 mov     eax, esi
.text:0042E097                 shr     eax, 0Ah
.text:0042E09A                 and     eax, 3FFFFCh
.text:0042E09F                 sub     eax, 40000000h
.text:0042E0A4                 mov     eax, [eax]
.text:0042E0A6                 test    al, 1 ;PTE present?
.text:0042E0A8                 jz      short error
.text:0042E0A8
.text:0042E0AA                 shr     eax, 0Ch
.text:0042E0AD                 jmp     short convert
.text:0042E0AF
.text:0042E0AF error:
.text:0042E0AF                 xor     eax, eax
.text:0042E0B1                 xor     edx, edx
.text:0042E0B3                 jmp     short done
.text:0042E0B3
.text:0042E0B3 _MmGetPhysicalAddress@4 endp
Remember the VA is decomposed into 3 or 2 parts:
- 3 parts for 4Kb pages: [10bits=PDE index / 10bits=PTE index / 12bit=Page offset]
- 2 parts for 4Mb pages: [10bits=PDE index / 22bits=Page offset]
The code first gets the PDE index*4, which is the PDE offset relative to CR3 since PDE entries are 4-byte long. This value is added to -3FD00000. The PDE offset being in [0,FFC], the result will be in [C0300000,C0300FFC]. Now, the first PDE is pointed by CR3. Which means that 
physical_to_virtual(CR3)=C0300000, for all processes. The range [C0300000,C0301000[ contains the 0x400 PDEs.
A comparison then checks if the page is present or not (bit0) and the function returns 0 if the page is not present. The comparison also checks for bit7; if set, this bit indicates a 4Mb page and a different, simpler code branch is executed.
For a 4Kb PDE, the next 10 bits of the VA are extracted, then made a PTE offset in [0,FFC]. The offset is added to -40000000. The resulting value is in [C0000000, C03FFFFC]. This means the PTEs are in the range [C0000000, C0400000[. It's important to understand that this range is "reserved"; only a handful of these pages are actually mapped to physical ones, as explained below.
(The physical address is then calculated by extracting the page offset part of the VA (bottom 12 or 22 bits) and adding it to the physical address of the lowest-level page in the translation hierarchy.)
What's interesting in this scheme is the address range used. Let's consider a 3-level hierarchy (4Kb pages). CR3 "points" to C0300000, ie the first PDE is at C0300000. The PTEs go from C0000000 to C0400000: The PDE range overlaps the PTE range! And not anywhere, exactly at the 3/4th of this range; which makes sense since the 3/4th of a 32-bit address space also start at C0000000. This is not random of course: the PDEs are themselves referenced by the PTEs to allow the processor to access the [C0000000, C0400000[ range!
This may seem a bit obscure, plus my explanations here are pretty poor. It's funny how explaining 30 lines of smart assembly can be so tricky... The thing to remember to understand this is that the CPU offers NO facility to do physical to virtual conversion. But to let access the kernel access the pages that allow the CPU to do this conversion internally, they must be accessible in virtual memory. And to be accessible in virtual memory, they must be referenced by themselves. This self-reference mechanism allows the implemention of MmGetPhysicalAddress.
Quick lab experiment. Fire up WinDbg, local kernel debugging:
lkd> !process 0 0 SystemPROCESS 81bcc830  SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000  DirBase: 00039000  ObjectTable: e1000cc0  HandleCount: 244.Image: SystemDirBase is the physical address of the first PDE (loaded into CR3).
We can confirm that by check the field in the associated EPROCESS structure:
lkd> dt _EPROCESS Pcb.DirectoryTableBase 81bcc830+0x000 Pcb                    :+0x018 DirectoryTableBase     : [2] 0x39000Now, let's get the physical address of C0300000. We use !vtop, with the PFN page for the process (39000 >> 12):
lkd> !vtop 39 c0300000Pdi 300 Pti 300c0300000 00039000 pfn(00039)The result is 39000, which confirms that C0300000 maps the PDEs.
You can check it for other processes, for instance WinDbg itself:
lkd> !process 0 0 windbg.exePROCESS 81ad8410  SessionId: 0  Cid: 049c    Peb: 7ffd7000  ParentCid: 028cDirBase: 00ae9000  ObjectTable: e1c446b0  HandleCount: 614.Image: windbg.exelkd> !vtop ae9 c0300000Pdi 300 Pti 300c0300000 00ae9000 pfn(00ae9)