Theory

The Memory Management Unit (MMU) is a hardware unit that enables address translations such as going from virtual addresses to physical addresses. Many major operating system today uses the MMU to provide memory protection and process isolation.

This tutorial is a continuation of the previous article that will describe how to run the flat binary we have generated in an MMU context. We based our code for setting up the MMU from the xv6-rpi project. You can find the processor manual that describes the registers that we write to here.

xv6-rpi builds the kernel image as an ELF file format that defines the load location to be placed at the expected location. This is not a flat binary as it does contain an ELF file header but QEMU does know how to load an ELF file. QEMU will place the image at the locations specified by the ELF file header.

kernel.elf is the kernel image generated by xv6-rpi when you do a make

$ readelf -e kernel.elf

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x10000
  Start of program headers:          52 (bytes into file)
  Start of section headers:          710932 (bytes into file)
  Flags:                             0x5000200, Version5 EABI, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         2
  Size of section headers:           40 (bytes)
  Number of section headers:         16
  Section header string table index: 13

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .start_sec        PROGBITS        00010000 010000 009000 00 WAX  0   0  8
  [ 2] .text             PROGBITS        80020000 020000 008cf4 00  AX  0   0  4
  [ 3] .rodata           PROGBITS        80028cf4 028cf4 00080d 00   A  0   0  4
  [ 4] .data             PROGBITS        8002a000 02a000 0800c0 00  WA  0   0  4
  [ 5] .bss              NOBITS          800aa0c0 0aa0c0 00557c 00  WA  0   0  4
  [ 6] .ARM.attributes   ARM_ATTRIBUTES  00000000 0aa0c0 000027 00      0   0  1
  [ 7] .comment          PROGBITS        00000000 0aa0e7 000038 01  MS  0   0  1
  [ 8] .debug_frame      PROGBITS        00000000 0aa120 000040 00      0   0  4
  [ 9] .debug_line       PROGBITS        00000000 0aa160 000157 00      0   0  1
  [10] .debug_info       PROGBITS        00000000 0aa2b7 00020a 00      0   0  1
  [11] .debug_abbrev     PROGBITS        00000000 0aa4c1 00003c 00      0   0  1
  [12] .debug_aranges    PROGBITS        00000000 0aa500 000060 00      0   0  8
  [13] .shstrtab         STRTAB          00000000 0ad87a 00009a 00      0   0  1
  [14] .symtab           SYMTAB          00000000 0aa560 002880 10     15 434  4
  [15] .strtab           STRTAB          00000000 0acde0 000a9a 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x010000 0x00010000 0x00010000 0x09000 0x09000 RWE 0x10000
  LOAD           0x020000 0x80020000 0x00020000 0x8a0c0 0x8f63c RWE 0x10000

 Section to Segment mapping:
  Segment Sections...
   00     .start_sec 
   01     .text .rodata .data .bss 

From the LOAD addresses specified in the program headers, the image will be loaded into physical memory at 0x010000 (.start_sec) and 0x020000 (.text) on startup. 0x010000 is the entry point of the board on bootup and the code placed there will initialize the MMU and then it will jump into the virtual address 0x80020000, which is the physical address 0x020000. This is known as identity mapping and an advantage of it is that the code region will not care if the MMU is enabled or not.

The kernel.ld script is what defines all of these locations, it is used by the GNU linker to generate the location offsets while linking to make the ELF image.

This will be a problem for us using the Plan 9 toolchain, as the Plan 9 linker is not advanced enough to specify multiple addresses to generate the code relative to. The Plan 9 linker only allows us to specify one location that the code can be relative to using the -Taddr flag. The implication of this limitation is that if we want the simplest way without jumping through hoops such as a secondary loader or addreess fixups is that we should set the address relative to the startup address at 0x010000. But since we want the address relative to the virtual address space when we enable the MMU, we need to set it to be relative to 0x80010000.

Another issue is that we will load using a flat binary, this means that QEMU will just copy all of the binary into memory at the startup location at 0x010000 on startup in a contiguous manner, so it will not place anything at 0x020000 as it did for the xv6-rpi kernel image. We need to adjust the locations of the code and data placement so we can work with the restriction of placing everything at 0x010000.

With these two limitations, we have to ask ourselves an important question before we continue. How does this even work? If we specify the text segment at the virtual address 0x80010000, wouldn’t the code generated use this address then crash and burn due to being unmapped on bootup? To answer this we need to look at some assembly.

The source code we are about to disassemble is in entry.s and start.c, review it to see what it is doing.

Here is the partial disassembly of the startup code generated by the Plan 9 toolchain:

$ aout2elf plan9.out
$ arm-none-eabi-objdump -D plan9.elf

plan9.elf:     file format elf32-littlearm


Disassembly of section .text:

80010000 <_start>:
80010000:       e3a01a02        mov     r1, #8192       ; 0x2000
80010004:       e3a02a09        mov     r2, #36864      ; 0x9000
80010008:       e3a03000        mov     r3, #0
8001000c:       e5813000        str     r3, [r1]
80010010:       e2811004        add     r1, r1, #4
80010014:       e1520001        cmp     r2, r1
80010018:       1afffffb        bne     8001000c <_start+0xc>
8001001c:       e3a010d3        mov     r1, #211        ; 0xd3
80010020:       e129f001        msr     CPSR_fc, r1
80010024:       e3a0da02        mov     sp, #8192       ; 0x2000
80010028:       eb00003c        bl      80010120 <start>
8001002c:       eafffffe        b       8001002c <_start+0x2c>

80010030 <load_pgtbl>:
80010030:       e59f34b0        ldr     r3, [pc, #1200] ; 800104e8 <kmain+0x148>
80010034:       ee033f10        mcr     15, 0, r3, cr3, cr0, {0}
80010038:       e3a03004        mov     r3, #4
8001003c:       ee023f50        mcr     15, 0, r3, cr2, cr0, {2}
80010040:       e1a03000        mov     r3, r0
80010044:       ee023f30        mcr     15, 0, r3, cr2, cr0, {1}
80010048:       e59d3008        ldr     r3, [sp, #8]
8001004c:       ee023f10        mcr     15, 0, r3, cr2, cr0, {0}
80010050:       ee113f10        mrc     15, 0, r3, cr1, cr0, {0}
80010054:       e59fb490        ldr     fp, [pc, #1168] ; 800104ec <kmain+0x14c>
80010058:       e183300b        orr     r3, r3, fp
8001005c:       ee013f10        mcr     15, 0, r3, cr1, cr0, {0}
80010060:       e3a03000        mov     r3, #0
80010064:       ee083f17        mcr     15, 0, r3, cr8, cr7, {0}
80010068:       e28ef000        add     pc, lr, #0

80010078 <print0>:
80010078:       e52de008        str     lr, [sp, #-8]!
8001007c:       e59f5474        ldr     r5, [pc, #1140] ; 800104f8 <kmain+0x158>
80010080:       e1a04000        mov     r4, r0
80010084:       e1d430d0        ldrsb   r3, [r4]
80010088:       e3530000        cmp     r3, #0
8001008c:       0a000004        beq     800100a4 <print0+0x2c>
80010090:       e4d42001        ldrb    r2, [r4], #1
80010094:       e5c52000        strb    r2, [r5]
80010098:       e1d430d0        ldrsb   r3, [r4]
8001009c:       e3530000        cmp     r3, #0
800100a0:       1afffffa        bne     80010090 <print0+0x18>
800100a4:       e49df008        ldr     pc, [sp], #8

800100a8 <set_bootpgtbl>:
800100a8:       e52de014        str     lr, [sp, #-20]! ; 0xffffffec
800100ac:       e59d601c        ldr     r6, [sp, #28]
800100b0:       e59d5020        ldr     r5, [sp, #32]
800100b4:       e3a03902        mov     r3, #32768      ; 0x8000
800100b8:       e58d300c        str     r3, [sp, #12]
800100bc:       e3a01901        mov     r1, #16384      ; 0x4000
800100c0:       e58d1008        str     r1, [sp, #8]
800100c4:       e1a04a20        lsr     r4, r0, #20
800100c8:       e1a07a26        lsr     r7, r6, #20
800100cc:       e1a08a25        lsr     r8, r5, #20
800100d0:       e3a06000        mov     r6, #0
800100d4:       e1560008        cmp     r6, r8
800100d8:       2a00000f        bcs     8001011c <set_bootpgtbl+0x74>
800100dc:       e1a03a07        lsl     r3, r7, #20
800100e0:       e59d1024        ldr     r1, [sp, #36]   ; 0x24
800100e4:       e3510000        cmp     r1, #0
800100e8:       059fb40c        ldreq   fp, [pc, #1036] ; 800104fc <kmain+0x15c>
800100ec:       0183500b        orreq   r5, r3, fp
800100f0:       159fb408        ldrne   fp, [pc, #1032] ; 80010500 <kmain+0x160>
800100f4:       1183500b        orrne   r5, r3, fp
800100f8:       e3540c01        cmp     r4, #256        ; 0x100
800100fc:       359d300c        ldrcc   r3, [sp, #12]
80010100:       37835104        strcc   r5, [r3, r4, lsl #2]
80010104:       259d3008        ldrcs   r3, [sp, #8]
80010108:       27835104        strcs   r5, [r3, r4, lsl #2]
8001010c:       e2844001        add     r4, r4, #1
80010110:       e2877001        add     r7, r7, #1
80010114:       e2866001        add     r6, r6, #1
80010118:       eaffffed        b       800100d4 <set_bootpgtbl+0x2c>
8001011c:       e49df014        ldr     pc, [sp], #20

80010120 <start>:
80010120:       e52de040        str     lr, [sp, #-64]! ; 0xffffffc0
80010124:       e3a01048        mov     r1, #72 ; 0x48
80010128:       e5cd1014        strb    r1, [sp, #20]
8001012c:       e3a02065        mov     r2, #101        ; 0x65
80010130:       e5cd2015        strb    r2, [sp, #21]
80010134:       e3a0306c        mov     r3, #108        ; 0x6c
80010138:       e5cd3016        strb    r3, [sp, #22]
8001013c:       e5cd3017        strb    r3, [sp, #23]
80010140:       e3a0206f        mov     r2, #111        ; 0x6f
80010144:       e5cd2018        strb    r2, [sp, #24]
80010148:       e3a03020        mov     r3, #32
8001014c:       e5cd3019        strb    r3, [sp, #25]
80010150:       e3a01066        mov     r1, #102        ; 0x66
80010154:       e5cd101a        strb    r1, [sp, #26]
80010158:       e3a02072        mov     r2, #114        ; 0x72
8001015c:       e5cd201b        strb    r2, [sp, #27]
80010160:       e3a0306f        mov     r3, #111        ; 0x6f
80010164:       e5cd301c        strb    r3, [sp, #28]
80010168:       e3a0106d        mov     r1, #109        ; 0x6d
8001016c:       e5cd101d        strb    r1, [sp, #29]
80010170:       e3a02020        mov     r2, #32
80010174:       e5cd201e        strb    r2, [sp, #30]
80010178:       e3a03073        mov     r3, #115        ; 0x73
8001017c:       e5cd301f        strb    r3, [sp, #31]
80010180:       e3a01074        mov     r1, #116        ; 0x74
80010184:       e5cd1020        strb    r1, [sp, #32]
80010188:       e3a02061        mov     r2, #97 ; 0x61
8001018c:       e5cd2021        strb    r2, [sp, #33]   ; 0x21
80010190:       e3a03072        mov     r3, #114        ; 0x72
80010194:       e5cd3022        strb    r3, [sp, #34]   ; 0x22
80010198:       e5cd1023        strb    r1, [sp, #35]   ; 0x23
8001019c:       e3a02028        mov     r2, #40 ; 0x28
800101a0:       e5cd2024        strb    r2, [sp, #36]   ; 0x24
800101a4:       e3a03029        mov     r3, #41 ; 0x29
800101a8:       e5cd3025        strb    r3, [sp, #37]   ; 0x25
800101ac:       e3a0102c        mov     r1, #44 ; 0x2c
800101b0:       e5cd1026        strb    r1, [sp, #38]   ; 0x26
800101b4:       e3a02020        mov     r2, #32
800101b8:       e5cd2027        strb    r2, [sp, #39]   ; 0x27
800101bc:       e3a03061        mov     r3, #97 ; 0x61
800101c0:       e5cd3028        strb    r3, [sp, #40]   ; 0x28
800101c4:       e3a01062        mov     r1, #98 ; 0x62
800101c8:       e5cd1029        strb    r1, [sp, #41]   ; 0x29
800101cc:       e3a0206f        mov     r2, #111        ; 0x6f
800101d0:       e5cd202a        strb    r2, [sp, #42]   ; 0x2a
800101d4:       e3a03075        mov     r3, #117        ; 0x75
800101d8:       e5cd302b        strb    r3, [sp, #43]   ; 0x2b
800101dc:       e3a01074        mov     r1, #116        ; 0x74
800101e0:       e5cd102c        strb    r1, [sp, #44]   ; 0x2c
800101e4:       e3a02020        mov     r2, #32
800101e8:       e5cd202d        strb    r2, [sp, #45]   ; 0x2d
800101ec:       e5cd102e        strb    r1, [sp, #46]   ; 0x2e
800101f0:       e3a0106f        mov     r1, #111        ; 0x6f
800101f4:       e5cd102f        strb    r1, [sp, #47]   ; 0x2f
800101f8:       e5cd2030        strb    r2, [sp, #48]   ; 0x30
800101fc:       e3a03073        mov     r3, #115        ; 0x73
80010200:       e5cd3031        strb    r3, [sp, #49]   ; 0x31
80010204:       e3a01065        mov     r1, #101        ; 0x65
80010208:       e5cd1032        strb    r1, [sp, #50]   ; 0x32
8001020c:       e3a02074        mov     r2, #116        ; 0x74
80010210:       e5cd2033        strb    r2, [sp, #51]   ; 0x33
80010214:       e3a03075        mov     r3, #117        ; 0x75
80010218:       e5cd3034        strb    r3, [sp, #52]   ; 0x34
8001021c:       e3a01070        mov     r1, #112        ; 0x70
80010220:       e5cd1035        strb    r1, [sp, #53]   ; 0x35
80010224:       e3a02020        mov     r2, #32
80010228:       e5cd2036        strb    r2, [sp, #54]   ; 0x36
8001022c:       e3a03074        mov     r3, #116        ; 0x74
80010230:       e5cd3037        strb    r3, [sp, #55]   ; 0x37
80010234:       e3a01068        mov     r1, #104        ; 0x68
80010238:       e5cd1038        strb    r1, [sp, #56]   ; 0x38
8001023c:       e3a02065        mov     r2, #101        ; 0x65
80010240:       e5cd2039        strb    r2, [sp, #57]   ; 0x39
80010244:       e3a03020        mov     r3, #32
80010248:       e5cd303a        strb    r3, [sp, #58]   ; 0x3a
8001024c:       e3a0104d        mov     r1, #77 ; 0x4d
80010250:       e5cd103b        strb    r1, [sp, #59]   ; 0x3b
80010254:       e5cd103c        strb    r1, [sp, #60]   ; 0x3c
80010258:       e3a03055        mov     r3, #85 ; 0x55
8001025c:       e5cd303d        strb    r3, [sp, #61]   ; 0x3d
80010260:       e3a0100a        mov     r1, #10
80010264:       e5cd103e        strb    r1, [sp, #62]   ; 0x3e
80010268:       e3a02000        mov     r2, #0
8001026c:       e5cd203f        strb    r2, [sp, #63]   ; 0x3f
80010270:       e28d0014        add     r0, sp, #20
80010274:       ebffff7f        bl      80010078 <print0>
80010278:       e3a00000        mov     r0, #0
8001027c:       e58d0008        str     r0, [sp, #8]
80010280:       e3a01601        mov     r1, #1048576    ; 0x100000
80010284:       e58d100c        str     r1, [sp, #12]
80010288:       e58d0010        str     r0, [sp, #16]
8001028c:       ebffff85        bl      800100a8 <set_bootpgtbl>
80010290:       e3a00102        mov     r0, #-2147483648        ; 0x80000000
80010294:       e3a03000        mov     r3, #0
80010298:       e58d3008        str     r3, [sp, #8]
8001029c:       e3a01601        mov     r1, #1048576    ; 0x100000
800102a0:       e58d100c        str     r1, [sp, #12]
800102a4:       e58d3010        str     r3, [sp, #16]
800102a8:       ebffff7e        bl      800100a8 <set_bootpgtbl>
800102ac:       e59f0250        ldr     r0, [pc, #592]  ; 80010504 <kmain+0x164>
800102b0:       e3a03000        mov     r3, #0
800102b4:       e58d3008        str     r3, [sp, #8]
800102b8:       e3a01601        mov     r1, #1048576    ; 0x100000
800102bc:       e58d100c        str     r1, [sp, #12]
800102c0:       e58d3010        str     r3, [sp, #16]
800102c4:       ebffff77        bl      800100a8 <set_bootpgtbl>
800102c8:       e3a00209        mov     r0, #-1879048192        ; 0x90000000
800102cc:       e3a03201        mov     r3, #268435456  ; 0x10000000
800102d0:       e58d3008        str     r3, [sp, #8]
800102d4:       e3a01302        mov     r1, #134217728  ; 0x8000000
800102d8:       e58d100c        str     r1, [sp, #12]
800102dc:       e3a02001        mov     r2, #1
800102e0:       e58d2010        str     r2, [sp, #16]
800102e4:       ebffff6f        bl      800100a8 <set_bootpgtbl>
800102e8:       e3a00901        mov     r0, #16384      ; 0x4000
800102ec:       e3a03902        mov     r3, #32768      ; 0x8000
800102f0:       e58d3008        str     r3, [sp, #8]
800102f4:       ebffff4d        bl      80010030 <load_pgtbl>
800102f8:       ebffff5b        bl      8001006c <jump_stack>
800102fc:       e49df040        ldr     pc, [sp], #64   ; 0x40

It seems that the toolchain is placing the code at 0x80010000 on startup but take a deeper look at the instructions. All these instructions do not depend on where they are in memory. One glaring instruction is the BL instruction for calling functions, does that not use absolute locations? It turns out not to be the case, the BL instruction is jumping relative to its location; objdump calculated that address for us and printed it out like an absolute location but it is not really not (refer to the ARM manual).

The Plan 9 toolchain does not generate code that needs the exact address for simple statements, function calls, loops, and local variable access. But it does generate code that does depend on the location for constant string lookups and global variable accesses. It will look up the data address and calculate the relative offset to that; it encodes the offset in the binary for those. This means that we cannot use global variables or pointer to constant strings before we enable the MMU and work in the virtual address space at 0x80010000.

Even if we can’t use global variables or string pointers from the code, we can certainly use them if we know the exact memory location of where they are in memory. This means when we need access to a global variable or string constant before we enable the MMU, we just need to specify an absolute physical location to it.

This ends the coverage of the basic theory on how this will work, let us see how this plays out in the code.

Practice

kernel.ld defines various symbols for use in the code that we need to define in the Plan 9 code. Here are the symbols we are going to need:

These are all in physical memory

edata_entry:   The start of the address of all the data structures we need to enable the MMU
svc_stktop:    The top of the stack pointer
_kernel_pgtbl: The location of the kernel page table
_user_pgtbl:   The location of the user page table

We use nm on the kernel.elf to see where it places its symbol in memory. Then we will define it in a header file directly

$ nm kernel.elf | grep 'edata_entry\|svc_stktop\|_kernel_pgtbl\|_user_pgtbl'

00010548 T edata_entry
00014000 T _kernel_pgtbl
00012000 T svc_stktop
00018000 T _user_pgtbl

edata_entry does not seem to end at a nice address, so we will just round it to a nice address for our code. We will also need to adjust these memory locations and place it before the Plan 9 .text to make sure our binary image does override these locations if the binary ever grew big enough. (The GNU linker does not do this because it can calculate it on the fly and the addresses we see above is just one instance of the computation)

// memlayout.h

// first kernel address
#define KERNBASE 0x80000000

// map the first 1 MB low memory containing kernel code.
#define INIT_KERNMAP 0x100000

// start of kernel data structures we need
#define edata_entry 0x2000

// stack for bootstrapping
#define svc_stktop 0x2000

// kernel page table address
#define _kernel_pgtbl 0x4000

// user page table address
#define _user_pgtbl 0x8000

// end of kernel data structures we need
#define end_entry 0x9000

#define P2V(a) ((a) + KERNBASE)

We just took the constants and subtracted it by 0x10000 and rounded edata_entry to where svc_stktop is, we don’t care about the other locations so we don’t need to care about allocating memory for those as they did. They are striving for a complete kernel and we are just trying to show a demo of using the MMU.

We port over entry.S, start.c, and main.c to Plan 9. Due to the fact that the Plan 9 compiler does not support inline assembly, any code that uses it in xv6-rpi will have to be done in the assembly file such as the load_pgtbl function.

#include "memlayout.h"
#include "arm.h"
#include "mmu.h"

TEXT _start(SB), 1, $-4
	// clear the memory for data structures we need
	MOVW    $(edata_entry), R1
	MOVW    $(end_entry), R2
	MOVW    $0, R3
_zero:
	MOVW    R3, (R1)
	ADD     $4, R1
	CMP     R1, R2
	BNE     _zero

	// set supervisor mode, no interrupts
	MOVW    $(SVC_MODE|NO_INT), R1
	MOVW    R1, CPSR

	// set the stack pointer to jump into C
	MOVW    $(svc_stktop), SP
	BL      start(SB)

	// loop forever
	B       0(PC)

// loads the page tables for kernel and user
// void load_pgtbl(u32 *kernel_pgtbl, u32 *user_pgtbl)
// R0    - kernel_pgtbl
// SP[8] - user_pgtbl
TEXT load_pgtbl(SB), 1, $-4
	// set the domain access control; all domains are checked for permission
	MOVW    $0x55555555, R3
	MCR     15, 0, R3, C(3), C(0), 0

	// set the page table base registers; we use two tables:
	// TTBR0 for user space and TTBR1 for kernel space
	MOVW    $(32-UADDR_BITS), R3
	MCR     15, 0, R3, C(2), C(0), 2

	// load the kernel page table
	MOVW    R0, R3
	MCR     15, 0, R3, C(2), C(0), 1

	// load the user page table
	MOVW    8(SP), R3
	MCR     15, 0, R3, C(2), C(0), 0

	// enable MMU, cache, write buffer, high vector tbl,
	// disable subpage
	MRC     15, 0, R3, C(1), C(0), 0
	ORR     $0x80300D, R3
	MCR     15, 0, R3, C(1), C(0), 0

	// flush the TLB
	MOVW    $0, R3
	MCR     15, 0, R3, C(8), C(7), 0

	RET

// once we get here, we have enabled the MMU and setup the page tables
// when the kernel booted up, it was in user address space, but
// now we can use the kernel address space now for the kernel
TEXT jump_stack(SB), 1, $-4
	// R12 defined by the linker is relative to the kernel address
	// so we couldn't use it until now
	MOVW    $setR12(SB), R12
	
	// setup stack pointer to be in kernel virtual address now
	ADD     $(KERNBASE), SP

	// jump to the address of main, main is in the kernel address range
	// we couldn't load this before the MMU is setup properly
	MOVW    $kmain(SB), PC
/* start.c */

#include "types.h"
#include "memlayout.h"
#include "arm.h"
#include "mmu.h"

void jump_stack(void);
void load_pgtbl(u32 *, u32 *);

int set_me;

static void
print0(char *s)
{
	volatile u8 *p = (void *)UART0;
	for (; *s; s++)
		*p = *s;
}

// setup the boot page table: dev_mem whether it is device memory
static void
set_bootpgtbl(u32 virt, u32 phy, uint len, int dev_mem)
{
	u32 pde, *user_pgtbl, *kernel_pgtbl;
	int idx;

	user_pgtbl = (u32 *)_user_pgtbl;
	kernel_pgtbl = (u32 *)_kernel_pgtbl;

	// convert all the parameters to indexes
	virt >>= PDE_SHIFT;
	phy >>= PDE_SHIFT;
	len >>= PDE_SHIFT;

	for (idx = 0; idx < len; idx++) {
		pde = (phy << PDE_SHIFT);

		if (!dev_mem) {
			// normal memory, make it kernel-only, cachable, bufferable
			pde |= (AP_KO << 10) | PE_CACHE | PE_BUF | KPDE_TYPE;
		} else {
			// device memory, make it non-cachable and non-bufferable
			pde |= (AP_KO << 10) | KPDE_TYPE;
		}

		// use different page table for user/kernel space
		if (virt < NUM_UPDE) {
			user_pgtbl[virt] = pde;
		} else {
			kernel_pgtbl[virt] = pde;
		}

		virt++;
		phy++;
	}
}

void
start(void)
{
	// print out a message to let us know that we made it here
	// we can't use char* because the string generated will not
	// be accessible yet since the R12 has not been setup yet
	// and we haven't relocated to the kernel address yet
	// we put the string on the stack here so we can print it
	char msg[] = "Hello from start(), about to setup the MMU\n";
	print0(msg);

	// double map the memory required for paging
	// we do not map all the physical memory
	set_bootpgtbl(0, 0, INIT_KERNMAP, 0);
	set_bootpgtbl(KERNBASE, 0, INIT_KERNMAP, 0);

	// map the vector table
	set_bootpgtbl(VEC_TBL, 0, 1 << PDE_SHIFT, 0);

	// map the devices so we can use the devices when we we enable the MMU
	set_bootpgtbl(KERNBASE + DEVBASE, DEVBASE, DEV_MEM_SZ, 1);

	// load the page table
	load_pgtbl((u32 *)_kernel_pgtbl, (u32 *)_user_pgtbl);

	// we have set up the MMU now, jump to the kernel address space
	jump_stack();
}

We start in _start on bootup and we clear out the memory for the data structures we will setup for the MMU and stack. Then we jump to start in C to make it easier to setup the data structure for the MMU. start will setup the pages for identity mapping using set_bootpgtbl and then call load_pgtbl to tell the CPU of where the page tables are and enable the MMU. We then call jump_stack to switch over to the virtual address space by setting up the stack pointer and program counter to be inside the virtual address space. We then call kmain and inside kmain, test various constructs such as global variable access, string constants, and pointer dereferencing.

/* kmain.c */

#include "types.h"
#include "arm.h"
#include "memlayout.h"

#define nelem(x) (sizeof(x) / sizeof(x[0]))

void
putc(int c)
{
	volatile u8 *p = (void *)P2V(UART0);
	*p = c;
}

void
putln(char *s)
{
	for (; *s; s++)
		putc(*s);
	putc('\n');
}

void
puthex(uint v)
{
	char *hex = "0123456789ABCDEF";
	int i;

	for (i = sizeof(v) * 8 - 4; i >= 0; i -= 4)
		putc(hex[(v >> i) & 0xf]);
	putc('\n');
}

// test if we can access global variables now
static char bar[] = "bar";
char *foo = "foo";
static char *tab[] = {
    "tab1",
    "tab2",
    "tab3"};
int idx;
int *pidx;

extern int set_me;

struct test_struct {
	char *t1;
	char *t2;
	int t3;
} ts[] = {
    {"xxx", "yyy", 0x30},
    {"aaa", "yyy", 0x40},
    {"bbb", "yyy", 0x50},
    {"ccc", "yyy", 0x60},
    {"ddd", "yyy", 0x70},
    {"eee", "yyy", 0x80},
    {"fff", "yyy", 0x90},
    {"ggg", "yyy", 0x100},
    {"hhh", "yyy", 0x110},
};

void
kmain(void)
{
	putln("we are in kmain(), with the MMU enabled!");
	putln("running some tests to see if we can access strings and variables");
	putln(bar);
	putln(foo);
	for (idx = 0; idx < nelem(tab); idx++)
		putln(tab[idx]);
	puthex((uint)&idx);

	pidx = &idx;
	*pidx = 30;
	puthex(*pidx);

	set_me = 0xdeadbeef;
	puthex(set_me);
	puthex((uint)&set_me);

	for (set_me = 0; set_me < nelem(ts); set_me++) {
		putln(ts[set_me].t1);
		putln(ts[set_me].t2);
		puthex(ts[set_me].t3);
		puthex((uint)&ts[set_me].t1);
		puthex((uint)&ts[set_me].t2);
		puthex((uint)&ts[set_me].t3);
		putc('\n');
	}

	// we did not setup a return when we got to here so we don't want
	// to jump into space on return, just loop forever
	for (;;)
		;
}

If all went well, we should be in virtual address space executing code. Here is the QEMU output:

$ qemu-system-arm -M versatilepb -m 128M -nographic -kernel plan9
Hello from start(), about to setup the MMU
we are in kmain(), with the MMU enabled!
running some tests to see if we can access strings and variables
bar
foo
tab1
tab2
tab3
80011008
0000001E
DEADBEEF
80011000
xxx
yyy
00000030
80011020
80011024
80011028

aaa
yyy
00000040
8001102C
80011030
80011034

bbb
yyy
00000050
80011038
8001103C
80011040

ccc
yyy
00000060
80011044
80011048
8001104C

ddd
yyy
00000070
80011050
80011054
80011058

eee
yyy
00000080
8001105C
80011060
80011064

fff
yyy
00000090
80011068
8001106C
80011070

ggg
yyy
00000100
80011074
80011078
8001107C

hhh
yyy
00000110
80011080
80011084
80011088

You can get the code to the demo at this url.