Camkes Arm VMM use of huge pages

I’ve been playing around with Arm camkes VMM. Recently I noticed it uses 4K pages when mapping VM memory, despite the RAM being configured for 256MB and being aligned on 1GB. I’m wondering why it is not using huge pages. Does this mean I’m misconfiguring or missing some configuration? Or is this the default behavior? If so, anyway I can make it use the 2MB huge pages with some simple hacking? I need some help here because when digging through the code, I found it a bit overwhelming given the amount of indirection in the memory reservation and mapping paths…

Thanks in advance,

Hey @josecm :wave:
Sorry if this a late reply! but in case you’re still wondering I’m happy to try and answer your questions.

I’m wondering why it is not using huge pages. Does this mean I’m misconfiguring or missing some configuration? Or is this the default behavior?

You aren’t missing any configurations. Using 4K pages when mapping the VM memory is indeed the default behavior of the ARM VMM and its underlying libraries (libsel4vm and libsel4vmmplatsupport). Why is it not using huge pages? The ARM VMM can definitely be improved to support the use of large pages, however this has just never been implemented (to my more limited knowledge at least). I think it would be worth officially implementing support, however this is probably out of the scope of this discussion (sorry if this is a bit of a non-answer).

If so, anyway I can make it use the 2MB huge pages with some simple hacking.

In the name of hacking (I don’t suggest staging these changes to the upstream repository :sweat_smile:) we can definitely re-work the ARM VMM implementation to use large pages.

The first place I would look at is components/VM_Arm/src/modules/init_ram.c (relative to the camkes-vm repository). This module is loaded by the camkes ARM VMM to initialize the RAM region for the guest VM. This specifically makes a call to vm_ram_register_at to register the RAM region. If we were to take a look inside, vm_ram_register_at is iterating the region, allocating and mapping pages at a 4K granularity. That’s the default behavior of vm_ram_register_at. Thankfully we can override this behavior and use our own iterator by calling vm_ram_register_at_custom_iterator. Hence we could rework our module to do something like the following:

void WEAK init_ram_module(vm_t *vm, void *cookie)                                                                                                           
{                                                                                                                                                           
    int err; 
    ...
    err = vm_ram_register_at_custom_iterator(vm, linux_ram_base, linux_ram_size, my_large_page_iterator, (void *)vm)
    ...
}

my_large_page_iterator implementing something like the following:

static vm_frame_t my_large_page_iterator(uintptr_t addr, void *cookie)
{                                                                                                                                                           
    int ret;                                                                                                                                                
    int error;                                                                                                                                              
    vka_object_t object;                                                                                                                                    
    vm_frame_t frame_result = { seL4_CapNull, seL4_NoRights, 0, 0 };                                                                                        
    vm_t *vm = (vm_t *)cookie;                                                                                                                              
    if (!vm) {                                                                                                                                              
        return frame_result;                                                                                                                                
    }                                                                                                                                                       
                                                                                                                                                            
    int page_size = seL4_LargePageBits; //<--- Different page size goes here                                                                                                                    
                                                                                                                                                            
    uintptr_t frame_start = ROUND_DOWN(addr, BIT(page_size));                                                                                               
    cspacepath_t path;                                                                                                                                      
    error = vka_cspace_alloc_path(vm->vka, &path);                                                                                                          
    if (error) {                                                                                                                                            
        ZF_LOGE("Failed to allocate path");                                                                                                                 
        return frame_result;                                                                                                                                
    }                                                                                                                                                       
    seL4_Word vka_cookie;                                                                                                                                   
    error = vka_utspace_alloc_at(vm->vka, &path, kobject_get_type(KOBJECT_FRAME, page_size), page_size, frame_start,                                        
                                 &vka_cookie);                                                                                                              
    if (error) {                                                                                                                                            
        ZF_LOGE("Failed to allocate page");                                                                                                                 
        vka_cspace_free_path(vm->vka, path);                                                                                                                
        return frame_result;                                                                                                                                
    }                                                                                                                                                       
    frame_result.cptr = path.capPtr;                                                                                                                        
    frame_result.rights = seL4_AllRights;                                                                                                                   
    frame_result.vaddr = frame_start;                                                                                                                       
    frame_result.size_bits = page_size;                                                                                                                     
    return frame_result;                                                                                                                                    
}  

The above should back the entire RAM region with large pages.

Ideally we should be able to stop there, however due to some underlying assumptions, there’s still a little bit more we need to hack. The ARM VMM loads the kernel and initrd images by copying the contents of each image into their corresponding load addresses within the guest’s address space. The VMM achieves this by mapping the underlying frames of the load addresses into its own address space for copying. This unfortunately assumes pages were allocated and mapped at a 4K granularity. Specifically the VMM calls vm_load_guest_kernel/vm_load_guest_module → in-turn calls vm_ram_touch to perform the copy → which maps in 4K increments. We could very quickly and sneakily hack the function to work on large page increments like so:

diff --git a/libsel4vm/src/guest_ram.c b/libsel4vm/src/guest_ram.c
index 1e383b4..e1a8f4e 100644
--- a/libsel4vm/src/guest_ram.c
+++ b/libsel4vm/src/guest_ram.c
@@ -161,14 +161,14 @@ int vm_ram_touch(vm_t *vm, uintptr_t addr, size_t size, ram_touch_callback_fn to
     access_cookie.data = cookie;
     access_cookie.vm = vm;
     for (current_addr = addr; current_addr < end_addr; current_addr = next_addr) {
-        uintptr_t current_aligned = PAGE_ALIGN_4K(current_addr);
-        uintptr_t next_page_start = current_aligned + PAGE_SIZE_4K;
+        uintptr_t current_aligned = PAGE_ALIGN(current_addr, BIT(seL4_LargePageBits));
+        uintptr_t next_page_start = current_aligned + BIT(seL4_LargePageBits);
         next_addr = MIN(end_addr, next_page_start);
         access_cookie.size = next_addr - current_addr;
         access_cookie.offset = current_addr - addr;
         access_cookie.current_addr = current_addr;
         int result = vspace_access_page_with_callback(&vm->mem.vm_vspace, &vm->mem.vmm_vspace, (void *)current_aligned,
-                                                      seL4_PageBits, seL4_AllRights, 1, touch_access_callback, &access_cookie);
+                                                      seL4_LargePageBits, seL4_AllRights, 1, touch_access_callback, &access_cookie);
         if (result) {
             return result;
         }

The above change is definitely hacky. You could reasonably tidy up the change however by updating the interfaces to allow us to pass the page size backing the RAM regions (either book-kept within the guest_ram interface or as a parameter to vm_ram_touch). Other approaches could possibly also involve expanding the iterator implementation (my_large_page_iterator) to exclusively back the image regions with 4K pages (untested however and not sure how much of an improvement that would be for this case).

An approximate combo of these two changes should hopefully help you proceed with larger page sizes (or possibly get an idea of how the camkes ARM VMM manages the RAM region). I hope that all makes sense! Happy to try answer further questions :slight_smile:

1 Like