Showing posts with label mapping. Show all posts
Showing posts with label mapping. Show all posts

Saturday, February 8, 2014

Kernel Space and User Space

Understanding of Kernel space and User space in detail is very important if you wish to have a strong base of Linux Kernel.


  • Here Kernel Space and User Space corresponds to their Virtual address space.
  • Every process in linux utilizes  its own separate virtual space.
  • In a linux system based on 32 bit Architecture, user space address space corresponds to lower 3GB of virtual space and kernel space the upper 1GB.(general way)
  • The kernel space virtual address space is shared between all the processes.
  • When a process is active, it can either be running in "user mode" or "kernel mode".
  • In a process is running in User mode it means that the CPU is running the user space side of code.
  • A process running in the user mode has limited capability and is controlled by a flag in the CPU.
  • Even though the kernel memory is present in the process's memory map the user space code is not allowed to access the kernel space code.(can do in some special way).
  • When a process wants to do something other than move data around in its own (userspace) virtual memory, like opening a file for example, it must make a syscall to communicate with the kernel space.
  • Each CPU architecture has it's unique way of making a system call but the basic remains the same i.e.
  • A magic instruction is executed, the CPU turns on the "privileged mode" flag, and jumps to a special address in kernel space, the "syscall entry point".( read another post to understand what syscall is)
  • Now when the syscall has reached the kernel space then the process is running in kernel mode and executing instructions from the kernel space memory.
  • Taking the same example of open system call, to find the requested file, the kernel may consult with filesystem drivers (to figure out where the file is) and block device drivers (to load the necessary blocks from disk) or network device drivers and protocols (to load the file from a remote source).
  • These drivers can be either built in or can be loaded as module but the key point that remains it that they are the part of kernel space.
  • Loading a module is done with a syscall that asks the kernel to copy the module's code and data into kernel space and run its initialization code in kernel mode.
  • If the kernel can't process the request then the process is made to sleep by the kernel and when the request is complete then the syscall returns back to the user space.
  • Returning back to user mode means restoring the CPU registers to what they were before coming to Kernel Mode and changing the CPU privilege level to non-privilege .
  • Apart from syscalls there are some other things that take CPU to kernel mode eg. 

1. Page faults- If the process tries to access a virtual memory address that doesn't have a physical address assigned to it then the CPU enters the Kernel mode  and jumps to page fault handler and the kernel sees whether the virtual addresss is valid or not and depending upon this it either tries to create a physical page for the given virtual address or if it can't then sends a segmentation fault signal (SIGSEGV).

2. Interrupts- When the CPU receives some interrupt from the hardware then it jumps to the kernel mode and executes the interrupt handler and when the kernel is finished handing the interrupt the the code return to the user space where it was executing.

 

Tuesday, February 4, 2014

Linux Addressing



  • There are two kinds of address used in Linux. 1. Virtual address( logical address)  and 2. Physical address.
  • Physical addresses are the addresses of the contents of the RAM.
  • Virtual addresses as the name says are virtual, i.e. they do not point to any address in RAM directly but need to be converted into physical address by MMU at run time.
  • For a 32 bit system there can be 2^32=4GB virtual addresses possible.
  • Each process is given a separate virtual address space so that they feel as if they have the access to the full RAM and each process remain independent of each other.
  • Virtual address is divided into two parts  1. User space virtual addresses  2. Kernel space virtual addresses.




  • The user space can be given upper 3 GB (0xc0000000 .. 0xffffffff ) and kernel space the lower 1 GB ( 0x0... 0xbfffffff ) also but the diagram shows the general way of doing it.
  • Virtual memory is not really used (committed) until actually used. This means when you allcoate memory, you get an "IOU" or "promise" for memory, but the memory only gets consumed when you actually use the memory, as in write some value to it.
  • A process running in user space may need to request some functions from kernel(ex. syscalls), so kernel image is fully mapped to the Kernel Virtual Space.
  • Suppose a user space application makes use of libc, then libc should also be mapped to the virtual address space of the process 
  • This mapping is kept permanent till the process is terminated.
  • Since the kernel needs to access each and every physical memory so the entire physical memory must be mapped to the kernel virtual space (1 GB).
  • The mapping is one two one.
  • For the mapping to be truly one to one the RAM has to be of 1GB, but this is not the case always.
  • We will consider two cases 1. RAM is less than 1 GB, say 512MB and 2. when the RAM is more than 1 GB say 3GB.
     process address space(virtual address space)

    4GB +---------------+
        |     512MB     |
        +---------------+ <------+     physical memory               
        |     512MB     |        | 
    3GB +---------------+ <--+   +---> +------------+ 
        |               |    |         |   512 MB   |
        |     /////     |    +-------> +------------+
        |               |     
    0GB +---------------+     


  • When the RAM is 512MB then whole of RAM is linearly mapped to the 512MB of the lower 512 MB of the kernel virtual space.
  • The virtual address space of Kernel is divided into two parts 1. Low memory 2. High Memory.
  • The first 896MB constitute the low memory region or LOWMEM.
  • The top 128 MB is called high memory region or HIGHMEM.
  • The first 16MB of LOWMEM is reserved for DMA usage.
  • There are some zones of physical memory:
  • ZONE_DMA - Contains page frames of memory below 16 MB
  • ZONE_NORMAL - Contains page frames of memory at and above 16 MB and below 896 MB
  • ZONE_HIGHMEM - Contains page frames of memory at and above 896 MB
  • So, if you have 512 MB, your ZONE_HIGHMEM will be empty, and ZONE_NORMAL will have 496 MB of physical memory mapped.




                                       physical memory
       process address space    +------> +------------+
                                |        |  3200 M    |
                                |        |            |
    4GB +---------------+ <-----+        |  HIGH MEM  |
        |     128 MB    |                |            |
        +---------------+ <---------+    |            |
        +---------------+ <------+  |    |            | 
        |     896 MB    |        |  +--> +------------+         
    3GB +---------------+ <--+   +-----> +------------+ 
        |               |    |           |   896 MB   |
        |     /////     |    +---------> +------------+
        |               |     
    0GB +---------------+     

  • As we can see from the diagram the physical memory above 896MB is mapped to the 128MB of the HIGHMEM region.
  • When we request a large chunk of memory from physical space using vmalloc() then we get it from HIGHMEM also ,that's why the physical pages that we get are not contiguous.
  • For a user-space application, if you print out the pointer address you defined, it should be one of the virtual addresses out of the (0-3GB) range.

  • What about kernel? what if print a pointer address in kernel? is it always from kernel address space? The answer is no ... since kernel can access user space address, depending on the pointer, it can be from either.
  • To distinguish between kernel space and user space address is easy, if it fall into 0-3GB, then it is from user-space, otherwise, it is from kernel.The programmer sees virtual address only. 

  • To make things clear from user point of view lets see the user space addresses after compiling the program.
+----------------------+ <--- 0xBFFF FFFF (=3GB)
    | environment variable |
    |----------------------| <--- 0xBFFF FD0C 
    | stacks (down grow)   |
    |          |           |
    |          v           |
    |----------------------|
    |                      |
    |     // free memory   |
    |                      |
    |                      |
    |----------------------| <--------------------
    |  myprogram.o         |                    |
    |----------------------|                    |
    |  mylib.o             |                    |
    |----------------------|              executable image      
    |  myutil.o            |                    |     
    |----------------------|                    |
    |  library code (libc) |                    |
    |----------------------| <--------------------                
    |                      | 0x8000 0000 (=2GB) 
    |                      |
    |  other memory        |
    |                      |
    +----------------------+
  PAGE_OFFSET =  0xC8000000 = 3GB

  For a 512MB system, the virtual address for kernel will be from   3GB ~ PAGE_OFFSET + 512MB