LINUX KERNEL INTERNALS: mmap() and munmap()

In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory.
It is a method of memory-mapped file I/O.
It naturally implements demand paging, because initially file contents are not entirely read from disk and do not use physical RAM at all.
The actual reads from disk are performed in a "lazy" manner, after a specific location is accessed. After the memory is no longer needed it is important to munmap(2) the pointers to it.
A memory-mapped file is a segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource.
This resource is typically a file that is physically present on-disk, but can also be a device, shared memory object, or other resource that the operating system can reference through a file descriptor.
Once present, this correlation between the file and the memory space permits applications to treat the mapped portion as if it were primary memory.
Memory mapping is the only way to transfer data between user and kernel spaces that does not involve explicit copying, and is the fastest way to handle large amounts of data.
There is a major difference between the conventional read(2) and write(2) functions and mmap.
While data is transfered with mmap no "control" messages are exchanged.
This means that a user space process can put data into the memory, but that the kernel does not know that new data is available. The same holds for the opposite scenario: The kernel puts its data into the shared memory, but the user space process does not get a notification of this event.
This characteristic implies that memory mapping has to be used with some other communication means that transfers control messages, or that the shared memory needs to be checked in regular intervals for new content.
Similar to the read and write function calls mmap can be used with different file systems and with sockets.

From Linux Man page-


#include <sys/mman.h>





       void *mmap(void *addr, size_t length, int prot, int flags,


                  int fd, off_t offset);


       int munmap(void *addr, size_t length);




mmap() creates a new mapping in the virtual address space of the


calling process.  The starting address for the new mapping is

specified in addr.  The length argument specifies the length of the

mapping.



 If addr is NULL, then the kernel chooses the address at which to


create the mapping; this is the most portable method of creating a

new mapping.  If addr is not NULL, then the kernel takes it as a hint

about where to place the mapping; on Linux, the mapping will be

created at a nearby page boundary.  The address of the new mapping is

returned as the result of the call.



       The contents of a file mapping are initialized using length bytes starting


        at offset offset in the file (or other object) referred to by the


file descriptor fd.  offset must be a multiple of the page size as

returned by sysconf(_SC_PAGE_SIZE).



       The prot argument describes the desired memory protection of the


mapping (and must not conflict with the open mode of the file).  It

is either PROT_NONE or the bitwise OR of one or more of the following

flags:



PROT_EXEC  Pages may be executed.



PROT_READ  Pages may be read.



PROT_WRITE Pages may be written.



PROT_NONE  Pages may not be accessed.





The flags argument determines whether updates to the mapping are



visible to other processes mapping the same region, and whether

updates are carried through to the underlying file.  This behavior is

determined by including exactly one of the following values in flags:









MAP_SHARED Share this mapping.  Updates to the mapping are visible to

other processes that map this file, and are carried

through to the underlying file.  The file may not actually

be updated until msync(2) or munmap() is called.









MAP_PRIVATE

Create a private copy-on-write mapping.  Updates to the



mapping are not visible to other processes mapping the




same file, and are not carried through to the underlying



file.  It is unspecified whether changes made to the file




after the mmap() call are visible in the mapped region.





       In addition, zero or more of the following values can be ORed in



flags:









MAP_32BIT (since Linux 2.4.20, 2.6)

Put the mapping into the first 2 Gigabytes of the process

address space.  This flag is supported only on x86-64, for

64-bit programs.  



MAP_ANON













Synonym for MAP_ANONYMOUS.  Deprecated.








MAP_ANONYMOUS













The mapping is not backed by any file; its contents are




initialized to zero.  The fd and offset arguments are ignored;











MAP_DENYWRITE

This flag is ignored. 



MAP_EXECUTABLE

This flag is ignored.









MAP_FILE

Compatibility flag.  Ignored.









MAP_FIXED

Because requiring a

fixed address for a mapping is less portable, the use of this



option is discouraged.




MAP_GROWSDOWN

Used for stacks.  Indicates to the kernel virtual memory

system that the mapping should extend downward in memory.









MAP_HUGETLB (since Linux 2.6.32)

Allocate the mapping using "huge pages."  See the Linux kernel

source file Documentation/vm/hugetlbpage.txt for further

information.









MAP_LOCKED (since Linux 2.5.37)

Lock the pages of the mapped region into memory in the manner

of mlock(2).  This flag is ignored in older kernels.









MAP_NONBLOCK (since Linux 2.5.46)

Only meaningful in conjunction with MAP_POPULATE.  Don't

perform read-ahead: create page tables entries only for pages



that are already present in RAM.  







MAP_NORESERVE

Do not reserve swap space for this mapping.  When swap space



is reserved, one has the guarantee that it is possible to




modify the mapping.  When swap space is not reserved one might

get SIGSEGV upon a write if no physical memory is available.











MAP_POPULATE (since Linux 2.5.46)

Populate (prefault) page tables for a mapping.  For a file

mapping, this causes read-ahead on the file.  Later accesses



to the mapping will not be blocked by page faults.




MAP_POPULATE is supported for private mappings only since



Linux 2.6.23.




MAP_STACK (since Linux 2.6.27)

Allocate the mapping at an address suitable for a process or

thread stack.  This flag is currently a no-op, but is used in



the glibc threading implementation so that if some




architectures require special treatment for stack allocations,



support can later be transparently implemented for glibc.




MAP_UNINITIALIZED (since Linux 2.6.33)

Don't clear anonymous pages.  This flag is intended to improve



performance on embedded devices.  This flag is honored only if




CONFIG_MMAP_ALLOW_UNINITIALIZED option.  Because of the



the kernel was configured with the








security implications, that option is normally enabled only on




of the contents of user memory).



embedded devices (i.e., devices where one has complete control




Of the above flags, only MAP_FIXED is specified in POSIX.1-2001.

However, most systems also support MAP_ANONYMOUS (or its synonym

MAP_ANON).









Some systems document the additional flags MAP_AUTOGROW,

MAP_AUTORESRV, MAP_COPY, and MAP_LOCAL.






A file is mapped in multiples of the page size.  For a file that is

not a multiple of the page size, the remaining memory is zeroed when

mapped, and writes to that region are not written out to the file.

on the pages that correspond to added or removed regions of the file

The effect of changing the size of the underlying file of a mapping

is unspecified.





munmap()





       The munmap() system call deletes the mappings for the specified


address range, and causes further references to addresses within the

range to generate invalid memory references.  The region is also

hand, closing the file descriptor does not unmap the region.

automatically unmapped when the process is terminated.  On the other

       The address addr must be a multiple of the page size.  All pages

containing a part of the indicated range are unmapped, and subsequent

       references to these pages will generate SIGSEGV.  It is not an error

if the indicated range does not contain any mapped pages.

LINUX KERNEL INTERNALS

Popular Posts

Monday, October 20, 2014

mmap() and munmap()

No comments:

Post a Comment

About Me

Blog Archive