1 Purpose and Scope
Typically operating system functions for memory allocation are designed to manipulate many megabytes of data, and have significant performance overheads typically due to page faults and page switching. This module then, is an important way of isolating any process from the worst performance overheads which are inherent in dealing with large blocks of memory.
This can be especially important when working at the application level where typically, a reserved memory block may only be eight or sixteen bytes in size. If one were using many small blocks of memory, then it would be wasteful (and in fact ineffective) to use a marginally larger block size to improve time delays inherent in managing large blocks. Alternatively one may cursorily consider using a significantly larger block of memory (maybe a megabyte) for the storage of the intended small (8 or 16 byte) block. Clearly if one approaches the problem from this perspective it is tremendously wasteful of resources, but for a given delay, one receives more memory, even if no practical benefit is seen.
Approaching the problem in this way leads to a solution which VMM implements. An algorithm is provided which can internally allocate arbitrarily sized memory blocks from the system pool. These are then divided up and passed to the process or thread programmer for efficient use. The arbitrary system memory block size is most logically set at an integer number of pages in the target system. This naturally leads to optimal memory performance.
In practice the overheads of the operating system are usually sufficiently great, that the bigger the number of pages allocated by the manager in a single operation, the better the performance. Unfortunately, the instantaneous allocation requirement of a given process can vary greatly, and consequently most applications will require some degree of tuning to achieve the optimum “gearing” between time and space.
VMM provides complimentary facilities and tools to evaluate the dynamic allocation requirement of a process, and the controls necessary to tailor the “gearing”.
Inherent in the practice of allocating small blocks of [usrmem] memory from larger blocks of [sysmem] memory is the problem that sysmem can become fragmented. Usrmem is what the process requests, sysmem is what VMM allocates to fulfil the request. To achieve a speed improvement over the operating system capability, the quantity of sysmem allocated at any given instant is likely to be larger than that of the total usrmem requirement.
Part of the process of improving the “gearing” of usrmem allocation is minimising the difference between sysmem and usrmem allocation. VMM is good at doing this, but typically a problem can occur when sysmem is allocated and only partially used as usrmem. Where total usrmem allocation does not exceed a single sysmem block memory is recycled very efficiently.
Where total usrmem allocation exceeds a sysmem block, one can achieve a condition where a single small usrmem allocation requires the presence of a whole sysmem block.
External to VMM this is an impossible problem to resolve, and it exists in normal “C style” memory managers. It occurs because the programmer naturally requires indirect addressing of the memory block for speed and convenience. To implement indirect addressing the programmer naturally uses a pointer to memory that he requests, but typically the environment he operates within provides no native scheme by which VMM can communicate a pointer change to his process.
Compaction of usrmem allocation in multiple fragmented sysmem blocks is “merely” a scheme where usrmem blocks are moved between sysmem blocks. This can dramatically improve the “gearing” by freeing largely unallocated sysmem blocks. Due to the previously described lack of a communications mechanism it is often impossible to pass the benefits of such a compaction scheme on to a typical programmer.
Internally VMM uses and manages compaction where it makes sense to do so. Whilst the standard external interface does not offer a scheme for compaction, such a feature can be offered to potential clients who are prepared to implement the additional capability required of their processes to make it work reliably.
In addition to this possibility there are simple schemes implemented to protect against the worst effects of these problems, without the development overhead of compaction.
Where a usrmem allocation request exceeds a given proportion of a standard sysmem block, the size of the standard sysmem block is locally adjusted down to fit exactly the request. This prevents single big usrmem blocks becoming fragmented, because blocks greater than a given size can be considered either fully allocated, or free. It does however return the problem of the system overhead. Clearly blocks greater than a given size are less of an overhead penalty if they are slow, because their “gearing” is naturally better.
An alternative scheme to avoid the large block compaction problem is simply to use more than one instance of VMM where it is predictable that a given process operation will require a significant memory allocation. Instances of VMM can be created as peers or nested for better “geared” performance. In this situation an instance of VMM can have a lifespan the same as that of a process, a thread or a function. |