Some design problems never seem to go away. You think that anybody who has been in the embedded software development business for a while must have learned to be wary of malloc() and free() (or their C++ counterparts new and delete). Then you find that many developers actually don’t know why embedded real-time systems are so particularly intolerant of heap problems.
For example, recently an Embedded.com reader attacked my comment to the article “Back to the Basics – Practical Embedded Coding Tips: Part 1 Reentrancy, atomic variables and recursion“, in which I advised against using the heap. Here is this reader’s argumentation:
I have no idea why did you bring up the pledge not to use the heap, on modern 32-bit MCUs (ARMs etc) there is no reason – and no justification – to avoid using the heap. The only reason not to use the heap is to avoid memory fragmentation, but good heap implementation and careful memory allocation planning will overcome that.
As I cannot disagree more with the statements above, I decided that it’s perhaps the time to re-post my “heap of problems” list, which goes as follows:
- Dynamically allocating and freeing memory can fragment the heap over time to the point that the program crashes because of an inability to allocate more RAM. The total remaining heap storage might be more than adequate, but no single piece satisfies a specific malloc() request.
- Heap-based memory management is wasteful. All heap management algorithms must maintain some form of header information for each block allocated. At the very least, this information includes the size of the block. For example, if the header causes a four-byte overhead, then a four-byte allocation requires at least eight bytes, so only 50 percent of the allocated memory is usable to the application. Because of these overheads and the aforementioned fragmentation, determining the minimum size of the heap is difficult. Even if you were to know the worst-case mix of objects simultaneously allocated on the heap (which you typically don’t), the required heap storage is much more than a simple sum of the object sizes. As a result, the only practical way to make the heap more reliable is to massively oversize it.
- Both malloc() and free() can be (and often are) nondeterministic, meaning that they potentially can take a long (hard to quantify) time to execute, which conflicts squarely with real-time constraints. Although many RTOSs have heap management algorithms with bounded, or even deterministic performance, they don’t necessarily handle multiple small allocations efficiently.
Unfortunately, the list of heap problems doesn’t stop there. A new class of problems appears when you use heap in a multithreaded environment. The heap becomes a shared resource and consequently causes all the headaches associated with resource sharing, so the list goes on:
- Both malloc() and free() can be (and often are) non-reentrant; that is, they cannot be safely called simultaneously from multiple threads of execution.
- The reentrancy problem can be remedied by protecting malloc(), free(), realloc(), and so on internally with a mutex, which lets only one thread at a time access the shared heap. However, this scheme could cause excessive blocking of threads (especially if memory management is nondeterministic) and can significantly reduce parallelism. Mutexes can also be subject to priority inversion. Naturally, the heap management functions protected by a mutex are not available to interrupt service routines (ISRs) because ISRs cannot block.
Finally, all the problems listed previously come on top of the usual pitfalls associated with dynamic memory allocation. For completeness, I’ll mention them here as well.
- If you destroy all pointers to an object and fail to free it or you simply leave objects lying about well past their useful lifetimes, you create a memory leak. If you leak enough memory, your storage allocation eventually fails.
- Conversely, if you free a heap object but the rest of the program still believes that pointers to the object remain valid, you have created dangling pointers. If you dereference such a dangling pointer to access the recycled object (which by that time might be already allocated to somebody else), your application can crash.
- Most of the heap-related problems are notoriously difficult to test. For example, a brief bout of testing often fails to uncover a storage leak that kills a program after a few hours, or weeks, of operation. Similarly, exceeding a real-time deadline because of nondeterminism can show up only when the heap reaches a certain fragmentation pattern. These types of problems are extremely difficult to reproduce.