At first glance, dynamic allocation of objects in C++ is simple: new to allocate, delete to deallocate, and you're done. However, under the hood, the issue is much more complex and allows a considerable level of customization. This may not be important for simple applications, but is essential when you need better control of memory in your code, whether by writing a custom allocator, some kind of advanced memory management scheme or a specialized garbage collector.
This article doesn't aim to be a comprehensive manual, but a brief overview of the various ways memory can be allocated in C++. It's not basic, and assumes a good familiarity with the language.
Raw operator new
Let's start with the raw operator new. Consider this code, which allocates space for 5 integers and returns a pointer to it [1]:
int* v = static_cast<int*>(::operator new(5 * sizeof(*v)));
When called like this, operator new acts as a raw memory allocator, similar to malloc. The above line is conceptually equivalent to:
int* v = static_cast<int*>(malloc(5 * sizeof(*v)));
Freeing memory allocated with the raw operator new is done with the raw operator delete:
::operator delete(v);
Would you ever use the raw new and delete functions? Yes, in some rare cases, as I'll demonstrate later in the article. Why use them instead of the old and trusted malloc and free? One good reason is that you want to keep your code wholly in the C++ domain. Mixing new with free (or malloc with delete) is a big NO NO. Another reason is that you can overload or override these functions if you need to. Here's an example:
void* operator new(size_t sz) throw (std::bad_alloc)
{
cerr << "allocating " << sz << " bytes\n";
void* mem = malloc(sz);
if (mem)
return mem;
else
throw std::bad_alloc();
}
void operator delete(void* ptr) throw()
{
cerr << "deallocating at " << ptr << endl;
free(ptr);
}
In general, keep in mind that the global operator new function is called when the new operator is used to allocate objects of built-in types, objects of class type that do not contain user-defined operator new functions, and arrays of any type. When the new operator is used to allocate objects of a class type where an operator new is defined, that class's operator new is called.
And this brings us to classes with operator new.
Class-specific operator new
People sometimes wonder what's the difference between "operator new" and the "new operator". The former refers to either an overloaded operator new, global or class-specific, or the raw operator new function presented earlier. The latter refers to the built-in C++ new operator you usually employ to allocate memory, as in:
Car* mycar = new Car;
C++ supports operator overloading, and one of the operators it lets us overload is new. Here's an example:
class Base
{
public:
void* operator new(size_t sz)
{
cerr << "new " << sz << " bytes\n";
return ::operator new(sz);
}
void operator delete(void* p)
{
cerr << "delete\n";
::operator delete(p);
}
private:
int m_data;
};
class Derived : public Base
{
private:
int m_derived_data;
vector<int> z, y, x, w;
};
int main()
{
Base* b = new Base;
delete b;
Derived* d = new Derived;
delete d;
return 0;
}
Which prints:
new 4 bytes
delete
new 56 bytes
delete
The overloaded operator new and operator delete in the base class are also inherited by derived classes. As you can see, the operator new method gets the correct size to allocate in both cases. Note also that to actually allocate the memory, it uses ::operator new, the raw operator new described in the previous section. The double-colon in the call is essential in this case to avoid infinite recursion (without it the method would just call itself).
Why would you overload operator new for a class? There are many reasons.
- Performance: the default memory allocator is designed to be general purpose. Sometimes you have very specific objects you want to allocate, and by customizing the way they're allocated you can speed up memory management considerably. A lot of books and articles discuss this issue. Notably, chapter 4 in "Modern C++ Design" presents a very well designed and implemented custom allocator for small objects.
- Debugging & statistics: having full control of the way memory is allocated and released provides great flexibility for debugging, statistics and performance analysis. You can make your allocator insert special guards to detect buffer overruns, keep accounting of allocations vs. deallocations to detect memory leaks, count various metrics for statistics and performance analysis, and much more.
- Customization: for non-standard memory allocation schemes. One good example is pools or arenas for certain objects, which make memory management simpler. Another is a full-fledged garbage collection system for certain objects - this is all made possible by writing your custom operators new and delete for a class or a whole hierarchy.
It's educational to look at the way the new operator works in C++. Allocation is a two step process:
- First, raw memory is requested from the OS, represented by the global operator new function.
- Once that memory is granted, the new object is constructed in it.
The C++ FAQ presents a really nice code sample I'd like to reproduce here:
When you write this code:
Foo* p = new Foo();
What the compiler generates is functionally similar to:
Foo* p;
// don't catch exceptions thrown by the allocator itself
void* raw = operator new(sizeof(Foo));
// catch any exceptions thrown by the ctor
try {
p = new(raw) Foo(); // call the ctor with raw as this
}
catch (...) {
// oops, ctor threw an exception
operator delete(raw);
throw; // rethrow the ctor's exception
}
The funny syntax inside the try statement is called "placement new", and we'll discuss it shortly. For completeness' sake, let's see a similar breakdown for freeing an object with delete, which is also a two-step process:
- First, the destructor of the object that's being deleted is called.
- Then, the memory occupied by the object is returned to the OS, represented by the global operator delete function.
So:
delete p;
Is equivalent to [2]:
if (p != NULL) {
p->~Foo();
operator delete(p);
}
This is also a good place to repeat something I've mentioned in the first section of this article - if a class has its own operator new or operator delete, these get invoked instead of the global functions when an object is allocated or deallocated.
Placement new
Now, back to that "placement new" we saw in the code sample above. It happens to be a real syntax we can use in our C++ code. First, I want to briefly explain how it works. Then, we'll see when it can be useful.
Calling placement new directly skips the first step of object allocation. We don't ask for memory from the OS. Rather, we tell it where there's memory to construct the object in [3]. The following code sample should clarify this:
int main(int argc, const char* argv[])
{
// A "normal" allocation. Asks the OS for memory, so we
// don't actually know where this ends up pointing.
//
int* iptr = new int;
cerr << "Addr of iptr = " << iptr << endl;
// Create a buffer large enough to hold an integer, and
// note its address.
//
char mem[sizeof(int)];
cerr << "Addr of mem = " << (void*) mem << endl;
// Construct the new integer inside the buffer 'mem'.
// The address is going to be mem's.
//
int* iptr2 = new (mem) int;
cerr << "Addr of iptr2 = " << iptr2 << endl;
return 0;
}
For a particular run on my machine it prints:
Addr of iptr = 0x8679008
Addr of mem = 0xbfdd73d8
Addr of iptr2 = 0xbfdd73d8
As you can see, the mechanics of placement new are quite simple. What's more interesting is the question - why would we need something like this? It turns out placement new is quite useful in a few scenarios:
- Custom non-intrusive memory management. While overloading operator new for a class also allows custom memory management, the key concept here is non-intrusive. Overloading operator new requires you to change the source code of a class. But suppose we have a class the code of which we don't want or can't change. How can we still control its allocation? Placement new is the answer here. A common programming technique that uses placement new for this purpose is memory pools, sometimes also called "arenas" [4].
- In some applications it is necessary to allocate objects in specific memory regions. One example is shared memory. Another is embedded applications or drivers with memory mapped peripherals, which can be controlled conveniently by objects allocated "on top" of them.
- Many container libraries pre-allocate large buffers of memory. When new objects are added they have to be constructed in these buffers, so placement new is used. The prime example is probably the standard vector container.
Deleting an object allocated with placement new
One of the maxims of C++ is that objects allocated with new should be deallocated with delete. Is this also true for objects allocated with placement new? Not quite:
int main(int argc, const char* argv[])
{
char mem[sizeof(int)];
int* iptr2 = new (mem) int;
delete iptr2; // Whoops, segmentation fault!
return 0;
}
To understand why delete iptr2 in the snippet causes a segmentation fault (or some other kind of memory violation, depending on the operating system), let's recall the description of what delete iptr2 actually does:
- First, the destructor of the object that's being deleted is called.
- Then, the memory occupied by the object is returned to the OS, represented by the global operator delete function.
There's no problem with the first step for an object allocated with placement new, but the second one looks suspicious. Attempting to free memory that was not actually allocated by the memory allocator is definitely a bad thing, but it's exactly what the code sample above does. iptr2 points to some location on the stack which was not allocated with global operator new. And yet, delete ipt2 will try to deallocate it with global operator delete. Segmentation fault indeed.
So what do we do? How do we properly delete iptr2? Well, we surely can't expect the compiler to figure out how to deallocate the memory - after all, we just pass a pointer to placement new - that pointer could've been taken from the stack, from some memory pool or somewhere else. So deallocation has to be manual.
As a matter of fact, the placement new used above is just a special case of a generalized placement new syntax allowed by C++ for specifying extra arguments in new. It's defined in the standard header <new> as follows:
inline void* operator new(std::size_t, void* __p) throw()
{
return __p;
}
C++ dictates that to free such an object, a matching delete with the same arguments is looked for. This one is also defined in <new>:
inline void operator delete (void*, void*) throw()
{
}
Indeed, the C++ runtime just doesn't know how to deallocate such an object, so the delete is a no-op.
What about destruction? For an int, no destruction is really needed, but suppose the code would be:
char mem[sizeof(Foo)];
Foo* fooptr = new (mem) Foo;
For some non-trivial class Foo. What do we do to destruct fooptr once we don't need it anymore? We have to call its destructor:
fooptr->~Foo();
Yes, calling the destructor explicitly is actually valid in C++, and this is probably one of the only cases where it makes sense to do it [5].
Conclusion
This is a complex topic, and the article only served as an introduction, giving a "quick taste" of the various methods C++ provides for memory allocation. There are many interesting gotchas and programming tricks once you start going down some specific road (for example, implementing a pool allocator). These are best presented in their own context and not as part of a general introductory article. If you want to go deeper, check the Resources section below.
Resources
- C++ FAQ Lite, especially items 11.14 and 16.9
- "The C++ Programming Language, 3rd edition" by Bjarne Stroustrup - 10.4.11
- "Effective C++, 3rd edition" by Scott Myers - item 52
- "Modern C++ Design" by Andrei Alexandrescu - chapter 4
- Several StackOverflow discussions. Start with this one and browse as long as your patience lasts.
[1] | I'm writing :: before operator new explicitly although it's not strictly required in this case. IMHO this is a good practice, especially when used inside overloaded operator new methods to avoid ambiguity. |
[2] | Note the check for NULL. It's the reason for delete p being safe even when p is NULL - another C++ FAQ. |
[3] | It is solely your responsibility that the pointer passed to placement new points to enough memory for the object, and that it's also correctly aligned. |
[4] | Memory pools are a large and fascinating topic by themselves. I can't cover it in any meaningful depth here, so I encourage you to look up more information online. Wikipedia is a good start, as usual. |
[5] | In fact, the standard vector container uses it to destruct objects it holds. |