The many faces of operator new in C++

February 17th, 2011 at 7:03 am

At first glance, dynamic allocation of objects in C++ is simple: new to allocate, delete to deallocate, and you’re done. However, under the hood, the issue is much more complex and allows a considerable level of customization. This may not be important for simple applications, but is essential when you need better control of memory in your code, whether by writing a custom allocator, some kind of advanced memory management scheme or a specialized garbage collector.

This article doesn’t aim to be a comprehensive manual, but a brief overview of the various ways memory can be allocated in C++. It’s not basic, and assumes a good familiarity with the language.

Raw operator new

Let’s start with the raw operator new. Consider this code, which allocates space for 5 integers and returns a pointer to it [1]:

int* v = static_cast<int*>(::operator new(5 * sizeof(*v)));

When called like this, operator new acts as a raw memory allocator, similar to malloc. The above line is conceptually equivalent to:

int* v = static_cast<int*>(malloc(5 * sizeof(*v)));

Freeing memory allocated with the raw operator new is done with the raw operator delete:

::operator delete(v);

Would you ever use the raw new and delete functions? Yes, in some rare cases, as I’ll demonstrate later in the article. Why use them instead of the old and trusted malloc and free? One good reason is that you want to keep your code wholly in the C++ domain. Mixing new with free (or malloc with delete) is a big NO NO. Another reason is that you can overload or override these functions if you need to. Here’s an example:

void* operator new(size_t sz) throw (std::bad_alloc)
{
    cerr << "allocating " << sz << " bytes\n";
    void* mem = malloc(sz);
    if (mem)
        return mem;
    else
        throw std::bad_alloc();
}


void operator delete(void* ptr) throw()
{
    cerr << "deallocating at " << ptr << endl;
    free(ptr);
}

In general, keep in mind that the global operator new function is called when the new operator is used to allocate objects of built-in types, objects of class type that do not contain user-defined operator new functions, and arrays of any type. When the new operator is used to allocate objects of a class type where an operator new is defined, that class’s operator new is called.

And this brings us to classes with operator new.

Class-specific operator new

People sometimes wonder what’s the difference between "operator new" and the "new operator". The former refers to either an overloaded operator new, global or class-specific, or the raw operator new function presented earlier. The latter refers to the built-in C++ new operator you usually employ to allocate memory, as in:

Car* mycar = new Car;

C++ supports operator overloading, and one of the operators it lets us overload is new. Here’s an example:

class Base
{
public:
    void* operator new(size_t sz)
    {
        cerr << "new " << sz << " bytes\n";
        return ::operator new(sz);
    }

    void operator delete(void* p)
    {
        cerr << "delete\n";
        ::operator delete(p);
    }
private:
    int m_data;
};


class Derived : public Base
{
private:
    int m_derived_data;
    vector<int> z, y, x, w;
};


int main()
{
    Base* b = new Base;
    delete b;

    Derived* d = new Derived;
    delete d;
    return 0;
}

Which prints:

new 4 bytes
delete
new 56 bytes
delete

The overloaded operator new and operator delete in the base class are also inherited by derived classes. As you can see, the operator new method gets the correct size to allocate in both cases. Note also that to actually allocate the memory, it uses ::operator new, the raw operator new described in the previous section. The double-colon in the call is essential in this case to avoid infinite recursion (without it the method would just call itself).

Why would you overload operator new for a class? There are many reasons.

  • Performance: the default memory allocator is designed to be general purpose. Sometimes you have very specific objects you want to allocate, and by customizing the way they’re allocated you can speed up memory management considerably. A lot of books and articles discuss this issue. Notably, chapter 4 in "Modern C++ Design" presents a very well designed and implemented custom allocator for small objects.
  • Debugging & statistics: having full control of the way memory is allocated and released provides great flexibility for debugging, statistics and performance analysis. You can make your allocator insert special guards to detect buffer overruns, keep accounting of allocations vs. deallocations to detect memory leaks, count various metrics for statistics and performance analysis, and much more.
  • Customization: for non-standard memory allocation schemes. One good example is pools or arenas for certain objects, which make memory management simpler. Another is a full-fledged garbage collection system for certain objects – this is all made possible by writing your custom operators new and delete for a class or a whole hierarchy.

It’s educational to look at the way the new operator works in C++. Allocation is a two step process:

  1. First, raw memory is requested from the OS, represented by the global operator new function.
  2. Once that memory is granted, the new object is constructed in it.

The C++ FAQ presents a really nice code sample I’d like to reproduce here:

When you write this code:

Foo* p = new Foo();

What the compiler generates is functionally similar to:

Foo* p;

 // don't catch exceptions thrown by the allocator itself
 void* raw = operator new(sizeof(Foo));

 // catch any exceptions thrown by the ctor
 try {
   p = new(raw) Foo();  // call the ctor with raw as this
 }
 catch (...) {
   // oops, ctor threw an exception
   operator delete(raw);
   throw;  // rethrow the ctor's exception
 }

The funny syntax inside the try statement is called "placement new", and we’ll discuss it shortly. For completeness’ sake, let’s see a similar breakdown for freeing an object with delete, which is also a two-step process:

  1. First, the destructor of the object that’s being deleted is called.
  2. Then, the memory occupied by the object is returned to the OS, represented by the global operator delete function.

So:

delete p;

Is equivalent to [2]:

if (p != NULL) {
  p->~Foo();
  operator delete(p);
}

This is also a good place to repeat something I’ve mentioned in the first section of this article – if a class has its own operator new or operator delete, these get invoked instead of the global functions when an object is allocated or deallocated.

Placement new

Now, back to that "placement new" we saw in the code sample above. It happens to be a real syntax we can use in our C++ code. First, I want to briefly explain how it works. Then, we’ll see when it can be useful.

Calling placement new directly skips the first step of object allocation. We don’t ask for memory from the OS. Rather, we tell it where there’s memory to construct the object in [3]. The following code sample should clarify this:

int main(int argc, const char* argv[])
{
    // A "normal" allocation. Asks the OS for memory, so we
    // don't actually know where this ends up pointing.
    //
    int* iptr = new int;
    cerr << "Addr of iptr = " << iptr << endl;

    // Create a buffer large enough to hold an integer, and
    // note its address.
    //
    char mem[sizeof(int)];
    cerr << "Addr of mem = " << (void*) mem << endl;

    // Construct the new integer inside the buffer 'mem'.
    // The address is going to be mem's.
    //
    int* iptr2 = new (mem) int;
    cerr << "Addr of iptr2 = " << iptr2 << endl;

    return 0;
}

For a particular run on my machine it prints:

Addr of iptr = 0x8679008
Addr of mem = 0xbfdd73d8
Addr of iptr2 = 0xbfdd73d8

As you can see, the mechanics of placement new are quite simple. What’s more interesting is the question – why would we need something like this? It turns out placement new is quite useful in a few scenarios:

  • Custom non-intrusive memory management. While overloading operator new for a class also allows custom memory management, the key concept here is non-intrusive. Overloading operator new requires you to change the source code of a class. But suppose we have a class the code of which we don’t want or can’t change. How can we still control its allocation? Placement new is the answer here. A common programming technique that uses placement new for this purpose is memory pools, sometimes also called "arenas" [4].
  • In some applications it is necessary to allocate objects in specific memory regions. One example is shared memory. Another is embedded applications or drivers with memory mapped peripherals, which can be controlled conveniently by objects allocated "on top" of them.
  • Many container libraries pre-allocate large buffers of memory. When new objects are added they have to be constructed in these buffers, so placement new is used. The prime example is probably the standard vector container.

Deleting an object allocated with placement new

One of the maxims of C++ is that objects allocated with new should be deallocated with delete. Is this also true for objects allocated with placement new? Not quite:

int main(int argc, const char* argv[])
{
    char mem[sizeof(int)];
    int* iptr2 = new (mem) int;

    delete iptr2;       // Whoops, segmentation fault!

    return 0;
}

To understand why delete iptr2 in the snippet causes a segmentation fault (or some other kind of memory violation, depending on the operating system), let’s recall the description of what delete iptr2 actually does:

  1. First, the destructor of the object that’s being deleted is called.
  2. Then, the memory occupied by the object is returned to the OS, represented by the global operator delete function.

There’s no problem with the first step for an object allocated with placement new, but the second one looks suspicious. Attempting to free memory that was not actually allocated by the memory allocator is definitely a bad thing, but it’s exactly what the code sample above does. iptr2 points to some location on the stack which was not allocated with global operator new. And yet, delete ipt2 will try to deallocate it with global operator delete. Segmentation fault indeed.

So what do we do? How do we properly delete iptr2? Well, we surely can’t expect the compiler to figure out how to deallocate the memory – after all, we just pass a pointer to placement new – that pointer could’ve been taken from the stack, from some memory pool or somewhere else. So deallocation has to be manual.

As a matter of fact, the placement new used above is just a special case of a generalized placement new syntax allowed by C++ for specifying extra arguments in new. It’s defined in the standard header <new> as follows:

inline void* operator new(std::size_t, void* __p) throw()
{
    return __p;
}

C++ dictates that to free such an object, a matching delete with the same arguments is looked for. This one is also defined in <new>:

inline void  operator delete  (void*, void*) throw()
{
}

Indeed, the C++ runtime just doesn’t know how to deallocate such an object, so the delete is a no-op.

What about destruction? For an int, no destruction is really needed, but suppose the code would be:

char mem[sizeof(Foo)];
Foo* fooptr = new (mem) Foo;

For some non-trivial class Foo. What do we do to destruct fooptr once we don’t need it anymore? We have to call its destructor:

fooptr->~Foo();

Yes, calling the destructor explicitly is actually valid in C++, and this is probably one of the only cases where it makes sense to do it [5].

Conclusion

This is a complex topic, and the article only served as an introduction, giving a "quick taste" of the various methods C++ provides for memory allocation. There are many interesting gotchas and programming tricks once you start going down some specific road (for example, implementing a pool allocator). These are best presented in their own context and not as part of a general introductory article. If you want to go deeper, check the Resources section below.

Resources

  • C++ FAQ Lite, especially items 11.14 and 16.9
  • "The C++ Programming Language, 3rd edition" by Bjarne Stroustrup – 10.4.11
  • "Effective C++, 3rd edition" by Scott Myers – item 52
  • "Modern C++ Design" by Andrei Alexandrescu – chapter 4
  • Several StackOverflow discussions. Start with this one and browse as long as your patience lasts.
http://eli.thegreenplace.net/wp-content/uploads/hline.jpg
[1] I’m writing :: before operator new explicitly although it’s not strictly required in this case. IMHO this is a good practice, especially when used inside overloaded operator new methods to avoid ambiguity.
[2] Note the check for NULL. It’s the reason for delete p being safe even when p is NULL – another C++ FAQ.
[3] It is solely your responsibility that the pointer passed to placement new points to enough memory for the object, and that it’s also correctly aligned.
[4] Memory pools are a large and fascinating topic by themselves. I can’t cover it in any meaningful depth here, so I encourage you to look up more information online. Wikipedia is a good start, as usual.
[5] In fact, the standard vector container uses it to destruct objects it holds.

Related posts:

  1. Top-Down operator precedence parsing
  2. How I stopped worrying and switched to C++ for my Bob Scheme VM
  3. M$ patent claim on IsNot
  4. memmgr – a fixed-pool memory allocator

20 Responses to “The many faces of operator new in C++”

  1. David BjörkevikNo Gravatar Says:

    You write that “Mixing new with free (or malloc with delete) is a big NO NO.”, and then you link to an FAQ entry that warns against free()ing a pointer allocated with new(), which is not really the same thing. If you know what you’re doing it’s perfectly valid and legitimate to use malloc and free in a C++-program, and I sometimes do this myself for low-level stuff. The primary reason for this is that C++ lacks the concept of realloc(). And if a C++ developer doesn’t know about the inner workings of realloc(), he/she is in for some bugs that can be very hard to find. Here’s an example:
    std::vector<int> foo;

    ... // populate vector

    int *a = &foo[1];
    foo.push_back(8);
    int *b = &foo[1];
    assert(a==b); // Not guaranteed to be true

  2. elibenNo Gravatar Says:

    David,

    I’m not sure where we disagree. I didn’t say that it’s illegitimate to use malloc and free in a C++ program. I just said that mixing new with free (i.e. using free to deallocate a pointer allocated with new) is illegitimate.

    Also, realloc is often recommended against even in C code, and there’s a good chance C++’s containers don’t use it, instead just re-allocating new memory with new and then using uninitialized_copy to copy the data over.

  3. sellibitzeNo Gravatar Says:

    Thumbs up. Two minor issues I noticed: “…is requested from the OS…”. This is not necessarily true. The C++ runtime provided by the compiler writers might have its own layer of indirection/memory management or there might not even be an OS. And secondly, You don’t say anything about possible alignment issues w.r.t. placement new. malloc and ::operator new give you a pointer to a memory block which is “aligned enough” to store anything the C++ standard covers (possibly exclusing SIMD-specific extensions).

  4. elibenNo Gravatar Says:

    sellibitze,

    Thanks for the feedback. Regarding request from OS vs. runtime – you’re right, but I didn’t want to complicate the article unnecessarily. For the sake of the presentation of the material it shows, “requested from the OS” is a good-enough abstraction. Regarding alignment of placement new, I believe I did mention it in footnote 3.

  5. DaevNo Gravatar Says:

    This is a fantastic article. I’m a low-level C programmer, and when I’m called upon to find bugs in C++ programs, I always wish I could look at the code and “see the assembly language” the way I can with C. I would like there to be full C++ tutorials that taught the language from a purely low-level perspective like you are doing.

    There is one unclear passage I would suggest improving. In the section “Class-specific operator new” you draw a distinction between “operator new” and “the new operator,” which is important. But in the code which immediately follows, are you overloading the former or the latter? The paragraph after the code is confusing, because it starts talking about about “the overloaded new and delete operators,” which sounds like the latter. (Do you mean “the overloaded operator new and operator delete functions”?) What would it mean to “overload the new operator” as opposed to “overloading operator new”?

  6. ChrisNo Gravatar Says:

    One thing that is always glossed over is the question of just *why* it’s a no-no to mix malloc/free and new/delete. Memory is ultimately an operating-system resource, and any application-level (e.g. language or external RTL) memory manager must first and foremost request memory from the operating system, using whatever facilities the OS provides for doing that. (I seem to recall reading, once, about a system function named sbrk() that did the low-level allocation from the OS. Is that still the case?)

    So couldn’t new and delete have been implemented in a manner that WAS compatible with malloc and free? As wrappers AROUND malloc and free? Or couldn’t C++ development packages come with C++-compatible replacement implementations of malloc/free?

    After all, there are circumstances when one simply MUST mix newly-written C++ code, and extant code that uses the C RTL (the X windowing system (which underlies Gtk and such), for instance, makes HEAVY use of malloc/free. So are we saying we can’t write X programs in C++? That’s preposterous, not only in theory (it should be possible) but in practice (I know major organizations that have been doing it for decades with no particular problems).

    So what are the issues, and the rules, *REALLY*??? *THAT* would be an interesting article.

  7. elibenNo Gravatar Says:

    Daev,

    Thanks for the catch – I changed the phrasing to be clearer as to what is being overloaded.

    Chris,

    I’m not sure we see the “compatibility” of malloc/free with new/delete in the same way. There’s absolutely no problem using both pairs of constructs in the same program, and thus the X windowing system can be freely used from C++ programs (many purely C libraries do). What’s not allowed is the following: taking a pointer which was allocated with malloc and releasing it with delete. Why would you want to do it anyway?

  8. Ken KNo Gravatar Says:

    My understanding, and I admit this is novice on this subject, was new and delete was for classes was because of the v-table and/or constructor/deconstructor reasons. Since you make no mention of it I have doubt on that.

  9. SteveDNo Gravatar Says:

    Good article. Just a quick clarification – in the discussion of placement new, should the generic matching delete not be:

    inline void operator delete (void*, std::size_t size) throw()

    rather than …(void*, void*)?

    You then have a pointer to the memory and the amount to deallocate as you please. This is how Alexandrescu has it in his book anyway!

    For everyone worried about mixing new and free or malloc and delete it’s pretty simple – free() doesn’t call the destructor and malloc() doesn’t call the constructor. If you new it, delete it; if you malloc() it, free() it. Always. You can use both paradigms in the same code, but not on the same object.

    There is a more complex explanation, in that operator new as discussed here can be user defined or overloaded to allocate memory from a memory pool (say) and if so it will need to track when this is released in order to update the pool state by overloading the operator delete. If you just free() the pointer, the OS might be happy, but the state of the memory pool would not be updated, and you’d clearly have a problem.

  10. elibenNo Gravatar Says:

    SteveD,

    I think not. The version I showed is the one actually present in <new> headers if you peek at some stdlib source. It also makes sense, since delete doesn’t explicitly get the allocation size. The first void* is the pointer to release. The second is the placement pointer which is also passed to new, so the signatures match.

  11. alexNo Gravatar Says:

    Excellent article, thank you for sharing.

  12. Manjeet DahiyaNo Gravatar Says:
    int main(int argc, const char* argv[])
    {
        char mem[sizeof(int)];
        int* iptr2 = new (mem) int;
    
        delete iptr2;       // Whoops, segmentation fault!
    
        return 0;
    }

    So in this case we don’t need to delete iptr2??

  13. elibenNo Gravatar Says:

    Manjeet

    The actual memory was allocated on the stack and will be released once main returns, and no destructor has to be called for an int. Therefore, no delete is required here at all, but this is a synthetic example…

  14. Manjeet DahiyaNo Gravatar Says:

    Thanks Eliben for the clarification. This is what I was thinking. Very informative article. Thanks!

  15. LorenzoNo Gravatar Says:

    Like usual a very nice article!
    one question, you use the following code:

    int* v = static_cast<int*>(::operator new(5 * sizeof(*v)));
    ::delete(v);

    nice to compare to malloc(), but in product code is better a classic code like this?

    int *v = ::new int[5];
    ::delete[](v);
  16. elibenNo Gravatar Says:

    Lorenzo,

    Good catch – yes, this is how I’d allocate an array of integers if I were to write real code.

  17. NishithNo Gravatar Says:

    Refer to example in section “Class-specific operator new”
    Shouldn’t the function “void* operator new(size_t sz);” in class “Base” be static ?

  18. elibenNo Gravatar Says:

    Nishith,

    No, when overloading operators (and new specifically), they’re not made static.

  19. ScottNo Gravatar Says:

    According to the 2003 standard (sections 3.7.3.2, 12.5), it is possible for delete to get the allocation size in a class-specific overload of operator delete. If a class does not declare the normal operator delete:

    class foo : {
      ...
      void operator delete(void*);
      ...
    };

    but does instead declare this one:

    class foo : {
      ...
      void operator delete(void*, size_t);
      ...
    };

    then the second version will be found when lookup of a delete-expression is done within the scope of class foo. If foo were derived from some base class B, then the following code:

    B * b = new foo();
    delete b;

    would use foo::operator delete if B has a virtual destructor, otherwise it would use the global operator delete (or B’s operator delete, if one were defined for B). But of course, if B does not have a virtual destructor, then a foo should not be deleted via a B* in the first place.

  20. SteveNo Gravatar Says:

    Great article! Really clear and to the point.

    I also had Nishith’s question while reading this: “surely operator new/operator delete cannot access members of the class since the instance doesn’t exist yet, so why aren’t they declared as static?”.

    I don’t think Eli’s response is quite right. At least as far as the weird GCC 4.2 / llvm toolchain on OS X 10.7 is concerned, ‘operator new’ and ‘operator delete’ are implicitly static functions. In fact, you can add the ‘static’ modifier and it compiles just fine. If you omit it and happen to have a compilation error in the method, the compiler’s output points you at ‘_static_ operator new/delete’, whether you explicitly declared it that way or not. I have no idea if the actual standard would disallow ‘static’ to be explicitly written (that is, maybe this is a GCCism).

    Assuming it is valid, can anyone explain why C++ lets you just omit the ‘static’? It seems to me that it’s just more confusing this way. Certainly some operators such as operator< are not implicitly static. Is this just another in the large pile of C++ warts, or is there a better reason behind it?

Leave a Reply

To post code with preserved formatting, enclose it in `backticks` (even multiple lines)