I wanted to start what hopefully will become a series on C++ productivity that is geared towards .NET developers that have had at least some (maybe painful) experience with C++. The first topic I’m covering is memory management as this area is often cited as the killer reason to use a garbage collected runtime over a native language like C++.
“My code has ninety-nine problems, but memory management should not be one.”
Fair enough. Developing code is hard. Developing code that is stable and doesn’t leak like a sieve is even harder. Even in .NET we fight “leaks”, but as we became masters of the runtime we remembered simple rules to avoiding leaks in our managed application such as:
- Being mindful of a hooked event, making sure it is unhooked if the object lifetime of the observer is shorter than the subject.
- Static objects are rooted. Objects that static objects refers to are also rooted.
- Dispose an IDisposable or make use of the using statement.
- Be aware of framework deficiencies (like this one from WPF)
Being masters of the .NET runtime, rules like these are second nature to us. What about C++ where we directly allocate/deallocate memory? How much more difficult is it than .NET? Depending on your point of view, some say it’s easier. Just like .NET, you just have to remember some simple rules. I’ll explain as I go along, but first some background…
C++, Dynamic Memory and Automatic Variables
Apologies if this next part is rehash from computer science, but bear with me as it contains some important concepts. I won’t go too far in depth as there are thousands of smarter folks that have covered this far better than I. In C++ you have dynamically allocated variables and automatic variables. The distinction is very simple. Dynamic means you created something using the new keyword and will look like a pointer. Automatic means the new keyword was not used.
Here is an example of a class that uses dynamically allocated memory:
CMyClass* pClass = new CMyClass();
pClass->DoSomething();
Notice here that pClass is a pointer type. When we instantiate it with the new keyword, memory is allocated on the heap and that memory location is assigned to pClass. This memory will not be freed unless we execute delete pClass.
Here is an example of an automatic variable:
CMyClass myClass;
myClass.DoSomething();
Compare this to the previous dynamically allocated example. There is no new keyword involved. This might seem foreign to a .NET developer as the next line of code, executing the DoSomething method, would cause a NullReferenceException to be thrown. In C++, myClass is automatically allocated, and the constructor is called. You also do not call delete myClass to free it. It gets freed and the deconstructor gets called automatically. The question here should be “When does this happen?” And the answer is, “It gets freed the only safe time to release something automatic…when it loses scope.”
Consider the following example:
1: void MyMethod()
2: {
3: //myClassA ctor called
4: CMyClass myClassA;
5:
6: }//myClassA dtor called
In this snippet myClassA is allocated on the stack because it is local to the method. When the method exits, myClass is automatically deallocated.
Here is another example, but this time we have an automatic variable, m_myClass, that is scoped inside a class as a member.
1: class MyClassA
2: {
3: MyClassA()
4: {
5:
6: }
7:
8: ~MyClassA()
9: {
10: /* Do nothing as m_myClass
11: will automatically be freed */
12: }
13: private:
14: MyClassX m_myClass;
15: }
So hopefully now you have a rough understanding of automatic vs dynamic. Just to make sure, I’ll reiterate. Dynamic allocations are pointers created with the new keyword (malloc too if you wanna be picky), and freed with the delete keyword. Automatic variables are auto instantiated and freed when they lose scope.
“I need dynamic allocations! I can’t just use auto variables everywhere! You better not be wasting my time.”
Truth! Nothing but a hello word application can really suffice without dynamically allocating some of that sweet, sweet memory. How else are you going to hold thousands of strings for your latest Twitter client? Or where else are you going to keep that huge buffer of the image you decoded? In C++ it’s just a fact of life that for every new keyword use, you better have a delete to free it at some point, unless you consider leaking a feature. Managing dynamically allocated memory is where almost everyone has had an issue with native code. Here are some common scenarios:
- Ownership problem: Say the Joe class hands over a jpeg image to Randy. Sooner or later, Joe gets destroyed. Who frees the jpeg image? If it is Joe, then what if Randy is still using the jpeg? In that case Randy will just get pissed and crash the entire app when he finds out!
- Exception problem: Say we just allocated us a string in a ProcessSomething method, then immediately run the DoStuff method. DoStuff ends up throwing an exception. The stack then unwinds and our ProcessSomething method never runs it’s cleanup to free the string.
- Double delete/free: You run delete on some memory, but accidently run delete on it again. Usually related to the ownership problem. Some C runtimes will crash, some do nothing.
- Use after delete/free: You try to use your memory after it was already deleted. Also usually related to the ownership problem. This will surely crash most apps!
- Human error: You just plain don’t remember to free something you allocated.
With all these things to worry about, no wonder native code has such a bad name. It just looks like a minefield of problems where one has to worry more about infrastructure than writing code to solve a real world problem! Rest assured there is a solution in a little idiom known as RAII.
”Resource Allocation Is Initialization” or RAII: Another acronym to make things sound more complicated than they really are.
RAII has been around for quite a while and wikipedia has a good article on it, so I’ll just give it to you in layman’s. RAII is a pattern for binding the lifetime of an allocated resource to automatic variables. Remember that part! RAII is the basis of making robust, leak proof applications. Though RAII is at the root of the solution, keep in mind that it is not the solution in it’s entirety.
Here is a simple C++ class that can use RAII:
1: class MyClass
2: {
3: public:
4: MyClass()
5: {
6: /* Allocate some memory */
7: m_myDynamicString = new wchar_t[50];
8: }
9:
10: ~MyClass()
11: {
12: /* Free memory on dtor */
13: delete m_myDynamicString;
14: }
15:
16: void DoSomething ()
17: {
18: throw std::runtime_error("file write failure");
19: }
20:
21: private:
22: wchar_t* m_myDynamicString;
23: };
In this class, there is nothing out of the ordinary. It allocates memory on the constructor and nicely deallocates it on the deconstructor. When we use MyClass as an automatic variable, RAII starts to make sense. Consider this usage of the above “MyClasss”.
1: void CrazyMethod()
2: {
3: MyClass myClass; /* ctor executed */
4: myClass.DoSomething(); /* throws exception,
5: dtor automatic
6: when stack unwinds */
7: }
Because myClass is an automatic variable, the deconstructor is guarenteed to be executed in the event of an exception. It is also guarenteed to be executed when CrazyMethod leaves scope. Because the deconstructor always runs, delete is always called on our allocated memory. Alternatively, MyClass can be a member of another class as an automatic variable, and it will still be released when the class it’s contained in loses scope (or is released). This is all great stuff, but we still have problems…
“Your example still shows having to call delete xyz! I thought you were going to show me how to avoid having to do that! This solves only a few issues with memory management!”
RAII is a great way to handle some problems, but this basic pattern doesn’t go far enough. As mentioned before, we still have to explicitly delete the memory we allocated with the new keyword. It also does not solve the “ownership problem” of resources. If only we could bind all dynamically allocated memory to a specific scope and also have a way to track ownership, we’d be sitting pretty. Luckily there’s some smart folks that have come up with a very smart solution.
Enter the Smart Pointers – Giving Intelligence to a Memory Address
So I finally get to the root of this post. Smart pointers, simply said, are wrappers around pointers. They are also not a new concept either. The smart pointers I’ve been exposed to all use RAII to make them work. Also they are all based on templates (generics for you .NET folks), so they are generally very flexible and add little code smell.
For the sake of brevity, I only want to cover some smart pointers, at least the ones I find the most useful in modern C++ development.
The idea behind all smart pointers I cover here are to take all assignments of new. They take care of managing lifetime of your dynamic memory and making sure you have no leaks.
unique_ptr
Nothing explains more than a good old example. So lets start there!
1: void DoSomething()
2: {
3: /* auto keyword is like var in c# */
4: auto myWideString = unique_ptr(new wchar_t[10]);
5: }
Here we initialize a new unique_ptr, passing it some wchar_t array we have dynamically allocated. Keep in mind that we have NOT done “myWideString = new unique_ptr…”. If fact, I can’t think of a reason to ever new a smart pointer as it defeats the purpose. If you recall the previous section on RAII, you will realize that the lifetime of our “new wchar_t[10]” memory is controlled by the lifetime of the unique_ptr. In this case, for the duration of the DoSomething method. Once DoSomething exits, the unique_ptr deconstructor is automatically called. The unique_ptr is smart enough to to run the “delete[]” on the buffer we have assigned to it.
Here is another, more complex example:
1: class MyCoolClass
2: {
3: public:
4: MyCoolClass()
5: {
6: m_myAllocedInt(new int(123));
7: }
8:
9: void Process()
10: {
11: /* Process something */
12: }
13: private:
14: /* Auto destroyed */
15: unique_ptrint> m_myAllocedInt;
16: };
17:
18: void DoSomethingElse()
19: {
20: auto myCoolInstance = unique_ptr(new MyCoolClass());
21: myCoolInstance->Process();
22: }
First notice that MyCoolClass has a unique_ptr (m_myAllocedInt) as a class member. It’s also an automatic variable that is tied to the scope of the class. In the constructor a new int is allocated on the heap and handed over to the unique_ptr. This memory will be freed when the class’s deconstructor is called.
Next look at the DoSomethingElse method. We dynamically create the MyCoolClass on the heap and assign it to the myCoolInstance unique_ptr. This ties that class instances to the lifetime of the unique_ptr that owns it, which until DoSomethingElse exits. The next line shows a call to myCoolInstance->Process. Notice that even though we are dealing with an automatic variable, we still can use it as if it was a pointer of the template type! Neat!
Performance – Performance overhead of unique_ptr is nothing. The compiler will inline everything, so it’s just as if you called the delete yourself.
Caveats – The unique_ptr directly replaces the older smart pointer called auto_ptr. Only one pointer can own the dynamically allocated memory at one time. So you cannot do “unique_ptr
When to use – Use unique_ptr in situations where you need to transfer ownership of an instance. Personally, I mostly use this internal to a class implementation. If you wish to have multiple classes share an instance, you would use this next smart pointer…
shared_ptr
shared_ptr is by far my most favorite of all smart pointers. It’s very similar to unique_ptr in usage, solves the same issue as unique_ptr. In addition, it also solves the ownership problem too. shared_ptr achieves this by using a pattern known as (automatic) reference counting. If you were to create a shared_ptr, it would have an initial reference count of 1. If you were to pass your shared_ptr to be referenced somewhere else, it has a count of 2. As soon as shared_ptr instance falls out of scope (recall RAII), the reference count is decremented. Once the reference count hits 0, it will run delete on the memory.
Let’s look at this common example of needing to share instances between classes:
1: class MyCommonInstance
2: {
3: public:
4: void DoMagic();
5: };
6:
7: class MyCoolClass
8: {
9: public:
10: MyCoolClass()
11: {
12: }
13:
14: void SetCommonData(shared_ptr& someInstance)
15: {
16: /* auto increments ref count */
17: m_localInstance = someInstance;
18: }
19: private:
20: shared_ptrm_localInstance;
21: };
22:
23: void DoSomethingElse()
24: {
25: auto instanceA = unique_ptr(new MyCoolClass());
26: auto instanceB = unique_ptr(new MyCoolClass());
27:
28: /* 1 ref count */
29: auto sharedInstance = shared_ptr(new MyCommonInstance);
30:
31: /* 2 ref count */
32: instanceA->SetCommonData(sharedInstance);
33:
34: /* 3 ref count */
35: instanceB->SetCommonData(sharedInstance);
36: } /* smart ptrs destroyed on exit -
37: instanceA dtor called - ref count 2
38: instanceB dtor called - ref count 1
39: sharedInstance dtor - ref count 0, delete called automatically */
- Here two instances of MyCoolClass have been created. I am wrapping the instances in a unique_ptr as you should already be familiar with it if you haven’t totally skimmed this post.
- Next we create a shared_ptr of type MyCommonInstance. At the point of instantiation, the sharedInstance variable has a reference count of 1. When we pass sharedInstance to instanceA->SetCommonData, instanceA keeps reference to it.
- So by the time SetCommonData completes, sharedInstance will have a reference count of 2. Because we send the shared_ptr to instanceB, sharedInstance will have a total reference count of 3 before the method exits.
- The method exits, freeing all smart pointers and subsequently the memory that was assigned to them. As a shared_ptr gets released, it’s reference count gets lowered until it hits 0 and it’s deleter gets called.
Performance – shared_ptr performs very fast, but there are some things to be aware of. When a shared_ptr is first created, it has to do an allocation on the heap to store the reference counting memory. So you are doing two allocations by default, one for the reference counting memory and one for dynamically allocated memory. There is a work around though. If you use the make_shared helper method, it will allocate your instance AND the reference counting memory in one allocation. Neat! Also support for custom allocators makes it extremely flexible, but that’s a 200 level topic. You can say that reference counting eats up blip of CPU time, but the reality is if you are going to be sharing ownership, you will most likely be doing reference counting by hand. shared_ptr simply automates it. shared_ptr also has virtual ctor/dtor’s so there is also the small overhead in vtable lookup. Usually not a problem, but in high perf scenarios you should be aware of this.
Caveats – The Achilles heal of reference counting mechanisms are the dreaded circular references. The .NET GC handles this scenario automatically. But here in native land, we aren’t using a big complicated memory management system. Consider class Parent has reference to Child. And Child keeps reference to Parent. Both have a strong reference to each other and will never be released with a shared_ptr. This is easily solvable with the use of another smart pointer, known as weak_ptr, which I’m not covering here.
The other caveat is with arrays. Consider this:
1: auto myWideString = shared_ptr(new wchar_t[10]);
By default, shared_ptr will run “delete pData” This isn’t correct in C++, for arrays must be freed with “delete[] pData”. The unique_ptr can handle this situation by default, but shared_ptr cannot. To rectify the situation we just supply a custom deleter to our shared_ptr. Here is an example:
1: struct array_deleter
2: {
3: inline void operator ()(void * p)
4: {
5: delete[] p;
6: }
7: };
8:
9: void DoSomething()
10: {
11: /* auto keyword is like var in c# */
12: auto myWideString = shared_ptr(new wchar_t[10], array_deleter());
13: }
This will ensure the memory is properly released
When to use – (Almost) all the time! Seriously. The few times not to use this smart pointer is when you need extreme performance in allocating/deallocating hundreds of thousands of instances. In this case, just go for the unique_ptr.
CComPtr, CComQIPtr and _com_ptr_t
When working with COM objects, we realize that they implement the IUnknown interface. This interface already supports reference counting (like shared_ptr). The difference is COM uses something called intrusive reference counting. This means the reference counting is built into the class itself.
Now that you know the basics of smart pointers, I won’t go into the grit of these three pointer types. Do know that they will call the AddRef/Release on your COM object automatically. AddRef is called when the smart pointer gets reference, Release is called when the smart pointer falls out of scope. The COM class itself will keep track of it’s reference count and delete itself when it gets to 0. If using COM smart pointers, the reference count will hit 0 when all smart pointers referencing it fall out of scope.
Performance – Just as much performance as calling AddRef/Release yourself!
Conclusion
“So what were those simple rules for not leaking memory in my C++ application?”
- Use smart pointers with every new keyword.
- Always pass your smart pointers by reference:
void DoSomething(smartPtr& data); - Never call delete on a smart pointer
- Be mindful of shared_ptr and arrays. Use a custom deleter as shown previously
- Circular references. Know how to use weak_ptr!
We all know that no application is free of memory leaks, no matter how managed it is. Hopefully I have shown how memory management in C++ is not as difficult as it used to be and even mortals like us can make applications that are robust and free of leaks and tedious cleanup routines. Just #include
-Jer
Probably the best primer on memory management for C++ from a .NET point of view I have ever read. Thanks and let’s hope you continue to write more. My guess is C++ will become “fashionable” again as we realise that .NET is not the right tool for all the jobs.
[...] C++ Productivity: Memory Management 101 for the .NET Guy (Jeremiah Morrill) [...]
Great article and refresher!
Trying to remember all of the little nuances of creating and building a C++ project in VS 2010 is like trying to type with your hands tied behind your back. #include, std::, using, .h files, etc.
But the more interesting question to me is, “Why the move into c++?”
Work? Fun? Or are you trying to get a jump on September?
Perspiring minds want to know!
This was an awesome overview. Thanks!
Waiting for the series to move on.