Call it overselling, but we'll tell you up front: we have good material for this article. This is only because I convinced my good friend Petru Marginean to co-author the article. Petru has developed a library facility that is of great help with exceptions. We streamlined the implementation together until we obtained a lean, mean library that in specific cases can help you write exception-safe code much more easily.
Let's face it, writing correct code in the presence of exceptions is a not an easy task. Exceptions establish a separate control flow that has little to do with the main control flow of the application. Figuring out the exception flow requires a different way of thinking, and, well, new tools.
Writing Exception-Safe Code Is Hard: An Example
Let's say you are developing one of those trendy instant messaging servers. Users log on and off the system and can send messages to each other. You hold a server-side database of users, plus in-memory information for users who are logged on. Each user can have friends. The list of friends is also kept both in the database and in memory.
When a user adds or removes a friend, you need to do two things: update the database, and update the in-memory cache that you keep for that user. It's that simple.
Assuming you model per-user information in a class called User, and that you model the user database with a UserDatabase class, the code for adding a friend could look like below:
class User { ... string GetName(); void AddFriend(User& newFriend); private: typedef vector<User*> UserCont; UserCont friends_; UserDatabase* pDB_; }; void User::AddFriend(User& newFriend) { // Add the new friend to the database pDB_->AddFriend(GetName(), newFriend.GetName()); // Add the new friend to the vector of friends friends_.push_back(&newFriend); }Surprisingly, the two-liner User::AddFriend hides a pernicious bug. In an out-of memory condition, vector::push_back can fail by throwing an exception. In that case, you will end up having the friend added to the database, but not to the in-memory information.
Now we've got a problem, haven't we? In any circumstance, this lack of consistency of information is dangerous. It is likely that many parts of your application are based on the assumption that the database is in sync with the in-memory information.
A simple approach to the problem is to think of switching the two lines of code:
void User::AddFriend(User& newFriend) { // Add the new friend to the vector of friends // If this throws, the friend is not added to // the vector, nor the database friends_.push_back(&newFriend); // Add the new friend to the database pDB_->AddFriend(GetName(), newFriend.GetName()); }This definitely saves consistency in the case of vector::push_back failing. Unfortunately, as you consult UserDatabase::AddFriend's documentation, you find out with annoyance that it can throw an exception, too! Now you might end up putting the friend into the vector, but not in the database!
It's time you interrogate the database folks: "Why don't you guys return an error code instead of throwing an exception?" "Well," they say, "we're using a highly reliable cluster of XYZ database servers on a TZN network, so failure is extremely rare. Being this rare, we thought it's best to model failure with an exception, because exceptions appear only in exceptional conditions, right?"
It makes sense, but you still need to address failure. You don't want a database failure to drag the whole system toward chaos. This way you can fix the database without having to shut down the whole server.
In essence, you must do two operations, any of which can fail. If any of them fails, you must undo the whole thing. Let's see how this can be done.
Solution 1: Brute Force
A simple solution is to throw in (sic!) a try-catch block:
void User::AddFriend(User& newFriend) { friends_.push_back(&newFriend); try { pDB_->AddFriend(GetName(), newFriend.GetName()); } catch (...) { friends_.pop_back(); throw; } }If vector::push_back fails, that's okay because UserDatabase::AddFriend is never reached. If UserDatabase::AddFriend fails, you catch the exception (no matter what it is), you undo the push_back operation with a call to vector::pop_back, and nicely re-throw the exact same exception.
The code works, but at the cost of increased size and clumsiness. The two-liner just became a six-liner. The technique isn't quite appealing; imagine littering all of your code with such try-catch statements.
Moreover, the technique doesn't scale well. Imagine you have a third operation to do. In that case, things suddenly become much clumsier. You can choose between equally awkward solutions: nested try statements or a more complicated control flow featuring additional flags. These solutions raise code bloating issues, efficiency issues, and most important, severe understandability and maintenance issues.
Solution 2: The Politically Correct Approach
Show the above to any C++ expert, and you're likely to hear: "Nah, that's no good. You must use the resource acquisition is initialization idiom [1] and leverage destructors for automatic resource deallocation in case of failure."
Okay, let's go down that path. For each operation that you must undo, there's a corresponding class. The constructor of that class "does" the operation and the destructor rolls that operation back, unless you call a "commit" function in which case the destructor does nothing.
Some code will make all this crystal-clear. For the push_back operation, let's put together a VectorInserter class like so:
class VectorInserter { public: VectorInserter(std::vector<User*>& v, User& u) : container_(v), user_(u), commit_(false) { container_.push_back(&u); } void Commit() throw() { commit_ = true; } ~VectorInserter() { if (!commit_) container_.pop_back(); } private: std::vector<User*>& container_; User& user_; bool commit_; };Maybe the most important thing in the code above is the throw() specification next to Commit. It documents the reality that Commit always succeeds, because you already did the work Commit just tells VectorInserter: "Everything's fine, don't roll back anything."
You use the whole machinery like this:
void User::AddFriend(User& newFriend) { VectorInserter ins(friends_, &newFriend); pDB_->AddFriend(GetName(), newFriend.GetName()); // Everything went fine, commit the vector insertion ins.Commit(); }AddFriend now has two distinct parts: the activity phase, in which the operations occur, and the commitment phase, which doesn't throw it only stops the undo from happening.
The way AddFriend works is simple: if any operation fails, the commit point is not reached and the whole operation is called off. The VectorInserter pop_backs the data entered, so the program remains in the state it was in before calling AddFriend.
The idiom works nicely in all cases. If, for example, the vector insertion fails, the destructor of ins is not called, because ins isn't constructed.
This approach works just fine, but in the real world, it turns out not to be that neat. You must write a bunch of little classes to support this idiom. Extra classes mean extra code to write, intellectual overhead, and additional entries to your class browser. Moreover, it turns out there are lots of spots in which you must deal with exception safety. Let's face it, adding a new class every so often just for undoing an arbitrary operation in its destructor is not the brightest idea of productivity.
Oh, and by the way, VectorInserter has a bug. Did you notice it? The implicitly generated copy constructor can easily lead to errors: if the copied-from object is not yet committed, the destructors will later probably do too many pop_backs. Defining classes is hard; that's another reason for avoiding writing lots of them.
Solution 3: The Real Approach
In the real world, when the programmer sits down to write AddFriend, either he's reviewed all the options above, or he didn't have time to care about them. At the end of the day, you know what the real result usually is? Of course you do:
void User::AddFriend(User& newFriend) { friends_.push_back(&newFriend); pDB_->AddFriend(GetName(), newFriend.GetName()); }It's a solution based upon not-too=scientific arguments.
"Who said memory's going to be exhausted? There's half a gig in this box!"
"Even if memory does become exhausted, the paging system will slow the program down to a crawl way before the program crashes."
"The database folks said AddFriend cannot possibly fail. They're using XYZ and TZN!"
"It's not worth the trouble. We'll think of it at a later review."
Solutions that require a lot of discipline and grunt work are not very attractive. Under schedule pressure, a good but clumsy solution loses its utility. Everybody knows how things must be done by the book, but will consistently take the shortcut. The one true way is to provide reusable solutions that are correct and easy to use.
When you do take the shortcut, you check in the code with an unpleasant feeling of imperfection, but the feeling gradually peters out as all tests run just fine. As time goes on, the spots that can cause problems "in theory" start to crop up in practice.
You know you have a problem, however, and a big one: you have given up controlling the correctness of your application. Now when the server crashes, you don't have much of a clue on where to start: is it a hardware failure, a genuine bug, or an amok state due to an exception? Not only you are exposed to involuntary bugs, you have deliberately introduced them!
Even if things work okay for a while, life is change. The number of users can grow, stressing memory to its limits. Your network administrator might disable paging for the sake of performance. Your database might not be so infallible. And you are unprepared for any of these.
Solution 4: Petru's Approach
Using the ScopeGuard tool which we'll detail in a minute you can easily write code that's simple, correct, and efficient:
void User::AddFriend(User& newFriend) { friends_.push_back(&newFriend); ScopeGuard guard = MakeObjGuard( friends_, &UserCont::pop_back); pDB_->AddFriend(GetName(), newFriend.GetName()); guard.Dismiss(); }The only job of guard above is to call friends_.pop_back when it exits its scope. That is, unless you Dismiss it. If you do that, guard doesn't do anything anymore.
ScopeGuard implements automatic calls to functions or member functions in its destructor. It can be helpful when you want to implement automatic undoing of atomic operations in the presence of exceptions.
You use ScopeGuard like so: if you need to do several operations in an "all-or-nothing" fashion, you put a ScopeGuard after each operation. The execution of that ScopeGuard nullifies the effect of the operation above it:
friends_.push_back(&newFriend); ScopeGuard guard = MakeObjGuard( friends_, &UserCont::pop_back);ScopeGuard works with regular functions, too:
void* buffer = std::malloc(1024); ScopeGuard freeIt = MakeGuard(std::free, buffer); FILE* topSecret = std::fopen("cia.txt"); ScopeGuard closeIt = MakeGuard(std::fclose, topSecret);If all atomic operations succeed, you Dismiss all guards. Otherwise, each constructed ScopeGuard will diligently call the function with which you initialized it.
With ScopeGuard you can easily arrange to undo various operations without having to write special classes for removing the last element of a vector, freeing some memory, and closing a file. This makes ScopeGuard a very useful reusable solution for writing exception-safe code, easily.
Implementing ScopeGuard
ScopeGuard is a generalization of a typical implementation of the "resource acquisition is initialization" C++ idiom. The difference is that ScopeGuard focuses only on the cleanup part you do the resource acquisition, and ScopeGuard takes care of relinquishing the resource. (In fact, cleaning up is arguably the most important part of the idiom.)
There are different ways of cleaning up resources, such as calling a function, calling a functor, and calling a member function of an object. Each of these can require zero, one, or more arguments.
Naturally, we model these variations by building a class hierarchy. The destructors of the objects in the hierarchies do the actual job. The base of the hierarchy is the ScopeGuardImplBase class, shown below:
class ScopeGuardImplBase { public: void Dismiss() const throw() { dismissed_ = true; } protected: ScopeGuardImplBase() : dismissed_(false) {} ScopeGuardImplBase(const ScopeGuardImplBase& other) : dismissed_(other.dismissed_) { other.Dismiss(); } ~ScopeGuardImplBase() {} // nonvirtual (see below why) mutable bool dismissed_; private: // Disable assignment ScopeGuardImplBase& operator=( const ScopeGuardImplBase&); };ScopeGuardImplBase gathers the management of the dismissed_ flag, which controls whether derived classes perform cleanup or not. If dismissed_ is true, then derived classes will not do anything during their destruction.
This brings us to the missing virtual in the definition of ScopeGuardImplBase's destructor. How would one expect polymorphic behavior of the destructor if it's not virtual? Well, just hold your curiosity for a second; we have an ace up our sleeve, with which we obtain polymorphic behavior without the overhead of virtual functions.
For now, let's see how to implement an object that calls a function or functor taking one argument in its destructor. If however, you call Dismiss, the function/functor is not invoked anymore.
template <typename Fun, typename Parm> class ScopeGuardImpl1 : public ScopeGuardImplBase { public: ScopeGuardImpl1(const Fun& fun, const Parm& parm) : fun_(fun), parm_(parm) {} ~ScopeGuardImpl1() { if (!dismissed_) fun_(parm_); } private: Fun fun_; const Parm parm_; };To make it easy to use ScopeGuardImpl1, let's write a helper function.
template <typename Fun, typename Parm> ScopeGuardImpl1<Fun, Parm> MakeGuard(const Fun& fun, const Parm& parm) { return ScopeGuardImpl1<Fun, Parm>(fun, parm); }MakeGuard relies on compiler's ability to deduce template arguments for template functions. This way you don't need to specify the template arguments to ScopeGuardImpl1 - actually, you don't need to explicitly create ScopeGuardImpl1 objects. This trick is used by standard library functions such as make_pair and bind1st.
Still curious about how to achieve polymorphic behavior of the destructor, without a virtual destructor? Here is the definition of ScopeGuard, which, surprisingly, is a mere typedef:
typedef const ScopeGuardImplBase& ScopeGuard;It's time for us to disclose the whole machinery. According to the C++ standard, a const reference initialized with a temporary value makes that temporary value live for the lifetime of the reference itself. Let's explain this with an example. If you write:
FILE* topSecret = std::fopen("cia.txt"); ScopeGuard closeIt = MakeGuard(std::fclose, topSecret);then MakeGuard creates a temporary variable of type (deep breath here):
ScopeGuardImpl1<int (&)(FILE*), FILE*>This is because the type of std::fclose is function taking a FILE* and returning an int. The temporary variable of the type above is assigned to the const reference closeIt. The language rule mentioned above is that the temporary lives at least as long as the reference and when it is destroyed, the correct destructor is called. In turn, the destructor closes the file.
ScopeGuardImpl1 supports functions (or functors) taking one parameter. It is very simple to build classes that accept zero, two, or more parameters (ScopeGuardImpl0, ScopeGuardImpl2...). Once you have these, you can overload MakeGuard to achieve a nice, unified syntax:
template <typename Fun> ScopeGuardImpl0<Fun> MakeGuard(const Fun& fun) { return ScopeGuardImpl0<Fun >(fun); } ..By now, we already have a powerful means of expressing automatic calls to functions. MakeGuard is an excellent tool, especially when it comes to interfacing with C APIs without having to write lots of wrapper classes.
What's even better is the preservation of efficiency, as there's no virtual call involved.
ScopeGuard for Objects and Member Functions
So far, so good, but what about invoking member functions for objects? It's not hard at all. Let's implement ObjScopeGuardImpl0, a class template that can invoke a parameterless member function for an object.
template <class Obj, typename MemFun> class ObjScopeGuardImpl0 : public ScopeGuardImplBase { public: ObjScopeGuardImpl0(Obj& obj, MemFun memFun) : obj_(obj), memFun_(memFun) {} ~ObjScopeGuardImpl0() { if (!dismissed_) (obj_.*fun_)(); } private: Obj& obj_; MemFun memFun_; };ObjScopeGuardImpl0 is a bit more exotic because it uses the lesser-known pointers to member functions and operator.*(). To understand how it works, let's take a look at MakeObjGuard's implementation. (We availed ourselves of MakeObjGuard in the opening section.)
template <class Obj, typename MemFun> ObjScopeGuardImpl0<Obj, MemFun, Parm> MakeObjGuard(Obj& obj, Fun fun) { return ObjScopeGuardImpl0<Obj, MemFun>(obj, fun); }Now if you call:
ScopeGuard guard = MakeObjGuard( friends_, &UserCont::pop_back);then an object of the following type is created:
ObjScopeGuardImpl0<UserCont, void (UserCont::*)()>Fortunately, MakeObjGuard shelters you from having to write types that look like uninspired emoticons. The mechanism is the same when guard leaves its scope, the destructor of the temporary object is called. The destructor invokes the member function via pointer to member. To achieve that, we use operator.*.
Error Handling
If you have read Herb Sutter's work on exceptions [2], you know that it is essential that destructors must not throw an exception. A throwing destructor makes it impossible to write correct code, and can shut down your application without any warning. In C++, once an exception has been thrown, if a destructor called during stack unwinding emits another exception, the application terminates immediately.
The destructors of ScopeGuardImplX and ObjScopeGuardImplX call an unknown function or member function respectively, and that other function might throw. This would halt the program, because the guards's destructors are deliberately designed to call the unknown function precisely during stack unwinding when an exception is active! In theory, you should never pass functions that throw to MakeGuard or MakeObjGuard. In practice (as you can see in the downloadable code), the destructor is shielded from any exceptions:
template <class Obj, typename MemFun> class ObjScopeGuardImpl0 : public ScopeGuardImplBase { ... public: ~ScopeGuardImpl1() { if (!dismissed_) try { (obj_.*fun_)(); } catch(...) {} } }Yes, the catch(...) block does not do anything at all. This is not a hack. It is fundamental that in the realm of exceptions, if your "undo/recover" action fails, there is pretty much nothing you can do. You attempt undoing, and you move on regardless whether the undo operation succeeds or not.
A possible sequence of actions in our instant messaging example is: you insert a friend in the database, you try to insert it in the friends_ vector and fail, and consequently you try to delete the user from the database. There is a narrow chance that somehow the deletion from the database fails, too, which leads to a very unpleasant state of affairs.
In general, you should put guards on operations that you are sure you can undo successfully.
Supporting Parameters by Reference
Petru and I were happily using ScopeGuard for a while, until we stumbled upon a problem. Consider the code below:
void Decrement(int& x) { --x; } void UseResource(int refCount) { ++refCount; ScopeGuard guard = MakeGuard(Decrement, refCount); ... }The guard object above ensures that the value of refCount is preserved upon exiting UseResource. (This idiom is useful in some resource sharing cases.)
In spite of its usefulness, the code above does not work. The problem is, ScopeGuard stores a copy of refCount (see the definition of ScopeGuardImpl1, member variable parm_) and not a reference to it. In this case, we need, however, to store a reference to refCount, so Decrement can operate on it.
One solution would be to implement additional classes such as ScopeGuardImplRef and MakeGuardRef. This is a lot of duplication and it gets nasty as you implement classes for multiple parameters.
The solution we settled on consists of a little helper class that transforms a reference into a value:
template <class T> class RefHolder { T& ref_; public: RefHolder(T& ref) : ref_(ref) {} operator T& () const { return ref_; } }; template <class T> inline RefHolder<T> ByRef(T& t) { return RefHolder<T>(t); }RefHolder and its companion helper function ByRef seamlessly adapt a reference to a value, and allow ScopeGuardImpl1 to work with references without any modification. All you have to do is to wrap your references in calls to ByRef, like so:
void Decrement(int& x) { --x; } void UseResource(int refCount) { ++refCount; ScopeGuard guard = MakeGuard(Decrement, ByRef(refCount)); ... }We find this solution to be pretty expressive and suggestive.
The nicest part of reference support is the const modifier used in ScopeGuardImpl1. Here's the relevant excerpt:
template <typename Fun, typename Parm> class ScopeGuardImpl1 : public ScopeGuardImplBase { ... private: Fun fun_; const Parm parm_; };This little const is very important. It prevents code that uses non-const references from compiling and running incorrectly. In other words, if you forget to use ByRef with a function, the compiler will not allow incorrect code to compile.
But Wait, There's More
By now, you have a good tool that helps you write correct code without having to agonize about it. Sometimes, however, you want the ScopeGuard to always execute when you exit the block. In this case, declaring a dummy variable of type ScopeGuard is awkward you only need a temporary, you don't need a named temporary.
The macro ON_BLOCK_EXIT does exactly what you want, and lets you write expressive code like below:
{ FILE* topSecret = fopen("cia.txt"); ON_BLOCK_EXIT(std::fclose, topSecret); ... use topSecret ... } // topSecret automagically closedON_BLOCK_EXIT says: "I want this action to be performed when the current block exits." Similarly, ON_BLOCK_EXIT_OBJ implements the same feature for a member function call.
These macros use non-orthodox (albeit legal) macro wizardry, which shall go undisclosed. The curious can look it up in the code. (Due to a compiler bug, Microsoft VC++ users will have to disable the "Program Database for Edit and Continue" project setting for ON_BLOCK_EXIT to work.)
ScopeGuard in the Real World
What we like about ScopeGuard is its ease of use and conceptual simplicity. This article detailed the whole implementation, but explaining ScopeGuard‘s usage only takes a couple of minutes. Among our colleagues, ScopeGuard spreads like wildfire. Everybody takes ScopeGuard for granted as a valuable tool that helps in various situations, from premature returns to exceptions. With ScopeGuard, you can write exception-safe code with reasonable ease, and understand and maintain it just as easily.
Every tool comes with a use recommendation, and ScopeGuard is no exception. You should use ScopeGuard as it was intended as an automatic variable in functions. You should not hold ScopeGuard objects as member variables or allocate them on the heap. For this purpose, the downloadable code contains a Janitor class, which does exactly what ScopeGuard does, but in a more general way at the expense of some efficiency. Borland 5.5 users would need to use Janitor instead of ScopeGuard due to a compiler bug.
Conclusion
We have presented some issues that appear in certain cases of writing exception-safe code. After discussing a couple of ways of achieving exception safety in such cases, we introduced a solution applicable when failure-proof (and nonthrowing) undo operations are easily available. ScopeGuard uses several generic programming techniques to let you prescribe function and member function calls to be performed when a ScopeGuard variable exits a scope. Optionally, you can dismiss the ScopeGuard object.
ScopeGuard is useful when you need to perform automatic cleanup of resources and can rely on failure-proof undo operations. This idiom is important when you want to assemble an operation out of several atomic operations, each of which could fail but can also be undone. There are cases for which this approach is not applicable.
Acknowledgements
Herb Sutter provided an exceptional technical review of this article. The authors would also like to thank to Mihai Antonescu and Dan Pravat for making useful corrections and suggestions.
References
[1] Bjarne Stroustrup. The C++ Programming Language, 3rd Edition (Addison-Wesley, 1997), page 366.
[2] Herb Sutter. Exceptional C++: 47 Engineering Puzzles, Programming Problems, and Solutions (Addison-Wesley. 2000).
Download the Code
About the Authors
Andrei Alexandrescu is a Development Manager at RealNetworks Inc. (www.realnetworks.com), based in Seattle, WA, and author of the acclaimed book Modern C++ Design. He may be contacted at andrei@metalanguage.com. Andrei is also one of the featured instructors of The C++ Seminar (www.gotw.ca/cpp_seminar).
Petru Marginean is senior C++ developer for Plural, New York. He can be reached at petrum@hotmail.com.