神刀安全网

C++11 Completed RAII, Making Composition Easier

The addition of move semantics in C++11 is not just a performance and safety improvement. It’s also the feature that completed RAII. And as of C++11 I believe that RAII is absolutely necessary to make object composition easy in the language.

To illustrate let’s look at how objects were composed before C++11, what problems we ran into, and how everything just works automatically since C++11. Let’s build an example of three objects:

struct Expensive {     std::vector<float> vec; }; struct Group {     Group();     Group(const Group &);     Group & operator=(const Group &);     ~Group();     int i;     float f;     std::vector<Expensive *> e; }; struct World {     World();     World(const World &);     World & operator=(const World &);     ~World();     std::vector<Group *> c; };

Before C++11 composition looked something like this. It was OK to have a vector of floats, but you’d never have a vector of more expensive objects because any time that that vector re-allocates, you’d have a very expensive operation on your hand. So instead you’d write a vector of pointers. Let’s implement all those functions:

Group::Group()     : i()     , f() { } Group::Group(const Group & other)     : i(other.i)     , f(other.f) {     e.reserve(other.e.size());     for (std::vector<Expensive *>::const_iterator it = other.e.begin(), end = other.e.end(); it != end; ++it)     {         e.push_back(new Expensive(**it));     } } Group & Group::operator=(const Group & other) {     i = other.i;     f = other.f;     for (std::vector<Expensive *>::iterator it = e.begin(), end = e.end(); it != end; ++it)     {         delete *it;     }     e.clear();     e.reserve(other.e.size());     for (std::vector<Expensive *>::const_iterator it = other.e.begin(), end = other.e.end(); it != end; ++it)     {         e.push_back(new Expensive(**it));     }     return *this; } Group::~Group() {     for (std::vector<Expensive *>::iterator it = e.begin(), end = e.end(); it != end; ++it)     {         delete *it;     } } World::World() { } World::World(const World & other) {     c.reserve(other.c.size());     for (std::vector<Group *>::const_iterator it = other.c.begin(), end = other.c.end(); it != end; ++it)     {         c.push_back(new Group(**it));     } } World & World::operator=(const World & other) {     for (std::vector<Group *>::iterator it = c.begin(), end = c.end(); it != end; ++it)     {         delete *it;     }     c.clear();     c.reserve(other.c.size());     for (std::vector<Group *>::const_iterator it = other.c.begin(), end = other.c.end(); it != end; ++it)     {         c.push_back(new Group(**it));     }     return *this; } World::~World() {     for (std::vector<Group *>::iterator it = c.begin(), end = c.end(); it != end; ++it)     {         delete *it;     } }

Oh god this is painful to do now, but this illustrates how people used to do composition. Or most of the time what people actually did is they just made their type non-copyable. Nobody would have wanted to maintain all this code (I have personally introduced several bugs in manually maintained copy constructors and assignment operators. Too easy to make typos in this mindless code), so the easiest thing to do is to make the type non-copyable.

In fact oftentimes it looked like types were non-copyable even if they could have been copyable. It’s just difficult to tell with all these pointers flying around.

Nowadays I would write the above classes like this:

struct Expensive {     std::vector<float> vec; }; struct Group {     int i = 0;     float f = 0.0f;     std::vector<Expensive> e; }; struct World {     std::vector<Group> c; };

This does everything that the above code did and it does it faster and with less heap allocations. The main feature in C++11 that made this possible was the addition of move semantics. Why isn’t this possible without move semantics? After all that last chunk of code would have compiled fine and run fine before C++11. But before C++11 people would have changed this code to look like the code further up. To see why let’s see what happens if we add a new Group to the World:

Let’s say that the Expensive class often has hundreds of floats in its vector and the Group class often has dozens of Expensive objects around. Now our World has two Groups in its vector and is about to push_back a third Group. That will cause the container to re-allocate its internal storage which will create two new Groups to replace the two old Groups. Those two new Groups have to allocate a new vector each to store new Expensives. Then each of those dozens of Expensives will again allocate a new vector each to store the data that they get from the old Expensives.

We end up with a lot of copies and a lot of heap allocations to do an operation that’s completely internal to the vector. It’s terrible that we can randomly get slowdowns like this from harmless operations like a push_back.

The first time that somebody catches this in a profiler they will take a look at the codebase and ask how often we copy Groups, really? The answer is probably that we’ll copy them very rarely. So why don’t we just replace the internals with a pointer? That will make the copy more expensive (one heap allocation per element in the vector) but it will make growing and shrinking the vector practically free. We get a huge performance improvement and everyone is happy. And with that we’re back at the initial code.

Move semantics solve that problem. With move semantics a lot of objects can re-organize their internals without having to copy everything that they own. That’s obviously very useful for std::vector, but it turns out to be useful in a lot of classes.

Move semantics also gives composition to types that aren’t copyable. Before C++11 you could use RAII for non-copyable types, but then you couldn’t compose them as well as other classes. To illustrate let’s add some kind of OS handle to the Expensive struct. And let’s say that this OS handle requires manual clean-up:

struct Expensive {     Expensive()         : h(GetOsHandle())     {     }     ~Expensive()     {         FreeOsHandle(h);     }      HANDLE h;     std::vector<float> vec; };

And just with that, everything is ruined. Expensive now can’t be copied and can’t be moved. That immediately breaks Group, which immediately breaks World. To fix this we could change Group to use a pointer to Expensive instead of using Expensive by value. But then Group has to be non-copyable, too and World is still broken. So now we also have to change World to store Group by pointer and we propagate our ugliness all the way through the codebase. A single type that requires manual clean-up makes us add the boilerplate code from C++98 to do composition to all other classes that use it directly or indirectly. It’s a mess.

Of course you know the solution already: Move semantics. If we just wrap the OS handle in a type that supports move semantics, everything continues to work:

struct WrappedOsHandle {     WrappedOSHandle()         : h()     {     }     WrappedOSHandle(HANDLE h)         : h(h)     {     }     WrappedOSHandle(WrappedOSHandle && other)         : h(other.h)     {         other.h = HANDLE();     }     WrappedOSHandle & operator=(WrappedOSHandle && other)     {         std::swap(h, other.h);         return *this;     }     ~WrappedOsHandle()     {         if (h)             FreeOsHandle(h);     }     operator HANDLE() const     {         return h;     }  private:     HANDLE h; };  struct Expensive {     Expensive()         : h(GetOsHandle())     {     }      WrappedOSHandle h;     std::vector<float> vec; };

It’s a bit of boilerplate, but there are ways of avoiding it. (for example use a unique_ptr with a custom deleter). Now whenever we use this handle type, our class stays composable. Group keeps working and the World keeps working and everyone is happy.

There is a more fundamental reason why this works and why RAII is important for this: Composing objects is a lot easier if certain operations are standardized. If my object A consists of two objects B and C, it’s a lot easier to write the clean-up code for A if the clean-up code for all types is standardized. Otherwise B and C might have custom clean-up code and now A has to also have custom clean-up code. If everyone standardizes on one way to clean up objects, composition is easier.

The list of functions that make composition easier is long. It includes construction, copying, moving, assignment, swapping, destruction, reflection, comparison, hashing, checking for validity, pattern matching, interfacing with scripting languages, serialization in all its many forms and more. For example it’s a lot easier to write a hash function for my type if there is a standard way to hash my components. Or it’s a lot easier to copy my type if there is a standard way to copy my components. Not all types need all operations from this list, but if your type does need it, you’ll want a standard interface for your components. In fact once there is a standard way, you might as well automate this.

C++ has decided to automate the bare necessities out of that list: Construction, copying, moving, assignment and destruction. And it did this in the set of rules that we call RAII. If you use RAII, composition will be a lot easier for you. You’ll find that you’ll have a lot more types that just slot together and just work together. It’ll improve your code.

Oh and this is also another good reason to standardize reflection: With reflection, I can automate a lot of other elements in that list. Last I heard people were working on that but progress was slow.

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » C++11 Completed RAII, Making Composition Easier

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址