Tuesday, June 24, 2014

The Drawbacks of Implementing Move Assignment in Terms of Swap

More and more, I bump into people who, by default, want to implement move assignment in terms of swap. This disturbs me, because (1) it's often a pessimization in a context where optimization is important, and (2) it has some unpleasant behavioral implications as regards resource management.

Let's consider a simplified case of a container that contains a pointer to its contents, which are stored on the heap. I'm using a raw pointer, because I don't want to abstract anything away through the use of smart pointers.
class Container {     
public:
  Container& operator=(Container&& rhs);

private:
  int *pData;           // assume points to an array
};
Implementing the move assignment operator using std::swap, the code looks like this:
Container& Container::operator=(Container&& rhs)
{
  std::swap(pData, rhs.pData);
  return *this;
}
Swapping two pointers calls for three pointer assignments,
template<typename T>
void swap(T& lhs, T& rhs)
{
  auto temp(lhs);
  lhs = std::move(rhs);
  rhs = std::move(temp);
}
so the cost of implementing move assignment using swap is three pointer assignments.

However, if we assume that an empty container has a null pData pointer, move assignment can be implemented using only 2 pointer assignments:
Container& Container::operator=(Container&& rhs)
{
  delete [] pData;
  
  pData = rhs.pData;
  rhs.pData = nullptr;

  return *this;
}
I'm ignoring the delete for now, but we'll get back to it later. At this point, I want to observe that non-swap move assignment performs only 2/3rds the assignments of its swap-based cousin. That's important, because move operations should typically be as efficient as possible. Remember that they're optimizations of copy operations, and if you're not concerned about their efficiency, why not just omit them and let rvalues be copied? My feeling is that the fact that a class author went to the trouble of adding support for move operations is a sign that the author perceives the class to be one where speed is important. If that's the case, it seems unreasonable to pay a premium to put the object being moved from into the state of the target of the assignment when it's cheaper to put the object into a different, but equally valid (typically default-constructed), state. After all, the semantics of move assignment typically don't specify the state of the source object of the move after the assignment has been performed, so if callers can't rely on it having the state of the target object before the assignment, why pay for it?

But back to the delete. Regardless of which move assignment implementation is used, the delete will eventually be performed. If it doesn't take place in the move assignment operator for the object that's the target of the move assignment, it'll probably occur in the destructor of the object that's the source of the move assignment. The cost therefore doesn't vary between the implementations, but when you incur that cost does, and that can be important.

Suppose that Container objects typically use a lot of memory--enough that you have to worry about it. Now consider the following scenario:
{
  Container c1, c2;
  ...
  c1 = std::move(c2);
  ...
}
In the non-swap implementation of move assignment, c1's memory is released at the point of the assignment, but in the swap version, that memory becomes associated with c2, and the memory may not be released until the end of the scope. That might well surprise the caller, who could hardly be blamed for thinking that when an assignment was made to c1, c1's old resources would be released. And it could certainly increase the maximum amount of memory used by the application at any given time.

In my experience, the impact of move-assignment-by-swap on the timing of resource release isn't as well known as it should be, even though the issue was well described many years ago (e.g., Thomas Becker  here and David Abrahams here).

Now, bear in mind that I said at the outset that I was disturbed by people who, by default, want to implement move assignment in terms of swap. I have no issue with developers who, consciously aware of the performance and behavioral implications of move-assignment-via-swap, choose to use it anyway. For some types, it may be a perfectly valid implementation choice. My concern is that it's gaining a reputation as the way to implement move assignment, and I don't think that's good for C++.

Am I mis-analyzing the situation?

Scott

61 comments:

Xeo said...

A copy-and-swap-based move-assignment operator will, with the default std::swap, result in endless recursion.

Scott Meyers said...

@Xeo: I don't see endless recursion in the implementation I posted. If you just swap *this and rhs, then, yes, but that's not normally what I see people doing.

Xeo said...

Oh, dang! Didn't catch that even when reading it twice.

Now OTOH, I don't see how much "copy" there is anymore - it's just "swap pointer".

DeadMG said...

if you're not concerned about their efficiency, why not just omit them and let rvalues be copied?

unique_ptr, unique_ptr, unique_ptr, and unique_ptr. Did I mention unique_ptr? This is far more important than efficiency gains.

Scott Meyers said...

@DeadMG: Using unique_ptr buys you nothing (in terms of the issues I discuss) if you then implement move assignment by swapping them. On the other hand, if you implement things as I've shown (where it's applicable), you can use unique_ptr and get the same results as with raw pointers.

Scott Meyers said...

@Xeo: "Swap pointer" costs three assignments, which is one more than is needed to assign to each of two pointers once.

Xeo said...

Actually, just nevermind me. My brain seems to have automatically inserted "copy-and-" before all mentions of "swap" here. Either that, or it was edited. In any case, I'm no confused and silent. :(

ark said...

What if you want to remove element n from a sequence by moving element n+1 to element n, then n+2 to n+1, and so on, finally destroying the last element? If move destroys the source, you destroy many elements; if it's implemented as swap, you destroy only one.

Callionica said...

Much C++ code is inline-able (even before link time) and in general we expect optimizing compilers to inline and optimize heavily.

Considering the move constructor - if you delegate to the default constructor and then swap the constructing object with the argument, is it unreasonable to expect the compiler to inline and optimize this code?

unique_token(unique_token&& other) : unique_token() {
using std::swap;
swap(*this, other);
}

Considering the move assignment operator - if you move-construct a temporary from the argument then swap the temporary with the current object, is it unreasonable to expect the compiler to inline and optimize this code?

unique_token& operator=(unique_token&& other) {
using std::swap;
auto tmp = unique_token(std::move(other));
swap(tmp, *this);
return *this;
}

A few things to note:
1. The valid-invalid state of the moved-from object (which must remain destructible) is defined in terms of the default constructor - this is code that we already have and don't need to write again.

2. There are no issues with resource lifetimes being extended when using this code. The moved-from object ends up in an empty and safely destructible state, while the moved-to object's original state is destroyed within the assignment operator.

3. By using a consistent pattern for all types instead of special case code for moving objects of this specific type, we've reduced cognitive overhead.

Of course, it's better to let the compiler generate move code wherever you can and if you find that you have a lot of move operators, try refactoring the code in to classes that *manage resource lifetimes* (like unique_ptr) and classes that *own resources* (like most classes). " = default" is my new best friend.

DeadMG said...

@Scott: Part of your article suggests that copy-and-swap is less efficient and this makes little sense if you implement move semantics for efficiency. But this is incorrect.

Firstly, the difference between two assignments or three is completely negligible, and the difference between copying a one million element vector of thousand character long strings and not is massive. These two optimizations are not even remotely comparable.

Secondly, you implement move semantics to support move-only classes even if you don't care about efficiency in the slightest.

Therefore, I suggest that the portion of the article in which you suggest that this cost is problematic is, well, problematic.

The delayed destruction is an interesting point, but the additional assignments are a non-entity as far as the performance benefits of move semantics are concerned, and support for move-only classes merits using them even if you don't care about performance in the slightest.

I love you, Scott, but your feeling about moves and performance is way off the mark.

Anonymous said...

I'd argue that in C++11 you typically don't need a custom swap operation when you implement move operations.

Vince said...

What about this:
Container c;
c = move(c);

This works with the swap implementation and is undefined behavior with your implementation.

Is it not reasonable to require self move assignment to work? Doesn't remove_if do such assignments sometimes?

Sumant said...

@Andrew Koenig: I think in both implementations there is exactly one deletion. The only difference is when it happens. In case of the eager-delete strategy, the Nth object is deleted right away and later on a default constructed object bubbles its way up to the end of the chain. In case of the deferred-delete strategy (just swap), the to-be-deleted pointer bubbles up to the end and then gets deleted.

Unknown said...

The first move constructor with only one swap is wrong. But it's wrong for a different reason. It potentially does not free the vector's elements but swaps them out into a container that is not necessarily short-lived. Dave Abrahams already identified this in his hold cpp-next blog to be the wrong kind of move assignment operator for that reason (in the context of a vector of opened streams IIRC). Copy-and-Swap in move-style would look like this:

Container& Container::operator=(Container&& rhs)
{
Container tmp (std::move(rhs));
std::swap(pData, tmp.pData);
return *this;
}

And this can be unified with the copy assignment operator into a single one that does both:

Container& Container::operator=(Container tmp)
{
std::swap(pData, tmp.pData);
return *this;
}

Note that tmp is passed by value letting the compiler decide how to create it.

I could not care less about whether this involves 2, 3, 4 or 5 pointer assignments. It's simple. It works. And compared to allocating an array and copying around this is likely insignificant.

etam said...

The generated assembler for move assignment with swap is much shorter than the other one. "swap(pData, rhs.pData);" is just "load two values into registers, store them back into memory". "delete [] pData;" is in assembler "if (pData) delete [] pData".

Details: http://susepaste.org/1515913

Scott Meyers said...

@Andrew Koenig: The move assignment operator cannot destroy the source. Only the destructor can destroy the source. The issue is what state to leave the source in after a move. Doing a swap leaves it in the state of the pre-move target, though clients can't rely on that unless the class explicitly documents it. (The standard says only that a moved-from object is in a valid state.)

If by "destroy" you're referring to the call to delete, see Sumant Tambe's comment.

Scott Meyers said...

@Callionica: I agree with the features you tout for your approach. It's simple and regular, and it avoid the resource lifetime extension issue. If compilers reliably generate code for this approach that's as good as not doing a swap, I'd be a happy guy. Whether compilers currently do that across a range of types (not just raw pointers), I don't know. Perhaps others do.

Scott Meyers said...

@DeadMG: I had hoped it would be clear that the example of a single pointer was just that, an example. If you don't think the cost of a single pointer assignment is worth worrying about (I know from experience that some people do), just replace the pointer data member with something that will get your attention--perhaps a std::array<double, 100'000'000>.

Scott Meyers said...

@Unknown: Please see my replies to Callonica (regarding the behavior of your approach) and DeadMG (regarding performance). I'll note that the lifetime issue you refer to, including the link to Dave Abrahams' discussion, is present in the original post (though it's not clear whether the link is still live).

Zenju said...

The eager-cleanup version saves one pointer assignment, but OTOH has one additional "delete [] nullptr" call. Isn't this just a mere tradeoff?

Anyway, shouldn't C++ allow to design for maximum efficiency and support a move-assignment like:

Container& Container::operator=(Container&& rhs)
{
delete [] pData;
pData = rhs.pData;

return *this;
}

with the catch that *no* destructor on rhs will be run? It seems to me that something is lacking here. Probably the C++ language should allow for a way to tell the compiler: "don't destruct rhs, don't generate any code, there's nothing left to do".
Maybe a "std::move" should even have this implicit (but intuitive IMHO) behavior of ending the moved-from instances's lifetime.

Scott Meyers said...

@Vince: My understanding is that per 17.6.4.9 of both C++11 and C++14, the policy of the Standard Library is that self-move-assignment generally leads to undefined behavior: "If a function argument binds to an rvalue reference parameter, the implementation may assume that this parameter is a unique reference to this argument. ... [ Note: If a program casts an lvalue to an
xvalue while passing that lvalue to a library function (e.g. by calling the function with the argument
move(x)), the program is effectively asking that function to treat that lvalue as a temporary. The
implementation is free to optimize away aliasing checks which might be needed if the argument was
an lvalue. —end note ]

The motivation for that is this defect report, which notes that "this clarifies that move assignment operators need not perform the traditional if (this != &rhs) test commonly found (and needed) in copy assignment operators."

My advice has been to follow the policy of the Standard Library and not worry about self-move-assignment. It's been noted elsewhere (in the comments below David Abrahams' post, I believe) that this eliminates a branch at the top of a function that's supposed to run as quickly as possible.

Of course, people are free to check for self-move-assignment, if they want to, and they're also free to use implementations (such as based on swap) that tolerate self-move-assignment, but I'd hope that before they do, they understand that there may be a cost associated with it and that there's no reason to expect Standard Library implementations to do it.

Unknown said...

The advantage of the swap implementation is it can be declared noexcept and some libraries can optimize for a noexcept move and if not available have to use copy instead to maintain exception guarantees. I'm thinking of std::vector as one example.

Scott Meyers said...

@Rick Wildes: There are often ways to achieve strong exception safety that don't involve swap. The non-swap-based implementation I showed is one of them. (The delete call isn't problematic, because common convention (and the policy of the Standard Library) is that if a destructor or memory deallocation function throw, results are undefined.)

Scott Meyers said...

@Zenju: Sean Parent has explored the idea of a destructive move with, I believe, the semantics you describe, but that's not C++, at least not currently.

Scott Meyers said...

@etam: The code to compare is assignment to two pointers versus swapping two pointers. I'd expect the swap to cost one more pointer assignment.

The delete call is a different issue, in my view. Others have pointed out (and I agree) that to avoid extending the lifetime of resources associated with the target of the move assignment, you need to swap with a default-constructed local object (or do the moral equivalent), and if you do that, you'll pay for the nullness test when that local object is destroyed. So the choice seems to be to pay for a nullness test on every call to move-operator= (regardless of whether swap is used) or to avoid that cost and run the risk of holding on to resources too long.

Vince said...

On self move assignment:
Wow, that's interesting (and scary). I suddenly realize that a bunch of things are broken:

- swap(x, x) results in x being in the "destroyed" state at the end (whereas it was guaranteed to be a noop in C++98, didnt it?)

- I've done things like (v being a vector and it an iterator to an element that I want to remove): "*it = move(v.back()); v.pop_back();" which is broken if it points to the last element

Makes me appreciate how dangerous "move" is :)

Joel Ek said...

This is how I implement all my classes in C++11:

#define self (*this)

template <typename A>
none swap(A& lhs, A& rhs)
{
A xhs = A((A&&)(lhs));
(&lhs)->~A(); new (&lhs) A((A&&)(rhs));
(&rhs)->~A(); new (&rhs) A((A&&)(xhs));
}

struct thing_t
{
thing_t()
{
// lean default constructor
}
thing_t(other_t other) // let compiler decide on copy or move
{
swap<other_t>(self.other, other);
}
thing_t(const thing_t& that)
{
self.other = that.other;
}
thing_t(thing_t&& that) : thing_t() // note the construction delegation
{
swap<other_t>(self.other, that.other);
}
thing_t& operator =(thing_t that) // let compiler decide on copy or move
{
swap<thing_t>(self, that);
return self;
}

private:

other_t other;
};

Needless to say, I'm no fan of std::swap nor of std::move.

Vitali said...

Doesn't the move-constructor still have to protect itself from assign-to-self as you would with a copy constructor?

So really, without swap, the proper move constructor is:

Container& Container::operator=(Container&& rhs) {
if (this != &rhs) {
delete [] pData;
pData = rhs.pData;
rhs.pData = nullptr;
}
return *this;
}

So in the nominal case, if you only have 1 member variable to move, your trading 1 extra conditional & 1 extra delete for an extra assignment operator, which is a win; I could be mistaken, but CPUs perform better with unconditional assignments than with branches.

__vic said...

delete is always a special case. In general "release"-function _can_throw_! Simple example - wrapper for UNIX/POSIX file descriptor.

class unix_fd
{
int fd;
public:
explicit unix_fd(int f = -1) : fd(f) {}
~unix_fd()
{
if(fd == -1) return;
if(::close(fd)) /* !!! call is failed! But we can't throw from destructor so just silently ignore....*/;
}
void close() // Our release-function
{
if(::close(fd)) throw system_error_with_errno_code;
}

};

Now let's compare two implementaions of move-assignment

#1
void unix_fd::operator=(unix_fd &&o) // Can't be noexcept
{
if(&o != this)
{
close(); // !!! Can throw here
fd = o.fd;
o.fd = -1;
}
return *this;
}

#2
void unix_fd::operator=(unix_fd &&o) noexcept
{
std::swap(fd, o.fd);
return *this;
}

#2 is perfectly noexcept!

Yes, close() call can be "delayed" in case #2. But! If we want strict error checking we must use explicit close() call, not destructor. Destructor releases resource only in "emergency" situations, where exeption can't be thrown anyway.

Vince said...

swap(x, y) is implemented as
auto tmp = move(x);
x = move(y);
y = move(x);

if x and y are references to the same object, the second line is a self assignment. Actually, the second line is a self assignment where the object is in the destroyed state...

Unknown said...

@__vic That was my point but you did a much better job of expressing it. But Scott is correct in that in his example the delete wont throw and noexcept move is probably still something that can be achieved without using swap. Maybe the rule should be: "don't use swap to implement move operations unless it is the only way to get noexcept and your object really needs noexcept move operations". A bit wordy but I'm not a very good word smith.

Scott Meyers said...

@Vince: I think you should be okay with swap(x, x), because in this case, you're swapping lvalues. Changing that behavior could break C++98 code. In theory, you could get in trouble with swap(x, move(x)), but that won't compile, because swap accepts only lvalues.

Scott Meyers said...

@Vince: Yes, if you do swap(x, x), the second assignment inside swap is a self-assignment. However, it's not an assignment from an object in a destroyed state, it's an assignment from an object in a moved-from state. They're not the same. Objects in a moved-from state are required to be valid objects that, unless expressly documented otherwise, may have any operation performed on them. Calling swap(x, x) should do the following:

- Move x's resources into temp.
- Do a self-assignment within x, which presumably has no effect on its moved-from state. (In my example code, it would simply copy pData to pData.)
- Move temp's resources back into x.

The end result would thus be that x is unchanged.

Scott Meyers said...

@Vitali: See my reply to Vince, where I explain that the Standard Library's policy is not to check for self-assignment in move assignment operators.

__vic said...

@Rick Wildes: You nearly always cannot use "release"-function in a move-assignment because _every_ "release"-function can fail. Yes, not every such function reports errors via exceptions (like delete) and not every error can be handled in your application (like heap corruption), but still every can fail. In such situations you have 2 options:

1) "Move" "release"-function call to destructor by moving resource to other object,
2) Manually release resource from assignee before move-assignment. Like this:

file1.close(); // all errors will be reported by exceptions
file1 = std::move(file2); // Can't fail if swap used

delete-operator is a very special case. RAM isn't the only resource need to be managed.

Scott Meyers said...

@__vic: You have to decide what's most important to you. Do you want strong exception safety? Do you want timely resource release? Do you want maximum speed in contexts where strong exception safety is not required? Your swap-based implementation achieves strong exception safety, but it sacrifices timely resource release, and I'd expect it to generally run more slowly. (Note that I'd omit the test for self-assignment, as I've explained in other comments in this thread.) There's nothing wrong with making those choices, but they're not the only choices that can be made.

Please recall the conclusion of my original post: "I have no issue with developers who, consciously aware of the performance and behavioral implications of move-assignment-via-swap, choose to use it anyway. For some types, it may be a perfectly valid implementation choice. My concern is that it's gaining a reputation as the way to implement move assignment, and I don't think that's good for C++."

__vic said...

@Scott Meyers:
> You have to decide what's most important to you
Indeed! There are no best-for-all solution as usually in programming ;-) But still. What would be your solution for my unix_fd class if you wrote a library for a general purposes? Just for this specific case.

Scott Meyers said...

@__vic: Off the top of my head, I'd prefer to let the move assignment operator throw. This means that move_if_noexcept would perform copies instead of moves (assuming the class supports copying), and I'm not sure what it would mean to copy a file descriptor.

__vic said...

@Scott Meyers:
> move_if_noexcept would perform copies instead of moves
It can't! Class controls resource so can't be copyable (if we keep straight unique semantics).

About performance and code generation for both cases (both inline):

#1:
call unix_fd::close()
movl 92(%esp), %eax
movl $-1, 92(%esp)
movl %eax, 88(%esp)

#2:
movl 92(%esp), %edx
movl %eax, 92(%esp)
movl %edx, 88(%esp)

Version with swap() definitly isn't longer & isn't slower

Joe said...

Sorry, but I have to strongly disagree with almost everything you wrote in this article.

First, on the performance side, the swap version is actually very likely faster, especially if you release resources in the move constructor without using the regular destructor, as this will likely increase the amount of code generated, and therefore put a larger burden on the icache. In fact, in your example for the late release, the late release would cause both objects to call their destructors at the same time, further lowering the burden on the icache. And even if icache isn't an issue, a swap isn't any less efficient as __vic shows in his comment.

Second, the late release is an absolute non-issue, as rvalues are supposed to be temporary. So they would be destroyed soon anyway. And if the programmer uses move to pretend it's a temporary he still should only assign, clear (as in vector::clear, which can actually very much benefit from *not* releasing memory if you plan to add new items) or destroy the object.

Third, and only indirectly related, move assign by swap is trivially self move assign safe. To elaborate a little: although the assumption (it is not definitely stated in the standard) is often that self move assign need not be safe. However, I have come to the conclusion that this is a bad idea. Why? The typical implementation of swap is a move construction and two move assigns. All well and good until you self-swap, which will cause a self move assign. At the same time, the definition of swap in the standard clearly states that any combination of lvalues, and that - by definition - includes both being the same. So either self move assign must be safe, or swap needs two move constructors and two move assigns, causing a general pessimization of swap (which is used a lot, just consider the standard algorithms). And disallowing self-swap is not an option, as code which includes them is widespread.

Sorry for the rant, but this issue has been quite irksome for me for quite a while.

Scott Meyers said...

@__vic: For move-only types, if resource release in the move assignment operator fails and if the way that's reported is to throw, then I'd have the move assignment operator throw. This means that operations employing move_if_noexcept would offer the basic instead of the strong guarantee, but that's how move_if_noexcept is designed to work.

As an aside, I looked up the specification for the move assignment operator for std::basic_filebuf. It calls std::basic_filebuf::close, which may throw an exception. Here's the spec (from 27.9.1.4/6 of draft C++14 (N3936)):

Effects: If is_open() == false, returns a null pointer. If a put area exists, calls overflow(traits:: eof()) to flush characters. If the last virtual member function called on *this (between underflow, overflow, seekoff, and seekpos) was overflow then calls a_codecvt.unshift (possibly several times) to determine a termination sequence, inserts those characters and calls overflow(traits:: eof()) again. Finally, regardless of whether any of the preceding calls fails or throws an exception, the function closes the file (as if by calling std::fclose(file)). If any of the calls made by the function, including std::fclose, fails, close fails by returning a null pointer. If one of these calls throws an exception, the exception is caught and rethrown after closing the file.

Matt Calabrese said...

Just to throw some more ideas into the mix -- one solution to an efficient, simple to implement move-assign that I've explored recently is to explicitly destroy the left-hand operand via an explicit destructor call, then use placement-new to move-construct from the right-hand operand into the left-hand operand's storage. The requirement for this is that you need to have a nothrow move-constructor and destructor. There is another subtle problem to this as well -- if someone derives from your class, the move-assign from the parent would, in effect, destroy the base-part of the object while the child part still lives (even though we reconstruct the base part before the function exits). This, I believe, is UB, though I haven't verified in the standard yet. Things are further complicated if the type is polymorphic, especially if it has virtual bases. Ultimately, I agree that a swap-based implementation is bad news, but there may be other alternatives to a properly-defined and easy-to-implement move-assign that do not suffer from these problems. Swap-based idioms were prominent prior to C++11 precisely because we didn't have actual move semantics. Implementing move in terms of swap seems inherently backwards, especially given that swap itself has been updated to be implemented in terms of move. We should be able to do better.

Scott Meyers said...

@Joel Ek: I suggest you read Matt Calabrese's comments on the destroy-in-place-then-construct-in-place approach.

I'd be interested to know why you prefer naked casting--C style, no less--over std::move.

__vic said...

@Matt Calabrese:
> Implementing move in terms of swap seems inherently backwards, especially given that swap itself has been updated to be implemented in terms of move.
I see nothing bad in implementing vector's move-assignent in terms of move of integers and raw pointers.

Scott Meyers said...

@Zenju: I now see that there is a formal proposal for destructive move. You might want to check it out.

Matt Calabrese said...

@__vic there's nothing wrong with manually doing moves if that's appropriate. The issue is mostly that when you want to overload the move-constructor because the default-move isn't appropriate, you will also want to overload the move-assign in a similar manner and for the same reason. Manually creating these similar definitions is redundant, error-prone, and can get out of sync. Ideally, we would be able to somehow get move-assign implemented in terms of move-construct in a simple, automatic way to reduce the amount of effort that goes into making an efficiently movable type. Destroy/move-construct is very appealing in that manner if it weren't for the drawbacks already enumerated. Swap has the other problems already mentioned. In short, the "best" solution is often to manually implement both, which is... unfortunate.

Hypothetically, imagine an alternate history of C++ standardization that had move-semantics from the start, didn't allow move-assign overloads but instead always automatically implemented it as destroy/move-construct for all types (requiring nothrow move and destruct), and by default defined copy-assign using the copy/move-assign idiom instead of member-wise assign (this is the C++11 equivalent of the copy/swap assign implementation). If this were the case, you'd only ever have to worry about implementing copy construct and move construct, overriding copy-assign only if you could make it more efficient (such as assignment of vector-to-vector, reusing storage, etc.). I'm not saying that this would be better or worse than what we have today, but it would certainly reduce redundancy by making the default definitions somewhat hierarchical in nature.

With that in mind, it would be nice if in actual C++ we had a similarly simple, idiomatic way to implement move-assign that didn't have the problems that implementing it in terms of swap or destroy/move-construct has. As it is, though, I'm not sure that such a method exists.

Stephan said...

I don't understand what's so bad about implementing move assignment in terms of an explicit destructor call followed by a placement new move constructor call when both are noexcept and the class is not polymorphic. Has anyone established that this is actually UB? I'd be surprised if this lead to any funny behaviour with existing compilers.

Matt Calabrese said...

After looking it up in the standard, destroy/move-construct is technically UB here even if your type has no children.

_____________________
[basic.stc.auto]

If a variable with automatic storage duration has initialization or a destructor with side effects, it shall not be destroyed before the end of its block, nor shall it be eliminated as an optimization even if it appears to be unused, except that class object or its copy/move may be eliminated as specified in 12.8.

Matt Calabrese said...

I take back what I said with respect to it being UB when there is no inheritance involved, since [basic.life] seems to clarify that it is well-defined by the example code snipped which implements copy-assign as destroy/copy-construct-in-place and a further clarification later on.

Unknown said...

In support of this article:

http://accu.org/content/conf2014/Howard_Hinnant_Accu_2014.pdf

see slides 46-53.

Scott Meyers said...

@Howart Hinnant: Very interesting stuff, Howard. Thanks for the link (and for taking the time to put together the information in your talk).

Matt Calabrese said...

Thanks for linking your slides, Howard. I'm not entirely sure that I agree with the conclusion, however. Specifically, while copy/swap (or really copy/move-assign) is generally suboptimal for copy-assign, it is at least correct in more cases by default than the actual default definition, which I think is a more important property -- either way it can be overridden when necessary, but at least it's applicable in more cases.

As well, while copy/move-assign is usually sub-optimal, destroy/move-construct is often much closer to an actual optimal move-assign implementation. Because of that, the argument that it is sub-optimal isn't quite so strong for me. The reason is that, unlike the copy/swap idiom for copy-assign, destroy/move-construct for move-assign does not trade optimal functionality for the strong exception guarantee, but rather, it is only applicable when the move and destroy are already noexcept, which they usually are anyway, and more closely mimics the overall behavior of a manually-implemented move-assign than the copy/assign idiom ever did. Again, I still don't advocate doing it for the other reasons mentioned, but I don't think we should sell it short and do think that implementations like this at least warrant further investigation. I'm not convinced of the generalization that one special member function should not be implemented in terms of another, and the implications of that generalization (greater redundancy and risk of error in a manual definition) make it something I don't think we should take too lightly.

Just so I can be sure I'm not missing something obvious here, with respect to a destroy/move-construct implementation of move-assign, can you give an example of where destroy-move would produce a "significantly" sub-optimal solution over a manually implemented move-assign? There probably are cases that come up, but in practice I don't tend to encounter them. The way I look at things right now, it just seems that the reasons why destroy/move-construct for move-assign are "bad" are not because such a definition would lead to an improper or sub-optimal solution, but rather because of the issues mentioned related to inheritance. In this respect, it is very different from the problems of copy/swap and I wouldn't discourage its use in places where the concern is performance as long as the other pitfalls are understood.

Unknown said...

Hi Matt, I agree with a lot of your remarks.

However I would never implement vector move-assign (for example) as destroy/move-construct in all cases. For the common case, vector move-assign is very nearly a destroy/move-construct. However in the case where the allocator's propagate_on_move_assign is false, and the two allocators are not equal, one should not throw away capacity on the lhs if it is sufficient to hold the contents of the rhs.

I like to look at the 6 special members as my opportunity to make my class just as good as it can be. If it turns out that for a given type, I can re-use another special member with no loss in efficiency at all, then fine. But before I do that, I will have coded up the most efficient design possible, and tested it, just to be sure.

Give each of the 6 special members the tender loving care it deserves. Don't fall into the trap of: I don't have to think about a certain special member because I can always implement it using technique X.

Kamil Rojewski said...

How about combining copy and move operators into a single one:

widget &operator =(widget w)
{
using std::swap;

swap(mData, w.mData);
return *this;
}

You get:

1. Copy-and-swap in case of copy.
2. Move-assignment to w in case of move, followed by safe swap (assuming noexcept) and resource destruction at the end.

Unknown said...

http://accu.org/content/conf2014/Howard_Hinnant_Accu_2014.pdf

see slides 46-53.

If widget contains a vector or string, or contains objects that contain a vector or string, or anything that is vector or string-like in terms of their ability to reuse resources such as memory in their copy assignment operator, then you might as well sprinkle your code with calls to this_thread::sleep_for().

Anonymous said...

C++ is getting harder to learn. Why do we have operations that can mean so many different things depending on the context or scope? Why isn't the language more expressive so that I don't have to wonder what the compiler is going to do?

I just want to pass objects around and get my work done. It really shouldn't be this hard.

You can ask an experienced C++ programmer how to do something and they'll say, "It's really simple."

And then they proceed with their explanation. Soon the caveats start coming out. They're throwing parts of their explanation onto the stack so they can dive into a contextual part of the explanation.

And at the end of it, you're drooling into cup as the explanation graph floating above your head dissolves into thin air. The experienced C++ programmer laughs and says, "It is what it is."

Sorry for the rant. You're a great speaker. Oh, and you look like a young Mick Jagger. :)

Unknown said...

Here is an idea that I came up with for implementing move in terms of swap. It is even less efficient than your original swap-based implementation, but it still not too bad, if we are talking about large objects with small handles. The advantage is that it solves the delayed destruction problem *and* it avoids code duplication (all destructor logic stays in the destructor).

Here it is. I assume that Container has a swap method
that just swaps some pointers or other small objects.

Container& Container::operator=(Container&& rhs)
{
Container tmp;
tmp.swap(rhs);
tmp.swap(*this);
return *this;
}

One can verify that if upon entry, we have *this == X and rhs == Y, then upon exit *this == Y, rhs == NULL, and tmp == X. So the destructor for Container will clean up object X at this point (and not later).

This solution also has other "features": (1) if *this and rhs are the same object, then the operation is a NOOP; (2) if the destructor for container throws, this will happen after the operation is complete.

I realize both of these "features" are rather esoteric, as neither case is likely (or even possible) to arise...

At any rate, I think this approach has some merits.

Scott Meyers said...

@Victor Shoup: I saw your comment, and I'll respond to it when I have time, but I'm under lots of deadline pressure right now, so it will probably take a while. I apologize for that. Please be patient.

Thanks.

Anonymous said...

Hi Scott,

You need to distinguish two situations:

Cases like std::vector which are completely visible to the compiler. In that case, I agree with your article.

But I disagree with you when talking about pimpl'ed classes.

Pimpl'ed classes are slow, since by design everything is out-of-line, and the compiler can't inline (much less reason about) anything of the class, in particular, it cannot inline the ctors, the dtor, and the copy-assignment operator.

But a move-assignment operator that uses swap() can be inlined without breaking encapsulation of the class, and without invoking another non-inline function.

And afaik, that's the only way to have a move-assignment operator that is fully inline for a pimpl'ed class.

Scott Meyers said...

@Victor Shoup: First, apologies for the long delay. But at least it's still 2014! Second, it's not clear to me how your design is preferable to what seems to be a more conventional approach: move construction followed by a swap (as suggested by Callionica and Unknown on June 24 above):

Container& Container::operator=(Container&& rhs)
{
  Container tmp(std::move(rhs));
  tmp.swap(*this);
  return *this;
}

From what I can tell, this approach has all the advantages that yours does, and it's a bit shorter. Am I overlooking something?

Scott Meyers said...

@Mmutz: If inlining the move assignment operator is more important to you than timely resource release (e.g., of the object pointed to by the pImpl pointer), then your approach is reasonable. I personally would tend to assign greater weight to timely resource release.

Since it's been a while since I wrote the article, and because there are a lot of comments now between the article itself and this comment, I'll reiterate my bottom line:

I have no issue with developers who, consciously aware of the performance and behavioral implications of move-assignment-via-swap, choose to use it anyway. For some types, it may be a perfectly valid implementation choice. My concern is that it's gaining a reputation as the way to implement move assignment, and I don't think that's good for C++.