G+: Huh

David Coles 05 Jun 2017

Huh. I always knew that `list + tuple` was a TypeError in Python, but didn't know `list += tuple` (like `list.extend(tuple)`) would work.

Weirdly `tuple += list` is a TypeError, but `tuple += tuple2` is OK.
(however it creates a new tuple like `tuple = tuple + tuple2`)

(Cross-posted from https://twitter.com/davidcoles/status/871735636058923009)

(+1's) 1

Matt Giuca 05 Jun 2017

Gross. Python's list += is completely broken anyway and should be avoided: it mutates the list.

That's very counter-intuitive given that + creates a copy of the list. x += y should always have the same semantics as x = x + y. (It could be implemented with a fast-path but it should have the same semantics.)

So tuple += tuple2 is fine, list += list_or_tuple is broken.

David Coles 05 Jun 2017

I'm a little on the fence about it. On one hand, being nice syntax for in-place operations is sometimes quite useful (the rational in PEP-203 was matrix operations) and I believe std::string overloads += for append. On the other hand it's surprising given that most people reasonably assume x += y is short for x = x + y (where x is evaluated only once) and having an operator that may operate in place depending on some deep voodo like the presence or absence of _iadd_ is a recipe for disaster.

PEP 203 -- Augmented Assignments

Matt Giuca 05 Jun 2017

> I believe std::string overloads += for append

Yes but C++ is totally different. In C++ (like C) everything is by value unless explicitly dereferenced.

In Python, if you have a list variable x and you write "x += y", there is an implicit dereference causing the list object pointed to by x to be modified. That's completely different to "x = x + y" which assigns x to point at a new object.

In C++, if you have a variable std::string or std::vector x and you write "x += y", there is no dereference, it modifies the variable x by value. This has the exact same semantics as "x = x + y" (but probably better performance, unless the compiler optimizes the latter somehow). Now it's true that there could be any number of pointers or references pointing at &x, but those are pointers to the variable x, so they get updated when x changes no matter which operator is used. Even if x was an int, and you used "x += y" or "x = x + y", pointers pointing at x would get updated all the same. Thus C++ has totally consistent semantics between the short and long form of += as well as for all different types.

David Coles 06 Jun 2017

Because of operator overloading, operators aren't always by-value like they are in C. Specifically you can also overload compound assignment operators, so += can differ from + followed by assignment. You can see the difference in the assembly code: The copy version creates a temporary string and then copies the value to x and frees the temporary, while the inplace version calls ::append.

That said, it's much harder to do accidentally since the type system makes it pretty clear when something is being done in-place and when a new copy is being made. And it's possible if the compiler is smart enough and can see the full lifetime of the object it might be able use an inplace implementation as-if it was doing a copy (as an optimization).

Compiler Explorer - C++

Matt Giuca 06 Jun 2017

> Because of operator overloading, operators aren't always by-value like they are in C.

I'm talking about conceptually not literally. Of course with operator overloading I can make += perform subtraction while + performs multiplication. But the semantics of those operators is consistent in the standard library which is as good as you can expect in a language with operator overloading. That the standard library is consistent strongly encourages other code to follow that rule as well.

Unlike Python where there is no well-understood semantics for += due to list and str/tuple disagreeing.

As for looking at the assembly, well sure it generates different code in C++ but it has the same semantics. I don't think there is any case for std::string or std::vector where the behaviour of += is distinguishable from +, other than if you use .data() and compare the internal pointers (but that's exposing implementation details by design).