So here's a challenge. Given a bitarray represented by a Python integer (e.g 0x12345678) work what it's two's complement value is (assuming a 32-bit bitarray).
Despite all of Python's integers being based on 2s complement form, it turns out to be surprisingly tricky.
One hack is just to use ctypes.c_int32(0x12345678).value, but this feels a little dishonest.
I worked out a way using `value |= (-1 ^ 0xffffffff)` to sign-extend when the high-bit is set, but it's still pretty long and clunky and relies on a conditional statement.
(+1's) 1
Matt Giuca
Using ctypes is technically system dependent (can't guarantee two's complement), so I like the direction of your final solution more. Note that in the other branch of your condition, you need to & it with 0x7fffffff.
Here's a decent solution which doesn't have a condition:
value = -(value & 0x80000000) | (value & 0x7fffffff)
David Coles
Yup. That's what really urked me about it. My "safe solution" ended up using the struct module to convert to a bytestring with ">I" and then back again with ">i".
I think I can get away without the mask since we've already tested that the high-bit is 1, but you do need something to strip off any sign bit.
Your solution definitely feels much cleaner, I guess "the trick" being it uses the identity -0x80000000 = 0x80000000 (assuming 32-bit bitarrays). Then it's just nice bit-wise operations.
I'm actually surprised how sensible Python's bitwise operations are, even though they're done using signed integers. Sure there's a few gotchas (e.g. needing to mask off the sign-bit), but on the whole "infinite bits" integer representation is quite nice.
"I guess "the trick" being it uses the identity -0x80000000 = 0x80000000 (assuming 32-bit bitarrays)."
That's not really it.
The trick is that for any value x with a single 1 bit, -x is a sign extension of x. For example -0x00008000 = 0xffff8000.
David Coles
That's a much better way of putting it (I didn't mean trick in the negative sense, just "the tricky part" I couldn't get my head around). At a low level the "-" operator just flips all the bits and then adds 1, so for one-bit set values the set bit becomes the only unset bit, but the add 1 step bubbles up to set this bit again, while also clearing all the lower bits, right?
Usually when really stuck, I go and take a look at https://graphics.stanford.edu/~seander/bithacks.html, but I can't find this version here. I guess it's because it takes advantage of Python's "bit operations are treated as 2s complement representation" behavior (technically would be undefined behavior in C).
For arbitrary number of bits:
# b: Number of bits lambda value, b: -(value & (1<<(b-1))) | (value & ((1<<(b-1)) - 1))
If you're wondering why I'm trying to do this, it's to deal with horrible Microsoft HRESULT style error codes (e.g. ones that look like 0x80000001 but stored in a signed int).
All sorts of platform specific assumptions (platform uses 32-bit ints, platform uses 2s complement representation, etc.). To Microsoft's credit, they do try and hide this with lots of typedefs and macros (e.g. FAILED, IS_ERROR), but I'm sure many people expect to be able to test errors by checking for a negative value.
Matt Giuca
Maybe it would be best to just treat as a signed int and say >=0x80000000 is an error?
David Coles
That was my original plan, but it breaks if people try and use the -12345 form and felt it's nicer to match what I'd get if I called said C API directly via ctypes (currently get error as a hexstring from another component on another host).