The length of vectors, maps, sets, etc are serialized using
Write/ReadCompactSize -- which, unfortunately, do not use a
unique encoding.
So deserializing and then re-serializing a transaction (for example)
can give you different bits than you started with. That doesn't
cause any problems that we are aware of, but it is exactly the type
of subtle mismatch that can lead to exploits.
With this pull, reading a non-canonical CompactSize throws an
exception, which means nodes will ignore 'tx' or 'block' or
other messages that are not properly encoded.
Please check my logic... but this change is safe with respect to
causing a network split. Old clients that receive
non-canonically-encoded transactions or blocks deserialize
them into CTransaction/CBlock structures in memory, and then
re-serialize them before relaying them to peers.
And please check my logic with respect to causing a blockchain
split: there are no CompactSize fields in the block header, so
the block hash is always canonical. The merkle root in the block
header is computed on a vector<CTransaction>, so
any non-canonical encoding of the transactions in 'tx' or 'block'
messages is erased as they are read into memory by old clients,
and does not affect the block hash. And, as noted above, old
clients re-serialize (with canonical encoding) 'tx' and 'block'
messages before relaying to peers.
Variable-length integers: bytes are a MSB base-128 encoding of the number.
The high bit in each byte signifies whether another digit follows. To make
the encoding is one-to-one, one is subtracted from all but the last digit.
Thus, the byte sequence a[] with length len, where all but the last byte
has bit 128 set, encodes the number:
(a[len-1] & 0x7F) + sum(i=1..len-1, 128^i*((a[len-i-1] & 0x7F)+1))
Properties:
* Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes)
* Every integer has exactly one encoding
* Encoding does not depend on size of original integer type
We're in a wholly different world now, C++-compiler-wise.
Current std::stringstream implementations don't have the stated problem anymore,
and are just as fast as CDataStream.
The #ifdef'd block does not even compile anymore; CDataStream constructor changed,
and missing some std::. Also timing in whole seconds is also way too granular
to say anything sensible in such microbenchmarks. Just remove it,
it can always be found again in git history.
This commit removes the dependency of serialize.h on PROTOCOL_VERSION,
and makes this parameter required instead of implicit. This is much saner,
as it makes the places where changing a version number can have an
influence obvious.
* move PROTOCOL_VERSION to version.h
* move CLIENT_VERSION* to version.h, make available past cpp stage
* clearly separate client, network version portions of version.h
This turns on most gcc warnings, and removes some unused variables and other code that triggers warnings.
Exceptions are:
-Wno-sign-compare : triggered by lots of comparisons of signed integer to foo.size(), which is unsigned.
-Wno-char-subscripts : triggered by the convert-to-hex functions (I may fix this in a future commit).
Replaced all occurrences of #if* __WXMSW__ with WIN32,
and all occurrences of __WXMAC_OSX__ with MAC_OSX, and made
sure those are defined appropriately in the makefile and bitcoin-qt.pro.
util.h doesn't use REF, serialize.h does, creating a dependency of
serialize.h on util.h, but util.h already depends on serialize.h. To
resolve this circular dependency the function 'REF' has now been moved
closer to one of its two points of use.
Signed-off-by: Giel van Schijndel <me@mortis.eu>