86b47fa741408b061ab0bda784b8678bfd7dfa88 speed up Unserialize_impl for prevector (Akio Nakamura)
Pull request description:
The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times.
However, `reserve()` does not change the value of `_size` (a private member of prevector).
This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`.
The changes are as follows:
1. prevector.h
Add a public member function named 'append'.
This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values.
2. serialize.h
In the following two function:
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)`
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)`
Make a callback function from each original logic of reading values from stream, and call prevector's `append()`.
3. test/prevector_tests.cpp
Add a test for `append()`.
## A benchmark result is following:
[Machine]
MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD)
[result]
DeserializeAndCheckBlockTest => 22% faster
DeserializeBlockTest => 29% faster
[before PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339
DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187
[After PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757
DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527
ACKs for top commit:
laanwj:
utACK 86b47fa741408b061ab0bda784b8678bfd7dfa88
Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
```
./serialize.h:762:49: warning: static_assert with no message is a C++17 extension [-Wc++17-extensions]
static_assert(is_serializable_enum<T>::value);
^
, ""
```
* Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod
9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl)
Pull request description:
Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3:
```
RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before
RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod
```
Be aware that this changes the internal data of the filter, so this should probably
not be used for CBloomFilter because of interoperability problems.
Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a
* Use unordered_map in CSporkManager
In one of my profiling sessions with many InstantSend transactions
happening, calls into CSporkManager added up to about 1% of total CPU time.
This is easily avoidable by using unordered maps.
* Use std::unordered_map instead of std::map in limitedmap
* Use unordered_set for CNode::setAskFor
* Add serialization support for unordered maps and sets
* Use unordered_map for mapArgs and mapMultiArgs
* Let limitedmap prune in batches and use unordered_multimap
Due to the batched pruning, there is no need to maintain an ordered map
of values anymore. Only when nPruneAfterSize, there is a need to create
a temporary ordered vector of values to figure out what can be removed.
* Instead of using a multimap for mapAskFor, use a vector which we sort on demand
CNode::AskFor will now push entries into an initially unordered vector
instead of an ordered multimap. Only when we later want to use vecAskFor in
SendMessages, we sort the vector.
The vector will actually be mostly sorted in most cases as insertion order
usually mimics the desired ordering. Only the last few entries might need
some shuffling around. Doing the sort on-demand should be less wasteful
then trying to maintain correct order all the time.
* Fix compilation of tests
* Fix limitedmap tests
* Rename limitedmap to unordered_limitedmap to ensure backports conflict
This ensures that future backports that depends on limitedmap's ordering
conflict so that we are made aware of needed action.
* Fix compilation error on Travis
* Sort evo/* source files in Makefile.am
* Keep track of proRegTxHash in CConnman::masternodeQuorumNodes map
We will later need the proRegTxHash
* Fix serialization of std::tuple with const rvalue elements
Having serialization and deserialization in the same specialized template
results in compilation failures due to the "if(for_read)" branch.
* Implement MNAUTH message
This allows masternodes to authenticate themself.
* Protect fresh incoming connections for a second from eviction
Give fresh connections some time to do the VERSION/VERACK handshake and
an optional MNAUTH when it's a masternode. When an MNAUTH happened, the
incoming connection is then forever protected against eviction.
If a timeout of 1 second occurs or the first message after VERACK is not
MNAUTH, the node is not protected anymore and becomes eligable for
eviction.
* Avoid connecting to masternodes if an incoming connection is from the same one
Now that incoming connections from MNs authenticate them self, we can avoid
connecting to the same MNs through intra-quorum connections.
* Apply review suggestions
* Require only 3 out of 5 signatures for old InstantSend in regtest mode
* Use LLMQs of size 5 with threshold of 3 for regtest
* Fix wrong check for out-of-range bits in CFixedBitSet
* Reduce number of masternodes in masternode/LLMQ tests
* Add missing \n to LogPrintf call
* Use correct indexes for isolated/receiver/sender nodes
The way it was before resulted in nodes 1-3 being unused and 6-8 being used
for these 3 special nodes even though these are masternodes.
* Avoid stopping/starting isolated node in p2p-instantsend.py
It's enough to disable networking for this node.
* Split up remaining logic from CMasternodeMan into CMasternodeMetaMan and CMasternodeUtils
Also get rid of CMastermode and store remaining meta info
in CMasternodeMetaInfo
* Also allow non-const T in Serialize/Unserialize for shared_ptr
* Rename CActiveDeterministicMasternodeManager to CActiveMasternodeManager
* Fix nowallet compile in masternode-utils.cpp
b4e4ba4 Introduce convenience type CTransactionRef (Pieter Wuille)
1662b43 Make CBlock::vtx a vector of shared_ptr<CTransaction> (Pieter Wuille)
da60506 Add deserializing constructors to CTransaction and CMutableTransaction (Pieter Wuille)
0e85204 Add serialization for unique_ptr and shared_ptr (Pieter Wuille)
d59a518 Use fixed preallocation instead of costly GetSerializeSize (Pieter Wuille)
25a211a Add optimized CSizeComputer serializers (Pieter Wuille)
a2929a2 Make CSerAction's ForRead() constexpr (Pieter Wuille)
a603925 Avoid -Wshadow errors (Pieter Wuille)
5284721 Get rid of nType and nVersion (Pieter Wuille)
657e05a Make GetSerializeSize a wrapper on top of CSizeComputer (Pieter Wuille)
fad9b66 Make nType and nVersion private and sometimes const (Pieter Wuille)
c2c5d42 Make streams' read and write return void (Pieter Wuille)
50e8a9c Remove unused ReadVersion and WriteVersion (Pieter Wuille)
f00705a serialize: Deprecate `begin_ptr` / `end_ptr` (Wladimir J. van der Laan)
47314e6 prevector: add C++11-like data() method (Wladimir J. van der Laan)
* serialization: teach serializers variadics
Also add a variadic CDataStream ctor for ease-of-use.
* connman is in charge of pushing messages
The changes here are dense and subtle, but hopefully all is more explicit
than before.
- CConnman is now in charge of sending data rather than the nodes themselves.
This is necessary because many decisions need to be made with all nodes in
mind, and a model that requires the nodes calling up to their manager quickly
turns to spaghetti.
- The per-node-serializer (ssSend) has been replaced with a (quasi-)const
send-version. Since the send version for serialization can only change once
per connection, we now explicitly tag messages with INIT_PROTO_VERSION if
they are sent before the handshake. With this done, there's no need to lock
for access to nSendVersion.
Also, a new stream is used for each message, so there's no need to lock
during the serialization process.
- This takes care of accounting for optimistic sends, so the
nOptimisticBytesWritten hack can be removed.
- -dropmessagestest and -fuzzmessagestest have not been preserved, as I suspect
they haven't been used in years.
* net: switch all callers to connman for pushing messages
Drop all of the old stuff.
* drop the optimistic write counter hack
This is now handled properly in realtime.
* net: remove now-unused ssSend and Fuzz
* net: construct CNodeStates in place
* net: handle version push in InitializeNode
There's only one case where a vector containing a fundamental type is
serialized all-at-once, unsigned char. Anything else would lead to
strange results.
Use a dummy argument to overload in that case.
- it now takes over the passed file descriptor and closes it in the
destructor
- this fixes a leak in LoadExternalBlockFile(), where an exception could
cause the file to not getting closed
- disallow copies (like recently added for CAutoFile)
- make nType and nVersion private
One might assume that CAutoFile would be ref-counted so that a copied object
would delay closing the underlying file until all copies have gone out of
scope. Since that's not the case with CAutoFile, explicitly disable copying.
Thanks to Pieter Wuille for most of the work on this commit.
I did not fixup the overhaul commit, because a rebase conflicted
with "remove fields of ser_streamplaceholder".
I prefer not to risk making a mistake while resolving it.
The nType and nVersion fields of stream objects are never accessed
from outside the class (or perhaps from the inside too, I haven't checked).
Thus no need to have them in a placeholder, whose only purpose is to
fill the "Stream" template parameter in serialization implementation.
The implementation of each class' serialization/deserialization is no longer
passed within a macro. The implementation now lies within a template of form:
template <typename T, typename Stream, typename Operation>
inline static size_t SerializationOp(T thisPtr, Stream& s, Operation ser_action, int nType, int nVersion) {
size_t nSerSize = 0;
/* CODE */
return nSerSize;
}
In cases when codepath should depend on whether or not we are just deserializing
(old fGetSize, fWrite, fRead flags) an additional clause can be used:
bool fRead = boost::is_same<Operation, CSerActionUnserialize>();
The IMPLEMENT_SERIALIZE macro will now be a freestanding clause added within
class' body (similiar to Qt's Q_OBJECT) to implement GetSerializeSize,
Serialize and Unserialize. These are now wrappers around
the "SerializationOp" template.
- ensures a consistent usage in header files
- also add a blank line after the copyright header where missing
- also remove orphan new-lines at the end of some files
Thus the read(...) and write(...) methods of all stream classes now have identical parameter lists.
This will bring these classes one step closer to a common interface.
Remove the 'state' and 'exceptmask' from serialize.h's stream implementations,
as well as related methods.
As exceptmask always included 'failbit', and setstate was always called with
bits = failbit, all it did was immediately raise an exception. Get rid of
those variables, and replace the setstate with direct exception throwing
(which also removes some dead code).
As a result, good() is never reached after a failure (there are only 2
calls, one of which is in tests), and can just be replaced by !eof().
fail(), clear(n) and exceptions() are just never called. Delete them.