dash/src/limitedmap.h

119 lines
3.9 KiB
C
Raw Normal View History

// Copyright (c) 2012-2015 The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
#ifndef BITCOIN_LIMITEDMAP_H
#define BITCOIN_LIMITEDMAP_H
#include <assert.h>
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
#include <algorithm>
#include <unordered_map>
#include <vector>
/** STL-like map container that only keeps the N elements with the highest value. */
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
// WARNING, this was initially the "limitedmap" class from Bitcoin, but now does not maintain ordering. If any backports
// ever start using this map in a way that requires ordering, do NOT use this as it is but instead reintroduce the original
// limitedmap
template <typename K, typename V, typename Hash = std::hash<K>>
class unordered_limitedmap
{
public:
typedef K key_type;
typedef V mapped_type;
typedef std::pair<const key_type, mapped_type> value_type;
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
typedef typename std::unordered_map<K, V, Hash>::const_iterator const_iterator;
typedef typename std::unordered_map<K, V, Hash>::size_type size_type;
protected:
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
std::unordered_map<K, V, Hash> map;
typedef typename std::unordered_map<K, V, Hash>::iterator iterator;
size_type nMaxSize;
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
size_type nPruneAfterSize;
public:
explicit unordered_limitedmap(size_type nMaxSizeIn, size_type nPruneAfterSizeIn = 0)
2015-08-17 18:06:45 +02:00
{
assert(nMaxSizeIn > 0);
nMaxSize = nMaxSizeIn;
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
if (nPruneAfterSizeIn == 0) {
nPruneAfterSize = nMaxSize;
} else {
nPruneAfterSize = nPruneAfterSizeIn;
}
assert(nPruneAfterSize >= nMaxSize);
2015-08-17 18:06:45 +02:00
}
const_iterator begin() const { return map.begin(); }
const_iterator end() const { return map.end(); }
size_type size() const { return map.size(); }
bool empty() const { return map.empty(); }
const_iterator find(const key_type& k) const { return map.find(k); }
size_type count(const key_type& k) const { return map.count(k); }
void insert(const value_type& x)
{
std::pair<iterator, bool> ret = map.insert(x);
if (ret.second)
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
prune();
}
void insert_or_update(const value_type& x)
{
std::pair<iterator, bool> ret = map.insert(x);
if (ret.second)
prune();
else
ret.first->second = x.second;
}
void erase(const key_type& k)
{
map.erase(k);
}
void update(const_iterator itIn, const mapped_type& v)
{
// Using map::erase() with empty range instead of map::find() to get a non-const iterator,
// since it is a constant time operation in C++11. For more details, see
// https://stackoverflow.com/questions/765148/how-to-remove-constness-of-const-iterator
iterator itTarget = map.erase(itIn, itIn);
if (itTarget == map.end())
return;
itTarget->second = v;
}
size_type max_size() const { return nMaxSize; }
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
size_type max_size(size_type nMaxSizeIn, size_type nPruneAfterSizeIn = 0)
{
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
assert(nMaxSizeIn > 0);
nMaxSize = nMaxSizeIn;
if (nPruneAfterSizeIn == 0) {
nPruneAfterSize = nMaxSize;
} else {
nPruneAfterSize = nPruneAfterSizeIn;
2015-08-17 18:06:45 +02:00
}
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
assert(nPruneAfterSize >= nMaxSize);
prune();
return nMaxSize;
}
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
void prune()
{
if (map.size() <= nPruneAfterSize) {
return;
}
std::vector<iterator> sortedIterators;
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
sortedIterators.reserve(map.size());
for (auto it = map.begin(); it != map.end(); ++it) {
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
sortedIterators.emplace_back(it);
}
std::sort(sortedIterators.begin(), sortedIterators.end(), [](const iterator& it1, const iterator& it2) {
return it1->second < it2->second;
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
});
size_type tooMuch = map.size() - nMaxSize;
assert(tooMuch > 0);
sortedIterators.resize(tooMuch);
for (auto& it : sortedIterators) {
map.erase(it);
Collection of minor performance optimizations (#2855) * Merge #13176: Improve CRollingBloomFilter performance: replace modulus with FastMod 9aac9f90d5e56752cc6cbfac48063ad29a01143c replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a * Use unordered_map in CSporkManager In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps. * Use std::unordered_map instead of std::map in limitedmap * Use unordered_set for CNode::setAskFor * Add serialization support for unordered maps and sets * Use unordered_map for mapArgs and mapMultiArgs * Let limitedmap prune in batches and use unordered_multimap Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed. * Instead of using a multimap for mapAskFor, use a vector which we sort on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time. * Fix compilation of tests * Fix limitedmap tests * Rename limitedmap to unordered_limitedmap to ensure backports conflict This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action. * Fix compilation error on Travis
2019-04-11 14:42:14 +02:00
}
}
};
#endif // BITCOIN_LIMITEDMAP_H