4e46e6101c
296be8f58e02b39a58f017c52294aceed22c3ffd Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents (Jeremy Rubin) 46d955d196043cc297834baeebce31ff778dff80 Remove mapLinks in favor of entry inlined structs with iterator type erasure (Jeremy Rubin) Pull request description: Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free. Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry. It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves. The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff). ### Simpler ownership model No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry. ### Memory Usage We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure. ### Performance If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in *so many* places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case. The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster. ``` Before: # Benchmark, evals, iterations, total, min, max, median ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793 After: # Benchmark, evals, iterations, total, min, max, median ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596 ``` The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything. Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users. ``` # Benchmark, evals, iterations, total, min, max, median ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068 # Benchmark, evals, iterations, total, min, max, median ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306 ``` I don't think it's possible to come up with an example of where a maplinks based design would have better performance, but it's something for reviewers to consider. # Discussion ## Why maplinks in the first place? I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map. ## Is iterator_to weird? iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to > iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable: > > - Interoperability with preexisting APIs based on pointers or references. > - Publication of pointer-based interfaces (for instance, when designing a C-compatible library). > - The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times. > - Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available. In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined). Edit by laanwj: removed at sign from the description ACKs for top commit: jonatack: re-ACK 296be8f per `git range-diff ab338a19 3ba1665 296be8f`, sanity check gcc 10.2 debug build is clean. hebasto: re-ACK 296be8f58e02b39a58f017c52294aceed22c3ffd, only rebased since my [previous](https://github.com/bitcoin/bitcoin/pull/19478#pullrequestreview-482400727) review (verified with `git range-diff`). Tree-SHA512: f5c30a4936fcde6ae32a02823c303b3568a747c2681d11f87df88a149f984a6d3b4c81f391859afbeb68864ef7f6a3d8779f74a58e3de701b3d51f78e498682e |
||
---|---|---|
.github | ||
.tx | ||
build-aux/m4 | ||
ci | ||
contrib | ||
depends | ||
doc | ||
share | ||
src | ||
test | ||
.cirrus.yml | ||
.dockerignore | ||
.editorconfig | ||
.fuzzbuzz.yml | ||
.gitattributes | ||
.gitignore | ||
.gitlab-ci.yml | ||
.python-version | ||
.style.yapf | ||
.travis.yml | ||
autogen.sh | ||
CMakeLists.txt | ||
configure.ac | ||
CONTRIBUTING.md | ||
COPYING | ||
INSTALL.md | ||
libdashconsensus.pc.in | ||
Makefile.am | ||
README.md | ||
SECURITY.md |
Dash Core staging tree
CI | master | develop |
---|---|---|
Gitlab |
For an immediately usable, binary version of the Dash Core software, see https://www.dash.org/downloads/.
Further information about Dash Core is available in the doc folder.
What is Dash?
Dash is an experimental digital currency that enables instant, private payments to anyone, anywhere in the world. Dash uses peer-to-peer technology to operate with no central authority: managing transactions and issuing money are carried out collectively by the network. Dash Core is the name of the open source software which enables the use of this currency.
For more information read the original Dash whitepaper.
License
Dash Core is released under the terms of the MIT license. See COPYING for more information or see https://opensource.org/licenses/MIT.
Development Process
The master
branch is meant to be stable. Development is normally done in separate branches.
Tags are created to indicate new official,
stable release versions of Dash Core.
The develop
branch is regularly built (see doc/build-*.md for instructions) and tested, but is not guaranteed to be
completely stable.
The contribution workflow is described in CONTRIBUTING.md and useful hints for developers can be found in doc/developer-notes.md.
Testing
Testing and code review is the bottleneck for development; we get more pull requests than we can review and test on short notice. Please be patient and help out by testing other people's pull requests, and remember this is a security-critical project where any mistake might cost people lots of money.
Automated Testing
Developers are strongly encouraged to write unit tests for new code, and to
submit new unit tests for old code. Unit tests can be compiled and run
(assuming they weren't disabled in configure) with: make check
. Further details on running
and extending unit tests can be found in /src/test/README.md.
There are also regression and integration tests, written
in Python.
These tests can be run (if the test dependencies are installed) with: test/functional/test_runner.py
The Travis CI system makes sure that every pull request is built for Windows, Linux, and macOS, and that unit/sanity tests are run automatically.
Manual Quality Assurance (QA) Testing
Changes should be tested by somebody other than the developer who wrote the code. This is especially important for large or high-risk changes. It is useful to add a test plan to the pull request description if testing the changes is not straightforward.
Translations
Changes to translations as well as new translations can be submitted to Dash Core's Transifex page.
Translations are periodically pulled from Transifex and merged into the git repository. See the translation process for details on how this works.
Important: We do not accept translation changes as GitHub pull requests because the next pull from Transifex would automatically overwrite them again.