Dash - Reinventing Cryptocurrency
Go to file
MarcoFalke 4e46e6101c
Merge #19478: Remove CTxMempool::mapLinks data structure member
296be8f58e02b39a58f017c52294aceed22c3ffd Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents (Jeremy Rubin)
46d955d196043cc297834baeebce31ff778dff80 Remove mapLinks in favor of entry inlined structs with iterator type erasure (Jeremy Rubin)

Pull request description:

  Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.

  Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.

  It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.

  The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff).

  ### Simpler ownership model
  No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.

  ### Memory Usage
  We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.

  ### Performance
  If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in *so many* places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.

  The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.
  ```
  Before:
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793

  After:
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596
  ```
  The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything.

  Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.

  ```
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068

  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306
  ```
  I don't think it's possible to come up with an example of where a maplinks based design would have better performance, but it's something for reviewers to consider.

  # Discussion
  ## Why maplinks in the first place?

  I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map.

  ## Is iterator_to weird?

  iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to

  >  iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable:
  >
  >     - Interoperability with preexisting APIs based on pointers or references.
  >     - Publication of pointer-based interfaces (for instance, when designing a C-compatible library).
  >     - The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times.
  >     - Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available.

  In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined).

  Edit by laanwj: removed at sign from the description

ACKs for top commit:
  jonatack:
    re-ACK 296be8f per `git range-diff ab338a19 3ba1665 296be8f`, sanity check gcc 10.2 debug build is clean.
  hebasto:
    re-ACK 296be8f58e02b39a58f017c52294aceed22c3ffd, only rebased since my [previous](https://github.com/bitcoin/bitcoin/pull/19478#pullrequestreview-482400727) review (verified with `git range-diff`).

Tree-SHA512: f5c30a4936fcde6ae32a02823c303b3568a747c2681d11f87df88a149f984a6d3b4c81f391859afbeb68864ef7f6a3d8779f74a58e3de701b3d51f78e498682e
2024-03-06 02:00:40 +07:00
.github ci: exclude translation files from clang-diff analysis 2024-03-04 08:54:53 -06:00
.tx fix: follow-up #5393 - should be used [dash.dash_ents] (#5472) 2023-07-01 14:16:50 +03:00
build-aux/m4 Merge bitcoin/bitcoin#26002: build: sync ax_boost_base from upstream 2024-02-29 12:33:46 -06:00
ci non-scripted-diff: bump copyright year to 2023 2024-02-24 11:05:37 -06:00
contrib fix: uninitialized variable onions in makeseeds script 2024-03-03 23:34:34 -06:00
depends Merge bitcoin/bitcoin#25322: build: Fix capnp package build for Android 2024-03-05 10:40:36 -06:00
doc Merge #19473: net: Add -networkactive option 2024-03-06 02:00:39 +07:00
share feat: Set client version for non-release binaries and version in guix based on git tags (#5653) 2024-01-11 21:43:42 -06:00
src Merge #19478: Remove CTxMempool::mapLinks data structure member 2024-03-06 02:00:40 +07:00
test Merge #19473: net: Add -networkactive option 2024-03-06 02:00:39 +07:00
.cirrus.yml Merge bitcoin/bitcoin#24574: test: Actually print TSan tracebacks 2024-02-22 20:58:43 -06:00
.dockerignore
.editorconfig
.fuzzbuzz.yml Merge #21064: refactor: use std::shared_mutex & remove Boost Thread 2024-01-16 09:29:52 -06:00
.gitattributes
.gitignore merge bitcoin#21336: Make .gitignore ignore src/test/fuzz/fuzz.exe 2024-02-06 08:39:51 -06:00
.gitlab-ci.yml build: disable multiprocess build in gitlab CI 2024-01-16 09:34:28 -06:00
.python-version partial bitcoin#27483: Bump python minimum version to 3.8 2023-05-11 09:18:48 -05:00
.style.yapf
.travis.yml Merge #20339: ci: Run more ci configs on cirrus 2024-02-05 10:20:31 -06:00
autogen.sh Merge #17829: scripted-diff: Bump copyright of files changed in 2019 2023-12-06 11:40:14 -06:00
CMakeLists.txt chore: Added missing sources files in CMake (#5503) 2023-07-25 12:23:56 -05:00
configure.ac chore: bump version to v21.0.0; allow breaking changes from here out 2024-03-05 12:09:15 -06:00
CONTRIBUTING.md Merge bitcoin/bitcoin#25165: doc: Explain squashing with merge commits 2024-02-22 20:58:44 -06:00
COPYING docs: update license year range to 2024 (#5890) 2024-02-22 20:56:43 -06:00
INSTALL.md
libdashconsensus.pc.in revert dash#1432: Rename consensus source library and API 2022-08-09 14:16:28 +05:30
Makefile.am Merge #20549: Support make src/bitcoin-node and src/bitcoin-gui 2024-01-16 09:34:27 -06:00
README.md chore: drop version from README.md which is not really useful (#5811) 2024-01-10 12:12:41 -06:00
SECURITY.md Merge bitcoin/bitcoin#23466: doc: Suggest keys.openpgp.org as keyserver in SECURITY.md 2022-04-03 18:46:47 -05:00

Dash Core staging tree

CI master develop
Gitlab Build Status Build Status

https://www.dash.org

For an immediately usable, binary version of the Dash Core software, see https://www.dash.org/downloads/.

Further information about Dash Core is available in the doc folder.

What is Dash?

Dash is an experimental digital currency that enables instant, private payments to anyone, anywhere in the world. Dash uses peer-to-peer technology to operate with no central authority: managing transactions and issuing money are carried out collectively by the network. Dash Core is the name of the open source software which enables the use of this currency.

For more information read the original Dash whitepaper.

License

Dash Core is released under the terms of the MIT license. See COPYING for more information or see https://opensource.org/licenses/MIT.

Development Process

The master branch is meant to be stable. Development is normally done in separate branches. Tags are created to indicate new official, stable release versions of Dash Core.

The develop branch is regularly built (see doc/build-*.md for instructions) and tested, but is not guaranteed to be completely stable.

The contribution workflow is described in CONTRIBUTING.md and useful hints for developers can be found in doc/developer-notes.md.

Testing

Testing and code review is the bottleneck for development; we get more pull requests than we can review and test on short notice. Please be patient and help out by testing other people's pull requests, and remember this is a security-critical project where any mistake might cost people lots of money.

Automated Testing

Developers are strongly encouraged to write unit tests for new code, and to submit new unit tests for old code. Unit tests can be compiled and run (assuming they weren't disabled in configure) with: make check. Further details on running and extending unit tests can be found in /src/test/README.md.

There are also regression and integration tests, written in Python. These tests can be run (if the test dependencies are installed) with: test/functional/test_runner.py

The Travis CI system makes sure that every pull request is built for Windows, Linux, and macOS, and that unit/sanity tests are run automatically.

Manual Quality Assurance (QA) Testing

Changes should be tested by somebody other than the developer who wrote the code. This is especially important for large or high-risk changes. It is useful to add a test plan to the pull request description if testing the changes is not straightforward.

Translations

Changes to translations as well as new translations can be submitted to Dash Core's Transifex page.

Translations are periodically pulled from Transifex and merged into the git repository. See the translation process for details on how this works.

Important: We do not accept translation changes as GitHub pull requests because the next pull from Transifex would automatically overwrite them again.