5f26855f109af53a336d5f98ed0ae584e7a31f84 test: Remove ubsan alignment suppressions (Wladimir J. van der Laan)
9d933ef9191417b4b7d29eaa3c3a571f814acc8e prevector: avoid misaligned member accesses (Anthony Towns)
Pull request description:
Ensure prevector data is appropriately aligned. Earlier discussion in #17530.
**Edit laanwj**: In contrast to #17530, it does this without increase in size of any of the coin cache data structures (x86_64, clang)
| Struct | (size,align) before | (size,align) after |
| ------------- | ------------- | ------- |
| Coin | 48, 8 | 48, 8 |
| CCoinsCacheEntry | 56, 8 | 56, 8 |
| CScript | 32, 1 | 32, 8 |
ACKs for top commit:
laanwj:
ACK 5f26855f109af53a336d5f98ed0ae584e7a31f84
practicalswift:
ACK 5f26855f109af53a336d5f98ed0ae584e7a31f84
jonatack:
ACK 5f26855f109af53a336d5f98ed0ae584e7a31f84
Tree-SHA512: 98d112d6856f683d5b212410b73f3071d2994f1efb046a2418a35890aa1cf1aa7c96a960fc2e963fa15241e861093c1ea41951cf5b4b5431f88345eb1dd0a98a
d2eee87928 Lift prevector default vals to the member declaration (Ben Woosley)
Pull request description:
I overlooked this possibility in #14028
ACKs for commit d2eee8:
promag:
utACK d2eee87, change looks good because members are always initialized.
251Labs:
utACK d2eee87 nice one.
ken2812221:
utACK d2eee87928781ab3082d4279aa6f19641a45e801
practicalswift:
utACK d2eee87928781ab3082d4279aa6f19641a45e801
scravy:
utACK d2eee87928781ab3082d4279aa6f19641a45e801
Tree-SHA512: f2726bae1cf892fd680cf8571027bcdc2e42ba567eaa901fb5fb5423b4d11b29e745e0163d82cb513d8c81399cc85933a16ed66d4a30829382d4721ffc41dc97
76e13b586ff690dd3312f719f305c0d1021cd505 warnings: Compiler warning on memset usage for non-trivial type (Lenny Maiorani)
Pull request description:
Fixing warnings reported by GCC: memset of non-trivial type
Tree-SHA512: 357aeac60acfb922851daaf0bd8d4b81e377da7c9b31c2942b54cfdd4129dae61e577fc0a6aa430348cb07abd16ae32f986a64dbb2c1d90ec148f53e7451a229
497e90c02b96e8739e8faf3d43e41ba1ff0627b7 Remove default argument to prevector constructor to remove ambiguity (Ben Woosley)
Pull request description:
The call with this default argument is redundant with `prevector(size_type)` on line 251.
Tree-SHA512: 4d22e6f4cd56e4b700596d7f5afc945ec6684636a94690fa16a1bbb34e4f53b6340f53a6c314fea213359426474125228ba7193388789f8a13308506358e92db
* Implement assign_to in prevector
* Implement optimized fill() methods for trivially constructible types in prevector
No need to invoke the "new" operator on every element when the elements
are trivially constructible (e.g. unsigned char)
* Benchmark prevector<...>::const_iterator vs vector.assign()
* Manually invoke ::memmove instead of relying on the stl
Some compilers do not automatically switch to memmove internally, so lets
do this manually.
* Use prevector::assign_to in benchmark
* Rename prevector benchmarks
* Use larger copy ranges in benchmarks
Co-authored-by: UdjinM6 <UdjinM6@users.noreply.github.com>
bc70ab5 Fix header guards using reserved identifiers (Dan Raviv)
Pull request description:
Identifiers beginning with an underscore followed immediately by an uppercase letter are reserved.
Tree-SHA512: 32b45e0aef6f6325bc3cbdea399532437490b753621149374df27e1c1eed6739ad1a09ae368e888cab8d01fb757f1b190c45a0854d2861de39a9296f17e29d9e
86b47fa741408b061ab0bda784b8678bfd7dfa88 speed up Unserialize_impl for prevector (Akio Nakamura)
Pull request description:
The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times.
However, `reserve()` does not change the value of `_size` (a private member of prevector).
This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`.
The changes are as follows:
1. prevector.h
Add a public member function named 'append'.
This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values.
2. serialize.h
In the following two function:
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)`
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)`
Make a callback function from each original logic of reading values from stream, and call prevector's `append()`.
3. test/prevector_tests.cpp
Add a test for `append()`.
## A benchmark result is following:
[Machine]
MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD)
[result]
DeserializeAndCheckBlockTest => 22% faster
DeserializeBlockTest => 29% faster
[before PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339
DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187
[After PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757
DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527
ACKs for top commit:
laanwj:
utACK 86b47fa741408b061ab0bda784b8678bfd7dfa88
Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
5aad635 Use memset() to optimize prevector::resize() (Evan Klitzke)
e46be25 Reduce redundant code of prevector and speed it up (Akio Nakamura)
f0e7aa7 Add new prevector benchmarks. (Evan Klitzke)
Pull request description:
This branch optimizes various `prevector` operations, especially resizing vectors. While profiling the `loadblk` thread I noticed that a lot of time was being spent in `prevector::resize()` which led to this work. I have some data here indicating that it takes up **37%** of the time in `ReadBlockFromDisk()`: https://monad.io/readblockfromdisk.svg
This branch improves things significantly. For trivial types, the new results for the prevector benchmark are:
* `PrevectorClearTrivial` which tests `prevector::clear()` becomes 24.6x faster
* `PrevectorDestructorTrivial` which tests `prevector::~prevector()` becomes 20.5x faster
* `PrevectorResizeTrivial` which tests `prevector::resize()` becomes 20.3x faster
Note that in practice it looks like the prevector is only used to contain `unsigned char` types, which is a trivial type. The benchmarks are testing a bit of an extreme case, but the changes here are motivated by the profiling data for `ReadBlockFromDisk()` I linked to above.
The pull request here consists of a series of three commits:
* The first adds new benchmarks but does not change the prevector code.
* The second is from @AkioNak , and merges some prevector optimizations he submitted in #11988
* The third optimizes `prevector::resize()` to use `memset()` when the prevector contains trivially constructible types
Tree-SHA512: 28f7cbb91a19f9f43b6a5942781d7eb2e3197389186b666f086b69df12bee37773140f765426d715bfb8ebff79cb27a5f1206d0325b54b4aa65598b50fb18368
45a5aaf Only call clear on prevector if it isn't trivially destructible and don't loop in clear (Jeremy Rubin)
aaa02e7 Add prevector destructor benchmark (Jeremy Rubin)
Tree-SHA512: 52bc8163b65b71310252f2d578349d0ddc364a6c23795c5e06e101f5449f04c96cbdca41c0cffb1974b984b8e33006471137d92b8dd4a81a98e922610a94132a
f00705a serialize: Deprecate `begin_ptr` / `end_ptr` (Wladimir J. van der Laan)
47314e6 prevector: add C++11-like data() method (Wladimir J. van der Laan)
swap was using an incorrect condition to determine when to apply an optimization
(not swapping the full direct[] when swapping two indirect prevectors).
Rather than correct the optimization I'm removing it for simplicity. Removing
this optimization minutely improves performance in the typical (currently only)
usage of member swap(), which is swapping with a freshly value-initialized
object.
Fixes a bug in which pop_back did not call the deleted item's destructor.
Using the most general erase() implementation to implement all the others
prevents similar bugs because the coupling between deallocation and destructor
invocation only needs to be maintained in one place.
Also reduces duplication of complex memmove logic.