* More/better logging for InstantSend
* Implement CRecoveredSigsDb::TruncateRecoveredSig
* Truncate recovered sigs for ISLOCKs instead of completely removing them
This makes AlreadyHave() return true even when the recovered sig is deleted
locally. This avoids re-requesting and re-processing of old recovered sigs.
* Also truncate recovered sigs for freshly received ISLOCKs
* Fix comment
* Add .gitlab-ci.yml
* Use | instead of > for multiline commands
This honor new-lines and makes ; unnecessary
* Use ubuntu:bionic as base image
* Move cache initialization before apt-get installs
* Cache apt packages
* Move installation of wget and unzip up as we need it for the cache
* Prevent apt from deleting caches
* Collect test logs into artifact
* Make combine_logs.py always look for the template in the correct dir
* Move final cache stuff into after_script
* Reintroduce PYTHON_DEBUG=1, but only for .travis.yml
* Install jinja2 in Travis builder image
* Enable ChainLocks after quorums have been created
Creating 4 quorums causes a lot of blocks to be created and signed by
ChainLocks, which then causes timeouts later.
* Increase timeout in wallet-dump.py test
The first dumpwallet is quite slow sometimes, which then makes the
later called dumpwallet throw a wallet locked exception.
28479f926f21f2a91bec5a06671c60e5b0c55532 qa: Test bitcond shutdown (João Barbosa)
8d3f46ec3938e2ba17654fecacd1d2629f9915fd http: Remove timeout to exit event loop (João Barbosa)
e98a9eede2fb48ff33a020acc888cbcd83e24bbf http: Remove unnecessary event_base_loopexit call (João Barbosa)
6b13580f4e3842c11abd9b8bee7255fb2472b6fe http: Unlisten sockets after all workers quit (João Barbosa)
18e968581697078c36a3c3818f8906cf134ccadd http: Send "Connection: close" header if shutdown is requested (João Barbosa)
02e1e4eff6cda0bfc24b455a7c1583394cbff6eb rpc: Add wait argument to stop (João Barbosa)
Pull request description:
Fixes#11777. Reverts #11006. Replaces #13501.
With this change the HTTP server will exit gracefully, meaning that all requests will finish processing and sending the response, even if this means to wait more than 2 seconds (current time allowed to exit the event loop).
Another small change is that connections are accepted even when the server is stopping, but HTTP requests are rejected. This can be improved later, especially if chunked replies are implemented.
Briefly, before this PR, this is the order or events when a request arrives (RPC `stop`):
1. `bufferevent_disable(..., EV_READ)`
2. `StartShutdown()`
3. `evhttp_del_accept_socket(...)`
4. `ThreadHTTP` terminates (event loop exits) because there are no active or pending events thanks to 1. and 3.
5. client doesn't get the response thanks to 4.
This can be verified by applying
```diff
// Event loop will exit after current HTTP requests have been handled, so
// this reply will get back to the client.
StartShutdown();
+ MilliSleep(2000);
return "Bitcoin server stopping";
}
```
and checking the log output:
```
Received a POST request for / from 127.0.0.1:62443
ThreadRPCServer method=stop user=__cookie__
Interrupting HTTP server
** Exited http event loop
Interrupting HTTP RPC server
Interrupting RPC
tor: Thread interrupt
Shutdown: In progress...
torcontrol thread exit
Stopping HTTP RPC server
addcon thread exit
opencon thread exit
Unregistering HTTP handler for / (exactmatch 1)
Unregistering HTTP handler for /wallet/ (exactmatch 0)
Stopping RPC
RPC stopped.
Stopping HTTP server
Waiting for HTTP worker threads to exit
msghand thread exit
net thread exit
... sleep 2 seconds ...
Waiting for HTTP event thread to exit
Stopped HTTP server
```
For this reason point 3. is moved right after all HTTP workers quit. In that moment HTTP replies are queued in the event loop which keeps spinning util all connections are closed. In order to trigger the server side close with keep alive connections (implicit in HTTP/1.1) the header `Connection: close` is sent if shutdown was requested. This can be tested by
```
bitcoind -regtest
nc localhost 18443
POST / HTTP/1.1
Authorization: Basic ...
Content-Type: application/json
Content-Length: 44
{"jsonrpc": "2.0","method":"stop","id":123}
```
Summing up, this PR:
- removes explicit event loop exit — event loop exits once there are no active or pending events
- changes the moment the listening sockets are removed — explained above
- sends header `Connection: close` on active requests when shutdown was requested which is relevant when it's a persistent connection (default in HTTP 1.1) — libevent is aware of this header and closes the connection gracefully
- removes event loop explicit break after 2 seconds timeout
Tree-SHA512: 4dac1e86abe388697c1e2dedbf31fb36a394cfafe5e64eadbf6ed01d829542785a8c3b91d1ab680d3f03f912d14fc87176428041141441d25dcb6c98a1e069d8
11e0151 http: Remove numThreads and ThreadCounter (Wladimir J. van der Laan)
f946654 http: Remove WaitExit from WorkQueue (Wladimir J. van der Laan)
b1c2370 http: Join worker threads before deleting work queue (Wladimir J. van der Laan)
Pull request description:
This prevents a potential race condition if control flow ends up in
`ShutdownHTTPServer` before the thread gets to `queue->Run()`,
deleting the work queue while workers are still going to use it.
Meant to fix#12362.
Tree-SHA512: 8108514aeee5b2067a3736ed028014b580d1cbf8530ac7682b8a23070133dfa1ca21db4358c9158ea57e8811e0551395b6cb769887876b9cfce067ee968d0642
* Remove crownium files
* Remove trad theme files
* Remove drkblue theme files
* Remove light-retro theme files
* Remove old themes from optimize-pngs script
* Remove refs to old themes in Makefile.qt
* Remove more old theme file references
* Remove old themes from options dialog
* No need to care about themes for images and icons anymore
* Bring `trad` back
* Drop remaining `drkblue` references
Rename files that are actually used and drop no longer needed ones
* Use `wait_until` in sporks.py
* Drop unused spork functions in `dip3-deterministicmns.py`
* Use `wait_until` in multikeysporks.py
* scripted-diff: Rename `*_test_spork_state` to `*_test_spork_value` in multikeysporks.py to better match actual functionality
-BEGIN VERIFY SCRIPT-
sed -i 's/_test_spork_state/_test_spork_value/g' test/functional/multikeysporks.py
-END VERIFY SCRIPT-
* Introduce getprivatesendinfo and deprecate getpoolinfo
* s/ToJson/GetJsonInfo/
* s/queues/queue_size/
* Add TODO to explain `denomination` string convertion
* s/entries/entries_count/
* Use `CURRENCY_UNIT`
* Who needs safety belts after all
* Always check for expired queues on masternodes
* Check if a queue is too old or too far into the future
Instead of only checking that it's to old
* Check that no masternode can spam us with dsqs regardless of dsq readiness
* Drop `get_mnsync_status`, `wait_to_sync` and `sync_masternodes` and introduce `force_finish_mnsync` for MNs only
* Use `force_finish_mnsync` from util.py in dip3-deterministicmns.py and drop local unused functions
Also move the call, `force_finish_mnsync` should be called before `connect_nodes_bi`
This makes orphan processing work like handling getdata messages:
After every actual transaction validation attempt, interrupt
processing to deal with messages arriving from other peers.
4e955c5 Near-Bugfix: Reestablish consensus check removed in 8d7849b (Jorge Timón)
3e8c916 Introduce CheckInputsAndUpdateCoins static wrapper in txmempool.cpp (Jorge Timón)
832e074 Optimization: Minimize the number of times it is checked that no money is created (Jorge Timón)
3f0ee3e Proper indentation for CheckTxInputs and other minor fixes (Jorge Timón)
Pull request description:
...is created by individual transactions to 2 places (but call only once in each):
- ConnectBlock ( before calculated fees per txs twice )
- AcceptToMemoryPoolWorker ( before called CheckTxInputs 4 times and calculated
fees per tx one extra time )
Also call tx.GetValueOut() only once per call of CheckTxInputs (instead of 2)
For more motivation:
~~https://github.com/bitcoin/bitcoin/blob/master/src/main.cpp#L1493~~https://github.com/jtimon/bitcoin/compare/0.13-consensus-inputs...jtimon:0.13-consensus-inputs-comments
EDIT: partially replaces #6445
Near-Bugfix as pointed out in https://github.com/bitcoin/bitcoin/pull/8498#discussion_r124346132
Tree-SHA512: c71188e7c7c2425c9170ed7b803896755a92fd22f43b136eedaa6e554106696f0b10271d0ef0d0127c1eaafbc31d12eb19143df4f1b6882feecedf6ef05ea346
e97b113b04 [tests] Change invalidblockrequest to use BitcoinTestFramework (John Newbery)
2b7064eda7 [tests] Fix flake8 warnings in invalidblockrequest (John Newbery)
54b8c580b7 [test] Fix nits leftover from 11771 (Conor Scott)
Pull request description:
Builds on #11771. Please review that PR first
Next step in #10603.
- first commit tidies up invalidblockrequest.py
- second commit removes usage of ComparisonTestFramework
Tree-SHA512: 14b10c09c8c0ebef4a9176eb5b883a275d04c096785ee31b84ef594eed346ec6344d7ed32184c5fb397e744725df3911f45cdfadd0810e5a52eaa256084e3456
95e2e9a [tests] Change invalidtxrequest to use BitcoinTestFramework (John Newbery)
359d067 [tests] Fix flake8 warnings in invalidtxrequest (John Newbery)
c32cf9f [tests] Add P2PDataStore class (John Newbery)
cc046f6 [tests] Reduce NodeConn connection logging from info to debug (John Newbery)
Pull request description:
Next step in #10603
- first commit changes log level for an internal log from INFO to DEBUG. (Not really related, but I started finding the INFO level logging annoying when debuging test failures)
- second commit introduces a `P2PStub` class - a subclass of `NodeConnCB` which has its own block and tx store and responds appropriately to getdata requests. Not all the functionality is used in `invalidtxrequest.py`, but will be used in `invalidblockrequest.py` and `p2p-fullblocktest` when those are changed to use `BitcoinTestFramework`
- third commit tidies up `invalidtxrequest.py`
- fourth commit removes usage of `ComparisonTestFramework`
Tree-SHA512: f3085c73c15d6ce894e401490bce8a7fa7cf52b0c9d135ff7e351f1f6f517c99accab8588fcdc443f39ea8315329aaabd66b2baa32499df5a774737882030373
5c8ff26 [tests] Add NetworkThread assertions (John Newbery)
34e08b3 [tests] Fix network threading in functional tests (John Newbery)
74e64f2 [tests] Use network_thread_start() in tests. (John Newbery)
5fc6e71 [tests] Add network_thread_ utility functions. (John Newbery)
Pull request description:
Add assert that only one NetworkThread exists at any time in functional tests, and fix cases where that wasn't true.
fixes#11776
Tree-SHA512: fe5d1c59005f94bf66e11bb23ccf274b1cd9913741b56ea11dbcd21db4cc0b53b4413c0c4c16dbcd6ac611adad5e5cc2baaa39720598ce7b6393889945d06298
faaa7db qa: Only allow disconnecting all NodeConns (MarcoFalke)
Pull request description:
Disconnecting the connection with `index=0` makes no sense when there are more than one connections, as the list "rotates around" and populates index 0 after `del`.
Just disconnect all NodeConns in any case.
Tree-SHA512: e5cf540823fccb31634b5a11501f54222be89862e80ccafc28bc06726480f8d2153b8c1b6f859fa6a6d087876251d48a6c6035bccdaaf16831e300bc17ff613d
32ae82f5c [tests] use TestNode p2p connection in tests (John Newbery)
5e5725cc2 [tests] Add p2p connection to TestNode (John Newbery)
b86c1cd20 [tests] fix TestNode.__getattr__() method (John Newbery)
Pull request description:
Final two steps of #10082 : Adding the "mininode" P2P interface to `TestNode`
This PR adds the mininode P2P interface to `TestNode`. It simplifies the process for opening a P2P connection to the node-under-test from this:
```python
node0 = NodeConnCB()
connections = []
connections.append(NodeConn('127.0.0.1', p2p_port(0), self.nodes[0], node0))
node0.add_connection(connections[0])
```
to this:
```python
self.nodes[0].add_p2p_connection(p2p_conn_type=NodeConnCB)
```
The first commit adds the infrastructure to `test_node.py`. The second updates the individual test cases to use it. Can be separated if this is too much review for one PR.
Tree-SHA512: 44f1a6320f44eefc70489ae8350c6a16ad1a6035e4b9b7bafbdf19f5905ed0e2db85beaaf4758eec3059dd89a375a47a45352a029f39f57a86ab38a9ae66650e
86b47fa741408b061ab0bda784b8678bfd7dfa88 speed up Unserialize_impl for prevector (Akio Nakamura)
Pull request description:
The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times.
However, `reserve()` does not change the value of `_size` (a private member of prevector).
This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`.
The changes are as follows:
1. prevector.h
Add a public member function named 'append'.
This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values.
2. serialize.h
In the following two function:
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)`
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)`
Make a callback function from each original logic of reading values from stream, and call prevector's `append()`.
3. test/prevector_tests.cpp
Add a test for `append()`.
## A benchmark result is following:
[Machine]
MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD)
[result]
DeserializeAndCheckBlockTest => 22% faster
DeserializeBlockTest => 29% faster
[before PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339
DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187
[After PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757
DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527
ACKs for top commit:
laanwj:
utACK 86b47fa741408b061ab0bda784b8678bfd7dfa88
Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
5aad635 Use memset() to optimize prevector::resize() (Evan Klitzke)
e46be25 Reduce redundant code of prevector and speed it up (Akio Nakamura)
f0e7aa7 Add new prevector benchmarks. (Evan Klitzke)
Pull request description:
This branch optimizes various `prevector` operations, especially resizing vectors. While profiling the `loadblk` thread I noticed that a lot of time was being spent in `prevector::resize()` which led to this work. I have some data here indicating that it takes up **37%** of the time in `ReadBlockFromDisk()`: https://monad.io/readblockfromdisk.svg
This branch improves things significantly. For trivial types, the new results for the prevector benchmark are:
* `PrevectorClearTrivial` which tests `prevector::clear()` becomes 24.6x faster
* `PrevectorDestructorTrivial` which tests `prevector::~prevector()` becomes 20.5x faster
* `PrevectorResizeTrivial` which tests `prevector::resize()` becomes 20.3x faster
Note that in practice it looks like the prevector is only used to contain `unsigned char` types, which is a trivial type. The benchmarks are testing a bit of an extreme case, but the changes here are motivated by the profiling data for `ReadBlockFromDisk()` I linked to above.
The pull request here consists of a series of three commits:
* The first adds new benchmarks but does not change the prevector code.
* The second is from @AkioNak , and merges some prevector optimizations he submitted in #11988
* The third optimizes `prevector::resize()` to use `memset()` when the prevector contains trivially constructible types
Tree-SHA512: 28f7cbb91a19f9f43b6a5942781d7eb2e3197389186b666f086b69df12bee37773140f765426d715bfb8ebff79cb27a5f1206d0325b54b4aa65598b50fb18368
* Add timeout params to wait_for*_chainlock methods
* Give chainlocks more time in specific case
* Add logs to llmq-chainlock.py
* Replace wait_for_chainlocked_tip_all_nodes with wait_for_chainlocked_block_all_nodes
wait_for_chainlocked_tip_all_nodes did wait for the tip of each individual
node, which would not necessarily be the same. We should only allow to
explicitly specify which block to wait for.
* Get rid of wait_for_chainlocked_tip
Same as with wait_for_chainlocked_tip_all_nodes
This speeds up assumevalid.py from 22s to 7s on my machine. On travis, this
should be an improvement of a few minutes. Without this, Travis actually
fails due to block download timeouts.
* scripted-diff: Rename `wait_for_chainlock*` test functions
-BEGIN VERIFY SCRIPT-
sed -i 's/wait_for_chainlock_tip_all_nodes(/wait_for_chainlocked_tip_all_nodes(/g' test/functional/*.py
sed -i 's/wait_for_chainlock_tip(/wait_for_chainlocked_tip(/g' test/functional/*.py
sed -i 's/wait_for_chainlock(/wait_for_chainlocked_block(/g' test/functional/*.py
sed -i 's/wait_for_chainlock /wait_for_chainlocked_block /g' test/functional/*.py
-END VERIFY SCRIPT-
* Move `wait_for_*chainlock*` functions from individual tests to DashTestFramework
* Use `wait_until` in most Dash-specific `wait_for*` functions instead of custom timers
63c16ed50770bc3d4f0ecd2ffa971fcfa0688494 Use __cpuid_count for gnu C to avoid gitian build fail. (Chun Kuan Lee)
Pull request description:
Fixes#13538
Tree-SHA512: 161ae4db022288ae8631a166eaea2d08cf2c90bcd27218a094a754276de30b92ca9cfb5a79aa899c5a9d0534c5d7261037e7e915e1b92bc7067ab1539dc2b51e
4207c1b35c configure: Initialise assembly enable_* variables (Luke Dashjr)
afe0875577 configure: Skip assembly support checks, when assembly is disabled (Luke Dashjr)
d8ab8dc12d configure: Invert --enable-asm help string since default is now enabled (Luke Dashjr)
Pull request description:
Fixes#13759
Also inverts the help (so it shows `--disable-asm` like other enabled-by-default options, and initialises the flag variables.
ACKs for commit 4207c1:
laanwj:
makes sense, utACK 4207c1b35c2e2ee1c9217cc7db3290a24c3b4b52
achow101:
utACK 4207c1b35c2e2ee1c9217cc7db3290a24c3b4b52
ken2812221:
ACK 4207c1b35c2e2ee1c9217cc7db3290a24c3b4b52
practicalswift:
tACK 4207c1b35c2e2ee1c9217cc7db3290a24c3b4b52
Tree-SHA512: a30be1008fd8f019db34073f78e90a3c4ad3767d88d7c20ebb83e99c7abc23552f7da3ac8bd20f727405799aff1ecb6044cf869653f8db70478a074d0b877e0a
66b2cf1ccfad545a8ec3f2a854e23f647322bf30 Use immintrin.h everywhere for intrinsics (Pieter Wuille)
4c935e2eee456ff66cdfb908b0edffdd1e8a6c04 Add SHA256 implementation using using Intel SHA intrinsics (Pieter Wuille)
268400d3188200c9e3dcd3482c4853354388a721 [Refactor] CPU feature detection logic for SHA256 (Pieter Wuille)
Pull request description:
Based on #13191.
This adds SHA256 implementations that use Intel's SHA Extension instructions (using intrinsics). This needs GCC 4.9 or Clang 3.4.
In addition to #13191, two extra implementations are provided:
* (a) A variable-length SHA256 implementation using SHA extensions.
* (b) A 2-way 64-byte input double-SHA256 implementation using SHA extensions.
Benchmarks for 9001-element Merkle tree root computation on an AMD Ryzen 1800X system:
* Using generic C++ code (pre-#10821): 6.1ms
* Using SSE4 (master, #10821): 4.6ms
* Using 4-way SSE4 specialized for 64-byte inputs (#13191): 2.8ms
* Using 8-way AVX2 specialized for 64-byte inputs (#13191): 2.1ms
* Using 2-way SHA-NI specialized for 64-byte inputs (this PR): 0.56ms
Benchmarks for 32-byte SHA256 on the same system:
* Using SSE4 (master, #10821): 190ns
* Using SHA-NI (this PR): 53ns
Benchmarks for 1000000-byte SHA256 on the same system:
* Using SSE4 (master, #10821): 2.5ms
* Using SHA-NI (this PR): 0.51ms
Tree-SHA512: 2b319e33b22579f815d91f9daf7994a5e1e799c4f73c13e15070dd54ba71f3f6438ccf77ae9cbd1ce76f972d9cbeb5f0edfea3d86f101bbc1055db70e42743b7
57ba401abcfe564a2c4d259e0f758401ed74616d Enable double-SHA256-for-64-byte code on 32-bit x86 (Pieter Wuille)
Pull request description:
The SSE4 and AVX2 double-SHA256-for-64-byte input code from #13191 compiles fine on 32-bit x86 systems, but the autodetection logic in sha256.cpp doesn't enable it. Fix this.
Note that these instruction sets are only available on CPUs that support 64-bit mode as well, so it is only beneficial in the (perhaps unlikely) scenario where a 64-bit CPU is running a 32-bit Bitcoin Core binary.
Tree-SHA512: 39d5963c1ba8c33932549d5fe98bd184932689a40aeba95043eca31dd6824f566197c546b60905555eccaf407408a5f0f200247bb0907450d309b0a70b245102
32d153fa360f73b4999701b97d55b12318fd2659 For AVX2 code, also check for AVX, XSAVE, and OS support (Pieter Wuille)
Pull request description:
Fixes#12903.
Tree-SHA512: 01e71efb5d3a43c49a145a5b1dc4fe7d0a491e1e78479e7df830a2aaac57c3dcfc316e28984c695206c76f93b68e4350fc037ca36756ca579b7070e39c835da2
1e1eb6367f67dcf968bb62993b98b5873b926fc0 Improve coverage of SHA256 SelfTest code (Pieter Wuille)
Pull request description:
The existing SelfTest code does not cover the specialized double-SHA256-for-64-byte-inputs transforms added in #13191. Fix this.
Tree-SHA512: 593c7ee5dc9e77fc4c89e0a7753a63529b0d3d32ddbc015ae3895b52be77bee8a80bf16b754b30a22c01625a68db83fb77fa945a543143542bebb5b0f017ec5b
f68049dd879c216d1e98b6635eec488f8e936ed4 crypto: cleanup sha256 build (Cory Fields)
Pull request description:
Requested by @sipa in #13386.
Rather than appending all possible cpu variants to all targets, create a convenience variable that encompasses all.
Tree-SHA512: 8e9ab2185515672b79bb7925afa4f3fbfe921bfcbe61456833d15457de4feba95290de17514344ce42ee81cc38b252476cd0c29432ac48c737c2225ed515a4bd
4defdfab94504018f822dc34a313ad26cedc8255 [MOVEONLY] Move unused Merkle branch code to tests (Pieter Wuille)
4437d6e1f3107a20a8c7b66be8b4b972a82e3b28 8-way AVX2 implementation for double SHA256 on 64-byte inputs (Pieter Wuille)
230294bf5fdeba7213471cd0b795fb7aa36e5717 4-way SSE4.1 implementation for double SHA256 on 64-byte inputs (Pieter Wuille)
1f0e7ca09c9d7c5787c218156fa5096a1bdf2ea8 Use SHA256D64 in Merkle root computation (Pieter Wuille)
d0c96328833127284574bfef26f96aa2e4afc91a Specialized double sha256 for 64 byte inputs (Pieter Wuille)
57f34630fb6c3e218bd19535ac607008cb894173 Refactor SHA256 code (Pieter Wuille)
0df017889b4f61860092e1d54e271092cce55f62 Benchmark Merkle root computation (Pieter Wuille)
Pull request description:
This introduces a framework for specialized double-SHA256 with 64 byte inputs. 4 different implementations are provided:
* Generic C++ (reusing the normal SHA256 code)
* Specialized C++ for 64-byte inputs, but no special instructions
* 4-way using SSE4.1 intrinsics
* 8-way using AVX2 intrinsics
On my own system (AVX2 capable), I get these benchmarks for computing the Merkle root of 9001 leaves (supported lengths / special instructions / parallellism):
* 7.2 ms with varsize/naive/1way (master, non-SSE4 hardware)
* 5.8 ms with size64/naive/1way (this PR, non-SSE4 capable systems)
* 4.8 ms with varsize/SSE4/1way (master, SSE4 hardware)
* 2.9 ms with size64/SSE4/4way (this PR, SSE4 hardware)
* 1.1 ms with size64/AVX2/8way (this PR, AVX2 hardware)
Tree-SHA512: efa32d48b32820d9ce788ead4eb583949265be8c2e5f538c94bc914e92d131a57f8c1ee26c6f998e81fb0e30675d4e2eddc3360bcf632676249036018cff343e
538cc0ca8 build: Mention use of asm in summary (Wladimir J. van der Laan)
ce5381e7f build: Rename --enable-experimental-asm to --enable-asm and enable by default (Wladimir J. van der Laan)
Pull request description:
Now that 0.15 is branched off, enable assembler SHA256 optimizations by default, but still allow disabling them, for example if something goes wrong with auto-detection on a platform.
Also add mention of the use of asm in the configure summary.
Tree-SHA512: cd20c497f65edd6b1e8b2cc3dfe82be11fcf4777543c830ccdec6c10f25eab4576b0f2953f3957736d7e04deaa4efca777aa84b12bb1cecb40c258e86c120ec8