Merge #14687: zmq: enable tcp keepalive

c276df775914e4e42993c76e172ef159e3b830d4 zmq: enable tcp keepalive (mruddy)

Pull request description:

  This addresses https://github.com/bitcoin/bitcoin/issues/12754.

  These changes enable node operators to address the silent dropping (by network middle boxes) of long-lived low-activity ZMQ TCP connections via further operating system level TCP keepalive configuration. For example, ZMQ sockets that publish block hashes can be affected in this way due to the length of time it sometimes takes between finding blocks (e.g.- sometimes more than an hour).

  Prior to this patch, operating system level TCP keepalive configurations would not take effect since the SO_KEEPALIVE option was not enabled on the underlying socket.

  There are additional ZMQ socket options related to TCP keepalive that can be set. However, I decided not to implement those options in this changeset because doing so would require adding additional bitcoin node configuration options, and would not yield a better outcome. I preferred a small, easily reviewable patch that doesn't add a bunch of new config options, with the tradeoff that the fine tuning would have to be done via well-documented operating system specific configurations.

  I tested this patch by running a node with:
  `./src/qt/bitcoin-qt -regtest -txindex -datadir=/tmp/node -zmqpubhashblock=tcp://127.0.0.1:28332 &`
  and connecting to it with:
  `python3 ./contrib/zmq/zmq_sub.py`

  Without these changes, `ss -panto | grep 28332 | grep ESTAB | grep bitcoin` will report no keepalive timer information. With these changes, the output from the prior command will show keepalive timer information consistent with the configuration at the time of connection establishment, e.g.-: `timer:(keepalive,119min,0)`.

  I also tested with a non-TCP transport and did not witness any adverse effects:
  `./src/qt/bitcoin-qt -regtest -txindex -datadir=/tmp/node -zmqpubhashblock=ipc:///tmp/bitcoin.block &`

ACKs for top commit:
  adamjonas:
    Just to summarize for those looking to review - as of c276df775914e4e42993c76e172ef159e3b830d4 there are 3 tACKs (n-thumann, Haaroon, and dlogemann), 1 "looks good to me" (laanwj) with no NACKs or any show-stopping concerns raised.
  jonasschnelli:
    utACK c276df775914e4e42993c76e172ef159e3b830d4

Tree-SHA512: b884c2c9814e97e666546a7188c48f9de9541499a11a934bd48dd16169a900c900fa519feb3b1cb7e9915fc7539aac2829c7806b5937b4e1409b4805f3ef6cd1
This commit is contained in:
Jonas Schnelli 2020-09-02 09:09:08 +02:00 committed by Pasta
parent 29132a1b16
commit 516cfb54c5
No known key found for this signature in database
GPG Key ID: 52527BEDABE87984
2 changed files with 22 additions and 0 deletions

View File

@ -127,6 +127,20 @@ ZMQ_SUBSCRIBE option set to one or either of these prefixes (for
instance, just `hash`); without doing so will result in no messages
arriving. Please see [`contrib/zmq/zmq_sub.py`](/contrib/zmq/zmq_sub.py) for a working example.
The ZMQ_PUB socket's ZMQ_TCP_KEEPALIVE option is enabled. This means that
the underlying SO_KEEPALIVE option is enabled when using a TCP transport.
The effective TCP keepalive values are managed through the underlying
operating system configuration and must be configured prior to connection establishment.
For example, when running on GNU/Linux, one might use the following
to lower the keepalive setting to 10 minutes:
sudo sysctl -w net.ipv4.tcp_keepalive_time=600
Setting the keepalive values appropriately for your operating environment may
improve connectivity in situations where long-lived connections are silently
dropped by network middle boxes.
## Remarks
From the perspective of dashd, the ZeroMQ socket is write-only; PUB

View File

@ -117,6 +117,14 @@ bool CZMQAbstractPublishNotifier::Initialize(void *pcontext)
return false;
}
const int so_keepalive_option {1};
rc = zmq_setsockopt(psocket, ZMQ_TCP_KEEPALIVE, &so_keepalive_option, sizeof(so_keepalive_option));
if (rc != 0) {
zmqError("Failed to set SO_KEEPALIVE");
zmq_close(psocket);
return false;
}
rc = zmq_bind(psocket, address.c_str());
if (rc != 0)
{