From c04cc39a03d34dea75c4024e72e93d6d8ad640df Mon Sep 17 00:00:00 2001
From: Henry de Valence <hdevalence@hdevalence.ca>
Date: Wed, 2 Dec 2020 12:08:47 -0800
Subject: [PATCH] state: dodge a bug in zcashd

Zcashd will blindly request more block headers as long as it got 160
block headers in response to a previous query, EVEN IF THOSE HEADERS ARE
ALREADY KNOWN.  To dodge this behavior, return slightly fewer than the
maximum, to get it to go away.

https://github.com/zcash/zcash/blob/0ccc885371e01d844ebeced7babe45826623d9c2/src/main.cpp#L6274-L6280

Without this change, communication between a partially-synced `zebrad`
and fully-synced `zcashd` looked like this:

1.  `zebrad` connects to `zcashd`, which sends an initial `getheaders`
    request;

2.  `zebrad` correctly computes the intersection of the provided block
    locator with the node's current chain and returns 160 following
    headers;

3.  `zcashd` does not check whether it already has those headers and
    assumes that any provided headers are new and re-validates them;

4.  `zcashd` assumes that because `zebrad` responded with 160 headers,
    the `zebrad` node is ahead of it, and requests the next 160 headers.

5.  Because block locators are sparse, the intersection between the
    `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip,
    so this process continues for thousands of blocks.

To avoid this problem, we return slightly fewer than the protocol
maximum (158 rather than 160, to guard against off-by-one errors in
zcashd).  This does not interfere with use of the returned headers by
peers that check the headers, but does prevent `zcashd` from trying to
download thousands of block headers it already has.

This problem does not occur in the `zcashd<->zcashd` case only because
`zcashd` does not respond to `getheaders` messages while it is syncing.
However, implementing this behavior in Zebra would be more complicated,
because we don't have a distinct "initial block sync" state (we do
poll-based syncing continuously) and we don't have shared global
variables to modify to set that state.

Relevant links (thanks @str4d):

- The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905
- https://github.com/bitcoin/bitcoin/issues/6861
- https://github.com/bitcoin/bitcoin/issues/6755
- https://github.com/bitcoin/bitcoin/pull/8306#issuecomment-614916454
---
 zebra-state/src/service.rs | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/zebra-state/src/service.rs b/zebra-state/src/service.rs
index be9de897..eb57e7d7 100644
--- a/zebra-state/src/service.rs
+++ b/zebra-state/src/service.rs
@@ -639,8 +639,14 @@ impl Service<Request> for StateService {
             }
             Request::FindBlockHeaders { known_blocks, stop } => {
                 const MAX_FIND_BLOCK_HEADERS_RESULTS: usize = 160;
-                let res =
-                    self.find_chain_hashes(known_blocks, stop, MAX_FIND_BLOCK_HEADERS_RESULTS);
+                // Zcashd will blindly request more block headers as long as it
+                // got 160 block headers in response to a previous query, EVEN
+                // IF THOSE HEADERS ARE ALREADY KNOWN.  To dodge this behavior,
+                // return slightly fewer than the maximum, to get it to go away.
+                //
+                // https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905
+                let count = MAX_FIND_BLOCK_HEADERS_RESULTS - 2;
+                let res = self.find_chain_hashes(known_blocks, stop, count);
                 let res: Vec<_> = res
                     .iter()
                     .map(|&hash| {