* Updates ADDR_RESPONSE_LIMIT_DENOMINATOR to 4
* Moves logic getting a fraction of Zebra's peers to a method in the address book
* Adds and uses CachedPeerAddrs struct in inbound service
* moves and documents constant
* fixes test
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* updates docs
* renames sanitized_window method
* renames CachedPeerAddrs to CachedPeerAddrResponse
* updates test
* moves try_refresh to per request
* Make unused sanitization method pub(crate)
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* moves CachedPeerAddrResponse to a module
* updates unit test
* fixes unit test
* removes unnecessary condition
* clears cached getaddr response if it can't refresh for over a minute after the refresh time
* tests that inbound service gives out the same addresses for every Peers request before the refresh interval
* Applies suggestion from code review
* fixes doc link
* renames constant
* Fix docs on new constant
* applies suggestion from code review
* uses longer cache expiry time
* Adds code comments
---------
Co-authored-by: teor <teor@riseup.net>
* Add a PeerSocketAddr type which hides its IP address, but shows the port
* Manually replace SocketAddr with PeerSocketAddr where needed
```sh
fastmod SocketAddr PeerSocketAddr zebra-network
```
* Add missing imports
* Make converting into PeerSocketAddr easier
* Fix some unused imports
* Add a canonical_peer_addr() function
* Fix connection handling for PeerSocketAddr
* Fix serialization for PeerSocketAddr
* Fix tests for PeerSocketAddr
* Remove some unused imports
* Fix address book listener handling
* Remove redundant imports and conversions
* Update outdated IPv4-mapped IPv6 address code
* Make addresses canonical when deserializing
* Stop logging peer addresses in RPC code
* Update zebrad tests with new PeerSocketAddr type
* Update zebra-rpc tests with new PeerSocketAddr type
---------
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix(network): split synthetic NotFoundRegistry from message NotFoundResponse
* docs(network): Improve `notfound` message documentation
* refactor(network): Rename MustUseOneshotSender to MustUseClientResponseSender
```
fastmod MustUseOneshotSender MustUseClientResponseSender zebra*
```
* docs(network): fix a comment typo
* refactor(network): remove generics from MustUseClientResponseSender
* refactor(network): add an inventory collector to Client, but don't use it yet
* feat(network): register missing peer responses as missing inventory
We register this missing inventory based on peer responses,
or connection errors or timeouts.
Inbound message inventory tracking requires peers to send `notfound` messages.
But `zcashd` skips `notfound` for blocks, so we can't rely on peer messages.
This missing inventory tracking works regardless of peer `notfound` messages.
* refactor(network): rename ResponseStatus to InventoryResponse
```sh
fastmod ResponseStatus InventoryResponse zebra*
```
* refactor(network): rename InventoryStatus::inner() to to_inner()
* fix(network): remove a redundant runtime.enter() in a test
* doc(network): the exact time used to filter outbound peers doesn't matter
* fix(network): handle block requests slightly more efficiently
* doc(network): fix a typo
* fmt(network): `cargo fmt` after rename ResponseStatus to InventoryResponse
* doc(test): clarify some test comments
* test(network): test synthetic notfound from connection errors and peer inventory routing
* test(network): improve inbound test diagnostics
* feat(network): add a proptest-impl feature to zebra-network
* feat(network): add a test-only connect_isolated_with_inbound function
* test(network): allow a response on the isolated peer test connection
* test(network): fix failures in test synthetic notfound
* test(network): Simplify SharedPeerError test assertions
* test(network): test synthetic notfound from partially successful requests
* test(network): MissingInventoryCollector ignores local NotFoundRegistry errors
* fix(network): decrease the inventory rotation interval
This stops us waiting 3-4 sync resets (4 minutes) before we retry a missing block.
Now we wait 1-2 sync resets (2 minutes), which is still a reasonable rate limit.
This should speed up syncing near the tip, and on testnet.
* fmt(network): cargo fmt --all
* cleanup(network): remove unnecessary allow(dead_code)
* cleanup(network): stop importing the whole sync module into tests
* doc(network): clarify syncer inventory retry constraint
* doc(network): add a TODO for a fix to ensure API behaviour remains consistent
* doc(network): fix a function doc typo
* doc(network): clarify how we handle peers that don't send `notfound`
* docs(network): clarify a test comment
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* refactor(network): rename Advertised to Available
```sh
fastmod Advertised Available zebra*
fastmod advertised available zebra*
```
* refactor(network): allow different available and missing types inside an InventoryStatus
And rename it to ResponseStatus.
Split the methods between ResponseStatus and an InventoryStatus alias.
* refactor(network): add a block_hash convenience method to InventoryHash
* test(network): improve failure logs for connection tests
* fix(inbound): move address sanitization into the response future
* feat(network): send notfound when Zebra doesn't have a block or transaction
* doc(network): move module docs to the top of each module
This makes them more likely to get updated when the module changes.
* fix(network): stop sending unsupported missing inventory types to the registry
* test(network): inbound messages are forwarded to the registry
* test(inbound): test Peers requests to the inbound service, directly and via TCP
* test(network): notfound block responses are sent by the inbound service
* test(network): notfound tx responses are sent by the inbound service
* test(network): increase sync test mock service timeout
The code that these tests use hasn't actually changed much,
and they are only failing on some platforms (coverage, macOS).
So it seems like the extra concurrent inbound tests have pushed them
past their time limit.
(Perhaps due to TCP system calls, or extra serialization work.)
* doc(network): fix typo
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* test(network): remove unnecessary multi-threaded runtime from tests
This prevents `MockService<zebra_state>` timeouts
in the `sync_block_too_high_extend_tips` test,
at the cost of reducing coverage of different execution orders.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Refactor setup of `Connection` test vectors
Add a `new_test_connection` helper function to create a `Connection`
instance that's ready for testing.
* Check that no inbound requests are sent
Return the mock inbound service from `new_test_connection` and assert
that no requests were sent to it in any test.
* Replace `&mut Vec<u8>` with an `mpsc` channel
Make it easier to run the connection task in the background, i.e.,
remove any lifetime constraints from the borrowed buffer so that
`Connection` is `'static`.
It's now also easier to assert on individual messages sent from the
`Connection` instance.
* Make `MockServiceBuilder::finish` public
Allow test functions to be generic when creating a `MockService`, so
that caller functions actually determine if the type of `MockService`
assertions.
* Move `new_test_connection` to parent module
Make it more generic so that it can be used later in property tests as
well.
* Derive `Eq` and `PartialEq` for network `Response`
Allow intercepted `Response` instances to be easily compared in tests.
* Test block request cancel causes an error cascade
This is the scenario that caused the block synchronizer to reset every
few minutes, which made it considerably slower.
* Ignore unexpected block responses
It's likely that it's just a response for a previously cancelled block
request.
* Stop holding completed messages until the next inbound message
* Add more info to network message block download debug logs
* Simplify address metrics logs
* Try handling inbound messages as responses, then try as a new request
* Improve address book logging
* Fix a race between the first heartbeat and getaddr requests
* Temporarily reduce the getaddr fanout to 1
* Update metrics when exiting the Connection run loop
* Downgrade some debug logs to trace
* Use a named CancelHeartbeatTask unit struct for the channel type
* Prefer cancel handles in selects, if both are ready
* Fix message metrics to just show the command name
* Add metrics for internal requests and responses
* Add internal requests and responses to the messages dashboard
* Add a canceled metric, and peer addresses to request and response metrics
* Add a canceled messages graph
* Add connection state metrics for currently open connections
* Fix the connection state graph with new metrics
* Always send an error before dropping pending responses
* Move error detail logging into `fail_with`
* Delete an unused timer future
* Make error strings in metrics less verbose
* Downgrade some error logs to info
* Remove a redundant expect
* Avoid unnecessary allocations for connection state metrics
* Fix missed updates to mempool and block gossip metrics
* Rename internal network requests for wide transaction IDs
fastmod TransactionsByHash TransactionsById zebra*
fastmod AdvertiseTransactions AdvertiseTransactionIds zebra*
fastmod MempoolTransactions MempoolTransactionIds zebra*
fastmod TransactionHashes TransactionIds zebra*
* Update network transaction request/response comments
* Rename a transaction hash method for wide transaction IDs
fastmod transaction_hashes transaction_ids zebra-network
* Add UnminedTxId methods and conversions for InventoryHash
* Map WtxIds to unmined transaction network messages
Also, use UnminedTxId and UnminedTx in:
* Zebra's internal request and response format, and
* external Zcash network protocol messages.
* Enable WtxId mempool inventory tracking for peers
* Further clarify transaction IDs
* Use Witnessed rather than Wide for transaction IDs
And rename narrow to legacy when it only applies to v1-v4 transactions.
Otherwise, rename it to mined ID.
* Rename a missed binding
* Remove an incorrectly named binding
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
We modeled a Bitcoin `headers` message as being a list of block headers.
However, the actual data structure is slightly different: it's a list of (block
header, transaction count) pairs. This caused zcashd to reject our headers
messages.
To fix this, introduce a new `CountedHeader` struct with a `block::Header` and
transaction count `usize`, then thread it through the inbound service and the
state.
I tested this locally by running Zebra with these changes and inspecting a
trace-level log of the span of a peer connection that requested a nontrivial
headers packet from us, and verified that it did not reject our message.
This commit makes several related changes to the network code:
- adds a `TransactionsByHash(HashSet<transaction::Hash>)` request and
`Transactions(Vec<Arc<Transaction>>)` response pair that allows
fetching transactions from a remote peer;
- adds a `PushTransaction(Arc<Transaction>)` request that pushes an
unsolicited transaction to a remote peer;
- adds an `AdvertiseTransactions(HashSet<transaction::Hash>)` request
that advertises transactions by hash to a remote peer;
- adds an `AdvertiseBlock(block::Hash)` request that advertises a block
by hash to a remote peer;
Then, it modifies the connection state machine so that outbound
requests to remote peers are handled properly:
- `TransactionsByHash` generates a `getdata` message and collects the
results, like the existing `BlocksByHash` request.
- `PushTransaction` generates a `tx` message, and returns `Nil` immediately.
- `AdvertiseTransactions` and `AdvertiseBlock` generate an `inv`
message, and return `Nil` immediately.
Next, it modifies the connection state machine so that messages
from remote peers generate requests to the inbound service:
- `getdata` messages generate `BlocksByHash` or `TransactionsByHash`
requests, depending on the content of the message;
- `tx` messages generate `PushTransaction` requests;
- `inv` messages generate `AdvertiseBlock` or `AdvertiseTransactions`
requests.
Finally, it refactors the request routing logic for the peer set to
handle advertisement messages, providing three routing methods:
- `route_p2c`, which uses p2c as normal (default);
- `route_inv`, which uses the inventory registry and falls back to p2c
(used for `BlocksByHash` or `TransactionsByHash`);
- `route_all`, which broadcasts a request to all ready peers (used for
`AdvertiseBlock` and `AdvertiseTransactions`).
This is the first in a sequence of changes that change the block:: items
to not include Block as a prefix in their name, in accordance with the
Rust API guidelines.
Bitcoin does this either with `getblocks` (returns up to 500 following block
hashes) or `getheaders` (returns up to 2000 following block headers, not
just hashes). However, Bitcoin headers are much smaller than Zcash
headers, which contain a giant Equihash solution block, and many Zcash
blocks don't have many transactions in them, so the block header is
often similarly sized to the block itself. Because we're
aiming to have a highly parallel network layer, it seems better to use
`getblocks` to implement `FindBlocks` (which is necessarily sequential)
and parallelize the processing of the block downloads.
PushPeers is more complicated to thread into the rest of our
architecture (we would need to establish a data path connecting our
service handling inbound requests to the network layer's auto-crawler),
and since we crawl the network automatically anyways, we don't actually
need to accept them in order to get updated address information.
The only possible problem with this approach is that zcashd refuses to
answer multiple address requests from the same connection, ostensibly
for fingerprinting prevention (although it's totally happy to give
exactly the same information, as long as you hang up and reconnect
first, lol). It's unclear how this will interact with our design -- on
the one hand, it could mean that we don't get new addr information when
we ask, but on the other hand, we may have enough churn in our
connection pool that this isn't a problem anyways.
Attempting to implement requests for block data revealed a problem with
the previous connection logic. Block data is requested by sending a
`getdata` message with hashes of the requested blocks; the peer responds
with a sequence of `block` messages with the blocks themselves.
However, this wasn't possible to handle with the previous connection
logic, which could only convert a single Bitcoin message into a
Response. Instead, we factor out the message handling logic into a
Handler, which can statefully accumulate arbitrary data into a Response
and signal completion. This is still pretty ugly but it does work.
As a side effect, the HeartbeatNonceMismatch error is removed; because
the Handler now tries to process messages until it comes to a Response,
it just ignores mismatched nonces (and will eventually time out).
The previous Mempool and Transaction requests were removed but could be
re-added in a different form later. Also, the `Get` prefixes are
removed from `Request` to tidy the name.