* Return the maximum checkpoint height from the chain verifier
* Return the verified block height from the sync downloader
* Track the verified height in the syncer
* Use a lower concurrency limit during full verification
* Get the tip from the state before the first verified block
* Limit the number of submitted download and verify blocks in a batch
* Adjust lookahead limits when transitioning to full verification
* Keep unused extra hashes and submit them to the downloader later
* Remove redundant verified_height and state_tip()
* Split the checkpoint and full verify concurrency configs
* Decrease full verification concurrency to 5 blocks
10 concurrent blocks causes 3 minute stalls on some blocks on my machine.
(And it has about 4x as many cores as a standard machine.)
* cargo +stable fmt --all
* Remove a log that's verbose with smaller lookahead limits
* Apply the full verify concurrency limit to the inbound service
* Add a summary of the config changes to the CHANGELOG
* Increase the default full verify concurrency limit to 30
* Create new empty `zebra-node-services` crate
The goal is to store the mempool `Request` and `Response` types so that
the `zebra-rpc` crate can interface with the mempool service without
having to import `zebrad`.
* Move `Gossip` mempool type into new crate
It is required by the `Request` type, which will be moved next.
* Add documentation to `Gossip` variants
Avoid having to add an exception to allow undocumented code.
* Move `mempool::Request` type to new crate
The first part of the service interface definition. Usages have been
changed to refer to the new crate directly, and since this refactor is
still incomplete, some `mp` aliases are used in a few places to refer to
the old module.
* Create an `UnboxMempoolError` helper trait
Centralize some common code to extract and downcast boxed mempool
errors. The `mempool::Response` will need to contain a `BoxError`
instead of a `MempoolError` when it is moved to the
`zebra-node-services` crate, so this prepares the tests to be updated
with less changes.
* Use `UnboxMempoolError` in tests
Make the necessary changes so that the tests are ready to support a
`BoxError` in the `mempool::Response` type.
* Use `BoxError` in `mempool::Response`
Prepare it to be moved to the `zebra-node-services` crate.
* Move `mempool::Response` to `zebra-node-services`
Update usages to import from the new crate directly.
* Remove `mp` aliases for mempool component module
Use any internal types directly instead.
* Replace `tower::BoxService` with custom alias
Remove the dependency of `zebra-node-services` on `tower`.
* Move `Gossip` into a separate `sub-module`
Keep only the main `Request` and `Response` types in the `mempool`
module.
* Use `crate::BoxError` instead of `tower::BoxError`
Follow the existing convention.
* Add missing `gossip.rs` module file
It was missing from a previous refactor commit.
* fix(network): split synthetic NotFoundRegistry from message NotFoundResponse
* docs(network): Improve `notfound` message documentation
* refactor(network): Rename MustUseOneshotSender to MustUseClientResponseSender
```
fastmod MustUseOneshotSender MustUseClientResponseSender zebra*
```
* docs(network): fix a comment typo
* refactor(network): remove generics from MustUseClientResponseSender
* refactor(network): add an inventory collector to Client, but don't use it yet
* feat(network): register missing peer responses as missing inventory
We register this missing inventory based on peer responses,
or connection errors or timeouts.
Inbound message inventory tracking requires peers to send `notfound` messages.
But `zcashd` skips `notfound` for blocks, so we can't rely on peer messages.
This missing inventory tracking works regardless of peer `notfound` messages.
* refactor(network): rename ResponseStatus to InventoryResponse
```sh
fastmod ResponseStatus InventoryResponse zebra*
```
* refactor(network): rename InventoryStatus::inner() to to_inner()
* fix(network): remove a redundant runtime.enter() in a test
* doc(network): the exact time used to filter outbound peers doesn't matter
* fix(network): handle block requests slightly more efficiently
* doc(network): fix a typo
* fmt(network): `cargo fmt` after rename ResponseStatus to InventoryResponse
* doc(test): clarify some test comments
* test(network): test synthetic notfound from connection errors and peer inventory routing
* test(network): improve inbound test diagnostics
* feat(network): add a proptest-impl feature to zebra-network
* feat(network): add a test-only connect_isolated_with_inbound function
* test(network): allow a response on the isolated peer test connection
* test(network): fix failures in test synthetic notfound
* test(network): Simplify SharedPeerError test assertions
* test(network): test synthetic notfound from partially successful requests
* test(network): MissingInventoryCollector ignores local NotFoundRegistry errors
* fix(network): decrease the inventory rotation interval
This stops us waiting 3-4 sync resets (4 minutes) before we retry a missing block.
Now we wait 1-2 sync resets (2 minutes), which is still a reasonable rate limit.
This should speed up syncing near the tip, and on testnet.
* fmt(network): cargo fmt --all
* cleanup(network): remove unnecessary allow(dead_code)
* cleanup(network): stop importing the whole sync module into tests
* doc(network): clarify syncer inventory retry constraint
* doc(network): add a TODO for a fix to ensure API behaviour remains consistent
* doc(network): fix a function doc typo
* doc(network): clarify how we handle peers that don't send `notfound`
* docs(network): clarify a test comment
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* refactor(network): rename Advertised to Available
```sh
fastmod Advertised Available zebra*
fastmod advertised available zebra*
```
* refactor(network): allow different available and missing types inside an InventoryStatus
And rename it to ResponseStatus.
Split the methods between ResponseStatus and an InventoryStatus alias.
* refactor(network): add a block_hash convenience method to InventoryHash
* test(network): improve failure logs for connection tests
* fix(inbound): move address sanitization into the response future
* feat(network): send notfound when Zebra doesn't have a block or transaction
* doc(network): move module docs to the top of each module
This makes them more likely to get updated when the module changes.
* fix(network): stop sending unsupported missing inventory types to the registry
* test(network): inbound messages are forwarded to the registry
* test(inbound): test Peers requests to the inbound service, directly and via TCP
* test(network): notfound block responses are sent by the inbound service
* test(network): notfound tx responses are sent by the inbound service
* test(network): increase sync test mock service timeout
The code that these tests use hasn't actually changed much,
and they are only failing on some platforms (coverage, macOS).
So it seems like the extra concurrent inbound tests have pushed them
past their time limit.
(Perhaps due to TCP system calls, or extra serialization work.)
* doc(network): fix typo
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* test(network): remove unnecessary multi-threaded runtime from tests
This prevents `MockService<zebra_state>` timeouts
in the `sync_block_too_high_extend_tips` test,
at the cost of reducing coverage of different execution orders.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* fix(state): set state concurrency based on other services' concurrency
* fix(sync): increase the sync downloader lookahead limit
It seems like the recent tokio upgrade made this code even more efficient,
so on testnet we can have around 6000 blocks in flight.
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Document the chain verifier
* Drop gossiped blocks that are too far ahead of the tip
* Add extra gossiped block metrics
* Allow extra gossiped blocks, now we have a stricter limit
* Fix a comment
* Check the exact number of blocks in a downloaded block response
* Drop synced blocks that are too far ahead of the tip
* Add extra synced block metrics
* Test dropping gossiped blocks that are too far ahead of the tip
* Allow an extra checkpoint's worth of blocks in the verifier queues
* Actually let's try two extra checkpoints
* Scale extra height limit with lookahead limit
* Also drop blocks that are behind the finalized tip
* Downgrade a noisy log
* Use a debug log for already verified gossiped blocks
* Use debug logs for already verified synced blocks
* Refactor the address response limit
* Limit the number of peers in the address book
* Allow changing the address book limit in tests
* Add tests for the address book length limit
* rustfmt
* Stop checking the entire AddressBook for each connection attempt
* Stop redundant peer time checks within the address book
* Stop calling `Instant::now` 3 times for each address book update
* Only get the time once each time an address book method is called
* Update outdated comment
* Use an OrderedMap to efficiently store address book peers
* Add address book order tests
* Start network before verifiers
This makes the Groth16 download task start as late as possible.
* Explain why the Groth16 download must happen first
* Speed up Zebra shutdown: skip waiting for the tokio runtime
* Implement graceful shutdown for the peer set
* Use the minimum lookahead limit in acceptance tests
* Enable a doctest that compiles with newly public modules
* Revert "Remove commented-out code"
This reverts commit 9e69777925f103ee11e5940bba95b896c828839b.
* Implement deserialization for `addrv2` messages
* Limit addr and addrv2 messages to MAX_ADDRS_IN_MESSAGE
* Clarify address version comments
* Minor cleanups and fixes
* Add preallocation tests for AddrV2
* Add serialization tests for AddrV2
* Use prop_assert in AddrV2 proptests
* Use a generic utility method for deserializing IP addresses in `addrv2`
* Document the purpose of a conversion to MetaAddr
* Fix a comment typo, and clarify that comment
* Clarify the unsupported AddrV2 network ID error and enum variant names
```sh
fastmod AddrV2UnimplementedError UnsupportedAddrV2NetworkIdError zebra-network
fastmod Unimplemented Unsupported zebra-network
```
* Fix and clarify unsupported AddrV2 comments
* Replace `panic!` with `unreachable!`
* Clarify a comment about skipping a length check in a test
* Remove a redundant test
* Basic addr (v1) and addrv2 deserialization tests
* Test deserialized IPv4 and IPv6 values in addr messages
* Remove redundant io::Cursor
* Add comments with expected values of address test vectors
There are a lot of these messages when Zebra starts up.
They might be slowing down CI and causing timeouts.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Guarantee unique IDs in mempool service responses
* Guarantee unique IDs in crawler task mempool Queue requests
Also update the tests to use unique IDs.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Cancel download and verify tasks when the mempool is deactivated
* Refactor enable/disable logic to use a state enum
* Add helper test functions to enable/disable the mempool
* Add documentation about errors on service calls
* Improvements from review
* Improve documentation
* Fix bug in test
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Send `Response::Nil` instead of sending empty `Message`s
This matches `zcashd`'s behaviour more closely.
In most cases, the network layer filters these out already.
But this change makes the the inbound service code clearer.
* revert changes made to `AdvertiseTransactionIds` and `PushTransaction`
* remove newline
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Check if tx already exists in mempool or state before downloading
* Reorder checks
* Add rejected test; refactor into separate function
* Wrap mempool in buffered service
* Rename RejectedTransactionsById -> RejectedTransactionsIds
* Add RejectedTransactionIds response; fix request name
* Organize imports
* add a test for Storage::rejected_transactions
* add test for mempool `Request::RejectedTransactionIds`
* change buffer size to 1 in the test
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* reply to `Request::MempoolTransactionIds`
* remove boilerplate
* get storage from mempool with a method
* change panic message
* try fix for mac
* use normal init instead of init_tests for state service
* newline
* rustfmt
* fix test build
* Rename internal network requests for wide transaction IDs
fastmod TransactionsByHash TransactionsById zebra*
fastmod AdvertiseTransactions AdvertiseTransactionIds zebra*
fastmod MempoolTransactions MempoolTransactionIds zebra*
fastmod TransactionHashes TransactionIds zebra*
* Update network transaction request/response comments
* Rename a transaction hash method for wide transaction IDs
fastmod transaction_hashes transaction_ids zebra-network
* Add UnminedTxId methods and conversions for InventoryHash
* Map WtxIds to unmined transaction network messages
Also, use UnminedTxId and UnminedTx in:
* Zebra's internal request and response format, and
* external Zcash network protocol messages.
* Enable WtxId mempool inventory tracking for peers
* Further clarify transaction IDs
* Use Witnessed rather than Wide for transaction IDs
And rename narrow to legacy when it only applies to v1-v4 transactions.
Otherwise, rename it to mined ID.
* Rename a missed binding
* Remove an incorrectly named binding
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Always send our local listener with the latest time
Previously, whenever there was an inbound request for peers, we would
clone the address book and update it with the local listener.
This had two impacts:
- the listener could conflict with an existing entry,
rather than unconditionally replacing it, and
- the listener was briefly included in the address book metrics.
As a side-effect, this change also makes sanitization slightly faster,
because it avoids some useless peer filtering and sorting.
* Skip listeners that are not valid for outbound connections
* Filter sanitized addresses Zebra based on address state
This fix correctly prevents Zebra gossiping client addresses to peers,
but still keeps the client in the address book to avoid reconnections.
* Add a full set of DateTime32 and Duration32 calculation methods
* Refactor sanitize to use the new DateTime32/Duration32 methods
* Security: Use canonical SocketAddrs to avoid duplicate connections
If we allow multiple variants for each peer address, we can make multiple
connections to that peer.
Also make sure sanitized MetaAddrs are valid for outbound connections.
* Test that address books contain the local listener address
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
When peers ask for peer addresses, add our local listener address to the
set of addresses, sanitize, then truncate. Sanitize shuffles addresses,
so if there are lots of addresses in the address book, our address will
only be sent to some peers.
This change encodes a bunch of invariants in the type system,
and adds explicit failure states for:
* a closed oneshot,
* bugs in the initialization code.
Use `ServiceExt::oneshot` to perform state requests.
Explain that `ServiceExt::call_all` calls `poll_ready` internally.
Document a state service invariant imposed by `ServiceExt::call_all`.
* Rewrite GetData handling to match the zcashd implementation
`zcashd` silently ignores missing blocks, but sends found transactions
followed by a `NotFound` message:
e7b425298f/src/main.cpp (L5497)
This is significantly different to the behaviour expected by the old
Zebra connection state machine, which expected `NotFound` for blocks.
Also change Zebra's GetData responses to peer request so they ignore
missing blocks.
* Stop hanging on incomplete transaction or block responses
Instead, if the peer sends an unexpected block, unexpected transaction,
or NotFound message:
1. end the request, and return a partial response containing any items
that were successfully received
2. if none of the expected blocks or transactions were received, return
an error, and close the connection
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions
Add a `max_len` argument to support `FindHeaders` requests.
Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.
* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>