Commit Graph

61 Commits

Author SHA1 Message Date
teor 87f4308caf
fix(sync): Temporarily set full verification concurrency to 30 blocks (#4726)
* Return the maximum checkpoint height from the chain verifier

* Return the verified block height from the sync downloader

* Track the verified height in the syncer

* Use a lower concurrency limit during full verification

* Get the tip from the state before the first verified block

* Limit the number of submitted download and verify blocks in a batch

* Adjust lookahead limits when transitioning to full verification

* Keep unused extra hashes and submit them to the downloader later

* Remove redundant verified_height and state_tip()

* Split the checkpoint and full verify concurrency configs

* Decrease full verification concurrency to 5 blocks

10 concurrent blocks causes 3 minute stalls on some blocks on my machine.
(And it has about 4x as many cores as a standard machine.)

* cargo +stable fmt --all

* Remove a log that's verbose with smaller lookahead limits

* Apply the full verify concurrency limit to the inbound service

* Add a summary of the config changes to the CHANGELOG

* Increase the default full verify concurrency limit to 30
2022-07-06 10:13:57 -04:00
Janito Vaqueiro Ferreira Filho c24ea1fc3f
Refactor to create a new `zebra-node-services` crate (#3648)
* Create new empty `zebra-node-services` crate

The goal is to store the mempool `Request` and `Response` types so that
the `zebra-rpc` crate can interface with the mempool service without
having to import `zebrad`.

* Move `Gossip` mempool type into new crate

It is required by the `Request` type, which will be moved next.

* Add documentation to `Gossip` variants

Avoid having to add an exception to allow undocumented code.

* Move `mempool::Request` type to new crate

The first part of the service interface definition. Usages have been
changed to refer to the new crate directly, and since this refactor is
still incomplete, some `mp` aliases are used in a few places to refer to
the old module.

* Create an `UnboxMempoolError` helper trait

Centralize some common code to extract and downcast boxed mempool
errors. The `mempool::Response` will need to contain a `BoxError`
instead of a `MempoolError` when it is moved to the
`zebra-node-services` crate, so this prepares the tests to be updated
with less changes.

* Use `UnboxMempoolError` in tests

Make the necessary changes so that the tests are ready to support a
`BoxError` in the `mempool::Response` type.

* Use `BoxError` in `mempool::Response`

Prepare it to be moved to the `zebra-node-services` crate.

* Move `mempool::Response` to `zebra-node-services`

Update usages to import from the new crate directly.

* Remove `mp` aliases for mempool component module

Use any internal types directly instead.

* Replace `tower::BoxService` with custom alias

Remove the dependency of `zebra-node-services` on `tower`.

* Move `Gossip` into a separate `sub-module`

Keep only the main `Request` and `Response` types in the `mempool`
module.

* Use `crate::BoxError` instead of `tower::BoxError`

Follow the existing convention.

* Add missing `gossip.rs` module file

It was missing from a previous refactor commit.
2022-02-25 21:43:21 +00:00
teor a4dd3b7396
4. Avoid repeated requests to peers after partial responses or errors (#3505)
* fix(network): split synthetic NotFoundRegistry from message NotFoundResponse

* docs(network): Improve `notfound` message documentation

* refactor(network): Rename MustUseOneshotSender to MustUseClientResponseSender

```
fastmod MustUseOneshotSender MustUseClientResponseSender zebra*
```

* docs(network): fix a comment typo

* refactor(network): remove generics from MustUseClientResponseSender

* refactor(network): add an inventory collector to Client, but don't use it yet

* feat(network): register missing peer responses as missing inventory

We register this missing inventory based on peer responses,
or connection errors or timeouts.

Inbound message inventory tracking requires peers to send `notfound` messages.
But `zcashd` skips `notfound` for blocks, so we can't rely on peer messages.
This missing inventory tracking works regardless of peer `notfound` messages.

* refactor(network): rename ResponseStatus to InventoryResponse

```sh
fastmod ResponseStatus InventoryResponse zebra*
```

* refactor(network): rename InventoryStatus::inner() to to_inner()

* fix(network): remove a redundant runtime.enter() in a test

* doc(network): the exact time used to filter outbound peers doesn't matter

* fix(network): handle block requests slightly more efficiently

* doc(network): fix a typo

* fmt(network): `cargo fmt` after rename ResponseStatus to InventoryResponse

* doc(test): clarify some test comments

* test(network): test synthetic notfound from connection errors and peer inventory routing

* test(network): improve inbound test diagnostics

* feat(network): add a proptest-impl feature to zebra-network

* feat(network): add a test-only connect_isolated_with_inbound function

* test(network): allow a response on the isolated peer test connection

* test(network): fix failures in test synthetic notfound

* test(network): Simplify SharedPeerError test assertions

* test(network): test synthetic notfound from partially successful requests

* test(network): MissingInventoryCollector ignores local NotFoundRegistry errors

* fix(network): decrease the inventory rotation interval

This stops us waiting 3-4 sync resets (4 minutes) before we retry a missing block.

Now we wait 1-2 sync resets (2 minutes), which is still a reasonable rate limit.
This should speed up syncing near the tip, and on testnet.

* fmt(network): cargo fmt --all

* cleanup(network): remove unnecessary allow(dead_code)

* cleanup(network): stop importing the whole sync module into tests

* doc(network): clarify syncer inventory retry constraint

* doc(network): add a TODO for a fix to ensure API behaviour remains consistent

* doc(network): fix a function doc typo

* doc(network): clarify how we handle peers that don't send `notfound`

* docs(network): clarify a test comment

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-02-15 01:44:33 +00:00
teor 9f2028feff
3. Send notfound when Zebra doesn't have a block or transaction (#3466)
* refactor(network): rename Advertised to Available

```sh
fastmod Advertised Available zebra*
fastmod advertised available zebra*
```

* refactor(network): allow different available and missing types inside an InventoryStatus

And rename it to ResponseStatus.

Split the methods between ResponseStatus and an InventoryStatus alias.

* refactor(network): add a block_hash convenience method to InventoryHash

* test(network): improve failure logs for connection tests

* fix(inbound): move address sanitization into the response future

* feat(network): send notfound when Zebra doesn't have a block or transaction

* doc(network): move module docs to the top of each module

This makes them more likely to get updated when the module changes.

* fix(network): stop sending unsupported missing inventory types to the registry

* test(network): inbound messages are forwarded to the registry

* test(inbound): test Peers requests to the inbound service, directly and via TCP

* test(network): notfound block responses are sent by the inbound service

* test(network): notfound tx responses are sent by the inbound service

* test(network): increase sync test mock service timeout

The code that these tests use hasn't actually changed much,
and they are only failing on some platforms (coverage, macOS).

So it seems like the extra concurrent inbound tests have pushed them
past their time limit.
(Perhaps due to TCP system calls, or extra serialization work.)

* doc(network): fix typo

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* test(network): remove unnecessary multi-threaded runtime from tests

This prevents `MockService<zebra_state>` timeouts
in the `sync_block_too_high_extend_tips` test,
at the cost of reducing coverage of different execution orders.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2022-02-14 01:51:34 +00:00
teor fa071562fd
fix(network): increase state concurrency and syncer lookahead (#3455)
* fix(state): set state concurrency based on other services' concurrency

* fix(sync): increase the sync downloader lookahead limit

It seems like the recent tokio upgrade made this code even more efficient,
so on testnet we can have around 6000 blocks in flight.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-02-02 22:44:15 +00:00
Alfredo Garcia 3c1ba59001
Reduce log level of components (#3418)
* reduce log level of components

* revert some log downgrades

* dedupe log
2022-01-28 14:24:53 -03:00
teor a4d1a1801c
Security: Drop blocks that are a long way ahead of the tip (#3167)
* Document the chain verifier

* Drop gossiped blocks that are too far ahead of the tip

* Add extra gossiped block metrics

* Allow extra gossiped blocks, now we have a stricter limit

* Fix a comment

* Check the exact number of blocks in a downloaded block response

* Drop synced blocks that are too far ahead of the tip

* Add extra synced block metrics

* Test dropping gossiped blocks that are too far ahead of the tip

* Allow an extra checkpoint's worth of blocks in the verifier queues

* Actually let's try two extra checkpoints

* Scale extra height limit with lookahead limit

* Also drop blocks that are behind the finalized tip

* Downgrade a noisy log

* Use a debug log for already verified gossiped blocks

* Use debug logs for already verified synced blocks
2021-12-17 13:31:51 -03:00
teor 332afc17d5
Security: Limit address book size to limit memory usage (#3162)
* Refactor the address response limit

* Limit the number of peers in the address book

* Allow changing the address book limit in tests

* Add tests for the address book length limit

* rustfmt
2021-12-06 16:09:10 -03:00
teor 4d608d3224
Stop doing thousands of time checks each time we connect to a peer (#3106)
* Stop checking the entire AddressBook for each connection attempt

* Stop redundant peer time checks within the address book

* Stop calling `Instant::now` 3 times for each address book update

* Only get the time once each time an address book method is called

* Update outdated comment

* Use an OrderedMap to efficiently store address book peers

* Add address book order tests
2021-12-03 15:09:43 -03:00
teor 68d7198e9f
Re-order Zebra startup, so slow services are launched last (#3091)
* Start network before verifiers

This makes the Groth16 download task start as late as possible.

* Explain why the Groth16 download must happen first

* Speed up Zebra shutdown: skip waiting for the tokio runtime
2021-11-23 17:42:44 +00:00
teor 375a997d2f
Stop downloading unnecessary blocks in Zebra acceptance tests (#3072)
* Implement graceful shutdown for the peer set

* Use the minimum lookahead limit in acceptance tests

* Enable a doctest that compiles with newly public modules
2021-11-19 01:55:38 +00:00
teor d6f3b3dc9a
Parse received addrv2 messages (#3022)
* Revert "Remove commented-out code"

This reverts commit 9e69777925f103ee11e5940bba95b896c828839b.

* Implement deserialization for `addrv2` messages

* Limit addr and addrv2 messages to MAX_ADDRS_IN_MESSAGE

* Clarify address version comments

* Minor cleanups and fixes

* Add preallocation tests for AddrV2

* Add serialization tests for AddrV2

* Use prop_assert in AddrV2 proptests

* Use a generic utility method for deserializing IP addresses in `addrv2`

* Document the purpose of a conversion to MetaAddr

* Fix a comment typo, and clarify that comment

* Clarify the unsupported AddrV2 network ID error and enum variant names

```sh
fastmod AddrV2UnimplementedError UnsupportedAddrV2NetworkIdError zebra-network
fastmod Unimplemented Unsupported zebra-network
```

* Fix and clarify unsupported AddrV2 comments

* Replace `panic!` with `unreachable!`

* Clarify a comment about skipping a length check in a test

* Remove a redundant test

* Basic addr (v1) and addrv2 deserialization tests

* Test deserialized IPv4 and IPv6 values in addr messages

* Remove redundant io::Cursor

* Add comments with expected values of address test vectors
2021-11-12 00:25:23 +00:00
Alfredo Garcia 4d600a0fc1
truncate `Peers` response further (#3007) 2021-11-02 22:21:54 +00:00
teor 67327ac462
Downgrade some less interesting info-level logs to debug (#2938)
There are a lot of these messages when Zebra starts up.
They might be slowing down CI and causing timeouts.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-10-22 02:11:09 +00:00
teor 40c907dd09
Remove duplicate IDs in mempool requests and responses (#2887)
* Guarantee unique IDs in mempool service responses

* Guarantee unique IDs in crawler task mempool Queue requests

Also update the tests to use unique IDs.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-18 15:31:11 +00:00
Conrado Gouvea c6878d9b63
Cancel download and verify tasks when the mempool is deactivated (#2764)
* Cancel download and verify tasks when the mempool is deactivated

* Refactor enable/disable logic to use a state enum

* Add helper test functions to enable/disable the mempool

* Add documentation about errors on service calls

* Improvements from review

* Improve documentation

* Fix bug in test

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2021-09-29 09:06:40 +10:00
teor 07e8926fd5
Send `Response::Nil` instead of sending empty `Message`s (#2791)
* Send `Response::Nil` instead of sending empty `Message`s

This matches `zcashd`'s behaviour more closely.

In most cases, the network layer filters these out already.
But this change makes the the inbound service code clearer.

* revert changes made to `AdvertiseTransactionIds` and `PushTransaction`

* remove newline

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-09-23 19:58:00 +00:00
Conrado Gouvea 8825a52bb8
Move transaction download and verify stream into the mempool service (#2741)
* Move transaction dowloader and verifier into the mempool service

* add test for `Storage::contains_rejected()`

* Rename DownloadAndVerify->Queue; move should_download_or_verify() to previous impl

* GossipedTx -> Gossip

* Revamp error handling

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-13 16:28:07 -04:00
Conrado Gouvea f3ee76f202
Verify inbound PushTransactions (#2727)
* Verify inbound PushTransactions

* Add GossipedTx and refactor downloader to use it

* remove grafana changes

* remove TODOs

* Tidy the transaction fetching in mempool downloader

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
2021-09-09 10:04:44 -03:00
Conrado Gouvea a2993e8df0
Skip download and verification if the transaction is already in the mempool or state (#2718)
* Check if tx already exists in mempool or state before downloading

* Reorder checks

* Add rejected test; refactor into separate function

* Wrap mempool in buffered service

* Rename RejectedTransactionsById -> RejectedTransactionsIds

* Add RejectedTransactionIds response; fix request name

* Organize imports

* add a test for Storage::rejected_transactions

* add test for mempool `Request::RejectedTransactionIds`

* change buffer size to 1 in the test

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-08 18:51:17 +00:00
Alfredo Garcia 65e308d2e1
Respond to inbound `TransactionsById` with mempool content (#2725)
* reply to inbound `TransactionsById` requests

* apply style/redability suggestions and fix typo

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-06 22:55:17 +00:00
Alfredo Garcia 9c220afdc8
Reply to `Request::MempoolTransactionIds` with mempool content (#2720)
* reply to `Request::MempoolTransactionIds`

* remove boilerplate

* get storage from mempool with a method

* change panic message

* try fix for mac

* use normal init instead of init_tests for state service

* newline

* rustfmt

* fix test build
2021-09-02 13:42:31 +00:00
Conrado Gouvea 1ccb2de7c7
Add transaction downloader and verifier (#2679)
* Add transaction downloader

* Changed mempool downloader to be like inbound

* Verifier working (logs result)

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* Fix coinbase check for mempool, improve is_coinbase() docs

* Change other downloads.rs docs to reflect the mempool downloads.rs changes

* Change TIMEOUTs to downloads.rs; add docs

* Renamed is_coinbase() to has_valid_coinbase_transaction_inputs() and contains_coinbase_input() to has_any_coinbase_inputs(); reorder checks

* Validate network upgrade for V4 transactions; check before computing sighash (for V5 too)

* Add block_ prefix to downloads and verifier

* Update zebra-consensus/src/transaction.rs

Co-authored-by: teor <teor@riseup.net>

* Add consensus doc; add more Block prefixes

Co-authored-by: teor <teor@riseup.net>
2021-09-02 00:06:20 +00:00
teor c608260256
Support witnessed transaction IDs in zebra-network requests and responses (#2638)
* Rename internal network requests for wide transaction IDs

fastmod TransactionsByHash TransactionsById zebra*
fastmod AdvertiseTransactions AdvertiseTransactionIds zebra*
fastmod MempoolTransactions MempoolTransactionIds zebra*
fastmod TransactionHashes TransactionIds zebra*

* Update network transaction request/response comments

* Rename a transaction hash method for wide transaction IDs

fastmod transaction_hashes transaction_ids zebra-network

* Add UnminedTxId methods and conversions for InventoryHash

* Map WtxIds to unmined transaction network messages

Also, use UnminedTxId and UnminedTx in:
* Zebra's internal request and response format, and
* external Zcash network protocol messages.

* Enable WtxId mempool inventory tracking for peers

* Further clarify transaction IDs

* Use Witnessed rather than Wide for transaction IDs

And rename narrow to legacy when it only applies to v1-v4 transactions.
Otherwise, rename it to mined ID.

* Rename a missed binding
* Remove an incorrectly named binding

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-18 22:55:24 +00:00
teor 1a57023eac
Security: Use canonical SocketAddrs to avoid duplicate peer connections, Feature: Send local listener to peers (#2276)
* Always send our local listener with the latest time

Previously, whenever there was an inbound request for peers, we would
clone the address book and update it with the local listener.

This had two impacts:
- the listener could conflict with an existing entry,
  rather than unconditionally replacing it, and
- the listener was briefly included in the address book metrics.

As a side-effect, this change also makes sanitization slightly faster,
because it avoids some useless peer filtering and sorting.

* Skip listeners that are not valid for outbound connections

* Filter sanitized addresses Zebra based on address state

This fix correctly prevents Zebra gossiping client addresses to peers,
but still keeps the client in the address book to avoid reconnections.

* Add a full set of DateTime32 and Duration32 calculation methods

* Refactor sanitize to use the new DateTime32/Duration32 methods

* Security: Use canonical SocketAddrs to avoid duplicate connections

If we allow multiple variants for each peer address, we can make multiple
connections to that peer.

Also make sure sanitized MetaAddrs are valid for outbound connections.

* Test that address books contain the local listener address

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-06-22 02:16:59 +00:00
teor 92828bbb29 Reliability: send local listener address to peers
When peers ask for peer addresses, add our local listener address to the
set of addresses, sanitize, then truncate. Sanitize shuffles addresses,
so if there are lots of addresses in the address book, our address will
only be sent to some peers.
2021-05-18 14:02:19 +10:00
teor 74e155ff9f
Spelling: gossipped -> gossiped (#2119) 2021-05-07 13:01:11 +02:00
teor 0203d1475a Refactor and document correctness for std::sync::Mutex<AddressBook> 2021-04-21 17:14:47 -04:00
teor cc7d5bd2ad
Update comments for the inbound service (#1740) 2021-02-16 06:14:40 +10:00
teor 372a432179
Update the call_all comment in Inbound (#1737) 2021-02-16 06:14:16 +10:00
teor 6679a124e3 Require Inbound setup handlers to provide a result
Rather than having them default to `Ok(())`, which is incorrect
for some error handlers.
2021-02-03 08:32:10 +10:00
teor 09c8c89462 Make sure FailedInit never escapes Inbound::poll_ready 2021-02-03 08:32:10 +10:00
teor 134a5e78bd Consistently use `network_setup` for the Inbound Setup 2021-02-03 08:32:10 +10:00
teor 1c8362fe01 Remove unused imports 2021-02-03 08:32:10 +10:00
Jane Lusby 4cf331562c combine network setup into an exhaustive match 2021-02-03 08:32:10 +10:00
Jane Lusby 4d6ef89248 avoid using async blocks to avoid lifetime bug with generators 2021-02-03 08:32:10 +10:00
Jane Lusby 685a592399 Add clonable wrapper around TryRecvError 2021-02-03 08:32:10 +10:00
teor 6ffeb670ed Log the failed response in an unreachable panic 2021-02-03 08:32:10 +10:00
teor eac4fd181a Add a Setup enum to manage Inbound network setup internal state
This change encodes a bunch of invariants in the type system,
and adds explicit failure states for:
* a closed oneshot,
* bugs in the initialization code.
2021-02-03 08:32:10 +10:00
teor 32b032204a Consistently return Response::Nil during setup
And log an info-level message as a diagnostic, in case setup takes a
long time.
2021-02-03 08:32:10 +10:00
teor 94eb91305b Stop using ServiceExt::call_all due to buffer bugs
ServiceExt::call_all leaks Tower::Buffer reservations, so we can't use
it in Zebra.

Instead, use a loop in the returned future.

See #1593 for details.
2021-02-03 08:32:10 +10:00
teor 64bc45cd2e Fix state readiness hangs for Inbound
Use `ServiceExt::oneshot` to perform state requests.

Explain that `ServiceExt::call_all` calls `poll_ready` internally.
Document a state service invariant imposed by `ServiceExt::call_all`.
2021-02-03 08:32:10 +10:00
teor 4d1a2fd02e Make the Inbound invariant clearer 2021-02-03 08:32:10 +10:00
teor 2a25b9ee72 Remove services that are never `call`ed from Inbound
Uses the `ServiceExt::oneshot` design pattern from #1593.
2021-02-03 08:32:10 +10:00
teor 92d95d4be5 Refactor inbound members into a consistent order
And add download comments
2021-01-13 20:46:25 -05:00
teor fb76eb2e6b Add download and verify timeouts to the inbound service 2021-01-13 20:46:25 -05:00
teor b1f14f47c6
Rewrite GetData handling to match the zcashd implementation (#1518)
* Rewrite GetData handling to match the zcashd implementation

`zcashd` silently ignores missing blocks, but sends found transactions
followed by a `NotFound` message:
e7b425298f/src/main.cpp (L5497)

This is significantly different to the behaviour expected by the old
Zebra connection state machine, which expected `NotFound` for blocks.

Also change Zebra's GetData responses to peer request so they ignore
missing blocks.

* Stop hanging on incomplete transaction or block responses

Instead, if the peer sends an unexpected block, unexpected transaction,
or NotFound message:
1. end the request, and return a partial response containing any items
   that were successfully received
2. if none of the expected blocks or transactions were received, return
   an error, and close the connection
2021-01-04 13:25:35 +10:00
Alfredo Garcia 4544463059
Inbound `FindBlocks` and `FindHeaders` (#1347)
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions

Add a `max_len` argument to support `FindHeaders` requests.

Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.

* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-12-01 07:30:37 +10:00
Henry de Valence de8415dcb1 tidy spans 2020-11-25 10:55:44 -08:00
Henry de Valence 05837797b1 tidy imports 2020-11-25 10:55:44 -08:00