* Use a named CancelHeartbeatTask unit struct for the channel type
* Prefer cancel handles in selects, if both are ready
* Fix message metrics to just show the command name
* Add metrics for internal requests and responses
* Add internal requests and responses to the messages dashboard
* Add a canceled metric, and peer addresses to request and response metrics
* Add a canceled messages graph
* Add connection state metrics for currently open connections
* Fix the connection state graph with new metrics
* Always send an error before dropping pending responses
* Move error detail logging into `fail_with`
* Delete an unused timer future
* Make error strings in metrics less verbose
* Downgrade some error logs to info
* Remove a redundant expect
* Avoid unnecessary allocations for connection state metrics
* Fix missed updates to mempool and block gossip metrics
ZIP-201 describes how a Zcash node should behave when it reaches a
network upgrade activation height. Zebra doesn't implement all the
details specified there, so we need to document what it does implement
and what it doesn't and why.
* Stop useless crawler attempts when there are no peers and no crawl responses
* Disable GitHub bug report URLs when the disk is full
* Add help text for the `zebrad start` tracing filter option
* Replace usage of `discover::Change` with a tuple
Remove the assumption that a `Remove` variant would never be created
with type changes that allow the compiler to guarantee that assumption.
* Add a `version` field to the `Client` type
Keep track of the peer's reported protocol version.
* Create `LoadTrackedClient` type
A `peer::Client` type wrapper that implements `Load`. This helps with
the creation of a client service that has extra peer information to be
accessed without having to send requests.
* Use `LoadTrackedClient` in `initialize`
Ensure that `PeerSet` receives `LoadTrackedClient`s so that it will be
able to query the peer's protocol version later on.
* Require `LoadTrackedClient` in `PeerSet`
Replace the generic type with a concrete `LoadTrackedClient` so that we
can query its version.
* Create `MinimumPeerVersion` helper type
A type to track the current minimum protocol version for connected
peers based on the current block height.
* Use `MinimumPeerVersion` in handshakes
Keep the code to obtain the current minimum peer protocol version in a
central place.
* Add a `MinimumPeerVersion` instance to `PeerSet`
Prepare it to be able to disconnect from outdated peers based on the
current minimum supported peer protocol version.
* Disconnect from ready services for outdated peers
When the minimum peer protocol version is detected to have changed
(because of a network upgrade), remove all ready services of peers that
became outdated.
* Cancel added unready services of outdated peers
Only add an unready service if it's for a peer that has a supported
protocol version. Otherwise, add it but drop the cancel handle so that
the `UnreadyService` can execute and detect that it was cancelled.
* Avoid adding ready services for outdated peers
If a service becomes ready but it's for a connection to an outdated
peer, drop it.
* Improve comment inside `crawl_and_dial`
Describe an edge case that is also handled but was not explicit.
Co-authored-by: teor <teor@riseup.net>
* Test if calculated minimum peer version is correct
Given an arbitrary best chain tip height, check that the calculated
minimum peer protocol version is the expected value.
* Test if minimum version changes with chain tip
Apply an arbitrary list of chain tip height updates and check that for
each update the minimum peer version is calculated correctly.
* Test minimum peer version changed reports
Simulate a series of best chain tip height updates, and check for
minimum peer version updates at least once between them. Changes should
only be reported once.
* Create a `MockedClientHandle` helper type
Used to create and then track a mock `Client` instance.
* Add `MinimumPeerVersion::with_mock_chain_tip`
An extension method useful for tests, that contains some shared
boilerplate code.
* Bias arbitrary `Version`s to be in valid range
Give a 50% chance for an arbitrary `Version` to be in the range of
previously used values the Zcash network.
* Create a `PeerVersions` helper type
Helps with the creation of mocked client services with arbitrary
protocol versions.
* Create a `PeerSetGuard` helper type
An auxiliary type to a `PeerSet` instance created for testing. It keeps
track of any dummy endpoints of channels created and passed to the
`PeerSet` instance.
* Create a `PeerSetBuilder` helper type
Helps to reduce the code when preparing a `PeerSet` test instance.
* Test if outdated peers are rejected by `PeerSet`
Simulate a set of discovered peers being sent to the `PeerSet`. Ensure
that only up-to-date peers are kept by the `PeerSet` and that outdated
peers are dropped.
* Create `BlockHeightPairAcrossNetworkUpgrades` type
A helper type that allows the creation of arbitrary block height pairs,
where one value is before and the other is at or after the activation
height of an arbitrary network upgrade.
* Test if peers are dropped as they become outdated
Simulate a network upgrade, and check that peers that become outdated
are dropped by the `PeerSet`.
* Remove dbg! macros
Co-authored-by: teor <teor@riseup.net>
* Use a single-thread shared Tokio runtime
This allows it to pause the time and more closely resembles the
environment that's set by default for asynchronous tests.
* Add a `zebra_test::init_async` helper function
Calls `zebra_test::init` but also constructs a single-thread Tokio
runtime and returns it. This makes it simpler to initialize asynchronous
tests that can't use the `#[tokio::test]` attribute.
* Replace usages of `Runtime::new` in tests
Use the new `zebra_test::init_async()` helper function instead.
* Replace `runtime::Builder::new_current_thread()`
Use the new `zebra_test::init_async()` helper function instead.
* Replace `runtime::Builder::new_multi_thread()`
Use the new `zebra_test::init_async()` helper function instead. The test
with the change doesn't necessarily have to use a multi-thread runtime.
* Refactor the address response limit
* Limit the number of peers in the address book
* Allow changing the address book limit in tests
* Add tests for the address book length limit
* rustfmt
* Stop checking the entire AddressBook for each connection attempt
* Stop redundant peer time checks within the address book
* Stop calling `Instant::now` 3 times for each address book update
* Only get the time once each time an address book method is called
* Update outdated comment
* Use an OrderedMap to efficiently store address book peers
* Add address book order tests
Zebra's latest beta continues implementing zero-knowledge proof and note commitment tree validation. In this release, we have finished implementing transaction header, transaction amount, and Zebra-specific NU5 validation. (NU5 mainnet validation is waiting on an `orchard` crate update, and some consensus parameter updates.)
We also fix a number of security issues that could pose a local denial of service risk, or make it easier for an attacker to make a node follow a false chain.
As of this release, Zebra will automatically download and cache the Sprout and Sapling Groth16 circuit parameters. The cache uses around 1 GB of disk space. These cached parameters are shared across all Zebra and `zcashd` instances run by the same user.
See CHANGELOG.md for the full list of changes in this release.
* Tweak a log message
* Only retry failed DNS once, then use the other DNS responses
* Limit broadcasts to half the peers
* Use a longer minimum interval for GetAddr requests
* Reduce the syncer and mempool crawler fanouts
* Stop resetting the mempool twice when it starts up
This spawns two crawlers, which send two fanouts,
so it can use up a lot of peers.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Rewrite PeerSet comments to split long sentences
* Replace peer set integer indexes with address-based indexes
Also improve documentation and logging.
* Security: Stop using peer addresses to choose inventory routing order
* Minor doc and code cleanups
* Stop re-using a drained HashSet
* Replace used `_cancel` with `cancel`
* Reword a comment
* Replace cloned with copied
* Shut down channels and tasks on PeerSet Drop
* Document all the PeerSet fields
* Close the peer set background task handle on shutdown
* Receive background tasks during shutdown
Also, split receiving and polling background tasks into separate methods.
* Revert "Remove commented-out code"
This reverts commit 9e69777925f103ee11e5940bba95b896c828839b.
* Implement deserialization for `addrv2` messages
* Limit addr and addrv2 messages to MAX_ADDRS_IN_MESSAGE
* Clarify address version comments
* Minor cleanups and fixes
* Add preallocation tests for AddrV2
* Add serialization tests for AddrV2
* Use prop_assert in AddrV2 proptests
* Use a generic utility method for deserializing IP addresses in `addrv2`
* Document the purpose of a conversion to MetaAddr
* Fix a comment typo, and clarify that comment
* Clarify the unsupported AddrV2 network ID error and enum variant names
```sh
fastmod AddrV2UnimplementedError UnsupportedAddrV2NetworkIdError zebra-network
fastmod Unimplemented Unsupported zebra-network
```
* Fix and clarify unsupported AddrV2 comments
* Replace `panic!` with `unreachable!`
* Clarify a comment about skipping a length check in a test
* Remove a redundant test
* Basic addr (v1) and addrv2 deserialization tests
* Test deserialized IPv4 and IPv6 values in addr messages
* Remove redundant io::Cursor
* Add comments with expected values of address test vectors
* Add a `Duration32::from_days` constructor
Make it simpler to construct a `Duration32` representing a certain
number of days.
* Add `MetaAddr::was_not_recently_seen` method
A helper method to check if a peer was never seen before or if it was
last seen a long time ago. This will be one of the conditions to
consider a peer as unreachable.
* Add `MetaAddr::is_probably_unreachable` method
A helper method to check if a peer should be considered unreachable. It
is considered unreachable if recent connection attempts have failed and
it was not recently seen.
If a peer is considered unreachable, Zebra shouldn't attempt to connect
to it again.
* Do not keep trying to connect to unreachable peer
A peer is probably unreachable if it was last seen a long time ago and
if it's last connection attempt failed.
* Test `was_not_recently_seen`
Redo the calculation on arbitrary `MetaAddr`s.
* Test `is_probably_unreachable`
Redo the calculation on arbitrary `MetaAddr`s.
* Test if probably unreachable peers are ignored
Given an `AddressBook` with a list of arbitrary `MetaAddr`s, check that
none of the peers listed for a reconnection is probably unreachable.
* Rename unit test to improve clarity
Remove the double negative from the name.
Co-authored-by: teor <teor@riseup.net>
* Rename constant to `MAX_RECENT_PEER_AGE`
Make the purpose of the constant clearer.
Co-authored-by: teor <teor@riseup.net>
* Rename method to `last_seen_is_recent`
Remove the double negative from the name.
* Rename method to `is_probably_reachable`
Avoid having to negate the result of the method in security critical
filter.
* Move check into `is_ready_for_connection_attempt`
Make sure the check is used in any place that requires a peer that's
ready for a connection attempt.
* Improve test documention
Describe the goal of the test better.
Co-authored-by: teor <teor@riseup.net>
* Improve `is_probably_reachable` documentation
List the conditions as bullet points.
Co-authored-by: teor <teor@riseup.net>
* Document what happens when peers have no last seen time
Co-authored-by: teor <teor@riseup.net>
* Implement addr v1 serialization using a separate AddrV1 type
* Remove commented-out code
* Split the address serialization code into modules
* Reorder v1 and in_version fields in serialization order
* Fix a missed search-and-replace
* Explain conversion to MetaAddr
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Add unused seed peers to the AddressBook
* Document a new `await`
We added an extra await on the AddressBook thread mutex.
Co-authored-by: teor <teor@riseup.net>
* Fix a typo
* Refactor names
* Return early from `limit_initial_peers`
* Add `proptest`s regressions
* Return `MetaAddr` instead of `None`
* Test if `zebra_network::init()` deadlocks
* Remove unneeded regressions
* Rename `TimestampCollector` to `AddressBookUpdater` (#2992)
* Rename `TimestampCollector` to `AddressBookUpdater`
* Update comments
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Move `all_peers` instead of copying them
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Make `Duration` a const
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Use a timeout instead of measuring the elapsed time
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Copy `initial_peers` instead of moving them
* Refactor the position of `NewInitial` and `new_initial`
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Update `tower` to version `0.4.9`
Update to latest version to add support for Tokio version 1.
* Replace usage of `ServiceExt::ready_and`
It was deprecated in favor of `ServiceExt::ready`.
* Update Tokio dependency to version `1.13.0`
This will break the build because the code isn't ready for the update,
but future commits will fix the issues.
* Replace import of `tokio::stream::StreamExt`
Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.
* Use `IntervalStream` in `zebra-network`
In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.
* Use `IntervalStream` in `inventory_registry`
In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.
* Use `BroadcastStream` in `inventory_registry`
In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.
* Handle `Semaphore::acquire` error in `tower-batch`
Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.
* Handle `Semaphore::acquire` error in `zebrad` test
On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.
* Update some `zebra-network` dependencies
Use versions compatible with Tokio version 1.
* Upgrade Hyper to version 0.14
Use a version that supports Tokio version 1.
* Update `metrics` dependency to version 0.17
And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.
* Use `f64` as the histogram data type
`u64` isn't supported as the histogram data type in newer versions of
`metrics`.
* Update the initialization of the metrics component
Make it compatible with the new version of `metrics`.
* Simplify build version counter
Remove all constants and use the new `metrics::incement_counter!` macro.
* Change metrics output line to match on
The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.
* Update `sentry` to version 0.23.0
Use a version compatible with Tokio version 1.
* Remove usage of `TracingIntegration`
This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.
* Add sentry layer to tracing initialization
This seems like the replacement for `TracingIntegration`.
* Remove unnecessary conversion
Suggested by a Clippy lint.
* Update Cargo lock file
Apply all of the updates to dependencies.
* Ban duplicate tokio dependencies
Also ban git sources for tokio dependencies.
* Stop allowing sentry-tracing git repository in `deny.toml`
* Allow remaining duplicates after the tokio upgrade
* Use C: drive for CI build output on Windows
GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.
Co-authored-by: teor <teor@riseup.net>
* Use `prop_assert` instead of `assert`
Otherwise the test input isn't minimized.
* Split long string into a multi-line string
And add some newlines to try to improve readability.
* Fix referenced issue number
They had a typo in their number.
* Make peer services optional
It is unknown for initial peers.
* Fix `preserve_initial_untrusted_values` test
Now that it's optional, the services field can be written to if it was
previously empty.
* Fix formatting of property tests
Run rustfmt on them.
* Restore `TODO` comment
Make it easy to find planned improvements in the code.
Co-authored-by: teor <teor@riseup.net>
* Comment on how ordering is affected
Make it clear that missing services causes the peer to be chosen last.
Co-authored-by: teor <teor@riseup.net>
* Don't expect `services` to be available
Avoid a panic by using the compiler to help enforce the handling of the
case correctly.
* Panic if received gossiped address has no services
All received gossiped addresses have services. The only addresses that
don't have services configured are the initial seed addresses.
Co-authored-by: teor <teor@riseup.net>
* Limit open inbound connections based on the config
* Log inbound connection errors at debug level
* Test inbound connection limits
* Use clone directly in function call argument lists
* Remove an outdated comment
* Update tests to use an unbounded channel rather than mem::forget
And rename some variables.
* Use a lower limit in a slow test and require that it is exceeded
* Rate-limit initial seed peer connections
* Revert "Rate-limit initial seed peer connections"
This reverts commit f779a1eb9e1ffd5497e96dfe8ae1be4340d5d24d.
* Simplify logic
* Avoid cooperative async task starvation in the peer crawler and listener
If we don't yield in these loops, they can run for a long time before
tokio forces them to yield.
* Add test
* Check for task panics in initial peers test
* Remove duplicate code in rebase
Co-authored-by: teor <teor@riseup.net>
* Limit the number of outbound connections in the crawler
* Make zebra-network channel bounds depend on config.peerset_initial_target_size
* Bias Zebra towards outbound connections
And turn connection limits into `Config` methods.
* Downgrade some connection logs to debug
* Remove verbose or outdated fields in tracing logs
* Clarify connection limits
Includes:
- `fastmod OUTBOUND_PEER_BIAS_FRACTION OUTBOUND_PEER_BIAS_DENOMINATOR zebra*`
- clarify connection limit documentation
* Clarify inventory channel capacity
* Add zebra_network::initialize tests with limited numbers of peers
* Avoid cooperative async task starvation in the peer crawler and listener
If we don't yield in these loops, they can run for a long time before
tokio forces them to yield.
* Test the crawler with small connection limits
And use the multi-threaded runtime to avoid long hangs.
* Stop using the multi-threaded executor in tests where it's not needed
* Avoid starvation for every connection
Adds yields after inbound successes and initial peer connections.
* Add a crawler peer connection success test
* Add outbound connection limit tests
* Improve outbound tests
* Wrap `Sleep` timer in a `Pin<Box<_>>`
The `Sleep` type doesn't implement `Unpin` in newer versions of Tokio.
* Wrap `Sleep` type in a `Pin<Box<_>>`
In newer Tokio versions the `Sleep` type doesn't implement `Unpin`, so
it needs to be manually pinned.
There are a lot of these messages when Zebra starts up.
They might be slowing down CI and causing timeouts.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Replace some unit tuples with named unit structs
This helps distinguish generic channels and make them type-safe.
Also tidy imports and documentation in `peer_set::set`.
* Link to the tower balance crate from docs
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* limit the number of initial peers
* Move more code out of zebra_network::initialize
* Always limit the number of initial peers in the Config
This way, we can never get the unused peers out.
* Revert "Always limit the number of initial peers in the Config"
This reverts commit 81ede597c885de7289ce19faa3c21a3439dd293c.
Actually, this doesn't work, because we want those extra peers.
* Minor tweaks
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: teor <teor@riseup.net>
* Count the number of active inbound and outbound peer connections
And reduce the count when each connection fails.
* Fix a comment typo
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Newer versions of Tokio panic if `tokio::time::pause()` is called from a
multi-thread executor, and `#[tokio::test]` defaults to a single thread
runtime, so it makes sense to always use a single thread runtime in all
tests.