* Move peer address validation into its own module
* Add a network parameter to AddressBook and some MetaAddr methods
* Reject invalid initial peers, and connect to them in preferred order
* Reject Flux/ZelCash and misconfigured Zcash peers
* Prefer canonical Zcash ports
* Make peer_preference into a struct method
* Prefer peer addresses with canonical ports for outbound connections
* Also ignore the Zcash regtest port
* Document where field and variant order is required for correctness
* Use the correct peer count
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Tweak crawler timings so peers are more likely to be available
* Tweak min peer connection interval so we try all peers
* Let other tasks run between fanouts, so we're more likely to choose different peers
* Let other tasks run between retries, so we're more likely to choose different peers
* Let other tasks run after peer crawler DemandDrop
This makes it more likely that peers will become ready.
* Spawn the address book updater on a blocking thread
* Spawn CandidateSet address book operations on blocking threads
* Replace the PeerSet address book with a metrics watch channel
* Fix comment
* Await spawned address book tasks
* Run the address book update tasks concurrently (except for the mutex)
* Explain an internal-only method better
* Fix a typo
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Use a single-thread shared Tokio runtime
This allows it to pause the time and more closely resembles the
environment that's set by default for asynchronous tests.
* Add a `zebra_test::init_async` helper function
Calls `zebra_test::init` but also constructs a single-thread Tokio
runtime and returns it. This makes it simpler to initialize asynchronous
tests that can't use the `#[tokio::test]` attribute.
* Replace usages of `Runtime::new` in tests
Use the new `zebra_test::init_async()` helper function instead.
* Replace `runtime::Builder::new_current_thread()`
Use the new `zebra_test::init_async()` helper function instead.
* Replace `runtime::Builder::new_multi_thread()`
Use the new `zebra_test::init_async()` helper function instead. The test
with the change doesn't necessarily have to use a multi-thread runtime.
Newer versions of Tokio panic if `tokio::time::pause()` is called from a
multi-thread executor, and `#[tokio::test]` defaults to a single thread
runtime, so it makes sense to always use a single thread runtime in all
tests.
* Implement initial service mocking helpers
Adds a [`MockService`] type, which can be configured and built for usage
in unit tests or proptests. The mocked service can then be used to
intercept requests and respond indivdiually to them.
* Use `MockService in the `mempool::Crawler` test
Refactor it to remove the helper mock function, and use the new
`MockService` helper type.
* Use `MockService` in `CandidateSet` test vectors
Refactor to remove the manual mocking of the peer set service.
* Panic if a response is not sent by `MockService`
Change the current semantics to require all `MockService` usages to
respond to every intercepted request.
A `must_use` attribute was added to the `ResponseSender` so that the
compiler can warn when this doesn't happen.
* Allow generic error types in `MockService`
Replace the hard-coded `BoxError` as the `Service`'s error type with a
generic type parameter. This allows mocking services in locations that
require specific error types.
* Add a `ResponseSender::request` getter
Allow inspecting the request again before responding, and using
information from the request in the response.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Gossip dynamically allocated listener ports to peers
Previously, Zebra would either gossip port `0`, which is invalid, or skip
gossiping its own dynamically allocated listener port.
* Improve "no configured peers" warning
And downgrade from error to warning, because inbound-only nodes are a
valid use case.
* Move random_known_port to zebra-test
* Add tests for dynamic local listener ports and the AddressBook
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Security: stop gossiping failure and attempt times as last_seen times
Previously, Zebra had a single time field for peer addresses, which was
updated every time a peer was attempted, sent a message, or failed.
This is a security issue, because the `last_seen` time should be
"the last time [a peer] connected to that node", so that
"nodes can use the time field to avoid relaying old 'addr' messages".
So Zebra was sending incorrect peer information to other nodes.
As part of this change, we split the `last_seen` time into the
following fields:
- untrusted_last_seen: gossiped from other peers
- last_response: time we got a response from a directly connected peer
- last_attempt: time we attempted to connect to a peer
- last_failure: time a connection with a peer failed
* Implement Arbitrary and strategies for MetaAddrChange
Also replace the MetaAddr Arbitrary impl with a derive.
* Write proptests for MetaAddr and MetaAddrChange
MetaAddr:
- the only times that get included in serialized MetaAddrs are
the untrusted last seen and responded times
MetaAddrChange:
- the untrusted last seen time is never updated
- the services are only updated if there has been a handshake
* Only advance the outbound connection timer when it returns an address
Previously, we were advancing the timer even when we returned `None`.
This created large wait times when there were no eligible peers.
* Refactor to avoid overlapping sleep timers
* Add a maximum next peer delay test
Also refactor peer numbers into constants.
* Make the number of proptests overridable by the standard env var
Also cleanup the test constants.
* Test that skipping peer connections also skips their rate limits
* Allow an extra second after each sleep on loaded machines
macOS VMs seem to need this extra time to pass their tests.
* Restart test time bounds from the current time
This change avoids test failures due to cumulative errors.
Also use a single call to `Instant::now` for each test round.
And print the times when the tests fail.
* Stop generating invalid outbound peers in proptests
The candidate set proptests will fail if enough generated peers are
invalid for outbound connections.
Rust atomics have an API that's very easy to use incorrectly, leading to
hard to find bugs. For that reason, it's best to avoid it unless there's
a good reason not to.
* Rename field to `wait_next_handshake`
Make the name a bit more clear regarding to the field's purpose.
* Move `MIN_PEER_CONNECTION_INTERVAL` to `constants`
Move it to the `constants` module so that it is placed closer to other
constants for consistency and to make it easier to see any relationships
when changing them.
* Rate limit calls to `CandidateSet::update()`
This effectively rate limits requests asking for more peer addresses
sent to the same peer. A new `min_next_crawl` field was added to
`CandidateSet`, and `update` only sends requests for more peer addresses
if the call happens after the instant specified by that field. After
sending the requests, the field value is updated so that there is a
`MIN_PEER_GET_ADDR_INTERVAL` wait time until the next `update` call
sends requests again.
* Include `update_initial` in rate limiting
Move the rate limiting code from `update` to `update_timeout`, so that
both `update` and `update_initial` get rate limited.
* Test `CandidateSet::update` rate limiting
Create a `CandidateSet` that uses a mocked `PeerService`. The mocked
service always returns an empty list of peers, but it also checks that
the requests only happen after expected instants, determined by the
fanout amount and the rate limiting interval.
* Refactor to create a `mock_peer_service` helper
Move the code from the test to a utility function so that another test
will be able to use it as well.
* Check number of times service was called
Use an `AtomicUsize` shared between the service and the test body that
the service increments on every call. The test can then verify if the
service was called the number of times it expected.
* Test calling `update` after `update_initial`
The call to `update` should be skipped because the call to
`update_initial` should also be considered in the rate limiting.
* Mention that call to `update` may be skipped
Make it clearer that in this case the rate limiting causes calls to be
skipped, and not that there's an internal sleep that happens.
Also remove "to the same peers", because it's more general than that.
Co-authored-by: teor <teor@riseup.net>
* Rate-limit new outbound peer connections
Set the rate-limiting sleep timer to use a delay added to the maximum
between the next peer connection instant and now. This ensures that the
timer always sleeps at least the time used for the delay.
This change fixes rate-limiting new outbound peer connections, since
before there could be a burst of attempts until the deadline progressed
to the current instant.
Fixes#2216
* Create `MetaAddr::alternate_node_strategy` helper
Creates arbitrary `MetaAddr`s as if they were network nodes that sent
their listening address.
* Test outbound peer connection rate limiting
Tests if connections are rate limited to 10 per second, and also tests
that sleeping before continuing with the attempts still respets the rate
limit and does not result in a burst of reconnection attempts.
Given a generated list of gossiped peers, ensure that after running the
`validate_addrs` function none of the resulting peers have a `last_seen`
time that's after the specified limit.
If the calculation to apply the compensation offset overflows or
underflows, the reported times are too distant apart, and could be sent
on purpose by a malicious peer, so all addresses from that peer should
be rejected.
Use some mock gossiped peers where some have `last_seen` times in the
past and some have times in the future. Check that all the peers have
an offset applied to them by the `validate_addrs` function.
This tests if the offset is applied to all peers that a malicious peer
gossiped to us.
Use some mock gossiped peers that all have `last_seen` times in the
past and check that they don't have any changes to the `last_seen` times
applied by the `validate_addrs` function.