Commit Graph

626 Commits

Author SHA1 Message Date
teor 1835ec2c8d
Add diagnostics for peer set hangs (#3203)
* Use a named CancelHeartbeatTask unit struct for the channel type

* Prefer cancel handles in selects, if both are ready

* Fix message metrics to just show the command name

* Add metrics for internal requests and responses

* Add internal requests and responses to the messages dashboard

* Add a canceled metric, and peer addresses to request and response metrics

* Add a canceled messages graph

* Add connection state metrics for currently open connections

* Fix the connection state graph with new metrics

* Always send an error before dropping pending responses

* Move error detail logging into `fail_with`

* Delete an unused timer future

* Make error strings in metrics less verbose

* Downgrade some error logs to info

* Remove a redundant expect

* Avoid unnecessary allocations for connection state metrics

* Fix missed updates to mempool and block gossip metrics
2021-12-14 21:11:03 +00:00
Janito Vaqueiro Ferreira Filho 7bc2f0ac27
Describe `PeerSet`s behavior at a network upgrade (#3181)
ZIP-201 describes how a Zcash node should behave when it reaches a
network upgrade activation height. Zebra doesn't implement all the
details specified there, so we need to document what it does implement
and what it doesn't and why.
2021-12-13 11:17:20 +01:00
teor ba42d59f12
Stop ignoring panics in inbound handshakes (#3192) 2021-12-10 18:32:42 +00:00
Alfredo Garcia f750535961
Spawn initial handshakes in separated task (#3189)
* spawn connector

* expand comment

Co-authored-by: teor <teor@riseup.net>

* fix error handling

Co-authored-by: teor <teor@riseup.net>
2021-12-10 01:18:43 +00:00
teor 37808eaadb
Security: When there are no new peers, stop crawler using CPU and writing logs (#3177)
* Stop useless crawler attempts when there are no peers and no crawl responses

* Disable GitHub bug report URLs when the disk is full

* Add help text for the `zebrad start` tracing filter option
2021-12-10 00:19:52 +00:00
teor 4ce6fbccc4
Fix new clippy lints in clippy nightly (#3176) 2021-12-09 14:19:14 +00:00
Janito Vaqueiro Ferreira Filho 0ad89f2f41
Disconnect from outdated peers on network upgrade (#3108)
* Replace usage of `discover::Change` with a tuple

Remove the assumption that a `Remove` variant would never be created
with type changes that allow the compiler to guarantee that assumption.

* Add a `version` field to the `Client` type

Keep track of the peer's reported protocol version.

* Create `LoadTrackedClient` type

A `peer::Client` type wrapper that implements `Load`. This helps with
the creation of a client service that has extra peer information to be
accessed without having to send requests.

* Use `LoadTrackedClient` in `initialize`

Ensure that `PeerSet` receives `LoadTrackedClient`s so that it will be
able to query the peer's protocol version later on.

* Require `LoadTrackedClient` in `PeerSet`

Replace the generic type with a concrete `LoadTrackedClient` so that we
can query its version.

* Create `MinimumPeerVersion` helper type

A type to track the current minimum protocol version for connected
peers based on the current block height.

* Use `MinimumPeerVersion` in handshakes

Keep the code to obtain the current minimum peer protocol version in a
central place.

* Add a `MinimumPeerVersion` instance to `PeerSet`

Prepare it to be able to disconnect from outdated peers based on the
current minimum supported peer protocol version.

* Disconnect from ready services for outdated peers

When the minimum peer protocol version is detected to have changed
(because of a network upgrade), remove all ready services of peers that
became outdated.

* Cancel added unready services of outdated peers

Only add an unready service if it's for a peer that has a supported
protocol version. Otherwise, add it but drop the cancel handle so that
the `UnreadyService` can execute and detect that it was cancelled.

* Avoid adding ready services for outdated peers

If a service becomes ready but it's for a connection to an outdated
peer, drop it.

* Improve comment inside `crawl_and_dial`

Describe an edge case that is also handled but was not explicit.

Co-authored-by: teor <teor@riseup.net>

* Test if calculated minimum peer version is correct

Given an arbitrary best chain tip height, check that the calculated
minimum peer protocol version is the expected value.

* Test if minimum version changes with chain tip

Apply an arbitrary list of chain tip height updates and check that for
each update the minimum peer version is calculated correctly.

* Test minimum peer version changed reports

Simulate a series of best chain tip height updates, and check for
minimum peer version updates at least once between them. Changes should
only be reported once.

* Create a `MockedClientHandle` helper type

Used to create and then track a mock `Client` instance.

* Add `MinimumPeerVersion::with_mock_chain_tip`

An extension method useful for tests, that contains some shared
boilerplate code.

* Bias arbitrary `Version`s to be in valid range

Give a 50% chance for an arbitrary `Version` to be in the range of
previously used values the Zcash network.

* Create a `PeerVersions` helper type

Helps with the creation of mocked client services with arbitrary
protocol versions.

* Create a `PeerSetGuard` helper type

An auxiliary type to a `PeerSet` instance created for testing. It keeps
track of any dummy endpoints of channels created and passed to the
`PeerSet` instance.

* Create a `PeerSetBuilder` helper type

Helps to reduce the code when preparing a `PeerSet` test instance.

* Test if outdated peers are rejected by `PeerSet`

Simulate a set of discovered peers being sent to the `PeerSet`. Ensure
that only up-to-date peers are kept by the `PeerSet` and that outdated
peers are dropped.

* Create `BlockHeightPairAcrossNetworkUpgrades` type

A helper type that allows the creation of arbitrary block height pairs,
where one value is before and the other is at or after the activation
height of an arbitrary network upgrade.

* Test if peers are dropped as they become outdated

Simulate a network upgrade, and check that peers that become outdated
are dropped by the `PeerSet`.

* Remove dbg! macros

Co-authored-by: teor <teor@riseup.net>
2021-12-09 02:54:29 +00:00
teor c55753d5bd
Add debug-level Zebra network message tracing (#3170)
* Add debug-level Zebra network message tracing

* Delete redundant spaces

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-12-09 01:09:23 +00:00
Janito Vaqueiro Ferreira Filho 1f756fcc81
Add `zebra_test::init_async` helper function (#3169)
* Use a single-thread shared Tokio runtime

This allows it to pause the time and more closely resembles the
environment that's set by default for asynchronous tests.

* Add a `zebra_test::init_async` helper function

Calls `zebra_test::init` but also constructs a single-thread Tokio
runtime and returns it. This makes it simpler to initialize asynchronous
tests that can't use the `#[tokio::test]` attribute.

* Replace usages of `Runtime::new` in tests

Use the new `zebra_test::init_async()` helper function instead.

* Replace `runtime::Builder::new_current_thread()`

Use the new `zebra_test::init_async()` helper function instead.

* Replace `runtime::Builder::new_multi_thread()`

Use the new `zebra_test::init_async()` helper function instead. The test
with the change doesn't necessarily have to use a multi-thread runtime.
2021-12-09 00:18:17 +00:00
teor 332afc17d5
Security: Limit address book size to limit memory usage (#3162)
* Refactor the address response limit

* Limit the number of peers in the address book

* Allow changing the address book limit in tests

* Add tests for the address book length limit

* rustfmt
2021-12-06 16:09:10 -03:00
teor 4d608d3224
Stop doing thousands of time checks each time we connect to a peer (#3106)
* Stop checking the entire AddressBook for each connection attempt

* Stop redundant peer time checks within the address book

* Stop calling `Instant::now` 3 times for each address book update

* Only get the time once each time an address book method is called

* Update outdated comment

* Use an OrderedMap to efficiently store address book peers

* Add address book order tests
2021-12-03 15:09:43 -03:00
teor 022808d028
Release Zebra v1.0.0-beta.2 (#3132)
Zebra's latest beta continues implementing zero-knowledge proof and note commitment tree validation. In this release, we have finished implementing transaction header, transaction amount, and Zebra-specific NU5 validation. (NU5 mainnet validation is waiting on an `orchard` crate update, and some consensus parameter updates.)

We also fix a number of security issues that could pose a local denial of service risk, or make it easier for an attacker to make a node follow a false chain.

As of this release, Zebra will automatically download and cache the Sprout and Sapling Groth16 circuit parameters. The cache uses around 1 GB of disk space. These cached parameters are shared across all Zebra and `zcashd` instances run by the same user.

See CHANGELOG.md for the full list of changes in this release.
2021-12-03 06:54:14 +10:00
teor a92c431c03
Ignore NotFound errors in the syncer (#3131) 2021-12-02 11:28:20 -03:00
teor ab471b0db0
Revert "Stop returning NotFound errors, use the response instead" (#3124)
* Revert "Stop returning NotFound errors, use the response instead"

This reverts commit 45871f6915c0b294502bf04917c42fdcd3b1075c.

* Fix clippy warnings

* Downgrade a frequent log to debug level
2021-12-01 05:09:54 +00:00
teor c85ea18b43
Fix slow Zebra startup times, to reduce CI failures (#3104)
* Tweak a log message

* Only retry failed DNS once, then use the other DNS responses

* Limit broadcasts to half the peers

* Use a longer minimum interval for GetAddr requests

* Reduce the syncer and mempool crawler fanouts

* Stop resetting the mempool twice when it starts up

This spawns two crawlers, which send two fanouts,
so it can use up a lot of peers.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-30 21:04:32 +00:00
teor a358c410f5
Stop closing connections on unexpected messages, Credit: Equilibrium (#3120)
* Ignore unsupported messages from peers

* Ignore unknown message commands from peers

* Implement Display for Request, Response, Handler, connection::State

* Stop ignoring some completed `Response`s

* Stop returning NotFound errors, use the response instead

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-30 19:26:17 +00:00
teor f6abb15778
Security: Stop routing inventory requests by peer address (#3090)
* Rewrite PeerSet comments to split long sentences

* Replace peer set integer indexes with address-based indexes

Also improve documentation and logging.

* Security: Stop using peer addresses to choose inventory routing order

* Minor doc and code cleanups

* Stop re-using a drained HashSet

* Replace used `_cancel` with `cancel`

* Reword a comment

* Replace cloned with copied
2021-11-24 10:31:42 +10:00
teor b39f4ca5aa
Shut down channels and tasks on PeerSet Drop (#3078)
* Shut down channels and tasks on PeerSet Drop

* Document all the PeerSet fields

* Close the peer set background task handle on shutdown

* Receive background tasks during shutdown

Also, split receiving and polling background tasks into separate methods.
2021-11-22 22:29:34 -03:00
dependabot[bot] 1d14032b10
Bump tower from 0.4.10 to 0.4.11 (#3081)
Bumps [tower](https://github.com/tower-rs/tower) from 0.4.10 to 0.4.11.
- [Release notes](https://github.com/tower-rs/tower/releases)
- [Commits](https://github.com/tower-rs/tower/compare/tower-0.4.10...tower-0.4.11)

---
updated-dependencies:
- dependency-name: tower
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 06:26:39 +10:00
Pili Guerra 26b3a50e01
Updates for zebra v1.0.0-beta.1 release (#3073)
* Update versions for zebra v1.0.0-beta.1 release

* Adding original PR list for comparison and tracking as PRs merge

* First pass at categorising changes

* Merge and clarify description of related changes

* Remove or merge trivial changes

* Improve change descriptions

* Add new PRs merged

* CHANGELOG: Improve release summary

* CHANGELOG: categorise changes further

* README: Remove resolved issues and items

* Update CHANGELOG.md

Co-authored-by: teor <teor@riseup.net>

* CHANGELOG: Add new PRs merged

* CHANGELOG: Move change category

* CHANGELOG: Update release date ready for tagging

Co-authored-by: teor <teor@riseup.net>
2021-11-19 13:05:11 +01:00
teor 3fc049e2eb
Implement graceful shutdown for the peer set (#3071)
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-18 13:28:25 +00:00
teor c4118dcc2c
Check for panics in the address book updater task (#3064)
* Check for panics in the address book updater task

* Fix the return type and tests

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-18 12:34:51 +00:00
dependabot[bot] b33ffc9df8
Bump tokio from 1.13.0 to 1.14.0 (#3062)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.13.0 to 1.14.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-17 09:27:01 +10:00
teor 7457edcb86
Stop asking users to report peer errors, fix a common peer error (#3054)
* Stop treating inv with mixed item types as a connection error

* Remove unused connection errors

* Stop asking users to create bug reports for peer errors
2021-11-15 11:32:18 -03:00
Dimitris Apostolou afb8b3d477
Fix typos (#3055)
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-11-12 19:30:22 +00:00
teor d6f3b3dc9a
Parse received addrv2 messages (#3022)
* Revert "Remove commented-out code"

This reverts commit 9e69777925f103ee11e5940bba95b896c828839b.

* Implement deserialization for `addrv2` messages

* Limit addr and addrv2 messages to MAX_ADDRS_IN_MESSAGE

* Clarify address version comments

* Minor cleanups and fixes

* Add preallocation tests for AddrV2

* Add serialization tests for AddrV2

* Use prop_assert in AddrV2 proptests

* Use a generic utility method for deserializing IP addresses in `addrv2`

* Document the purpose of a conversion to MetaAddr

* Fix a comment typo, and clarify that comment

* Clarify the unsupported AddrV2 network ID error and enum variant names

```sh
fastmod AddrV2UnimplementedError UnsupportedAddrV2NetworkIdError zebra-network
fastmod Unimplemented Unsupported zebra-network
```

* Fix and clarify unsupported AddrV2 comments

* Replace `panic!` with `unreachable!`

* Clarify a comment about skipping a length check in a test

* Remove a redundant test

* Basic addr (v1) and addrv2 deserialization tests

* Test deserialized IPv4 and IPv6 values in addr messages

* Remove redundant io::Cursor

* Add comments with expected values of address test vectors
2021-11-12 00:25:23 +00:00
Janito Vaqueiro Ferreira Filho 11b5a33651
Security: Avoid reconnecting to peers that are likely unreachable (#3030)
* Add a `Duration32::from_days` constructor

Make it simpler to construct a `Duration32` representing a certain
number of days.

* Add `MetaAddr::was_not_recently_seen` method

A helper method to check if a peer was never seen before or if it was
last seen a long time ago. This will be one of the conditions to
consider a peer as unreachable.

* Add `MetaAddr::is_probably_unreachable` method

A helper method to check if a peer should be considered unreachable. It
is considered unreachable if recent connection attempts have failed and
it was not recently seen.

If a peer is considered unreachable, Zebra shouldn't attempt to connect
to it again.

* Do not keep trying to connect to unreachable peer

A peer is probably unreachable if it was last seen a long time ago and
if it's last connection attempt failed.

* Test `was_not_recently_seen`

Redo the calculation on arbitrary `MetaAddr`s.

* Test `is_probably_unreachable`

Redo the calculation on arbitrary `MetaAddr`s.

* Test if probably unreachable peers are ignored

Given an `AddressBook` with a list of arbitrary `MetaAddr`s, check that
none of the peers listed for a reconnection is probably unreachable.

* Rename unit test to improve clarity

Remove the double negative from the name.

Co-authored-by: teor <teor@riseup.net>

* Rename constant to `MAX_RECENT_PEER_AGE`

Make the purpose of the constant clearer.

Co-authored-by: teor <teor@riseup.net>

* Rename method to `last_seen_is_recent`

Remove the double negative from the name.

* Rename method to `is_probably_reachable`

Avoid having to negate the result of the method in security critical
filter.

* Move check into `is_ready_for_connection_attempt`

Make sure the check is used in any place that requires a peer that's
ready for a connection attempt.

* Improve test documention

Describe the goal of the test better.

Co-authored-by: teor <teor@riseup.net>

* Improve `is_probably_reachable` documentation

List the conditions as bullet points.

Co-authored-by: teor <teor@riseup.net>

* Document what happens when peers have no last seen time

Co-authored-by: teor <teor@riseup.net>
2021-11-10 23:51:22 +00:00
teor c0c00b3f0d
Simplify preallocate tests (#3032)
* Simplify preallocation tests using a test function

* Use prop_assert in proptests
2021-11-11 07:53:21 +10:00
teor 85b016756d
Refactor addr v1 serialization using a separate AddrV1 type (#3021)
* Implement addr v1 serialization using a separate AddrV1 type

* Remove commented-out code

* Split the address serialization code into modules

* Reorder v1 and in_version fields in serialization order

* Fix a missed search-and-replace

* Explain conversion to MetaAddr

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-11-10 06:47:50 +10:00
teor af2baa0a5e
Avoid listener address conflicts in network tests (#3031)
These conflicts can make some tests fail if they run in parallel.
Failures are more likely on machines with lots of cores.
2021-11-08 11:20:13 -03:00
teor 7d8240fac3
Fix verbose add_initial_peers logs (#3019)
And update some function docs.
2021-11-07 22:21:51 +00:00
teor 88cdf0f832
Move MetaAddr serialization into zebra_network::protocol::external (#3020)
Preparation for `addrv2` serialization.
2021-11-07 21:37:38 +00:00
Marek d03161c63f
Add unused seed peers to the AddressBook (#2974)
* Add unused seed peers to the AddressBook

* Document a new `await`

We added an extra await on the AddressBook thread mutex.

Co-authored-by: teor <teor@riseup.net>

* Fix a typo

* Refactor names

* Return early from `limit_initial_peers`

* Add `proptest`s regressions

* Return `MetaAddr` instead of `None`

* Test if `zebra_network::init()` deadlocks

* Remove unneeded regressions

* Rename `TimestampCollector` to `AddressBookUpdater` (#2992)

* Rename `TimestampCollector` to `AddressBookUpdater`

* Update comments

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Move `all_peers` instead of copying them

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Make `Duration` a const

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Use a timeout instead of measuring the elapsed time

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Copy `initial_peers` instead of moving them

* Refactor the position of `NewInitial` and `new_initial`

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-11-04 08:34:00 -03:00
Janito Vaqueiro Ferreira Filho 0960e4fb0b
Update to Tokio 1.13.0 (#2994)
* Update `tower` to version `0.4.9`

Update to latest version to add support for Tokio version 1.

* Replace usage of `ServiceExt::ready_and`

It was deprecated in favor of `ServiceExt::ready`.

* Update Tokio dependency to version `1.13.0`

This will break the build because the code isn't ready for the update,
but future commits will fix the issues.

* Replace import of `tokio::stream::StreamExt`

Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.

* Use `IntervalStream` in `zebra-network`

In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.

* Use `IntervalStream` in `inventory_registry`

In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.

* Use `BroadcastStream` in `inventory_registry`

In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.

* Handle `Semaphore::acquire` error in `tower-batch`

Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.

* Handle `Semaphore::acquire` error in `zebrad` test

On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.

* Update some `zebra-network` dependencies

Use versions compatible with Tokio version 1.

* Upgrade Hyper to version 0.14

Use a version that supports Tokio version 1.

* Update `metrics` dependency to version 0.17

And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.

* Use `f64` as the histogram data type

`u64` isn't supported as the histogram data type in newer versions of
`metrics`.

* Update the initialization of the metrics component

Make it compatible with the new version of `metrics`.

* Simplify build version counter

Remove all constants and use the new `metrics::incement_counter!` macro.

* Change metrics output line to match on

The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.

* Update `sentry` to version 0.23.0

Use a version compatible with Tokio version 1.

* Remove usage of `TracingIntegration`

This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.

* Add sentry layer to tracing initialization

This seems like the replacement for `TracingIntegration`.

* Remove unnecessary conversion

Suggested by a Clippy lint.

* Update Cargo lock file

Apply all of the updates to dependencies.

* Ban duplicate tokio dependencies

Also ban git sources for tokio dependencies.

* Stop allowing sentry-tracing git repository in `deny.toml`

* Allow remaining duplicates after the tokio upgrade

* Use C: drive for CI build output on Windows

GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.

Co-authored-by: teor <teor@riseup.net>
2021-11-02 18:46:57 +00:00
teor 938376f11f
Disable an unreliable test on macOS (#2997)
We keep the test active on other platforms,
because it passes on them.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-11-02 14:11:39 +00:00
Janito Vaqueiro Ferreira Filho a9f1c189d9
Make `services` field in `MetaAddr` optional (#2976)
* Use `prop_assert` instead of `assert`

Otherwise the test input isn't minimized.

* Split long string into a multi-line string

And add some newlines to try to improve readability.

* Fix referenced issue number

They had a typo in their number.

* Make peer services optional

It is unknown for initial peers.

* Fix `preserve_initial_untrusted_values` test

Now that it's optional, the services field can be written to if it was
previously empty.

* Fix formatting of property tests

Run rustfmt on them.

* Restore `TODO` comment

Make it easy to find planned improvements in the code.

Co-authored-by: teor <teor@riseup.net>

* Comment on how ordering is affected

Make it clear that missing services causes the peer to be chosen last.

Co-authored-by: teor <teor@riseup.net>

* Don't expect `services` to be available

Avoid a panic by using the compiler to help enforce the handling of the
case correctly.

* Panic if received gossiped address has no services

All received gossiped addresses have services. The only addresses that
don't have services configured are the initial seed addresses.

Co-authored-by: teor <teor@riseup.net>
2021-11-02 02:45:35 +00:00
Conrado Gouvea e54917ae7c
V1.0.0-beta.0 (#2973)
* V1.0.0-beta.0

* Bump version in install.md
2021-10-29 20:21:26 +00:00
Alfredo Garcia 07610feef3
Reduce outgoing peers demand (#2969)
* reduce demand

* use `saturating_sub`
2021-10-29 16:29:52 +00:00
Alfredo Garcia 3402c1d8a2
Add user agent metrics (#2957)
* add remote peer user agent metrics

* add user agent to obsolete peers
2021-10-28 19:23:09 +00:00
teor f26a60b801
Limit the number of inbound peer connections (#2961)
* Limit open inbound connections based on the config

* Log inbound connection errors at debug level

* Test inbound connection limits

* Use clone directly in function call argument lists

* Remove an outdated comment

* Update tests to use an unbounded channel rather than mem::forget

And rename some variables.

* Use a lower limit in a slow test and require that it is exceeded
2021-10-28 01:49:31 +00:00
Conrado Gouvea 8d01750459
Rate-limit initial seed peer connections (#2943)
* Rate-limit initial seed peer connections

* Revert "Rate-limit initial seed peer connections"

This reverts commit f779a1eb9e1ffd5497e96dfe8ae1be4340d5d24d.

* Simplify logic

* Avoid cooperative async task starvation in the peer crawler and listener

If we don't yield in these loops, they can run for a long time before
tokio forces them to yield.

* Add test

* Check for task panics in initial peers test

* Remove duplicate code in rebase

Co-authored-by: teor <teor@riseup.net>
2021-10-27 23:46:43 +00:00
teor 3e03d48799
Limit the number of outbound peer connections (#2944)
* Limit the number of outbound connections in the crawler

* Make zebra-network channel bounds depend on config.peerset_initial_target_size

* Bias Zebra towards outbound connections

And turn connection limits into `Config` methods.

* Downgrade some connection logs to debug

* Remove verbose or outdated fields in tracing logs

* Clarify connection limits

Includes:
- `fastmod OUTBOUND_PEER_BIAS_FRACTION OUTBOUND_PEER_BIAS_DENOMINATOR zebra*`
- clarify connection limit documentation

* Clarify inventory channel capacity

* Add zebra_network::initialize tests with limited numbers of peers

* Avoid cooperative async task starvation in the peer crawler and listener

If we don't yield in these loops, they can run for a long time before
tokio forces them to yield.

* Test the crawler with small connection limits

And use the multi-threaded runtime to avoid long hangs.

* Stop using the multi-threaded executor in tests where it's not needed

* Avoid starvation for every connection

Adds yields after inbound successes and initial peer connections.

* Add a crawler peer connection success test

* Add outbound connection limit tests

* Improve outbound tests
2021-10-27 21:28:51 +00:00
teor c2734f5661
Simplify calling `add_initial_peers` (#2945)
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-25 20:16:35 +00:00
Janito Vaqueiro Ferreira Filho 2a1d4281c5
Manually pin `Sleep` futures (#2914)
* Wrap `Sleep` timer in a `Pin<Box<_>>`

The `Sleep` type doesn't implement `Unpin` in newer versions of Tokio.

* Wrap `Sleep` type in a `Pin<Box<_>>`

In newer Tokio versions the `Sleep` type doesn't implement `Unpin`, so
it needs to be manually pinned.
2021-10-22 16:06:03 -03:00
teor 67327ac462
Downgrade some less interesting info-level logs to debug (#2938)
There are a lot of these messages when Zebra starts up.
They might be slowing down CI and causing timeouts.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-10-22 02:11:09 +00:00
teor 424edfa4d9
Improve documentation and types in the PeerSet (#2925)
* Replace some unit tuples with named unit structs

This helps distinguish generic channels and make them type-safe.

Also tidy imports and documentation in `peer_set::set`.

* Link to the tower balance crate from docs

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-22 01:26:04 +00:00
Alfredo Garcia ad5f5ff24a
Rate limit the amount of inbound connections (#2928)
* add sleep to `accept_inbound_connections()`

* Expand docs

* Expand comments again

Co-authored-by: teor <teor@riseup.net>
2021-10-22 00:35:34 +00:00
Alfredo Garcia 2de93bba8e
Limit the number of initial peers (#2913)
* limit the number of initial peers

* Move more code out of zebra_network::initialize

* Always limit the number of initial peers in the Config

This way, we can never get the unused peers out.

* Revert "Always limit the number of initial peers in the Config"

This reverts commit 81ede597c885de7289ce19faa3c21a3439dd293c.

Actually, this doesn't work, because we want those extra peers.

* Minor tweaks

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: teor <teor@riseup.net>
2021-10-21 23:04:46 +00:00
teor 4cdd12e2c4
Track the number of active inbound and outbound peer connections (#2912)
* Count the number of active inbound and outbound peer connections

And reduce the count when each connection fails.

* Fix a comment typo

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-21 21:36:42 +00:00
Janito Vaqueiro Ferreira Filho 39ed7d70d3
Use single thread Tokio runtime for tests (#2916)
Newer versions of Tokio panic if `tokio::time::pause()` is called from a
multi-thread executor, and `#[tokio::test]` defaults to a single thread
runtime, so it makes sense to always use a single thread runtime in all
tests.
2021-10-21 16:22:12 +00:00