Zebra

Commit Graph

Author	SHA1	Message	Date
Henry de Valence	299afe13df	zebra-network tweaks. (#877 ) * network: move gossiped peer selection logic into address book. * network: return BoxService from init. * zebrad: add note on why we truncate thegossiped peer list Co-authored-by: Jane Lusby <jlusby42@gmail.com> * Remove unused .rustfmt.toml Many of these options are never actually loaded by our CI because of a channel mismatch, where they're not applied on stable but only on nightly (see the logs from a rustfmt job). This means that we can get different settings when running `cargo fmt` on the nightly and stable channels, which was causing a CI failure on this PR. Reverting back to the default rustfmt settings avoids this problem and keeps us in line with upstream rustfmt. There's no loss to us since we were using the defaults anyways. Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2020-08-11 13:07:44 -07:00
Alfredo Garcia	9c387521bd	Print endpoint addresses at startup (#867 ) * print tracing and metrics endpoints in startup * print network address in startup	2020-08-10 12:47:26 -07:00
Henry de Valence	3d46ab746a	Clean up options in network config section. (#839 ) Closes #536. This removes: - the user-agent (we can add a mechanism to specify extra BIP14 components later, if any users ask us for that feature); - the EWMA parameters (these were put in the config just to avoid making a choice); - the peer connection timeout (we can change the default value if anyone ever has a problem with it); - the peer set request buffer size (setting this too low can make the application deadlock); The new peer interval is left in.	2020-08-06 11:29:00 -07:00
teor	6be0f8ed2f	fix: Warn if the listener port is for the wrong network We'll fix the underlying defaults in #660, with the rest of the listeners.	2020-07-29 16:03:52 +10:00
Henry de Valence	217c25ef07	network: propagate tracing Spans through peer connection	2020-07-09 11:15:06 -07:00
Dimitris Apostolou	ba81d7d4c0	Fix typos	2020-07-07 11:13:49 -07:00
Jane Lusby	685bdaf2df	don't require absense of cancel handles Prior to this change, we required that services that are canceled do not have a cancel handle in the `cancel_handles` list, based on the assumption that the handle must have been removed in the process of canceling this service. This doesn't holding up though, because it is currently possible for us to have the same peer connect to us multiple times, the second connect removes the cancel handle of the original connect and inserts it's own cancel handle in its place. In this scenario, when the first service is polled for readiness it will see that it has been canceled and go to clean itself up, but when it asserts that it doesn't have a cancel handle it will see the cancel handle of the second connect event, which uses the same key as the first connect, and fail its debug assertion. This change removes that debug assert on the assumption that it is okay for a peer to connect multiple times consecutively, and that the correct behavior in that case is to just cancel the first connection and continue as normal.	2020-06-16 13:42:31 -07:00
Jane Lusby	431f194c0f	propagate errors out of zebra_network::init (#435 ) Prior to this change, the service returned by `zebra_network::init` would spawn background tasks that could silently fail, causing unexpected errors in the zebra_network service. This change modifies the `PeerSet` that backs `zebra_network::init` to store all of the `JoinHandle`s for each background task it depends on. The `PeerSet` then checks this set of futures to see if any of them have exited with an error or a panic, and if they have it returns the error as part of `poll_ready`.	2020-06-09 12:24:28 -07:00
Jane Lusby	8c178c3ee4	fix panic in seed subcommand (#401 ) Co-authored-by: Jane Lusby <jane@zfnd.org> Prior to this change, the seed subcommand would consistently encounter a panic in one of the background tasks, but would continue running after the panic. This is indicative of two bugs. First, zebrad was not configured to treat panics as non recoverable and instead defaulted to the tokio defaults, which are to catch panics in tasks and return them via the join handle if available, or to print them if the join handle has been discarded. This is likely a poor fit for zebrad as an application, we do not need to maximize uptime or minimize the extent of an outage should one of our tasks / services start encountering panics. Ignoring a panic increases our risk of observing invalid state, causing all sorts of wild and bad bugs. To deal with this we've switched the default panic behavior from `unwind` to `abort`. This makes panics fail immediately and take down the entire application, regardless of where they occur, which is consistent with our treatment of misbehaving connections. The second bug is the panic itself. This was triggered by a duplicate entry in the initial_peers set. To fix this we've switched the storage for the peers from a `Vec` to a `HashSet`, which has similar properties but guarantees uniqueness of its keys.	2020-05-27 17:40:12 -07:00
Jane Lusby	b6b35364f3	cleanup warnings throughout codebase	2020-05-27 15:42:29 -04:00
Henry de Valence	3ed75cb626	Tweak peer set metrics. - Add a total peers metric to prevent races between measurements of ready/unready peers (which can cause the sum to be wrong). - Add an outbound request counter.	2020-02-21 06:48:25 -05:00
Henry de Valence	43b2d35dda	Crawl for more peers when we exhaust candidates.	2020-02-21 06:48:25 -05:00
Henry de Valence	afa2c2347f	fmt	2020-02-21 06:48:25 -05:00
Henry de Valence	00edcae0c2	Add metrics for the crawler and candidate set.	2020-02-14 20:14:05 -05:00
Henry de Valence	75d3d44fb3	Metrics MVP: add two metrics and export them to Prometheus. Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>	2020-02-14 20:14:05 -05:00
Henry de Valence	8000f888fd	Connect to multiple peers concurrently. The previous outbound peer connection logic got requests to connect to new peers and processed them one at a time, making single connection attempts and retrying if the connection attempt failed. This was quite slow, because many connections fail, and we have to wait for timeouts. Instead, this logic connects to new peers concurrently (up to 50 at a time).	2020-02-14 18:23:41 -05:00
Henry de Valence	2c0f48b587	Refactor connection logic and try a block request. Attempting to implement requests for block data revealed a problem with the previous connection logic. Block data is requested by sending a `getdata` message with hashes of the requested blocks; the peer responds with a sequence of `block` messages with the blocks themselves. However, this wasn't possible to handle with the previous connection logic, which could only convert a single Bitcoin message into a Response. Instead, we factor out the message handling logic into a Handler, which can statefully accumulate arbitrary data into a Response and signal completion. This is still pretty ugly but it does work. As a side effect, the HeartbeatNonceMismatch error is removed; because the Handler now tries to process messages until it comes to a Response, it just ignores mismatched nonces (and will eventually time out). The previous Mempool and Transaction requests were removed but could be re-added in a different form later. Also, the `Get` prefixes are removed from `Request` to tidy the name.	2020-02-10 09:03:56 -08:00
Henry de Valence	8d58dd804f	Note that tracing causes clippy false positives Thanks @hawkw for pointing this out.	2020-02-05 12:42:32 -08:00
Henry de Valence	f04f4f0b98	Apply clippy fixes	2020-02-05 12:42:32 -08:00
Henry de Valence	2965187b91	Upgrade tokio, futures, hyper to released versions.	2019-12-13 17:42:15 -05:00
Henry de Valence	36cd6d6e06	cargo fmt	2019-11-27 23:53:36 -05:00
Henry de Valence	ad6525574b	Rename PeerConnector -> peer::Connector	2019-11-27 23:53:36 -05:00
Henry de Valence	778e49b127	Rename PeerHandshake -> peer::Handshake	2019-11-27 23:53:36 -05:00
Henry de Valence	77191e62f6	Remove outdated fixup note.	2019-11-27 23:53:36 -05:00
Henry de Valence	da78603d3a	Rename `PeerClient` to `peer::Client`.	2019-11-27 23:53:36 -05:00
Henry de Valence	4fbc8270a2	Move PeerSet initialization into a submodule.	2019-11-27 05:06:01 -05:00
Henry de Valence	47ec2e2689	Remove stub discover module.	2019-10-22 19:06:08 -07:00
Henry de Valence	c3ec235a5b	Suppress unused import warnings.	2019-10-22 19:06:08 -07:00
Henry de Valence	ed2ee9d42f	Add a PeerConnector wrapper around PeerHandshake	2019-10-22 19:06:08 -07:00
Henry de Valence	e1a35490af	Move the CandidateSet to its own file. Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>	2019-10-22 19:06:08 -07:00
Henry de Valence	b1832ce593	Initial work to add a crawl-and-dial task. This responds to peerset demand by connecting to additional peers. Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>	2019-10-22 19:06:08 -07:00
Henry de Valence	5847b490da	Move PeerSet setup logic into a peer_set::init()	2019-10-18 16:11:01 -07:00
Henry de Valence	f6e62b0f5e	Remove failure from zebra-chain, zebra-network. Failure uses a distinct Fail trait rather than the standard library's Error trait, which causes a lot of interoperability problems with tower and other Error-using crates. Since failure was created, the standard library's Error trait was improved, and its conveniences are now available without the custom Fail trait using `thiserror` (for easy error derives) and `anyhow` (for a better boxed Error).	2019-10-16 13:16:52 -04:00
Henry de Valence	ae1a164ff8	Beginning of peerset implementation. (#62 ) * Don't expose submodules of zebra_network::peer. * PeerSet, PeerDiscover stubs. Co-authored-by: Deirdre Connolly <deirdre@zfnd.org> * Initial work on PeerSet. This is adapted from the MIT-licensed tower-balance implementation. * Use PeerSet in the connect stub.	2019-10-10 18:15:24 -07:00

1 2

84 Commits