Zebra

Commit Graph

Author	SHA1	Message	Date
teor	e06705ed81	Only reject pending client requests when the peer has errored - Add an `ExitClient` transition, used when the internal client channel is closed or dropped, and there are no more pending requests - Ignore pending requests after an `ExitClient` transition - Reject pending requests when the peer has caused an error (the `Exit` and `ExitRequest` transitions) - Remove `PeerError::ConnectionDropped`, because it is now handled by `ExitClient`. (Which is an internal error, not a peer error.)	2021-02-19 14:11:35 -08:00
teor	9d9734ea81	rustfmt	2021-02-19 14:11:35 -08:00
Jane Lusby	5ec8d09e0d	accidental drop on mustusesender	2021-02-19 14:11:35 -08:00
Jane Lusby	6906f87ead	introduce Transition enum	2021-02-19 14:11:35 -08:00
Jane Lusby	e6cb20e13f	leverage return value for propagating errors	2021-02-19 14:11:35 -08:00
teor	983e94f9e4	Add a TODO for inbound error handling cleanup	2021-02-03 08:32:10 +10:00
teor	b551d81f8d	Explain why we stay connected on Inbound errors We might be syncing using this peer, so it's ok to just ignore any internal errors in their Inbound requests, and drop the request.	2021-01-27 12:08:49 -08:00
teor	05fff8e6f7	Revert "Stop panicking when fail_with is called twice on a connection" But keep the extra error information.	2021-01-18 00:23:36 -05:00
teor	4fe81da953	Improve logging for connection state errors	2021-01-18 00:23:36 -05:00
teor	a6c1cd3c35	Stop panicking when fail_with is called twice on a connection We can't rule out the connection state changing between the state checks and any eventual failures, particularly in the presence of async code. So we turn this panic into a warning.	2021-01-18 00:23:36 -05:00
teor	44c8fafc29	Stop processing the request after failing an overloaded connection zebra-network's Connection expects that `fail_with` is only called once per connection, but the overload handling code continues to process the current request after an overload error, potentially leading to further failures. Closes #1599	2021-01-18 00:23:36 -05:00
teor	b7d0a40ee1	Revert unused instrument macros Reverts most of "Instrument some functions to try to locate the panic"	2021-01-06 13:07:23 -08:00
teor	6d3aa0002c	Ensure received client request oneshots are used via the type system The `peer::Client` translates `Request`s into `ClientRequest`s, which it sends to a background task. If the send is `Ok(())`, it will assume that it is safe to unconditionally poll the `Receiver` tied to the `Sender` used to create the `ClientRequest`. We enforce this invariant via the type system, by converting `ClientRequest`s to `InProgressClientRequest`s when they are received by the background task. These conversions are implemented by `ClientRequestReceiver`. Changes: * Revert `ClientRequest` so it uses a `oneshot::Sender` * Add `InProgressClientRequest`, which is the same as `ClientRequest`, but has a `MustUseOneshotSender` * `impl From<ClientRequest> for InProgressClientRequest` * Add a new `ClientRequestReceiver` type that wraps a `mpsc::Receiver<ClientRequest>` * `impl Stream<InProgressClientRequest> for ClientRequestReceiver`, converting the successful result of `inner.poll_next_unpin` into an `InProgressClientRequest` * Replace `client_rx: mpsc::Receiver<ClientRequest>` in `Connection` with the new `ClientRequestReceiver` type * `impl From<mpsc::Receiver<ClientRequest>> for ClientRequestReceiver`	2021-01-06 13:07:23 -08:00
teor	3e711ccc8a	Instrument some functions to try to locate the panic	2021-01-06 13:07:23 -08:00
teor	fa29fca917	Panic when must-use senders are dropped before use Add a MustUseOneshotSender, which panics if its inner sender is unused. Callers must call `send()` on the MustUseOneshotSender, or ensure that the sender is canceled. Replaces an unreliable panic in `Client::call()` with a reliable panic when a must-use sender is dropped.	2021-01-06 13:07:23 -08:00
teor	b03809ebe3	Add the invalid state to an unreachable panic message	2021-01-06 13:07:23 -08:00
teor	86136c7b5c	Stop ignoring errors when the new state is AwaitingRequest The previous code would send a Nil message on the Sender, even if the result was actually an error.	2021-01-06 13:07:23 -08:00
teor	da5084a10a	Split the 3-level match using a temporary	2021-01-06 13:07:23 -08:00
teor	fd23c46726	Remove a redundant fmt::Display bound	2021-01-06 13:07:23 -08:00
teor	3892894ffa	Call ClientRequest.tx.send() even if there is an error Previously, tx would be dropped before send if: - the success case would have used tx to wait for further messages, - but the response was actually an error. Instead, send the error on `tx` and call `fail_with()` using the same error. To support this change, allow `fail_with()` to take a `PeerError` or a `SharedPeerError`.	2021-01-06 13:07:23 -08:00
teor	28f3186182	Mark ClientRequest and State::AwaitingResponse as must_use	2021-01-06 13:07:23 -08:00
teor	b1f14f47c6	Rewrite GetData handling to match the zcashd implementation (#1518 ) * Rewrite GetData handling to match the zcashd implementation `zcashd` silently ignores missing blocks, but sends found transactions followed by a `NotFound` message: `e7b425298f/src/main.cpp (L5497)` This is significantly different to the behaviour expected by the old Zebra connection state machine, which expected `NotFound` for blocks. Also change Zebra's GetData responses to peer request so they ignore missing blocks. * Stop hanging on incomplete transaction or block responses Instead, if the peer sends an unexpected block, unexpected transaction, or NotFound message: 1. end the request, and return a partial response containing any items that were successfully received 2. if none of the expected blocks or transactions were received, return an error, and close the connection	2021-01-04 13:25:35 +10:00
Henry de Valence	18cf5e0249	network: use short Display for Message in spans This makes the span data more compact (e.g., `msg_as_req{msg=block}`) and restores the Debug impl for Message to show all of the data contained in the message. The full message is added as a single event at trace level in the span to preserve the previous full-inspectability.	2020-12-01 19:16:41 -08:00
Henry de Valence	add94c1c45	deps: move to tokio 0.3, tower 0.4 This change is mostly mechanical, with the exception of the changes to the `tower-batch` middleware. This middleware was adapted from `tower::buffer`, and the `tower::buffer` code was changed to implement its own bounded queue, because Tokio 0.3 removed the `mpsc::Sender::poll_send` method. See `ddc64e8d4d` for more context on the Tower changes. To match Tower as closely as possible in order to be able to upstream `tower-batch`, those changes are copied from `tower::Buffer` to `tower-batch`.	2020-11-20 10:08:16 -08:00
Henry de Valence	8e709bfa88	network: don't fail on unsolicited messages These messages might be unsolicited, or they might be a response to a request we already canceled. So don't fail the whole connection, just drop the message and move on.	2020-10-26 12:05:35 -07:00
Henry de Valence	13daefa729	network: handle request cancellation in Connection We handle request cancellation in two places: before we transition into the AwaitingResponse state, and while we are in AwaitingResponse. We need both places, or else if we started processing a request, we wouldn't process the cancellation until the timeout elapsed. The first is a check that the oneshot is not already canceled. For the second, we wait on a cancellation, either from a timeout or from the tx channel closing.	2020-10-26 12:05:35 -07:00
teor	1e97691fc8	Fix some "needless lifetime" clippy lints These lints seem to be new in clippy nightly.	2020-10-12 08:54:23 +10:00
Henry de Valence	b72c249b96	network: add a metric+warning when shedding load	2020-09-21 09:26:39 -07:00
Henry de Valence	4df5632752	network: handle Message::NotFound as a response This cleans up the response processing logic a little bit along the way, but the overall division of responsibility should be better documented in a future commit.	2020-09-20 10:21:18 -07:00
Henry de Valence	64905563d1	network: remove glob import in message-handling This clarifies which parts are the handler state and which parts are the incoming message.	2020-09-20 10:21:18 -07:00
Henry de Valence	9c021025a7	network: fill in remaining request/response pairs	2020-09-20 10:21:18 -07:00
Henry de Valence	b289cb9164	network: clean up GetHeaders, GetBlocks modeling	2020-09-20 10:21:18 -07:00
Henry de Valence	3c993f33b1	network: add PeerError::WrongMessage This lets us distinguish between cases where the message was unsupported (e.g., BIP11 messages), and cases where the message was uninterpretable in context (e.g., unsolicited messages).	2020-09-20 10:21:18 -07:00
Henry de Valence	430176dd0d	network: clean up message-as-request translation	2020-09-20 10:21:18 -07:00
Henry de Valence	1d3892e1dc	network: rename alias to BoxError This is shorter and consistent with Tower (which is why we use it in the first place).	2020-09-18 18:34:25 -07:00
Henry de Valence	3f150eb16e	network: implement transaction request handling. (#1016 ) This commit makes several related changes to the network code: - adds a `TransactionsByHash(HashSet<transaction::Hash>)` request and `Transactions(Vec<Arc<Transaction>>)` response pair that allows fetching transactions from a remote peer; - adds a `PushTransaction(Arc<Transaction>)` request that pushes an unsolicited transaction to a remote peer; - adds an `AdvertiseTransactions(HashSet<transaction::Hash>)` request that advertises transactions by hash to a remote peer; - adds an `AdvertiseBlock(block::Hash)` request that advertises a block by hash to a remote peer; Then, it modifies the connection state machine so that outbound requests to remote peers are handled properly: - `TransactionsByHash` generates a `getdata` message and collects the results, like the existing `BlocksByHash` request. - `PushTransaction` generates a `tx` message, and returns `Nil` immediately. - `AdvertiseTransactions` and `AdvertiseBlock` generate an `inv` message, and return `Nil` immediately. Next, it modifies the connection state machine so that messages from remote peers generate requests to the inbound service: - `getdata` messages generate `BlocksByHash` or `TransactionsByHash` requests, depending on the content of the message; - `tx` messages generate `PushTransaction` requests; - `inv` messages generate `AdvertiseBlock` or `AdvertiseTransactions` requests. Finally, it refactors the request routing logic for the peer set to handle advertisement messages, providing three routing methods: - `route_p2c`, which uses p2c as normal (default); - `route_inv`, which uses the inventory registry and falls back to p2c (used for `BlocksByHash` or `TransactionsByHash`); - `route_all`, which broadcasts a request to all ready peers (used for `AdvertiseBlock` and `AdvertiseTransactions`).	2020-09-08 10:16:29 -07:00
Henry de Valence	103b663c40	chain: rename BlockHeight to block::Height	2020-08-17 11:46:34 -07:00
Henry de Valence	61dea90e2f	chain: rename BlockHeaderHash to block::Hash This is the first in a sequence of changes that change the block:: items to not include Block as a prefix in their name, in accordance with the Rust API guidelines.	2020-08-17 11:46:34 -07:00
Henry de Valence	4a41c9254d	network: avoid panic when shutting down cleanly. When the connection sees the client_rx channel close it knows it will never get any more requests, and it should terminate. But instead of terminating, it errored itself, and the method to error itself tries to pull all the outstanding client requests from the channel in order to fail them before it shuts down. This results in reading from a closed channel, causing a panic. Instead we return cleanly rather than failing (since we know there are no outstanding requests, as the channel is closed).	2020-07-22 18:04:45 +10:00
Henry de Valence	fcd2f43f39	network: add warning to connection handling code.	2020-07-09 11:15:06 -07:00
Henry de Valence	217c25ef07	network: propagate tracing Spans through peer connection	2020-07-09 11:15:06 -07:00
Jane Lusby	4b9e4520ce	cleanup API for arc based error type (#469 ) Co-authored-by: Jane Lusby <jane@zfnd.org>	2020-06-12 11:29:42 -07:00
Jane Lusby	9bcda0f9c7	Wrap Blocks in Arc throughout codebase	2020-06-05 00:36:55 -04:00
Jane Lusby	b6b35364f3	cleanup warnings throughout codebase	2020-05-27 15:42:29 -04:00
Henry de Valence	7049f9d891	Add a FindBlocks request to get initial block hashes. Bitcoin does this either with `getblocks` (returns up to 500 following block hashes) or `getheaders` (returns up to 2000 following block headers, not just hashes). However, Bitcoin headers are much smaller than Zcash headers, which contain a giant Equihash solution block, and many Zcash blocks don't have many transactions in them, so the block header is often similarly sized to the block itself. Because we're aiming to have a highly parallel network layer, it seems better to use `getblocks` to implement `FindBlocks` (which is necessarily sequential) and parallelize the processing of the block downloads.	2020-02-14 18:23:41 -05:00
Henry de Valence	2082672b3c	Remove Response::Error. Error handling is already handled by Result; we don't need an "inner" error variant duplicating the outer one.	2020-02-10 09:03:56 -08:00
Henry de Valence	29f901add3	Rename Response::Ok to Response::Nil. This is a better name because it signals "no data in response" rather than "Ok", which is semantically mixed with `Ok/Err` of `Result`.	2020-02-10 09:03:56 -08:00
Henry de Valence	5929e05e52	Remove `PushPeers` and ignore unsolicited `addr` messages. PushPeers is more complicated to thread into the rest of our architecture (we would need to establish a data path connecting our service handling inbound requests to the network layer's auto-crawler), and since we crawl the network automatically anyways, we don't actually need to accept them in order to get updated address information. The only possible problem with this approach is that zcashd refuses to answer multiple address requests from the same connection, ostensibly for fingerprinting prevention (although it's totally happy to give exactly the same information, as long as you hang up and reconnect first, lol). It's unclear how this will interact with our design -- on the one hand, it could mean that we don't get new addr information when we ask, but on the other hand, we may have enough churn in our connection pool that this isn't a problem anyways.	2020-02-10 09:03:56 -08:00
Henry de Valence	2c0f48b587	Refactor connection logic and try a block request. Attempting to implement requests for block data revealed a problem with the previous connection logic. Block data is requested by sending a `getdata` message with hashes of the requested blocks; the peer responds with a sequence of `block` messages with the blocks themselves. However, this wasn't possible to handle with the previous connection logic, which could only convert a single Bitcoin message into a Response. Instead, we factor out the message handling logic into a Handler, which can statefully accumulate arbitrary data into a Response and signal completion. This is still pretty ugly but it does work. As a side effect, the HeartbeatNonceMismatch error is removed; because the Handler now tries to process messages until it comes to a Response, it just ignores mismatched nonces (and will eventually time out). The previous Mempool and Transaction requests were removed but could be re-added in a different form later. Also, the `Get` prefixes are removed from `Request` to tidy the name.	2020-02-10 09:03:56 -08:00
Henry de Valence	f04f4f0b98	Apply clippy fixes	2020-02-05 12:42:32 -08:00
Henry de Valence	7cc44f4fa9	Move server.rs to connection.rs and change imports.	2020-01-16 13:20:03 -05:00

1 2 3

101 Commits