Commit Graph

47 Commits

Author SHA1 Message Date
teor 6f8f4d8987
Provide recent syncer response lengths as a watch channel (#2602)
* Minimal recent sync lengths implementation

Also includes metrics and logging, to make diagnosing bugs easier.

* Add logging to check what happens when Zebra reaches the chain tip

* Add tests for recent sync lengths

- initially empty
- pruned to correct length
- newest entries go first

* Drop a redundant `/` from a Cargo.lock URL

This seems to be a nightly or beta Rust change,
but hopefully stable just accepts it.

* Use metrics histograms to avoid overwriting values

* Add detailed syncer monitoring dashboard

* Increase the recent sync length to 4

This length makes it easier to distinguish between temporary and
sustained errors/syncs.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-19 23:16:16 +00:00
Janito Vaqueiro Ferreira Filho 4c4dbfe7cd
Reject connections from outdated peers (#2519)
* Simplify state service initialization in test

Use the test helper function to remove redundant code.

* Create `BestTipHeight` helper type

This type abstracts away the calculation of the best tip height based on
the finalized block height and the best non-finalized chain's tip.

* Add `best_tip_height` field to `StateService`

The receiver endpoint is currently ignored.

* Return receiver endpoint from service constructor

Make it available so that the best tip height can be watched.

* Update finalized height after finalizing blocks

After blocks from the queue are finalized and committed to disk, update
the finalized block height.

* Update best non-finalized height after validation

Update the value of the best non-finalized chain tip block height after
a new block is committed to the non-finalized state.

* Update finalized height after loading from disk

When `FinalizedState` is first created, it loads the state from
persistent storage, and the finalized tip height is updated. Therefore,
the `best_tip_height` must be notified of the initial value.

* Update the finalized height on checkpoint commit

When a checkpointed block is commited, it bypasses the non-finalized
state, so there's an extra place where the finalized height has to be
updated.

* Add `best_tip_height` to `Handshake` service

It can be configured using the `Builder::with_best_tip_height`. It's
currently not used, but it will be used to determine if a connection to
a remote peer should be rejected or not based on that peer's protocol
version.

* Require best tip height to init. `zebra_network`

Without it the handshake service can't properly enforce the minimum
network protocol version from peers. Zebrad obtains the best tip height
endpoint from `zebra_state`, and the test vectors simply use a dummy
endpoint that's fixed at the genesis height.

* Pass `best_tip_height` to proto. ver. negotiation

The protocol version negotiation code will reject connections to peers
if they are using an old protocol version. An old version is determined
based on the current known best chain tip height.

* Handle an optional height in `Version`

Fallback to the genesis height in `None` is specified.

* Reject connections to peers on old proto. versions

Avoid connecting to peers that are on protocol versions that don't
recognize a network update.

* Document why peers on old versions are rejected

Describe why it's a security issue above the check.

* Test if `BestTipHeight` starts with `None`

Check if initially there is no best tip height.

* Test if best tip height is max. of latest values

After applying a list of random updates where each one either sets the
finalized height or the non-finalized height, check that the best tip
height is the maximum of the most recently set finalized height and the
most recently set non-finalized height.

* Add `queue_and_commit_finalized` method

A small refactor to make testing easier. The handling of requests for
committing non-finalized and finalized blocks is now more consistent.

* Add `assert_block_can_be_validated` helper

Refactor to move into a separate method some assertions that are done
before a block is validated. This is to allow moving these assertions
more easily to simplify testing.

* Remove redundant PoW block assertion

It's also checked in
`zebra_state::service::check::block_is_contextually_valid`, and it was
getting in the way of tests that received a gossiped block before
finalizing enough blocks.

* Create a test strategy for test vector chain

Splits a chain loaded from the test vectors in two parts, containing the
blocks to finalize and the blocks to keep in the non-finalized state.

* Test committing blocks update best tip height

Create a mock blockchain state, with a chain of finalized blocks and a
chain of non-finalized blocks. Commit all the blocks appropriately, and
verify that the best tip height is updated.

Co-authored-by: teor <teor@riseup.net>
2021-08-08 23:52:52 +00:00
teor 74e155ff9f
Spelling: gossipped -> gossiped (#2119) 2021-05-07 13:01:11 +02:00
teor 7e2c3a2fc7 Clarify a duplicate log message 2021-04-21 23:59:29 -04:00
teor 24f1b9bad1
Document the Inbound service in the start module (#1653) 2021-01-29 22:19:06 +10:00
Alfredo Garcia 486e55104a create Downloads for Inbound 2020-11-25 10:55:44 -08:00
Henry de Valence e9c847bbd7 zebrad: avoid a borrow in the ChainSync future 2020-11-17 14:56:27 -08:00
Henry de Valence 253bab042e sync: add a concurrency limit for block downloads 2020-10-26 12:05:35 -07:00
Henry de Valence 65e0c22fbe state: don't pre-buffer the service
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
2020-10-26 12:05:35 -07:00
Henry de Valence 55f46967b2 zebrad: serve blocks from Inbound service
The original version of this commit ran into

https://github.com/rust-lang/rust/issues/64552

again.  Thanks to @yaahc for suggesting a workaround (using futures combinators
to avoid writing an async block).
2020-09-18 18:34:25 -07:00
Henry de Valence 170f588ffb network: document load-shedding behavior
This was part of the original design and is described in the Connection
internals, but we never documented it externally.
2020-09-18 18:34:25 -07:00
Henry de Valence 1d0ebf89c6 zebrad: move seed command into inbound component
Remove the seed command entirely, and make the behavior it provided
(responding to `Request::Peers`) part of the ordinary functioning of the
start command.

The new `Inbound` service should be expanded to handle all request
types.
2020-09-18 18:34:25 -07:00
teor b1e1291f45 Log inbound peer requests at debug
Logging at info was a bit too verbose.

Also add a short log message.
2020-09-10 09:46:53 -07:00
Henry de Valence 9b6e66c1b9 zebrad: rename Syncer to ChainSync
This name clarifies what is being synced and avoids an agent-noun
construction.
2020-09-10 09:45:52 -07:00
Henry de Valence 0bc79686b8 zebrad: move sync into components module.
Part of #1030.
2020-09-10 09:45:52 -07:00
Jane Lusby ffdec0cb23
Remove in-memory state service (#974)
* Remove in-memory state service

* make the config compatible with toml again

* checkpoint commit to see how much I still have to revert

* back to the starting point...

* remove unused dependency

* reorganize error handling a bit

* need to make a new color-eyre release now

* reorder again because I have problems

* remove unnecessary helpers

* revert changes to config loading

* add back missing space

* Switch to released color-eyre version

* add back missing newline again...

* improve error message on unix when terminated by signal

* add context to last few asserts in acceptance tests

* instrument some of the helpers

* remove accidental extra space

* try to make this compile on windows

* reorg platform specific code

* hide on_disk module and fix broken link
2020-09-01 12:39:04 -07:00
teor 78201b456d feature: Implement checkpoint_sync for checkpoint verification
* add CheckpointList::new_up_to(limit: NetworkUpgrade)
* if checkpoint_sync is false, limit checkpoints to Sapling
* update tests for CheckpointList and chain::init
2020-08-24 15:34:46 +10:00
Jane Lusby 867dd0b475
Setup tracing-flame for use profiling zebrad (#436)
* Setup tracing-flame for use profiling zebrad

* start work on conditional flamegraph generation

* review time!

* update comments

* Update Cargo.toml

* disable default features for inferno

* reorganize

* missing one trait

* Apply suggestions from code review

* graceful shutdown!

* remove special case handling on ctrlc for cleanup

* rename signal fn to better represent its responsibility

* remove unused global hook for flushing flamegraph

* move tracing logic to the right file

* just copy linkerd's signal handling logic

* update book

* make zebrad app drop on shutdown normally

* Update zebrad/src/components/tokio.rs

Co-authored-by: teor <teor@riseup.net>

* Update zebrad/src/application.rs

Co-authored-by: teor <teor@riseup.net>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* cleanup a little

* ooh yea there's an API for that

* setup env-filter for backup subscriber

* document env filter

* document return codes

* forgot to save

* Update book/src/applications/zebrad.md

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2020-08-05 16:35:56 -07:00
Alfredo Garcia f2d7bb3177
Command execution tests (#690)
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-08-01 16:15:26 +10:00
teor 11090dbf91 feature: Separate Mainnet and Testnet state 2020-07-29 01:45:19 -04:00
teor 2acfcf3a90
Make the CheckpointVerifier handle partial restarts (#736)
Also put generic bounds on the BlockVerifier struct,
so we get better compilation errors.
2020-07-24 11:47:48 +10:00
teor c95c825707 fix: Lookup the genesis hash based on the network 2020-07-23 03:46:24 -04:00
teor 9b97ebbd61 feature: Choose checkpoints based on the config 2020-07-23 10:26:25 +10:00
teor 3d721a96a5 feature: Add the state config to the config file 2020-07-23 10:26:25 +10:00
teor 89ac2793d6 feature: Use ChainVerifier in the sync service 2020-07-23 10:26:25 +10:00
teor 8b5ec155f0
Consensus refactor (#629)
* Flatten consensus::verify::* to consensus::*
* Move consensus::*::tests into their own files
* Move CheckpointList into its own file
* Move Progress and Target into a types module

QueuedBlock and QueuedBlockList can stay in checkpoint.rs, because
they are tightly coupled to CheckpointVerifier.
2020-07-10 16:51:01 +10:00
Henry de Valence ff4e722cd7 sync: touch up tracing output. 2020-07-09 11:15:06 -07:00
Dimitris Apostolou ba81d7d4c0 Fix typos 2020-07-07 11:13:49 -07:00
Jane Lusby 51f6ce86ff
Implement retry policy for syncer (#551) 2020-07-01 13:35:01 -07:00
Jane Lusby 7245d91fe9
fix block downloading to be parallelized and commited via the verifier (#540) 2020-06-30 09:42:09 -07:00
Henry de Valence 21bf913b48 Revert "correctly trim and download tips (#531)"
This reverts commit e102bd5e34.
2020-06-24 12:24:37 -07:00
Jane Lusby e102bd5e34
correctly trim and download tips (#531)
* also download tips and filter tips

* dispatch all block downloads together

* tweek to match henry's changes

* switch to more intuitive match

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-24 15:19:34 -04:00
Jane Lusby 1c42b66a4f
Implement sync component for start subcommand (#506) 2020-06-22 19:24:53 -07:00
Jane Lusby 246e7cd2a9
Start testing out new version of `eyre` and `color-eyre` in zebra (#526)
* port to new version of eyre without generics

* correctly setup color_eyre hooks

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-22 15:36:23 -07:00
Jane Lusby 7f8a336b69 switch to on_disk state service for start cmd 2020-06-17 23:30:50 -07:00
Jane Lusby df18ac72c5 fix sharedpeererror to propagate tracing context 2020-06-17 14:38:26 -07:00
Jane Lusby 06fd3b2503 be more explicit with pattern in drain_requests 2020-06-16 12:04:45 -07:00
Jane Lusby b0ecd019b6 apply comments from code review 2020-06-16 12:04:45 -07:00
Jane Lusby d09c339dc5 little more cleaning 2020-06-16 12:04:45 -07:00
Jane Lusby 528fd2b5b1 add an outline of the structure of the node 2020-06-16 12:04:45 -07:00
Jane Lusby fc96a41b18 copy connect command into start command 2020-06-16 12:04:45 -07:00
Jane Lusby 431f194c0f
propagate errors out of zebra_network::init (#435)
Prior to this change, the service returned by `zebra_network::init` would spawn background tasks that could silently fail, causing unexpected errors in the zebra_network service.

This change modifies the `PeerSet` that backs `zebra_network::init` to store all of the `JoinHandle`s for each background task it depends on. The `PeerSet` then checks this set of futures to see if any of them have exited with an error or a panic, and if they have it returns the error as part of `poll_ready`.
2020-06-09 12:24:28 -07:00
Jane Lusby 18b4dbc16c
fix tracing configuration issues (#432) 2020-06-04 19:34:06 -07:00
Henry de Valence 4fcb550aa6 Fix a deadlock in TokioComponent.
The components are accessed by a lock on application state.  When some command
calls block_on to enter an async context, it obtained a write lock on the
entire application state.  This meant that if the application state were
accessed later in an async context, a deadlock would occur.  Instead the
TokioComponent holds an Option<Runtime> now, so that before calling block_on,
the caller can .take() the runtime and release the lock.  Since we only ever
enter an async context once, it's not a problem that the component is then
missing its runtime, as once we are inside of a task we can access the runtime.
2020-01-15 12:06:31 -08:00
Henry de Valence 2965187b91 Upgrade tokio, futures, hyper to released versions. 2019-12-13 17:42:15 -05:00
Deirdre Connolly 162b37fe8d Tracing endpoint (#3)
* Add a TracingConfig and some components

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>

* Restructure, use dependency injection, initialize tracing

* Start a placeholder loop in start command

* Add hyper alpha.1, bump tokio to alpha.4

* Hello world endpoint using async/await from hyper 0.13 alpha

Also cleaned up some linter messages.

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* Update to tracing_subscriber 0.1

* fmt

* add rust-toolchain

* Remove hyper::Version import

* wip: start filter_handler impl

* Add .rustfmt.toml

* rustfmt

* Tidy up .rustfmt.toml

* Add filter reloading handling.

* bump toolchain

* Remove generated hello world acceptance tests.

These test the behaviour of the autogenerated binary and work as examples of
how to test the behaviour of abscissa binaries.  Since we don't print "Hello
World" any more, they fail, but we don't yet have replacement behaviour to add
tests for, so they're removed for now.

* Clean up config file handling with Option::and_then.
2019-09-09 13:05:42 -07:00
Henry de Valence ec363d2d41 Create workspace skeleton based on design.md 2019-08-29 14:46:54 -07:00