Go to file
Henry de Valence a79ce97957
Fix sync algorithm. (#887)
* checkpoint: reject older of duplicate verification requests.

If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled.  Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.

* sync: add a timeout layer to block requests.

Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.

* sync: restart syncing on error

Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.

* sync: additional debug info

* sync: handle lookahead limit correctly.

Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped.  This meant that completed tasks could be left until the limit was
exceeded again.  Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.

* network: add debug instrumentation to retry policy

* sync: instrument the spawned task

* sync: streamline ObtainTips/ExtendTips logic & tracing

This change does three things:

1.  It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method.  This means that when debugging we only have one
deduplication algorithm to focus on.

2.  It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.

3.  It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip").  The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.

* sync: hack to work around zcashd behavior

* sync: localize debug statement in extend_tips

* sync: change algorithm to define tips as pairs of hashes.

This is different enough from the existing description that its comments no
longer apply, so I removed them.  A further chunk of work is to change the sync
RFC to document this algorithm.

* sync: reduce block timeout

* state: add resource limits for sled

Closes #888

* sync: add a restart timeout constant

* sync: de-pub constants
2020-08-12 16:48:01 -07:00
.github s/infrastructure/A-infrastructure (#883) 2020-08-11 16:41:29 -07:00
book rfc: Update the RFC template to talk about testing and maintenance (#875) 2020-08-11 13:26:35 -07:00
design Update data-flow.md 2020-08-04 22:44:39 -07:00
tower-batch build(deps): bump tracing from 0.1.18 to 0.1.19 (#872) 2020-08-11 10:18:54 -07:00
tower-fallback build(deps): bump pin-project from 0.4.22 to 0.4.23 2020-07-28 17:27:25 -04:00
zebra-chain Tweak light client root hash definition. 2020-08-11 19:13:50 -04:00
zebra-client Align crate versions and user-agent with NU numbers. 2020-07-24 11:46:37 -07:00
zebra-consensus Fix sync algorithm. (#887) 2020-08-12 16:48:01 -07:00
zebra-network Fix sync algorithm. (#887) 2020-08-12 16:48:01 -07:00
zebra-rpc Align crate versions and user-agent with NU numbers. 2020-07-24 11:46:37 -07:00
zebra-script Align crate versions and user-agent with NU numbers. 2020-07-24 11:46:37 -07:00
zebra-state Fix sync algorithm. (#887) 2020-08-12 16:48:01 -07:00
zebra-test Fix sync algorithm. (#887) 2020-08-12 16:48:01 -07:00
zebra-utils build(deps): bump tracing-subscriber from 0.2.10 to 0.2.11 (#873) 2020-08-11 10:30:50 -07:00
zebrad Fix sync algorithm. (#887) 2020-08-12 16:48:01 -07:00
.firebaserc Try building internal docs. 2020-02-10 18:12:43 -08:00
.gitignore Implement sync component for start subcommand (#506) 2020-06-22 19:24:53 -07:00
CONTRIBUTING.md Reorganize the book. (#843) 2020-08-06 15:39:54 -07:00
Cargo.lock build(deps): bump tracing-subscriber from 0.2.10 to 0.2.11 (#873) 2020-08-11 10:30:50 -07:00
Cargo.toml Refactor tracing components (#834) 2020-08-06 10:29:31 -07:00
Dockerfile Put all ports EXPOSE'd on the same line 2020-06-19 03:46:09 -04:00
LICENSE-APACHE Add copyright marks on each license 2019-11-14 11:50:49 -08:00
LICENSE-MIT Add copyright marks on each license 2019-11-14 11:50:49 -08:00
README.md Reorganize the book. (#843) 2020-08-06 15:39:54 -07:00
clippy.toml Apply clippy fixes 2020-02-05 12:42:32 -08:00
cloudbuild.yaml Fix CD workflow using cloudbuild.yaml (#637) 2020-07-10 07:37:54 -04:00
codecov.yml Turn off the CodeCov PR comment, metrics available in the checks results which link to codecov.io 2020-07-24 18:18:46 -04:00
firebase.json Configure redirect for firebase hosting 2020-01-16 18:38:16 -05:00
katex-header.html Add KaTeX to rendered docs. (#832) 2020-08-05 17:34:30 -07:00
prometheus.yaml Tell Prometheus to scrape more aggressively 2020-02-14 20:14:05 -05:00

README.md

Zebra logotype


codecov License

Hello! I am Zebra, an ongoing Rust implementation of a Zcash node.

Zebra is a work in progress. It is developed as a collection of zebra-* libraries implementing the different components of a Zcash node (networking, chain structures, consensus rules, etc), and a zebrad binary which uses them.

Most of our work so far has gone into zebra-network, building a new networking stack for Zcash, zebra-chain, building foundational data structures, zebra-consensus, implementing consensus rules, and zebra-state, providing chain state.

Zebra Website.

Rendered docs from the main branch.

Join us on Discord.

License

Zebra is distributed under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE and LICENSE-MIT.