As a side effect of computing Merkle roots, we build a list of
transaction hashes. Instead of discarding these, add them to
PreparedBlock and FinalizedBlock so that they can be reused rather than
recomputed.
This commit adds Merkle root validation to:
1. the block verifier;
2. the checkpoint verifier.
In the first case, Bitcoin Merkle tree malleability has no effect,
because only a single Merkle tree in each malleablity set is valid (the
others have duplicate transactions).
In the second case, we need to check that the Merkle tree does not contain any
duplicate transactions.
Closes#1385Closes#906
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions
Add a `max_len` argument to support `FindHeaders` requests.
Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.
* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Some checks use the same blocks, so we take a copy of the block borrows
before using them. That way, we don't have to manage the position of the
iterator between checks.
Temporary fix so that Zebra's default logs support a typical workflow:
1. Developer or user runs Zebra with the default config
2. They send the logs to a terminal
3. When they see a bug, they copy-paste the last few log lines into a
bug report
This is the same change that was merged in #1373 and reverted in #1375.
We'll create a consistent logging design for Zebra in ticket #1381.
* Make debug_stop_at_height and ephemeral work together
* if `debug_stop_at_height` and `ephemeral` are set, delete the database
files after reaching the stop height
* drop or flush the database before `debug_stop_at_height` exits Zebra
This commit changes the state system and database format to track the
provenance of UTXOs, in addition to the outputs themselves.
Specifically, it tracks the following additional metadata:
- the height at which the UTXO was created;
- whether or not the UTXO was created from a coinbase transaction or
not.
This metadata will allow us to:
- check the coinbase maturity consensus rule;
- check the coinbase inputs => no transparent outputs rule;
- implement lookup of transactions by utxo (using the height to find the
block and then scanning the block) for a future RPC mechanism.
Closes#1342
This provides useful and not too noisy output at INFO level. We do an
info-level message on every block commit instead of trying to do one
message every N blocks, because this is useful both for initial block
sync as well as continuous state updates on new blocks.
This change introduces two new types:
- `PreparedBlock`, representing a block which has undergone semantic
validation and has been prepared for contextual validation;
- `FinalizedBlock`, representing a block which is ready to be finalized
immediately;
and changes the `Request::CommitBlock`,`Request::CommitFinalizedBlock`
variants to use these types instead of their previous fields.
This change solves the problem of passing data between semantic
validation and contextual validation, and cleans up the state code by
allowing it to pass around a bundle of data. Previously, the state code
just passed around an `Arc<Block>`, which forced it to needlessly
recompute block hashes and other data, and was incompatible with the
already-known but not-yet-implemented data transfer requirements, namely
passing in the Sprout and Sapling anchors computed during contextual
validation.
This commit propagates the `PreparedBlock` and `FinalizedBlock` types
through the state code but only uses their data opportunistically, e.g.,
changing .hash() computations to use the precomputed hash. In the
future, these structures can be extended to pass data through the
verification pipeline for reuse as appropriate. For instance, these
changes allow the sprout and sapling anchors to be propagated through
the state.
The behavior of a request for a UTXO from a previous block depends on
whether that block has already been submitted to the state, or not:
* if it has, the state should be able to find it and answer immediately.
* if it has not, the state should see it in a later request.
However, the previous code only checked committed blocks, not queued
blocks, so if the block containing the UTXO had already arrived but had
not been committed, it would never be scanned.
This patch fixes the problem but is a bad solution, duplicating
computation between the block verifier and the state. A better fix
follows in the next commit.
Make tracing messages more concise by omitting information already
contained in a parent span and by shortening messages. This makes them
easier to read.
## Motivation
Prior to this PR we've been using `sled` as our database for storing persistent chain data on the disk between boots. We picked sled over rocksdb to minimize our c++ dependencies despite it being a less mature codebase. The theory was if it worked well enough we'd prefer to have a pure rust codebase, but if we ever ran into problems we knew we could easily swap it out with rocksdb.
Well, we ran into problems. Sled's memory usage was particularly high, and it seemed to be leaking memory. On top of all that, the performance for writes was pretty poor, causing us to become bottle-necked on sled instead of the network.
## Solution
This PR replaces `sled` with `rocksdb`. We've seen a 10x improvement in memory usage out of the box, no more leaking, and much better write performance. With this change writing chain data to disk is no longer a limiting factor in how quickly we can sync the chain.
The code in this pull request has:
- [x] Documentation Comments
- [x] Unit Tests and Property Tests
## Review
@hdevalence
This change has two benefits:
* reduces conflicts with the sled refactor and any replacement
* allows the function to be called independently for testing
* Add internal iterator API for accessing relevant chain blocks
* get blocks from all chains in non_finalized state
* Impl FusedIterator for service::Iter
* impl ExactSizedIterator for service::Iter
* let size_hint find heights in side chains
Co-authored-by: teor <teor@riseup.net>
* Add transcript test for requests while state is empty
* Add happy path test for each query once the state is populated
* let populate logic handle out of order blocks
Prior to this PR we realized that the RFC had been drafted with the assumption that chains would be ordered from best to worst in `NonFinalizedState`. This assumption was incorrect, since `BTreeSet` only ever orders values in ascending order. This discrepancy was noticed and fixed in the code, but there were still some inconsistencies that needed to be cleaned up.
This PR updates all the incorrect or confusing comments about chain ordering in the RFC and code.
Prior to this PR `memory_state` defined and implemented functionality for three different types, `Chain`, `NonFinalizedState`, and `QueuedBlocks`. Each of these components will need a fair number of unit tests, and I realized that as its currently organized it would be difficult to organize the tests or at a glance figure out which tests are testing which components.
This PR changes the organization of `memory_state` such that each component it exports is defined in its own module. In follow up PRs each module will get its own test module, which will focus exclusively on unit tests for the item defined there-in.
- [Tracking Issue](https://github.com/ZcashFoundation/zebra/issues/1250)
* make service use both finalized and non-finalized state
* Document new functions
* add documentation to sled fns
* cleanup tip fn now that errors are gone
* rename height unwrap fn
## Motivation
The zebra-state service needs to be able to handle duplicate blocks.
## Solution
This implements changes already outlined by [The State
RFC](https://zebra.zfnd.org/dev/rfcs/0005-state-updates.html). We check for
successfully committed blocks first, since interacting with the queued blocks
struct at this point just complicates the implimentation. If the block has not
already been committed we then check if the block has already been queued, if
not we handle the block normally (normally here being the bit we already had
implemented).
## Documentation Changes
- [x] Update the state RFC to match the ways this fix departs from the design
- the main thing is that I switched the order of checking for duplicates
- [x] ~~Add newly added functions to the state rfc~~ Decided not to do this because they're minor getters that don't influence the rest of the design and aren't exposed as part of the API
- [x] Document newly added functions inline
## Testing
## Related Issues
- fixes https://github.com/ZcashFoundation/zebra/issues/1182
- tracking issue https://github.com/ZcashFoundation/zebra/issues/1049
Co-authored-by: teor <teor@riseup.net>
This commit begins the process of integrating `zcash_script` with the rest of the system for verifying scripts while syncing the block chain. It does so by adding the necessary support for looking up UTXOs from the state service and implements the first parts of the `script::Verifier` for looking up the necessary UTXOs in the state and then generating the necessary call to `zcash_script` to verify the script itself.
Co-authored-by: teor <teor@riseup.net>
* implement most of the chain functions
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* split utxo into two sets
* update the Chain definition
* remove allow attribute in zebra-state/lib.rs
* merge ChainSet type into MemoryState
* Add error messages to asserts
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* try to fix github actions syntax
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* propagate back results over channel
* Start updating RFC to match changes
* implement queued block pruning
* and now it syncs wooo!
* remove empty modules
* setup config for proptests
* re-enable missing_docs lint
* update RFC to match changes in impl
* add documentation
* use more explicit variable names