Commit Graph

55 Commits

Author SHA1 Message Date
teor 64f777274c
fix(security): fix concurrency issues in tree key formats, and CPU usage in genesis tree roots (#7392)
* Add tree key format and cached root upgrades

* Document the changes in the upgrades

* Remove unnecessary clippy::unwrap_in_result

* Fix database type

* Bump state version

* Skip some checks if the database is empty

* Fix tests for a short state upgrade

* Disable format checks in some tests

* Document state performance issues

* Clarify upgrade behaviour

* Clarify panic messages

* Delete incorrect genesis trees write code

* Fix metrics handling for genesis

* Remove an unused import

* Explain why genesis anchors are ok

* Update snapshots

* Debug a failing test

* Fix some tests

* Fix missing imports

* Move the state check in a test

* Fix comment and doc typos

Co-authored-by: Marek <mail@marek.onl>
Co-authored-by: Arya <aryasolhi@gmail.com>

* Clarify what a long upgrade is

* Rename unused function arguments

Co-authored-by: Marek <mail@marek.onl>

* Add all_unordered log regex matching methods

* Fix timing issues with version upgrades and other logs

* Fix argument name in docs

Co-authored-by: Marek <mail@marek.onl>

* Explain match until first for all regexes behaviour better

---------

Co-authored-by: Marek <mail@marek.onl>
Co-authored-by: Arya <aryasolhi@gmail.com>
2023-10-19 14:50:46 +00:00
teor ae52e3d23d
change(ui): Enable the progress bar feature by default, but only show progress bars when the config is enabled (#7615)
* Add a progress bar config that is disabled unless the feature is on

* Simplify the default config

* Enable the progress bar feature by default, but require the config

* Rename progress bars config to avoid merge conflicts

* Use a log file when the progress bar is activated

* Document how to configure progress bars

* Handle log files in config_tests and check config path

* Fix doc link

* Fix path check

* Fix config log matching

* Fix clippy warning

* Add tracing to config tests

* It's zebrad not zebra

* cargo fmt --all

* Update release for config file changes

* Fix config test failures

* Allow printing to stdout in a method
2023-10-12 00:25:37 +00:00
teor be5cfad07f
change(state): Prepare for in-place database format upgrades, but don't make any format changes yet (#7031)
* Move format upgrades to their own module and enum

* Launch a format change thread if needed, and shut it down during shutdown

* Add some TODOs and remove a redundant timer

* Regularly check for panics in the state upgrade task

* Only run example upgrade once, change version field names

* Increment database format to 25.0.2: add format change task

* Log the running and initial disk database format versions on startup

* Add initial disk and running state versions to cached state images in CI

* Fix missing imports

* Fix typo in logs workflow command

* Add a force_save_to_disk argument to the CI workflow

* Move use_internet_connection into zebrad_config()

* fastmod can_spawn_zebrad_for_rpc can_spawn_zebrad_for_test_type zebra*

* Add a spawn_zebrad_without_rpc() function

* Remove unused copy_state() test code

* Assert that upgrades and downgrades happen with the correct versions

* Add a kill_and_return_output() method for tests

* Add a test for new_state_format() versions (no upgrades or downgrades)

* Add use_internet_connection to can_spawn_zebrad_for_test_type()

* Fix workflow parameter passing

* Check that reopening a new database doesn't upgrade (or downgrade) the format

* Allow ephemeral to be set to false even if we don't have a cached state

* Add a test type that will accept any kind of state

* When re-using a directory, configure the state test config with that path

* Actually mark newly created databases with their format versions

* Wait for the state to be opened before testing the format

* Run state format tests on mainnet and testnet configs (no network access)

* run multiple reopens in tests

* Test upgrades run correctly

* Test that version downgrades work as expected (best effort)

* Add a TODO for testing partial updates

* Fix missing test arguments

* clippy if chain

* Fix typo

* another typo

* Pass a database instance to the format upgrade task

* Fix a timing issue in the tests

* Fix version matching in CI

* Use correct env var reference

* Use correct github env file

* Wait for the database to be written before killing Zebra

* Use correct workflow syntax

* Version changes aren't always upgrades

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2023-07-13 21:36:15 +00:00
teor 1461c912f9
change(ci): Generate mainnet checkpoints in CI (#6550)
* Add extra test type modes to support zebra-checkpoints

* Add Mainnet and Testnet zebra-checkpoints test harnesses

* Add zebra-checkpoints to test docker images

* Add zebra-checkpoints test entrypoints

* Add Mainnet CI workflow for zebra-checkpoints

* Enable zebra-checkpoints feature in the test image

* Use the same features for (almost) all the docker tests

* Make workflow features match Docker features

* Add a feature note

* Add a zebra-checkpoints test feature to zebrad

* Remove the "no cached state" testnet code

* Log a startup message to standard error when launching zebra-checkpoints

* Rename tests to avoid partial name conflicts

* Fix log formatting

* Add sentry feature to experimental docker image build

* Explain what ENTRYPOINT_FEATURES is used for

* Use the correct zebra-checkpoints path

* Silence zebrad logs while generating checkpoints

* Fix zebra-checkpoints log handling

* Re-enable waiting for zebrad to fully sync

* Add documentation for how to run these tests individually

* Start generating checkpoints from the last compiled-in checkpoint

* Fix clippy lints

* Revert changes to TestType

* Wait for all the checkpoints before finishing

* Add more stderr debugging to zebra-checkpoints

* Fix an outdated module comment

* Add a workaround for zebra-checkpoints launch/run issues

* Use temp dir and log what it is

* Log extra metadata about the zebra-checkpoints binary

* Add note about unstable feature -Z bindeps

* Temporarily make the test run faster and with debug info

* Log the original test command name when showing stdout and stderr

* Try zebra-checkpoints in the system path first, then the cargo path

* Fix slow thread close bug in dual process test harness

* If the logs are shown, don't say they are hidden

* Run `zebra-checkpoints --help` to work out what's going on in CI

* Build `zebra-utils` binaries for `zebrad` integration tests

* Revert temporary debugging changes

* Revert changes that were moved to another PR
2023-04-27 04:39:43 +00:00
teor 47073ab30a
Fix a test child process panic bug (#5842) 2022-12-12 23:19:30 +00:00
teor 09836d2800
fix(clippy): Put Rust format variables inline (#5783)
* cargo clippy --fix --all-features --all-targets

With rustc 1.67.0-nightly (234151769 2022-12-03)

* cargo fmt --all
2022-12-08 01:05:57 +00:00
teor c812f880cf
cleanup(clippy): Use inline format strings (#5489)
* Inline format strings using an automated clippy fix

```sh
cargo clippy --fix --all-features --all-targets -- -A clippy::all -W clippy::uninlined_format_args
cargo fmt --all
```

* Remove unused & and &mut using an automated clippy fix

```sh
cargo clippy --fix --all-features --all-targets -- -A clippy::all -W clippy::uninlined_format_args
```
2022-10-27 13:25:18 +00:00
teor 3d8f4e6064
fix(ci): improve test output and test reliability (#5014)
* Rename a function to prepare_block_header_and_transaction_data_batch()

* Fix formatting of test command timeouts and child process output

* Put some #[cfg()]s in the standard Rust location

* Update some test timings

* Allow code timers to be ignored
2022-09-02 08:54:40 +00:00
teor f46d0115e5
fix(test): Show full Zebra test panic details in CI logs (#4942)
* Handle test failure regexes using Result::Err, rather than panicking

* Add output logs to test context, and add tests for that

* Let empty test child logs be read again (and produce empty output)

* Ignore missing test children when killing with ignore_exited

* Fix a clippy lint

* Rename `line` to `line_result` for clarity

* Revert a redundant context_from() on kill()

* Only ignore "no such process" kill() errors in sync_until() tests

* Log the command timeout when an acceptance test fails

* fix clippy

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-08-28 23:52:19 +00:00
teor 89a0410e23
fix(ci): fix hangs in lightwalletd tests by checking concurrent process output in different threads (#4828)
* Make code execution time logs shorter

* Do ZK parameter preloads in the lightwalletd tests that need them

* Try to re-launch `lightwalletd` when it hangs during sync tests

* Increase full sync timeout

* Clear the `zebrad` logs during `lightwalletd` tests, to avoid logging deadlocks

* Actually clear more than one line of logs

* Check zebrad and lightwalletd output in parallel threads, while waiting for zebrad

* Check zebrad and lightwalletd output in parallel threads, while waiting for lightwalletd

* Improve test logging

* Fix a log typo

* Only wait for lightwalletd once, because its logs stop after the initial sync

* Look for cached state disks for this commit and branch first

* Only copy the state once in the send transactions test

* Wait longer for lightwalletd gRPC server startup

* Add some function docs

* cargo fmt --all
2022-07-29 07:06:18 +10:00
Alfredo Garcia 97fb85dca9
lint(clippy): add `unwrap_in_result` lint (#4667)
* `unwrap_in_result` in zebra-chain crate

* `unwrap_in_result` in zebra-script crate

* `unwrap_in_result` in zebra-state crate

* `unwrap_in_result` in zebra-consensus crate

* `unwrap_in_result` in zebra-test crate

* `unwrap_in_result` in zebra-network crate

* `unwrap_in_result` in zebra-rpc crate

* `unwrap_in_result` in zebrad crate

* rustfmt

* revert `?` and add exceptions

* explain some panics better

* move some lint positions

* replace a panic with error

* Fix rustfmt?

Co-authored-by: teor <teor@riseup.net>
2022-06-28 06:22:07 +00:00
Marek 2e50ccc8f3
fix(doc): Fix various doc warnings, part 2 (#4561)
* Fix the syntax of links in comments

* Fix a mistake in the docs

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

* Remove unnecessary angle brackets from a link

* Revert the changes for links that serve as references

* Revert "Revert the changes for links that serve as references"

This reverts commit 8b091aa9fab453e7d3559a5d474e0879183b9bfb.

* Remove `<` `>` from links that serve as references

This reverts commit 046ef25620ae1a2140760ae7ea379deecb4b583c.

* Don't use `<` `>` in normal comments

* Don't use `<` `>` for normal comments

* Revert changes for comments starting with `//`

* Fix some warnings produced by `cargo doc`

* Fix some rustdoc warnings

* Fix some warnings

* Refactor some changes

* Fix some rustdoc warnings

* Fix some rustdoc warnings

* Resolve various TODOs

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-06-14 01:22:16 +00:00
Marek b8b35f8da9
fix(doc): Fix various doc warnings, part 1 (#4514)
* Fix the syntax of links in comments

* Fix a mistake in the docs

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

* Remove unnecessary angle brackets from a link

* Revert the changes for links that serve as references

* Revert "Revert the changes for links that serve as references"

This reverts commit 8b091aa9fab453e7d3559a5d474e0879183b9bfb.

* Remove `<` `>` from links that serve as references

This reverts commit 046ef25620ae1a2140760ae7ea379deecb4b583c.

* Don't use `<` `>` in normal comments

* Don't use `<` `>` for normal comments

* Revert changes for comments starting with `//`

* Fix some warnings produced by `cargo doc`

* Fix some rustdoc warnings

* Fix some warnings

* Refactor some changes

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2022-06-02 15:07:35 +00:00
Janito Vaqueiro Ferreira Filho c47dac8d5f
change(test): Refactor how extra arguments are handled when spawing lightwalled (#4067)
* Join similar imports

Avoid the confusion that might cause one to think that they come from
different modules or crates.

* Create an `Arguments` helper type

A type to keep track of a list of arguments for a sub-process. It makes
it easier for overriding parameters with new values.

* Create an `args!` helper macro

Make it simpler to create `Arguments` instances with known values.

* Require `Arguments` for `spawn_child` method

Change the method to have an `Arguments` parameter, and merge it with
some default values before passing them forward.

* Use `Arguments` in `spawn_lightwalletd_child`

Change the method to use an `Arguments` instance, and merge it with some
default options.

* Use `Arguments` in `spawn_child_with_command`

Require an `Arguments` instance in the `spawn_child_with_command`
extension method. Makes it simpler to call from `spawn_child` and
`spawn_lightwalletd_child` extension methods.

* Test if argument order is preserved

Check that when building an `Arguments` instance, the order that the
arguments are set is preserved in the generated list of strings.

* Refactor test to improve readability

Also separates some common code to be reused by later tests.

* Test overriding arguments

Check to see if overriding arguments behaves as expected, by keeping the
argument order when overriding and not introducing duplicates.

* Refactor test to improve readability

Move out a chunk of code so that the test itself is easier to read and
to make that code reusable by a later test.

* Test that `Arguments` instances can be merged

Merge two `Arguments` instances built from two lists of arguments, and
check that the expanded strings preserve order and override rules.

* Add Eq derives on Arguments

Co-authored-by: teor <teor@riseup.net>
2022-04-19 10:28:52 +00:00
teor ce51ad060f
fix(rpc): use the correct RPC error code for missing blocks (#3977)
* Return error code -8 for missing blocks, like zcashd does

* Test that getblock returns error code -8 for missing blocks

* Make RegexSet failures easier to read in logs

* Update lightwalletd acceptance tests for log message changes

* Add extra failure strings for lightwalletd logs

* Improve link formatting

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
2022-03-30 23:34:52 +00:00
Alfredo Garcia f687ab947f
feat(rpc): Implement `getblockchaininfo` RPC method (#3891)
* Implement `getblockchaininfo` RPC method

* add a test for `get_blockchain_info`

* fix tohex/fromhex

* move comment

* Update lightwalletd acceptance test for getblockchaininfo RPC (#3914)

* change(rpc): Return getblockchaininfo network upgrades in height order (#3915)

* Update lightwalletd acceptance test for getblockchaininfo RPC

* Update some doc comments for network upgrades

* List network upgrades in order in the getblockchaininfo RPC

Also:
- Use a constant for the "missing consensus branch ID" RPC value
- Simplify fetching consensus branch IDs
- Make RPC type derives consistent
- Update RPC type documentation

* Make RPC type derives consistent

* Fix a confusing test comment

* get hashand height at the same time

* fix estimated_height

* fix lint

* add extra check

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* fix typo

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* split test

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* fix(rpc): ignore an expected error in the RPC acceptance tests (#3961)

* Add ignored regexes to test command failure regex methods

* Ignore empty chain error in getblockchaininfo

We expect this error when zebrad starts up with an empty state.

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-03-25 12:25:31 +00:00
teor a5d7b9c1e0
T2. add(test): add test API that checks process logs for failures (#3899)
* Revert "Revert Option<Child> process handling"

This reverts commit 2af30086858d104dcb0ec87383996c36bcaa7371.

* Add a set of failure regexes to test command output

* Allow debug-printing TestChild again

* When the child is dropped, check any remaining output

* Document a wait_with_output edge case

* Improve failure regex panic output

* Improve builder ergonomics

* Add internal tests for failure regex panics

It would be easy to disable these panics, and never realise.

* Add some module structure TODOs

* Stop panicking if the child process has already been taken

* Add test APIs for consuming child output lines

* Fix a hang on child process drop

* Handle output being already taken in wait_with_output

And document some edge cases we don't handle yet

* Use bash's read command in the TestChild stderr test

And check the actual command we're using to see if it errors.

* Pretty print full failure regex list

* Add the test child command line to the failure regex logs
2022-03-22 23:53:24 +00:00
teor 16872f3ba6
T1. add(test): add test API that checks logs for multiple regexes (#3892)
* Make command test matching code accept generic regexes

And add generic conversions to regexes.

* Document test command structs

* Support matching multiple regexes internally in the test command

* Make it easier to call the generic regex methods

* Add a missing API usage comment

* Fix a potential hang in test child error reports

* Revert Option<Child> process handling

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-03-22 15:58:27 +00:00
teor cef146edbd
lint(clippy): warn on manual printing to stdout or stderr (#3767)
Most logging should use `tracing::trace!()` or `tracing::debug!()` instead.
2022-03-08 09:14:15 +00:00
teor 729535cf25
fix(test): check for zebrad test output in the correct order (#3643)
The mempool is only activated once, so we must check for that log first.
After mempool activation, the stop regex is logged at least once.
(It might be logged before as well, but we can't rely on that.)

When checking that the mempool didn't activate,
wait for the `zebrad` command to exit,
then check the entire log.
2022-02-25 23:36:20 +00:00
Dimitris Apostolou afb8b3d477
Fix typos (#3055)
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-11-12 19:30:22 +00:00
teor 3db5ecb80e
Keep reading buffered test process output after it has exited (#2923) 2021-10-20 21:40:39 -04:00
teor 0d5e5bec3c
Improve docs and panic messages for zebra_test::command (#2406) 2021-06-28 10:35:56 -03:00
teor 56ef08e385 Rewrite acceptance test matching
- Add a custom semver match for `zebrad` versions
- Prefer "line contains string" matches, so tests ignore minor changes
- Escape regex meta-characters when a literal string match is intended
- Rename test functions so they are more precise
- Rewrite match internals to remove duplicate code and enable custom matches
- Document match functions
2021-06-10 22:46:33 -04:00
teor 2f0f379a9e
Standardise clippy lints and require docs (#2238)
* Standardise lints across Zebra crates, and add missing docs

The only remaining module with missing docs is `zebra_test::command`

* Todo -> TODO

* Clarify what a transcript ErrorChecker does

Also change `Error` -> `BoxError`

* TransError -> ExpectedTranscriptError

* Output Descriptions -> Output descriptions
2021-06-04 08:48:40 +10:00
teor 895bb43ead Clippy: Fix inconsistent struct member orders lint 2021-03-01 23:31:18 -05:00
teor d4915c18e7
Fix inverted was_killed logic (#1779)
Also improve the error messages and code structure.
2021-02-20 06:26:00 +10:00
teor b0bc4a79c9 Disable conflict failure cleanup on macOS 2021-02-20 06:22:11 +10:00
teor c51fd688ee Skip node2.is_running() on Windows
`node2.is_running()` can return `true` on Windows, even though `node2`
has logged a panic. This cleanup code only runs if `node2` fails to panic
and exit as expected. So it's ok for us to skip it.

See #1781 for details.
2021-02-19 18:03:07 +10:00
teor 631fe22422 Fix conflict test node termination
On Windows, if a process is killed after it is dead, it returns `true`
for `was_killed`. Instead, check if the process is running before killing
it.

Also make the section where processes are running as short as possible,
and include context for both processes in every error.
2021-02-19 18:03:07 +10:00
teor bcb9cead88
Kill running acceptance test nodes on error (#1770)
And show the output from those nodes.

These changes help us diagnose errors that happen while one or more
acceptance test nodes are running.
2021-02-19 06:46:27 +10:00
Alfredo Garcia 4b34482264
Add hints to port conflict and lock file panics (#1535)
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests

* Add a full set of stderr acceptance test functions

This change makes the stdout and stderr acceptance test interfaces
identical.

* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path

* Use Display for state path logs

Avoids weird escaping on Windows when using Debug

* Add Windows conflict error messages

* Turn PORT_IN_USE_ERROR into a regex

And add another alternative Windows-specific port error

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
2021-01-29 22:36:33 +10:00
Jane Lusby 99c5acc94f rename test fn 2020-11-24 11:04:30 -05:00
Jane Lusby 17fdbe941b fix stdout issue with test framework for cached data tests 2020-11-24 11:04:30 -05:00
Jane Lusby 9fd38f29b4 further make use statement consistent 2020-11-24 11:04:30 -05:00
Jane Lusby a21ddfeefc fix use cfg 2020-11-24 11:04:30 -05:00
Jane Lusby 4bfe747f34 update acceptance tests 2020-11-24 11:04:30 -05:00
teor ea510b7d41
Run a block sync in CI with 2 large checkpoints (#1193)
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests

And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
2020-10-27 19:25:29 +10:00
teor 17a3612b36 Remove a redundant condition in expect_stdout
When the loop exits, either the process has stopped running,
or the deadline has passed.

If the process is still running, we want to kill it.
2020-10-21 00:58:08 -04:00
teor 3e897722a6 Improve TestDirExt docs 2020-10-15 19:54:00 -04:00
teor 04ce907dbf Remove duplicate code in zebra_test::command 2020-10-15 19:54:00 -04:00
teor 32bbc19c6b Fix a timeout bug in zebra_test::command
And add tests for the command functionality.

Also document some remaining bugs (see #1140).
2020-10-15 19:54:00 -04:00
Jane Lusby a7b418bfe5
Add test for first checkpoint verification (#1018)
* add test for first checkpoint sync

Prior this this change we've not had any tests that verify our sync /
network logic is well behaved. This PR cleans up the test helper code to
make error reports more consistent and uses this cleaned up API to
implement a checkpoint sync test which runs zebrad until it reads the
first checkpoint event from stdout.

Co-authored-by: teor <teor@riseup.net>

* move include out of unix cfg

Co-authored-by: teor <teor@riseup.net>
2020-09-11 13:39:39 -07:00
Jane Lusby ffdec0cb23
Remove in-memory state service (#974)
* Remove in-memory state service

* make the config compatible with toml again

* checkpoint commit to see how much I still have to revert

* back to the starting point...

* remove unused dependency

* reorganize error handling a bit

* need to make a new color-eyre release now

* reorder again because I have problems

* remove unnecessary helpers

* revert changes to config loading

* add back missing space

* Switch to released color-eyre version

* add back missing newline again...

* improve error message on unix when terminated by signal

* add context to last few asserts in acceptance tests

* instrument some of the helpers

* remove accidental extra space

* try to make this compile on windows

* reorg platform specific code

* hide on_disk module and fix broken link
2020-09-01 12:39:04 -07:00
Ramana Venkata 991c70723a zebrad: Create zebrad.toml in acceptance tests from ZebradConfig
Fixes #929
2020-08-23 21:24:19 -04:00
Alfredo Garcia d349f2bbc2
Refactor acceptance serialized_tests (#920)
* add network listening address to default config
2020-08-20 07:48:22 +10:00
Alfredo Garcia e73f976194
Valid generated config acceptance test (#859)
* add valid generated config test

* change to pathbuf

* use -c to make sure we are using the generated file

* add and use a ZebraTestDir type

* change approach to generate tempdir in top of each test

* pass tempdir to test_cmd and set current dir to it

* add and use a `generated_config_path` variable in tests
2020-08-13 13:31:13 -07:00
Henry de Valence a79ce97957
Fix sync algorithm. (#887)
* checkpoint: reject older of duplicate verification requests.

If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled.  Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.

* sync: add a timeout layer to block requests.

Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.

* sync: restart syncing on error

Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.

* sync: additional debug info

* sync: handle lookahead limit correctly.

Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped.  This meant that completed tasks could be left until the limit was
exceeded again.  Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.

* network: add debug instrumentation to retry policy

* sync: instrument the spawned task

* sync: streamline ObtainTips/ExtendTips logic & tracing

This change does three things:

1.  It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method.  This means that when debugging we only have one
deduplication algorithm to focus on.

2.  It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.

3.  It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip").  The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.

* sync: hack to work around zcashd behavior

* sync: localize debug statement in extend_tips

* sync: change algorithm to define tips as pairs of hashes.

This is different enough from the existing description that its comments no
longer apply, so I removed them.  A further chunk of work is to change the sync
RFC to document this algorithm.

* sync: reduce block timeout

* state: add resource limits for sled

Closes #888

* sync: add a restart timeout constant

* sync: de-pub constants
2020-08-12 16:48:01 -07:00
Alfredo Garcia c9093e4d59
Make more checks in non server acceptance tests (#860)
* make sure no info is printed in non server tests

* check exact full output for validity instead of log msgs

* add end of output character to version regex

* use coercions, use equality operator

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-10 12:50:48 -07:00
Jane Lusby 867dd0b475
Setup tracing-flame for use profiling zebrad (#436)
* Setup tracing-flame for use profiling zebrad

* start work on conditional flamegraph generation

* review time!

* update comments

* Update Cargo.toml

* disable default features for inferno

* reorganize

* missing one trait

* Apply suggestions from code review

* graceful shutdown!

* remove special case handling on ctrlc for cleanup

* rename signal fn to better represent its responsibility

* remove unused global hook for flushing flamegraph

* move tracing logic to the right file

* just copy linkerd's signal handling logic

* update book

* make zebrad app drop on shutdown normally

* Update zebrad/src/components/tokio.rs

Co-authored-by: teor <teor@riseup.net>

* Update zebrad/src/application.rs

Co-authored-by: teor <teor@riseup.net>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* cleanup a little

* ooh yea there's an API for that

* setup env-filter for backup subscriber

* document env filter

* document return codes

* forgot to save

* Update book/src/applications/zebrad.md

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2020-08-05 16:35:56 -07:00