* Update `tower` to version `0.4.9`
Update to latest version to add support for Tokio version 1.
* Replace usage of `ServiceExt::ready_and`
It was deprecated in favor of `ServiceExt::ready`.
* Update Tokio dependency to version `1.13.0`
This will break the build because the code isn't ready for the update,
but future commits will fix the issues.
* Replace import of `tokio::stream::StreamExt`
Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.
* Use `IntervalStream` in `zebra-network`
In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.
* Use `IntervalStream` in `inventory_registry`
In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.
* Use `BroadcastStream` in `inventory_registry`
In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.
* Handle `Semaphore::acquire` error in `tower-batch`
Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.
* Handle `Semaphore::acquire` error in `zebrad` test
On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.
* Update some `zebra-network` dependencies
Use versions compatible with Tokio version 1.
* Upgrade Hyper to version 0.14
Use a version that supports Tokio version 1.
* Update `metrics` dependency to version 0.17
And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.
* Use `f64` as the histogram data type
`u64` isn't supported as the histogram data type in newer versions of
`metrics`.
* Update the initialization of the metrics component
Make it compatible with the new version of `metrics`.
* Simplify build version counter
Remove all constants and use the new `metrics::incement_counter!` macro.
* Change metrics output line to match on
The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.
* Update `sentry` to version 0.23.0
Use a version compatible with Tokio version 1.
* Remove usage of `TracingIntegration`
This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.
* Add sentry layer to tracing initialization
This seems like the replacement for `TracingIntegration`.
* Remove unnecessary conversion
Suggested by a Clippy lint.
* Update Cargo lock file
Apply all of the updates to dependencies.
* Ban duplicate tokio dependencies
Also ban git sources for tokio dependencies.
* Stop allowing sentry-tracing git repository in `deny.toml`
* Allow remaining duplicates after the tokio upgrade
* Use C: drive for CI build output on Windows
GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.
Co-authored-by: teor <teor@riseup.net>
* Allow deliberate instances of the new nightly clippy::derivable_impls lint
We want our config defaults to be explicit.
Not so sure about the application defaults, but they also contain a config.
* Also allow unknown lint names
Stable doesn't know about this lint, but nightly does.
* Use the git version + new commit count + hash for the app version
This helps diagnose bugs in versions of Zebra built from git branches,
rather than git version tags.
* Fill in assert
* Also log semver string
* Fix syntax
* Handle vergen using the cargo package version or raw git tag
* s/Semver/SemVer/
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Enable builds where:
* there is no google cloud git commit env var, and
* there is no `.git` directory.
By making all `vergen` env vars optional, and skipping any env vars that
don't exist.
* build(deps): bump vergen from 3.2.0 to 5.1.1
* fix hardcoded version for Tracing struct
* add additional metadata
* remove extra allocations for metadata
* Remove zebrad code version from release checklist
The zebrad code automatically uses the crate version now.
* Sort panic metadata into rough categories
Co-authored-by: teor <teor@riseup.net>
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests
* Add a full set of stderr acceptance test functions
This change makes the stdout and stderr acceptance test interfaces
identical.
* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path
* Use Display for state path logs
Avoids weird escaping on Windows when using Debug
* Add Windows conflict error messages
* Turn PORT_IN_USE_ERROR into a regex
And add another alternative Windows-specific port error
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
* Add the configured network to error reports
* Log the configured network at error level
* Create the global span immediately after activating tracing
And leak the span guard, so the span is always active.
* Include panic metadata in the report and URL
* Use `Main` and `Test` in the global span
`net=Mainnet` is a bit redundant
In our README, we tell users to ignore these errors, so we should also
disable the issue URL.
Also include the hash in the error. (We don't want the span active for
all messages, we just want the hash in the error.)
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.
To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.
Co-authored-by: teor <teor@riseup.net>
* Split tracing component code into modules.
* Repatriate Tracing and simplify config handling.
We upstreamed our Tracing component, expecting not to have to exert fine
control over the tracing settings. But this turned out not to be the case, and
now that we want to do other things (flamegraphs, journalctl, opentelemetry,
etc), we end up with really awkward code (as in the current flamegraph
handling).
This also makes use of the changes to `init()` to load the config early to pass
configuration data into the components, which avoids the need for the
refactoring in #775.
Finally, we restore support for the `-v` flag when the filter is unset. Closes#831.
* Disable tracing and metrics endpoints by default.
Closes#660.
* Switch back to upstream Abscissa.
* Integrate flamegraph support into the new Tracing component.
* Pass -v in acceptance tests to get info-level output.
* Clean up acceptance test code.
* Setup tracing-flame for use profiling zebrad
* start work on conditional flamegraph generation
* review time!
* update comments
* Update Cargo.toml
* disable default features for inferno
* reorganize
* missing one trait
* Apply suggestions from code review
* graceful shutdown!
* remove special case handling on ctrlc for cleanup
* rename signal fn to better represent its responsibility
* remove unused global hook for flushing flamegraph
* move tracing logic to the right file
* just copy linkerd's signal handling logic
* update book
* make zebrad app drop on shutdown normally
* Update zebrad/src/components/tokio.rs
Co-authored-by: teor <teor@riseup.net>
* Update zebrad/src/application.rs
Co-authored-by: teor <teor@riseup.net>
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* cleanup a little
* ooh yea there's an API for that
* setup env-filter for backup subscriber
* document env filter
* document return codes
* forgot to save
* Update book/src/applications/zebrad.md
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Load tracing filter only from config and simplify logic.
* Configure the state storage in the config, not an environment variable.
This also changes the config so that the path is always set rather than being
optional, because Zebra always needs a place to store its config.
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case
Co-authored-by: Jane Lusby <jane@zfnd.org>
We get the injected TokioComponent dependency before the config is
loaded, so we can't use it to open the endpoints.
And we can't define after_config, because we use derive(Component).
So we work around these issues by opening the endpoints manually,
from the application's after_config.
* Log at warn level for commands that use stdout
* Let zebrad revhex read from stdin
Most unix tools support reading from stdin, so they can be used in
pipelines.
Part of #564.
Co-authored-by: Jane Lusby <jane@zfnd.org>
Prior to this change, the seed subcommand would consistently encounter a panic in one of the background tasks, but would continue running after the panic. This is indicative of two bugs.
First, zebrad was not configured to treat panics as non recoverable and instead defaulted to the tokio defaults, which are to catch panics in tasks and return them via the join handle if available, or to print them if the join handle has been discarded. This is likely a poor fit for zebrad as an application, we do not need to maximize uptime or minimize the extent of an outage should one of our tasks / services start encountering panics. Ignoring a panic increases our risk of observing invalid state, causing all sorts of wild and bad bugs. To deal with this we've switched the default panic behavior from `unwind` to `abort`. This makes panics fail immediately and take down the entire application, regardless of where they occur, which is consistent with our treatment of misbehaving connections.
The second bug is the panic itself. This was triggered by a duplicate entry in the initial_peers set. To fix this we've switched the storage for the peers from a `Vec` to a `HashSet`, which has similar properties but guarantees uniqueness of its keys.
The components are accessed by a lock on application state. When some command
calls block_on to enter an async context, it obtained a write lock on the
entire application state. This meant that if the application state were
accessed later in an async context, a deadlock would occur. Instead the
TokioComponent holds an Option<Runtime> now, so that before calling block_on,
the caller can .take() the runtime and release the lock. Since we only ever
enter an async context once, it's not a problem that the component is then
missing its runtime, as once we are inside of a task we can access the runtime.
* Add a TracingConfig and some components
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Restructure, use dependency injection, initialize tracing
* Start a placeholder loop in start command
* Add hyper alpha.1, bump tokio to alpha.4
* Hello world endpoint using async/await from hyper 0.13 alpha
Also cleaned up some linter messages.
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
* Update to tracing_subscriber 0.1
* fmt
* add rust-toolchain
* Remove hyper::Version import
* wip: start filter_handler impl
* Add .rustfmt.toml
* rustfmt
* Tidy up .rustfmt.toml
* Add filter reloading handling.
* bump toolchain
* Remove generated hello world acceptance tests.
These test the behaviour of the autogenerated binary and work as examples of
how to test the behaviour of abscissa binaries. Since we don't print "Hello
World" any more, they fail, but we don't yet have replacement behaviour to add
tests for, so they're removed for now.
* Clean up config file handling with Option::and_then.