Commit Graph

47 Commits

Author SHA1 Message Date
Janito Vaqueiro Ferreira Filho 0960e4fb0b
Update to Tokio 1.13.0 (#2994)
* Update `tower` to version `0.4.9`

Update to latest version to add support for Tokio version 1.

* Replace usage of `ServiceExt::ready_and`

It was deprecated in favor of `ServiceExt::ready`.

* Update Tokio dependency to version `1.13.0`

This will break the build because the code isn't ready for the update,
but future commits will fix the issues.

* Replace import of `tokio::stream::StreamExt`

Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.

* Use `IntervalStream` in `zebra-network`

In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.

* Use `IntervalStream` in `inventory_registry`

In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.

* Use `BroadcastStream` in `inventory_registry`

In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.

* Handle `Semaphore::acquire` error in `tower-batch`

Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.

* Handle `Semaphore::acquire` error in `zebrad` test

On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.

* Update some `zebra-network` dependencies

Use versions compatible with Tokio version 1.

* Upgrade Hyper to version 0.14

Use a version that supports Tokio version 1.

* Update `metrics` dependency to version 0.17

And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.

* Use `f64` as the histogram data type

`u64` isn't supported as the histogram data type in newer versions of
`metrics`.

* Update the initialization of the metrics component

Make it compatible with the new version of `metrics`.

* Simplify build version counter

Remove all constants and use the new `metrics::incement_counter!` macro.

* Change metrics output line to match on

The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.

* Update `sentry` to version 0.23.0

Use a version compatible with Tokio version 1.

* Remove usage of `TracingIntegration`

This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.

* Add sentry layer to tracing initialization

This seems like the replacement for `TracingIntegration`.

* Remove unnecessary conversion

Suggested by a Clippy lint.

* Update Cargo lock file

Apply all of the updates to dependencies.

* Ban duplicate tokio dependencies

Also ban git sources for tokio dependencies.

* Stop allowing sentry-tracing git repository in `deny.toml`

* Allow remaining duplicates after the tokio upgrade

* Use C: drive for CI build output on Windows

GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.

Co-authored-by: teor <teor@riseup.net>
2021-11-02 18:46:57 +00:00
teor e277975d85
Try flushing streams before exiting Zebra (#2911) 2021-10-20 13:57:09 +00:00
teor 776432978c
Allow deliberate instances of the clippy::derivable_impls lint (#2788)
* Allow deliberate instances of the new nightly clippy::derivable_impls lint

We want our config defaults to be explicit.

Not so sure about the application defaults, but they also contain a config.

* Also allow unknown lint names

Stable doesn't know about this lint, but nightly does.
2021-09-22 10:43:27 -03:00
teor b18c32f30f
Add the database format to the panic metadata (#2249)
Seems like it might be useful as we add more stuff to the state.
2021-06-04 14:42:15 +10:00
teor 52dcaa2544 Stop ignoring lightweight git tags in panic metadata
Unfortunately, Zebra's first alpha release is an annotated tag, but
GitHub defaults to lightweight tags. (At least for pre-releases.)
2021-05-20 09:00:56 +10:00
teor bcc59d11c3 Refactor metadata so git vars must be optional
We don't test non-git builds, but we can use the type system to make sure
they are always optional.
2021-05-20 09:00:56 +10:00
teor b6c5ef8041 Add VERGEN_CARGO_PROFILE to the panic env vars
Some panics should only happen on debug profiles.
2021-05-20 09:00:56 +10:00
teor 62f053de9e Enable cargo env vars when there is no .git
But still disable git env vars.

This change requires vergen 5.1.4 or later.
2021-05-20 09:00:56 +10:00
teor 96b3c94dbc
Add the new commit count and git hash to the version (#2038)
* Use the git version + new commit count + hash for the app version

This helps diagnose bugs in versions of Zebra built from git branches,
rather than git version tags.

* Fill in assert

* Also log semver string

* Fix syntax

* Handle vergen using the cargo package version or raw git tag

* s/Semver/SemVer/

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
2021-04-21 22:14:36 +00:00
teor 79c0c4ec57 Stop assuming there will always be a git commit
Enable builds where:
* there is no google cloud git commit env var, and
* there is no `.git` directory.

By making all `vergen` env vars optional, and skipping any env vars that
don't exist.
2021-04-20 13:48:31 -04:00
Kirill Fomichev 43e792b9a4
Update to vergen 5, add branch, commit time, and build target to the panic metadata, automatically update app version from crate version (#2029)
* build(deps): bump vergen from 3.2.0 to 5.1.1

* fix hardcoded version for Tracing struct

* add additional metadata

* remove extra allocations for metadata

* Remove zebrad code version from release checklist

The zebrad code automatically uses the crate version now.

* Sort panic metadata into rough categories

Co-authored-by: teor <teor@riseup.net>
2021-04-20 06:48:14 +10:00
teor 1e4e5924ca clippy: factor common code out of an if-else block 2021-04-14 23:16:45 -04:00
Alfredo Garcia 4b34482264
Add hints to port conflict and lock file panics (#1535)
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests

* Add a full set of stderr acceptance test functions

This change makes the stdout and stderr acceptance test interfaces
identical.

* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path

* Use Display for state path logs

Avoids weird escaping on Windows when using Debug

* Add Windows conflict error messages

* Turn PORT_IN_USE_ERROR into a regex

And add another alternative Windows-specific port error

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
2021-01-29 22:36:33 +10:00
teor c75cbdea79
Log configured network in every log message (#1568)
* Add the configured network to error reports
* Log the configured network at error level
* Create the global span immediately after activating tracing
And leak the span guard, so the span is always active.

* Include panic metadata in the report and URL
* Use `Main` and `Test` in the global span
`net=Mainnet` is a bit redundant
2021-01-12 07:46:56 +10:00
teor 69fcf64d6c
Disable issue URLs for "duplicate hash" errors (#1517)
In our README, we tell users to ignore these errors, so we should also
disable the issue URL.

Also include the hash in the error. (We don't want the span active for
all messages, we just want the hash in the error.)
2020-12-16 08:14:42 +10:00
Deirdre Connolly cff28f7ac8 Use the commit sha as the sentry release 2020-12-09 13:06:18 -05:00
Jane Lusby 400213e2b3 integrate sentry with our existing panic reporting logic 2020-12-09 13:06:18 -05:00
teor 16ffb1dbbf
Disable issue URLs on all timeouts (#1470)
This change helps prevent spurious bug reports.
2020-12-08 07:47:01 +10:00
Jane Lusby ef7e91c3c7
disable color-eyre colors if not connected to a tty (#1443)
* disable color-eyre colors if not connected to a tty
* check if color is disabled
2020-12-04 11:05:25 +10:00
Jane Lusby 90f944709b
fix git commit logic to work on gcloud (#1442) 2020-12-03 15:18:55 +10:00
teor 0e42d8b6c1 Always enable color_eyre, even when color is disabled
We want to automatically disable colors upstream in color_eyre,
and add a config that allows users to always turn off color.
2020-12-02 10:25:44 -08:00
teor bed34168c1 Automatically disable abscissa colors and color_eyre when writing to a file 2020-12-02 10:25:44 -08:00
Jane Lusby a91d0f0bb6
Include short sha in log messages and error urls (#1410)
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.

To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.

Co-authored-by: teor <teor@riseup.net>
2020-12-01 12:13:20 -08:00
Henry de Valence e8c16b172f zebrad: pass TracingSection to Tracing component 2020-11-30 15:25:50 -08:00
Jane Lusby 40e22808c7
disable reporting url for timeout errors (#1087)
* disable reporting url for timeout errors

* revert newline removal

* switch to released color-eyre version
2020-09-21 16:15:09 -07:00
Jane Lusby ca648ff27c
Enable issue-url feature in color-eyre (#1072)
* Enable issue-url feature in color-eyre

* get version automatically

* and the url!
2020-09-17 15:09:18 -07:00
Henry de Valence 6d1a4b2218
Load config after initializing the Terminal (#848) 2020-08-06 17:22:40 -07:00
Alfredo Garcia c52481c041 fix logs 2020-08-07 09:21:57 +10:00
Jane Lusby 3e9c6f054b
fix log level default for server commands (#840)
* fix log level default for server commands

* remove dbg
2020-08-06 11:23:00 -07:00
Henry de Valence a77328ad7c
Refactor tracing components (#834)
* Split tracing component code into modules.

* Repatriate Tracing and simplify config handling.

We upstreamed our Tracing component, expecting not to have to exert fine
control over the tracing settings.  But this turned out not to be the case, and
now that we want to do other things (flamegraphs, journalctl, opentelemetry,
etc), we end up with really awkward code (as in the current flamegraph
handling).

This also makes use of the changes to `init()` to load the config early to pass
configuration data into the components, which avoids the need for the
refactoring in #775.

Finally, we restore support for the `-v` flag when the filter is unset.  Closes #831.

* Disable tracing and metrics endpoints by default.

Closes #660.

* Switch back to upstream Abscissa.

* Integrate flamegraph support into the new Tracing component.

* Pass -v in acceptance tests to get info-level output.

* Clean up acceptance test code.
2020-08-06 10:29:31 -07:00
Jane Lusby 867dd0b475
Setup tracing-flame for use profiling zebrad (#436)
* Setup tracing-flame for use profiling zebrad

* start work on conditional flamegraph generation

* review time!

* update comments

* Update Cargo.toml

* disable default features for inferno

* reorganize

* missing one trait

* Apply suggestions from code review

* graceful shutdown!

* remove special case handling on ctrlc for cleanup

* rename signal fn to better represent its responsibility

* remove unused global hook for flushing flamegraph

* move tracing logic to the right file

* just copy linkerd's signal handling logic

* update book

* make zebrad app drop on shutdown normally

* Update zebrad/src/components/tokio.rs

Co-authored-by: teor <teor@riseup.net>

* Update zebrad/src/application.rs

Co-authored-by: teor <teor@riseup.net>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* cleanup a little

* ooh yea there's an API for that

* setup env-filter for backup subscriber

* document env filter

* document return codes

* forgot to save

* Update book/src/applications/zebrad.md

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2020-08-05 16:35:56 -07:00
Henry de Valence 4a03d76a41
Remove environment variables in favor of documented config options. (#827)
* Load tracing filter only from config and simplify logic.

* Configure the state storage in the config, not an environment variable.

This also changes the config so that the path is always set rather than being
optional, because Zebra always needs a place to store its config.
2020-08-05 11:48:08 -07:00
Alfredo Garcia f2d7bb3177
Command execution tests (#690)
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-08-01 16:15:26 +10:00
teor 050c46388f fix: Open the endpoints after the config is loaded
We get the injected TokioComponent dependency before the config is
loaded, so we can't use it to open the endpoints.

And we can't define after_config, because we use derive(Component).

So we work around these issues by opening the endpoints manually,
from the application's after_config.
2020-07-29 16:03:52 +10:00
teor e7437cc551 feature: Get endpoint addresses from config 2020-07-29 16:03:52 +10:00
teor 71de6de701 fix: Only enable tokio components for servers
Only enable the tokio and tracing components for server commands.
2020-07-17 10:12:51 +10:00
teor 49a3a7d6d1 fix: Only launch network endpoints for server commands
Fixes #669.
2020-07-16 10:40:03 -07:00
teor 12b9fa8ae2
Let zebrad revhex read from stdin (#648)
* Log at warn level for commands that use stdout
* Let zebrad revhex read from stdin

Most unix tools support reading from stdin, so they can be used in
pipelines.

Part of #564.
2020-07-15 16:16:07 +10:00
Jane Lusby 246e7cd2a9
Start testing out new version of `eyre` and `color-eyre` in zebra (#526)
* port to new version of eyre without generics

* correctly setup color_eyre hooks

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-22 15:36:23 -07:00
Jane Lusby 18b4dbc16c
fix tracing configuration issues (#432) 2020-06-04 19:34:06 -07:00
Jane Lusby 8c178c3ee4
fix panic in seed subcommand (#401)
Co-authored-by: Jane Lusby <jane@zfnd.org>

Prior to this change, the seed subcommand would consistently encounter a panic in one of the background tasks, but would continue running after the panic. This is indicative of two bugs. 

First, zebrad was not configured to treat panics as non recoverable and instead defaulted to the tokio defaults, which are to catch panics in tasks and return them via the join handle if available, or to print them if the join handle has been discarded. This is likely a poor fit for zebrad as an application, we do not need to maximize uptime or minimize the extent of an outage should one of our tasks / services start encountering panics. Ignoring a panic increases our risk of observing invalid state, causing all sorts of wild and bad bugs. To deal with this we've switched the default panic behavior from `unwind` to `abort`. This makes panics fail immediately and take down the entire application, regardless of where they occur, which is consistent with our treatment of misbehaving connections.

The second bug is the panic itself. This was triggered by a duplicate entry in the initial_peers set. To fix this we've switched the storage for the peers from a `Vec` to a `HashSet`, which has similar properties but guarantees uniqueness of its keys.
2020-05-27 17:40:12 -07:00
Henry de Valence 75d3d44fb3 Metrics MVP: add two metrics and export them to Prometheus.
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2020-02-14 20:14:05 -05:00
Henry de Valence 4fcb550aa6 Fix a deadlock in TokioComponent.
The components are accessed by a lock on application state.  When some command
calls block_on to enter an async context, it obtained a write lock on the
entire application state.  This meant that if the application state were
accessed later in an async context, a deadlock would occur.  Instead the
TokioComponent holds an Option<Runtime> now, so that before calling block_on,
the caller can .take() the runtime and release the lock.  Since we only ever
enter an async context once, it's not a problem that the component is then
missing its runtime, as once we are inside of a task we can access the runtime.
2020-01-15 12:06:31 -08:00
Henry de Valence ab3db201ee Change TracingEndpoint to forward to the Abscissa Tracing component. 2020-01-15 12:06:31 -08:00
Tony Arcieri 45eb81a204 Upgrade to Abscissa v0.5 2020-01-15 12:06:31 -08:00
Deirdre Connolly 162b37fe8d Tracing endpoint (#3)
* Add a TracingConfig and some components

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>

* Restructure, use dependency injection, initialize tracing

* Start a placeholder loop in start command

* Add hyper alpha.1, bump tokio to alpha.4

* Hello world endpoint using async/await from hyper 0.13 alpha

Also cleaned up some linter messages.

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* Update to tracing_subscriber 0.1

* fmt

* add rust-toolchain

* Remove hyper::Version import

* wip: start filter_handler impl

* Add .rustfmt.toml

* rustfmt

* Tidy up .rustfmt.toml

* Add filter reloading handling.

* bump toolchain

* Remove generated hello world acceptance tests.

These test the behaviour of the autogenerated binary and work as examples of
how to test the behaviour of abscissa binaries.  Since we don't print "Hello
World" any more, they fail, but we don't yet have replacement behaviour to add
tests for, so they're removed for now.

* Clean up config file handling with Option::and_then.
2019-09-09 13:05:42 -07:00
Henry de Valence ec363d2d41 Create workspace skeleton based on design.md 2019-08-29 14:46:54 -07:00