|
| 1 | +# Continue resolving `cargo-semver-checks` blockers for merging into cargo |
| 2 | + |
| 3 | +| Metadata | | |
| 4 | +| --- | --- | |
| 5 | +| Owner(s) | @obi1kenobi | |
| 6 | +| Teams | [cargo] | |
| 7 | +| Status | Proposed | |
| 8 | + |
| 9 | + |
| 10 | +## Summary |
| 11 | + |
| 12 | +Design and implement `cargo-semver-checks` functionality that lies on the critical path for merging the tool into cargo itself. Continues the work of [the 2024h2 goal][2024h2-goal]. |
| 13 | + |
| 14 | +[2024h2-goal]: https://rust-lang.github.io/rust-project-goals/2024h2/cargo-semver-checks.html |
| 15 | + |
| 16 | +## Motivation |
| 17 | + |
| 18 | +Cargo assumes that all packages adhere to semantic versioning (SemVer). |
| 19 | +However, SemVer adherence is quite hard in practice: [research shows][semver-study] that accidental SemVer violations are relatively common (lower-bound: in 3% of releases) and happen to Rustaceans of all skill levels. |
| 20 | +Given the significant complexity of the Rust SemVer rules, improvements here require better tooling. |
| 21 | + |
| 22 | +`cargo-semver-checks` is a linter for semantic versioning (SemVer) in Rust. |
| 23 | +It is broadly adopted by the Rust community, and the [cargo] team has expressed interest in merging it into cargo itself as part of the existing `cargo publish` workflow. |
| 24 | +By default, `cargo publish` would require SemVer compliance, but offer a flag (analogous to the `--allow-dirty` flag for uncommitted changes) to override the SemVer check and proceed with publishing anyway. |
| 25 | + |
| 26 | +The [cargo] team has identified [a set of milestones and blockers][merge-blockers] that must be resolved before `cargo-semver-checks` can be integrated into the `cargo publish` workflow. |
| 27 | +Our goal here is to make steady progress toward resolving them. |
| 28 | + |
| 29 | +[semver-study]: https://predr.ag/blog/semver-violations-are-common-better-tooling-is-the-answer/ |
| 30 | +[merge-blockers]: https://github.com/obi1kenobi/cargo-semver-checks/issues/61 |
| 31 | + |
| 32 | +### The status quo after the 2024h2 goal |
| 33 | + |
| 34 | +As part of [the 2024h2 goal work][2024h2-tracking], support for cargo manifest linting was merged into `cargo-semver-checks`. |
| 35 | +This lifted [one of the blockers][merge-blockers] blocker for SemVer-checking as part of `cargo publish`. |
| 36 | + |
| 37 | +Work is still required in two major areas: |
| 38 | +- Checking of cross-crate items |
| 39 | +- SemVer linting of type information |
| 40 | + |
| 41 | +Some work in each of these areas [already happened in the 2024h2 goal][2024h2-tracking]: |
| 42 | +- The manifest linting work [required a significant refactor][major-refactor] of the tool's data-handling infrastructure. As part of that major refactor, we were able to also create "API space" for a future addition of cross-crate information. |
| 43 | +- The [compiler team MCP][compiler-mcp] required to expose cross-crate information to rustdoc was merged, and together with T-rustdoc, we now have a plan for exposing that information to `cargo-semver-checks`. |
| 44 | +- We have implemented a partial schema that makes available a limited subset of type information around generic parameters and trait bounds. It's sufficient to power a set of new lints, though it isn't comprehensive yet. |
| 45 | + |
| 46 | +Fully resolving the [blockers][merge-blockers] is likely a 12-24 month undertaking, and beyond the scope of this goal on its own. |
| 47 | +Instead, this goal proposes to accomplish intermediate steps that create immediate value for users and derisk the overall endeavor, while needing only "moral support" from the [cargo] team as its only requirement. |
| 48 | + |
| 49 | +[2024h2-tracking]: https://github.com/rust-lang/rust-project-goals/issues/104 |
| 50 | +[major-refactor]: https://github.com/obi1kenobi/cargo-semver-checks/pull/1001 |
| 51 | +[compiler-mcp]: https://github.com/rust-lang/compiler-team/issues/635 |
| 52 | + |
| 53 | +#### Checking of cross-crate items |
| 54 | + |
| 55 | +_This section is background information and is unchanged from [the 2024h2 goal][2024h2-goal]._ |
| 56 | + |
| 57 | +Currently, `cargo-semver-checks` performs linting by only using the rustdoc JSON of the target package being checked. |
| 58 | +However, the public API of a package may expose items from other crates. |
| 59 | +Since rustdoc no longer inlines the definitions of such foreign items into the JSON of the crate whose public API relies on them, `cargo-semver-checks` [cannot see or analyze them][cross-crate-items]. |
| 60 | + |
| 61 | +This causes a massive number of false-positives ("breakage reported incorrectly") and false-negatives ("lint for issue X fails to spot an instance of issue X"). |
| 62 | +In excess of 90% of real-world false-positives are traceable back to a cross-crate item, as measured by our [SemVer study][semver-study]! |
| 63 | + |
| 64 | +For example, the following change is not breaking but `cargo-semver-checks` will incorrectly report it as breaking: |
| 65 | +```rust |
| 66 | +// previous release: |
| 67 | +pub fn example() {} |
| 68 | + |
| 69 | +// in the new release, imagine this function moved to `another_crate`: |
| 70 | +pub use another_crate::example; |
| 71 | +``` |
| 72 | +This is because the rustdoc JSON that `cargo-semver-checks` sees indeed *does not contain* a function named `example`. |
| 73 | +Currently, `cargo-semver-checks` is incapable of following the cross-crate connection to `another_crate`, generating its rustdoc JSON, and continuing its analysis there. |
| 74 | + |
| 75 | +Resolving this limitation will require changes to how `cargo-semver-checks` generates and handles rustdoc JSON, since the set of required rustdoc JSON files will no longer be fully known ahead of time. |
| 76 | +It will also require CLI changes in the same area as the changes required to support manifest linting. |
| 77 | + |
| 78 | +While there may be other challenges on rustc and rustdoc's side before this feature could be fully implemented, we consider those out of scope here since there are [parallel efforts to resolve them][parallel-efforts]. |
| 79 | +The goal here is for `cargo-semver-checks` to have its own story straight and do the best it can. |
| 80 | + |
| 81 | +[cross-crate-items]: https://github.com/obi1kenobi/cargo-semver-checks/issues/638 |
| 82 | +[parallel-efforts]: https://github.com/rust-lang/compiler-team/issues/635 |
| 83 | + |
| 84 | +#### SemVer linting of type information |
| 85 | + |
| 86 | +_This section is background information and is unchanged from [the 2024h2 goal][2024h2-goal]._ |
| 87 | + |
| 88 | +In general, at the moment `cargo-semver-checks` lints cannot represent or examine type information. |
| 89 | +For example, the following change is breaking but `cargo-semver-checks` will not detect or report it: |
| 90 | +```rust |
| 91 | +// previous release: |
| 92 | +pub fn example(value: String) {} |
| 93 | + |
| 94 | +// new release: |
| 95 | +pub fn example(value: i64) {} |
| 96 | +``` |
| 97 | +Analogous breaking changes to function return values, struct fields, and associated types would also be missed by `cargo-semver-checks` today. |
| 98 | + |
| 99 | +The main difficulty here lies with the expressiveness of the Rust type system. For example, none of the following changes are breaking: |
| 100 | +```rust |
| 101 | +// previous release: |
| 102 | +pub fn example(value: String) {} |
| 103 | + |
| 104 | +// new release: |
| 105 | +pub fn example(value: impl Into<String>) {} |
| 106 | + |
| 107 | +// subsequent release: |
| 108 | +pub fn example<S: Into<String>>(value: S) {} |
| 109 | +``` |
| 110 | +Similar challenges exist with lifetimes, variance, trait solving, `async fn` versus `fn() -> impl Future`, etc. |
| 111 | + |
| 112 | +While some promising preliminary work has been done toward resolving this challenge, more in-depth design work is necessary to determine the best path forward. |
| 113 | + |
| 114 | +### The next 6 months |
| 115 | + |
| 116 | +- Prototype cross-crate linting using manual workarounds for the current rustc and rustdoc blockers. This will allow us to roll out a full solution relatively quickly after the rustc and rustdoc blockers are resolved. |
| 117 | +- Expose data on generic types, lifetimes, functions, methods, and bounds in sufficient granularity for linting. |
| 118 | +- Determine how to handle special cases, such as changes to impls or bounds involving `'static`, `?Sized`, `dyn Trait` etc. |
| 119 | +- Improve sealed trait analysis to account for `#[doc(hidden)]` items, resolving many false-positives. |
| 120 | + |
| 121 | +### The "shiny future" we are working towards |
| 122 | + |
| 123 | +_This section is unchanged from [the 2024h2 goal][2024h2-goal]._ |
| 124 | + |
| 125 | +Accidentally publishing SemVer violations that break the ecosystem is never fun for anyone involved. |
| 126 | + |
| 127 | +From a user perspective, we want a fearless `cargo update`: one's project should never be broken by updating dependences without changing major versions. |
| 128 | + |
| 129 | +From a maintainer perspective, we want a fearless `cargo publish`: we want to prevent breakage, not to find out about it when a frustrated user opens a GitHub issue. Just like cargo flags uncommitted changes in the publish flow, it should also _quickly_ and _accurately_ flag breaking changes in non-major releases. Then the maintainer may choose to release a major version instead, or acknowledge and explicitly override the check to proceed with publishing as-is. |
| 130 | + |
| 131 | +To accomplish this, `cargo-semver-checks` needs the ability to express more kinds of lints (including manifest and type-based ones), eliminate false-positives, and stabilize its public interfaces (e.g. the CLI). At that point, we'll have lifted the [main merge-blockers][merge-blockers] and we can consider making it a first-party component of cargo itself. |
| 132 | + |
| 133 | +## Ownership and team asks |
| 134 | + |
| 135 | +**Owner:** @obi1kenobi, as maintainer of `cargo-semver-checks` |
| 136 | + |
| 137 | +I (@obi1kenobi) will be working on this effort. The only other resource request would be occasional discussions and moral support from the [cargo] and [rustdoc] teams, of which I already have the privilege as maintainer of a popular cargo plugin that makes extensive use of rustdoc JSON. |
| 138 | + |
| 139 | +| Subgoal | Owner(s) or team(s) | Notes | |
| 140 | +| -------------------------------------------------- | --------------------------- | ----- | |
| 141 | +| Prototype cross-crate linting using workarounds | @obi1kenobi | | |
| 142 | +| Allow linting generic types, lifetimes, bounds | @obi1kenobi | | |
| 143 | +| Handle "special cases" like `'static` and `?Sized` | @obi1kenobi | | |
| 144 | +| Handle `#[doc(hidden)]` in sealed trait analysis | @obi1kenobi | | |
| 145 | +| Discussion and moral support | ![Team][] [cargo] [rustdoc] | | |
| 146 | + |
| 147 | +## Frequently asked questions |
| 148 | + |
| 149 | +_This section is unchanged from [the 2024h2 goal][2024h2-goal]._ |
| 150 | + |
| 151 | +### Why not use semverver instead? |
| 152 | + |
| 153 | +[Semverver][semverver] is a prior attempt at enforcing SemVer compliance, but has been deprecated and is no longer developed or maintained. |
| 154 | +It relied on compiler-internal APIs, which are much more unstable than rustdoc JSON and required much more maintenance to "keep the lights on." |
| 155 | +This also meant that semverver required users to install a specific nightly versions that were known to be compatible with their version of semverver. |
| 156 | + |
| 157 | +While `cargo-semver-checks` relies on rustdoc JSON which is also an unstable nightly-only interface, its changes are much less frequent and less severe. |
| 158 | +By using the [Trustfall query engine][trustfall], `cargo-semver-checks` can simultaneously support a range of rustdoc JSON formats (and therefore Rust versions) within the same tool. |
| 159 | +On the maintenance side, `cargo-semver-checks` lints are written in a declarative manner that is oblivious to the details of the underlying data format, and do not need to be updated when the rustdoc JSON format changes. |
| 160 | +This makes maintenance much easier: updating to a new rustdoc JSON format usually requires just a few lines of code, instead of "a few lines of code apiece in each of hundreds of lints." |
| 161 | + |
| 162 | +[semverver]: https://github.com/rust-lang/rust-semverver |
| 163 | +[trustfall]: https://github.com/obi1kenobi/trustfall |
0 commit comments