Skip to content

Change proc_macro::Span::byte_range -> byte_offset. #139901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Apr 16, 2025

This is about #54725's open question on byte offsets:

Next to span.line() and span.column(), we also want a way to get the byte offset into the file.

We have basically two competing options to expose this:

Option 1 (merge this PR)

impl Span {
    pub fn byte_offset(&self) -> usize;
}

This is consistent with the line and column methods on Span.
To get the offset of the end of the span, you use span.end().byte_offset(), just like how you use span.end().column() for the end column.

Option 2 (close this PR)

impl Span {
    pub fn byte_range(&self) -> Range<usize>;
}

This gives you both ends of the range at once, and uses the Range type (which is usable to index a slice), but is arguably less consistent with the line and column methods.


Curious to hear what y'all think.

👍 for option 1 (merge this PR)
👎 for option 2 (close this PR)

(The decision is still up to the libs-api team. Votes are just useful as input, not as the final decision.)

This is more consistent with Span::line and Span::column.
@m-ou-se m-ou-se added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-proc-macros Area: Procedural macros labels Apr 16, 2025
@m-ou-se m-ou-se self-assigned this Apr 16, 2025
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 16, 2025
@rustbot

This comment was marked as off-topic.

@m-ou-se
Copy link
Member Author

m-ou-se commented Apr 16, 2025

I slightly prefer option 1, as it fits well in our earlier design choice to get rid of LineColumn (or Location) type and make Span itself usable as a way to store a location. E.g.

let location = range.end();

let _ = location.line();
let _ = location.column();
let _ = location.byte_offset();

If one is using an (empty) Span to store a specific location, it might be awkward to have to write .byte_range().start to get its offset.

@m-ou-se
Copy link
Member Author

m-ou-se commented Apr 16, 2025

@rust-lang/libs-api If you have an opinion on this, please speak up!

@BurntSushi
Copy link
Member

I'm fine with option (1) personally.

I don't think option (1) precludes option (2) though, and I could see it being a nice convenience routine to have. But I don't feel strongly about it. Following the API pattern created with line and column numbers SGTM.

@dtolnay
Copy link
Member

dtolnay commented Apr 16, 2025

I prefer to keep byte_range and not adding byte_offset.

The use case for byte offset is significantly different than line/column so the consistency objective is weak. For line/column, the overwhelming majority use cases is rendering a line/column in an error message or log message or similar, where you would practically never also want the upper end. Exposing directly on Span the line/column you almost definitely want, and putting the end line/column behind span.end(), prioritizes the majority use cases being obvious and concise (span.line()) and adding the minimal API that facilitates niche other use cases.

A Range-based API for line/column would not fit. For a span that goes from line 2 column 18 to line 4 column 5 (e.g. the braces of a for) returning Range { start: 18, end: 5 } for the column is bizarre and not coherent with a typical meaning of that type. A more coherent API would be something like Range { start: LineColumn { line: 2, column: 18 }, end: LineColumn { line: 4, column: 5 } } but span.line_column_range().start.line is just a worse API for all use cases than span.line().

Byte range is different than this. In my experience the majority use case for byte offset is slicing. Code like https://github.com/dtolnay/cxx/blob/f9d547b60324bc02d9983622159973a75d06ea10/gen/src/error.rs#L115-L142. The PR summary writes off Range for being an iterator: "This gives you both ends of the range at once, but uses the Range type (which is an Iterator)", but misses that Range is more importantly a RangeBounds. People will not be using iteration on a byte range but they will be using it as RangeBounds for slicing.

I don't find that pushing this to be consistent with line/column improves the API. Unlike line/column, a single byte offset (span.byte_offset()) is generally not going to be useful. In the unusual case that someone needs a single one, span.byte_range().start is not worse for them than span.byte_offset(). Meanwhile span.byte_range() is better than span.byte_offset()..span.end().byte_offset().

@m-ou-se
Copy link
Member Author

m-ou-se commented Apr 16, 2025

The PR summary writes off Range for being an iterator

I've reworded it a bit, instead focussing on how it can be used to index a slice. Thanks for pointing that out.

@m-ou-se
Copy link
Member Author

m-ou-se commented Apr 16, 2025

Your argument sounds convincing to me.

If anyone still prefers option 1 over 2, please speak up.

@Amanieu
Copy link
Member

Amanieu commented Apr 20, 2025

I also prefer option 2.

However I have a different concern about stabilization: how will this interact with the new range types tracked in #123741?

@m-ou-se m-ou-se added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-proc-macros Area: Procedural macros I-libs-api-nominated Nominated for discussion during a libs-api team meeting. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants