Skip to content

Wip: Add new lint "char_slices" #6529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

brightly-salty
Copy link
Contributor

changelog:
changelog: adds new lint "char_slices" which lints on things of type &[char] and suggests changing to &[&str]
fixes #5598

Currently a WIP: does not work

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @llogiq (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Dec 31, 2020
/// character. Usually users of `&[char]` want O(1) indexing on characters,
/// not codepoints.
///
/// **Known problems:** None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dealing with FFI to other high-level languages where strings are represented by sequences of chars, the single correct way is to use &[char] or Vec<char>. This should be listed under "Known problems".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as we understand it, char should not be used at FFI boundaries. consider u32/i32 instead.

(from what we understand & and &mut shouldn't be used at FFI boundaries. basically any type that is rust ABI shouldn't be used at FFI boundaries?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found another counter-example (though nightly only): std::str::pattern::Pattern is implemented on &[char] (which basically means any char in the slice matches the pattern). There may be other use cases where having chars equidistantly distributed in memory may be of the essence. I feel the Problems section should reflect this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vec<char> and &[char] is also a completely reasonable way to represent UTF32 encoded strings.

/// **What it does:** Checks for use of `&[char]` anywhere in the code.
///
/// **Why is this bad?** `char` represents a Unicode codepoint, not a
/// character. Usually users of `&[char]` want O(1) indexing on characters,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs should also say something about how to index on characters (e.g. unicode_segmentation?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the issue, the correct way to index on characters is to use &[&str]. I can try to make that more clear in the documentation if that is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not talking about the type but the code.

@@ -581,6 +609,24 @@ impl Types {
"a `VecDeque` might work",
);
return; // don't recurse into the type
} else if let TyKind::Slice(s) = hir_ty.kind {
if let TyKind::Path(ref qpath) = s.kind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably be more readable using an if_chain!. See elsewhere in the code for how to use it.

@@ -714,6 +760,25 @@ impl Types {
};
self.check_ty(cx, &mut_ty.ty, is_local);
},
TyKind::Slice(ref s) => {
if let TyKind::Path(ref qpath) = s.kind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the same as L614..629. Perhaps factor out a method?

@@ -0,0 +1,15 @@
#![warn(clippy::char_slices)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the corresponding char_slices.stderr file. If you run the tests, you should get a message telling you how to "bless" the result of the test, and then you can commit the file. This will likely also make CI pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because it currently does not produce any output in stderr. I think something is wrong but I have no idea what it is. If you have any ideas they would be very welcome. (This is only my second PR to clippy, and my first was relatively easy because it was almost exactly the same as three other lints).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... What I usually do in this case is to comment out lint conditions or move the lint out of other checks until output appears. Then I may reorder the checks until I've found the one that precludes linting and then I try to understand why it doesn't match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@llogiq I think I found the problem. res.opt_def_id() is never equal to cx.tcx.lang_items().char_impl(). What's another or better way to see if the inner type of a slice is char?

Copy link
Contributor

@giraffate giraffate Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO how about just using rustc_middle::ty::Char to check a type in slice? I guess the hir_ty_to_ty method can convert from rustc_hir::Ty to ty::Ty.

@brightly-salty brightly-salty marked this pull request as draft January 6, 2021 01:33
@giraffate
Copy link
Contributor

ping from triage @brightly-salty. There are some fixes to be done. Can you have questions to proceed this?

@brightly-salty
Copy link
Contributor Author

@giraffate Sorry, I almost forgot about this PR. I'll get on it.

@giraffate
Copy link
Contributor

ping from triage @brightly-salty. Can you have any updates on this?

@giraffate giraffate added S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties labels Feb 22, 2021
@brightly-salty
Copy link
Contributor Author

@giraffate Yes, I'm still working on it on my end. I'm just finding it very difficult to get it to actually work and lint any of the test code. If anyone else wants to look at it and has an idea that they think might work, help would be much appreciated.

@brightly-salty
Copy link
Contributor Author

Closed because of complications with issue

@brightly-salty brightly-salty deleted the char-slice-to-str-slice branch March 4, 2021 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add lint on &[char]
6 participants