-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Wip: Add new lint "char_slices" #6529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wip: Add new lint "char_slices" #6529
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @llogiq (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
/// character. Usually users of `&[char]` want O(1) indexing on characters, | ||
/// not codepoints. | ||
/// | ||
/// **Known problems:** None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When dealing with FFI to other high-level languages where strings are represented by sequences of char
s, the single correct way is to use &[char]
or Vec<char>
. This should be listed under "Known problems".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as we understand it, char
should not be used at FFI boundaries. consider u32
/i32
instead.
(from what we understand &
and &mut
shouldn't be used at FFI boundaries. basically any type that is rust ABI shouldn't be used at FFI boundaries?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found another counter-example (though nightly only): std::str::pattern::Pattern is implemented on &[char]
(which basically means any char in the slice matches the pattern). There may be other use cases where having chars equidistantly distributed in memory may be of the essence. I feel the Problems
section should reflect this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vec<char>
and &[char]
is also a completely reasonable way to represent UTF32 encoded strings.
/// **What it does:** Checks for use of `&[char]` anywhere in the code. | ||
/// | ||
/// **Why is this bad?** `char` represents a Unicode codepoint, not a | ||
/// character. Usually users of `&[char]` want O(1) indexing on characters, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs should also say something about how to index on characters (e.g. unicode_segmentation
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the issue, the correct way to index on characters is to use &[&str]
. I can try to make that more clear in the documentation if that is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not talking about the type but the code.
@@ -581,6 +609,24 @@ impl Types { | |||
"a `VecDeque` might work", | |||
); | |||
return; // don't recurse into the type | |||
} else if let TyKind::Slice(s) = hir_ty.kind { | |||
if let TyKind::Path(ref qpath) = s.kind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be more readable using an if_chain!
. See elsewhere in the code for how to use it.
@@ -714,6 +760,25 @@ impl Types { | |||
}; | |||
self.check_ty(cx, &mut_ty.ty, is_local); | |||
}, | |||
TyKind::Slice(ref s) => { | |||
if let TyKind::Path(ref qpath) = s.kind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the same as L614..629. Perhaps factor out a method?
@@ -0,0 +1,15 @@ | |||
#![warn(clippy::char_slices)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing the corresponding char_slices.stderr
file. If you run the tests, you should get a message telling you how to "bless" the result of the test, and then you can commit the file. This will likely also make CI pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because it currently does not produce any output in stderr. I think something is wrong but I have no idea what it is. If you have any ideas they would be very welcome. (This is only my second PR to clippy, and my first was relatively easy because it was almost exactly the same as three other lints).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... What I usually do in this case is to comment out lint conditions or move the lint out of other checks until output appears. Then I may reorder the checks until I've found the one that precludes linting and then I try to understand why it doesn't match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@llogiq I think I found the problem. res.opt_def_id()
is never equal to cx.tcx.lang_items().char_impl()
. What's another or better way to see if the inner type of a slice is char?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO how about just using rustc_middle::ty::Char
to check a type in slice? I guess the hir_ty_to_ty
method can convert from rustc_hir::Ty
to ty::Ty
.
ping from triage @brightly-salty. There are some fixes to be done. Can you have questions to proceed this? |
@giraffate Sorry, I almost forgot about this PR. I'll get on it. |
ping from triage @brightly-salty. Can you have any updates on this? |
@giraffate Yes, I'm still working on it on my end. I'm just finding it very difficult to get it to actually work and lint any of the test code. If anyone else wants to look at it and has an idea that they think might work, help would be much appreciated. |
Closed because of complications with issue |
changelog:
changelog: adds new lint "char_slices" which lints on things of type
&[char]
and suggests changing to&[&str]
fixes #5598
Currently a WIP: does not work