Suggestion: allow `slice (length s) (length s)` on Strings #143

Quelklef · 2021-02-13T07:40:50Z

Currently, Data.String.CodeUnits.slice returns Nothing on slices from the final index to the final index, e.g.

slice 4 4 "test" == Nothing

I find this behavior to be undesirable. It:

Means that "out of bounds" has different meanings for the two bounds, which I personally find to be unintuitive.
Nullifies the invarant that for each i from 0 through length s, we have s == slice 0 i s <> slice i (length s). Instead, this only holds up to length s - 1.

Motivating example: I was dealing with string containing two parts separated by a delimiter, like prefix:::suffix. I had a function to separate the two parts, getParts :: String -> Maybe (String /\ String), which was implemented via indexOf and split. With the current split implementation, this function has that getParts ":::suffix" == Just ("" /\ "suffix") but getParts "prefix:::" == Nothing, which is surprising.

Happy to make an MR if it's decided that this is reasonable. I suppose it's technically a reverse-incompatible change, though.

The text was updated successfully, but these errors were encountered:

JordanMartinez · 2021-02-18T01:58:13Z

One downside to accepting this PR is that the changes would be silent. What returned Nothing would now return Just a.

However, I don't think this design was a mistake. This is what I get when I run this in my browser console:

> "1".slice(1, 1);
""

Quelklef · 2021-02-18T03:34:42Z

I don't quite understand your comment.

By silent you mean that it could silently break existing code? That's a fair point; it would be a reverse-incompatible change, and I understand if it's rejected for that reason.

Yeah, "1".slice(1, 1) is "". But I don't understand the implication of that. Is your thinking that "" should become Nothing? In that case, slice 1 1 "abc" should be Nothing, but it's Just "".

JordanMartinez · 2021-02-18T14:37:22Z

Silent in the sense that the compiler doesn't issue an error or otherwise notify you that this function no longer works the same as it did before.

Ah, I didn't know that slice 1 1 "abc" returns back Just "". I guess it only returns Nothing when an error would otherwise be thrown. Then, I do think there is weight behind the point you're making.

Quelklef · 2021-02-18T16:25:56Z

Yeah, that's totally a valid concern. A fair option would be to provided a modified slice' function and a big bold note in the docs about the (subtle) difference.

JS slice actually never throws (except for on type errors). Try, for instance, "".slice(20, 25). The PS slice function returns Nothing when the indices are considered invalid, such as being "out of bounds"; this ticket is me arguing that the definition of "out of bounds" is bad on a particular edge case.

garyb · 2021-02-18T16:44:05Z

I don't really have a good answer to deal with the potential breakage here, but I'd agree that this being "out of bounds" is a mistake. An unfortunate legacy I introduced from the early days of this library... I think there may be some older issues discussing it previously where I argued against it, but I think I was wrong to do so now. Especially since the advent of NonEmptyString, since the situations I had in mind would be better expressed in terms of that.

One way to make it "obviously breaking" would be to remove the Maybe and to no longer track bounds, just have empty strings come back for all those situations, like it does in JS. Similarly, the somewhat related take and drop don't do bounds checking. I don't know... there probably is an occasional use case for having an empty string result allowable but catching out of bounds slices, but I don't know how common that would be.

MonoidMusician · 2021-02-18T17:05:17Z

I had a similar complaint with a different function back in #78

garyb · 2021-02-18T17:10:06Z

Thanks for the link! I guess I ducked that conversation 😉

JordanMartinez · 2021-02-18T20:54:26Z

I think the easiest fix is to add slice' with the correct implementation and then do a deprecation cycle in future breaking changes.

Thoughts?

hdgarrood · 2021-02-18T21:11:52Z

Very 👎 for adding a slice' alongside the current one. As I said in #78:

The way I see it, the most important thing is that the API exported by this package is coherent and consistent

This is probably one of the most important libraries in core, and so I think it's vital that we take a lot of care over the API surface we expose. We definitely shouldn't just do what's easiest; having a slice and a slice' alongside each other would offer a poor experience as you'd have to decide which one you want to use each time - that's a decision we should be making rather than offloading to users. As a user of this library, you'd have to understand what both variants do if you work in different projects where some prefer one variant and some prefer the other.

I think the conclusion we reached in #78 was the correct one (i.e. that not returning Nothing for out of bounds is best) but I have one regret in how we made that change, in that we didn't go over the whole API to ensure that our approach to out-of-bounds cases for functions that do slicing is consistent throughout. If we had, we probably would have made this change then as well. If/when we do this I think it's crucial that we go over the whole API to make sure that everything is consistent in how it handles out-of-bounds cases, or at least as consistent as we can reasonably make it.

Quelklef · 2021-02-18T21:16:18Z

How about:

Creating a new ticket for homogenizing all string-and-index operations to behave consistently, allowing out-of-bounds indices as per splitAt without Maybe? #78; and then
Creating a new ticket to re-add sliceMaybe, takeMaybe, dropMaybe, so we're still supporting these use-cases?

hdgarrood · 2021-02-18T21:20:03Z

I think we should consider this ticket as your point 1, and I think we shouldn't do your point 2; it's not compatible with what I was saying:

having a slice and a slice' alongside each other would offer a poor experience as you'd have to decide which one you want to use each time

The situation is marginally improved by having better names but it's still very much not ideal in my mind.

Quelklef · 2021-02-18T21:31:13Z

If you're intent on having one true slice, then I think it actually makes more sense to use the strict version. The non-strict version can trivially be recovered with >>> fromMaybe "", but going the other way around is more difficult.

hdgarrood · 2021-02-18T22:02:20Z

The reason I'm unconvinced by Maybes is that we've had a few discussions about out-of-bounds slicing over the years and, as far as I know, not once has anyone been able to give a persuasive concrete example of where the Maybe behaviour is useful.

garyb · 2021-02-18T22:19:31Z

The reason I'm unconvinced by Maybes is that we've had a few discussions about out-of-bounds slicing over the years and, as far as I know, not once has anyone been able to give a persuasive concrete example of where the Maybe behaviour is useful.

Yeah, that's why I changed my mind on it too. I'm pretty sure any case where it would have been useful will be done under NonEmptyString instead.

Quelklef · 2021-02-19T02:20:44Z

Makes sense. (I feel like I have in the past had use-cases for Maybe, but I cannot think of anything concrete)

On the topic of consistency and cohesion, should the normalization of negative indices even be performed? Similar functions (splitAt, charAt) don't seem to do such normalization. Besides, at least to my intuition, it's only useful for syntactic convenience and otherwise may cause subtle bugs.

Regardless, I've created a WIP PR for this ticket.

kl0tl mentioned this issue Feb 18, 2021

Prepare v5.0.0 release #144

Merged

6 tasks

Quelklef mentioned this issue Feb 19, 2021

[WIP] relax Data.String.CodeUnits.slice + related changes as per #143 #145

Merged

4 tasks

JordanMartinez added purs-0.15 A reminder to address this issue or merge this PR before we release PureScript v0.15.0 type: breaking change A change that requires a major version bump. labels Dec 1, 2021

thomashoneyman closed this as completed in #145 Mar 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: allow `slice (length s) (length s)` on Strings #143

Suggestion: allow `slice (length s) (length s)` on Strings #143

Quelklef commented Feb 13, 2021 •

edited

Loading

JordanMartinez commented Feb 18, 2021

Quelklef commented Feb 18, 2021 •

edited

Loading

JordanMartinez commented Feb 18, 2021

Quelklef commented Feb 18, 2021 •

edited

Loading

garyb commented Feb 18, 2021

MonoidMusician commented Feb 18, 2021

garyb commented Feb 18, 2021

JordanMartinez commented Feb 18, 2021

hdgarrood commented Feb 18, 2021

Quelklef commented Feb 18, 2021

hdgarrood commented Feb 18, 2021

Quelklef commented Feb 18, 2021

hdgarrood commented Feb 18, 2021

garyb commented Feb 18, 2021

Quelklef commented Feb 19, 2021 •

edited

Loading

Suggestion: allow slice (length s) (length s) on Strings #143

Suggestion: allow slice (length s) (length s) on Strings #143

Comments

Quelklef commented Feb 13, 2021 • edited Loading

JordanMartinez commented Feb 18, 2021

Quelklef commented Feb 18, 2021 • edited Loading

JordanMartinez commented Feb 18, 2021

Quelklef commented Feb 18, 2021 • edited Loading

garyb commented Feb 18, 2021

MonoidMusician commented Feb 18, 2021

garyb commented Feb 18, 2021

JordanMartinez commented Feb 18, 2021

hdgarrood commented Feb 18, 2021

Quelklef commented Feb 18, 2021

hdgarrood commented Feb 18, 2021

Quelklef commented Feb 18, 2021

hdgarrood commented Feb 18, 2021

garyb commented Feb 18, 2021

Quelklef commented Feb 19, 2021 • edited Loading

Suggestion: allow `slice (length s) (length s)` on Strings #143

Suggestion: allow `slice (length s) (length s)` on Strings #143

Quelklef commented Feb 13, 2021 •

edited

Loading

Quelklef commented Feb 18, 2021 •

edited

Loading

Quelklef commented Feb 18, 2021 •

edited

Loading

Quelklef commented Feb 19, 2021 •

edited

Loading