Use coerce instead of unsafeCoerce where appropriate #130

JordanMartinez · 2020-11-15T18:29:41Z

Implements the suggestions made here: replace unsafeCoerce with coerce where appropriate.

JordanMartinez · 2020-11-15T19:05:36Z

I now understand why this is tricky to implement...

rhendric · 2020-11-16T09:03:42Z

Just curious, with safe coercion, what's the advantage of keeping the specialized toNonEmptyString etc. instead of just using coerce inline? (Not that there's much disadvantage, other than code size.)

MonoidMusician · 2020-11-16T13:03:54Z

I’m not sure Safe.Coerce should be used here, since String -> NonEmptyString is not a safe coercion. (In particular, coerce is assumed to be safe in both directions, which isn’t the case here – see one of the threads on the compiler repo.)

Also I thought that it required the constructor to be imported, or did I misunderstand?

JordanMartinez · 2020-11-16T15:03:21Z

Let me explain the nuances here.

First, you'll notice that the three functions toNonEmptyString, fromNonEmptyString, and liftS are all functions that are not exported from those two modules. These functions exist only to make implementations of other functions a bit easier.

Second, Ci doesn't build. If you look at the error message, it's because the Newtype constructor for NonEmptyString isn't imported into the module. And that's why I said above "I now understand why this is tricky to implement."

The NonEmptyString constructor can't be exported because that would ruin the whole point of that type. Thus,we also can't use the safe version of coerce within these two modules. Since these three functions are used in the implementations of the two modules' functions, the only way to switch these three functions to coerce would be to move both modules' code into Data.NonEmptyString.Internal, which defeats the purpose. We have both a CodeUnits and CodePoints module to distinguish between the two. If we move the code into the Internal module, we would have to prefix all function calls with cu or cp to distinguish between which its ultimately operating on.

So, I'm not sure this is really worth it at the end of the day.

hdgarrood · 2020-11-16T15:36:56Z

These functions aren’t affected by whether or not we happen to be thinking about code points or code units at the point they’re used, so I don’t think they need to be duplicated. Why not move them into Internal and re-export them as necessary so that they’re visible from all the places they’re currently visible now? There’s no need to add prefixes because they’re the same function.

hdgarrood · 2020-11-16T15:41:54Z

To clarify: not every string function needs to choose whether to behave in terms of code units or code points, so not everything needs to be duplicated. For example, all of the functions in Data.String.Common fit into this category and so they are all re-exported from both Data.String.CodePoints and Data.String.CodeUnits.

JordanMartinez · 2020-11-17T01:46:09Z

Why not move them into Internal and re-export them as necessary so that they’re visible from all the places they’re currently visible now? There’s no need to add prefixes because they’re the same function.

Which functions are you referring to? I'm assuming it's not toNonEmptyString, fromNonEmptyString, and liftS because moving those to Internal and reexporting them would break the guarantees of NonEmptyString.

If you are talking about functions like toCharArray, charAt, and uncons, then which can be deduplicated because they ultimately do the same thing whether one uses CodeUnits or CodePoints, and which cannot?

For example, uncons is implemented differently in both modules:

The CodeUnits version of uncons is:

uncons :: NonEmptyString -> { head :: Char, tail :: Maybe NonEmptyString }
uncons nes = 
  let s = fromNonEmptyString nes 
  in { head: U.charAt 0 s
     , tail: fromString (CU.drop 1 s) 
     }

The CodePoints version of uncons is:

uncons :: NonEmptyString -> { head :: Char, tail :: Maybe NonEmptyString }
uncons nes = 
  let s = fromNonEmptyString nes 
  in { head: unsafePartial fromJust (CP.codePointAt 0 s)
     , tail: fromString (CP.drop 1 s) 
     }

Will they produce the same content and so this function can be deduplicated? Or do they produce something different? I assume the latter. If it's moved into Internal because it uses fromNonEmptyString, then we'd need to distinguish one from the other somehow.

hdgarrood · 2020-11-17T02:14:52Z

I am just talking about toNonEmptyString, fromNonEmptyString, and liftS. Exporting them from Data.String.NonEmpty.Internal doesn’t violate any guarantees, because importing the Internal module already gives you access to the constructor. The idea is that you can import the Internal module if you want but it’s now your responsibility to ensure that the invariants are upheld. You only get guarantees if you avoid importing any Internal modules. Note that the Internal module already exports the NonEmptyString constructor.

hdgarrood · 2020-11-17T02:16:26Z

You’re right that we definitely don’t want to re-export these functions - their only use is in defining the other functions in Data.String.NonEmpty.{CodePoints,CodeUnits}, right?

hdgarrood · 2020-11-17T02:25:05Z

Oh, looking back at what I said before I don’t know why I said to re-export them. We can just move these functions into the Internal module and only import them directly from that module where they’re needed.

JordanMartinez · 2020-11-17T14:45:40Z

Note that the Internal module already exports the NonEmptyString constructor.

Oh! It does...

The idea is that you can import the Internal module if you want but it’s now your responsibility to ensure that the invariants are upheld. You only get guarantees if you avoid importing any Internal modules.

This seems like one of those Haskell "idioms" that never made sense to me.

If we refactored this code, so that the NonEmptyString constructor wasn't exported, would you be in agreement with that? Or do you think this escape hatch should continue to exist?

hdgarrood · 2020-11-17T15:11:36Z

I’m not sure it makes a huge difference really, since if you’re determined you can always use unsafeCoerce. I would be tempted to leave things as they are though - using the actual constructor is definitely safer than using unsafeCoerce if people do want an escape hatch, and personally I don’t think the change is beneficial enough to justify rearranging things (or to justify holding 0.14 up any more).

I think the benefit of providing an Internal module is clear once you’ve been in a situation where you can’t implement something you need without access to it. We use Internal modules from the Haskell Esqueleto package all the time in production at Lumi, because Esqueleto’s safe public API isn’t yet flexible enough to cover all our needs.

JordanMartinez · 2020-11-17T15:17:59Z

I've updated the code to use the constructor directly.

I think the benefit of providing an Internal module is clear once you’ve been in a situation where you can’t implement something you need without access to it. We use Internal modules from the Haskell Esqueleto package all the time in production at Lumi, because Esqueleto’s safe public API isn’t yet flexible enough to cover all our needs.

That makes sense in that situation. Is this repo's code mature enough to not fall into that category though? If so, then I think a breaking change fixing that should be done in the future, but not right now due to the v0.14.0 delay.

hdgarrood · 2020-11-17T15:58:32Z

I don't think it's really a question of maturity: just because a library has been in use in production for a few years, it doesn't mean that there aren't still situations where it doesn't quite fill a gap that it should - it could just be that nobody has tried to use it in a particular context yet. I'm against removing the Internal module later: I don't think it presents that much of a risk (since you have to opt in to using it) and I think we need the Internal module here because both Data.String.NonEmpty.CodeUnits and Data.String.NonEmpty.CodePoints need to be able to "unsafely" construct NonEmptyString values.

If you're worried about things like IDE tooling importing things from Internal modules when they could have been imported from non-Internal modules, then I think that's something we should address in the tooling. Perhaps we can modify the relevant tooling/plugins so that they don't list Internal modules by default, or so that they give you a warning when they add an import from an Internal module, or something like that.

thomashoneyman · 2020-11-17T17:07:06Z

(Not to bring the discussion off topic, but I would like tooling to demote internal modules. I import them by accident much more than I’d like.)

JordanMartinez · 2020-11-17T21:15:34Z

think we need the Internal module here because both Data.String.NonEmpty.CodeUnits and Data.String.NonEmpty.CodePoints need to be able to "unsafely" construct NonEmptyString values.

Yeah, I get that. However, the Internal module seems like a workaround to a different problem: that one module cannot access "private" parts of another module whereas all other modules cannot access those same "private" parts.

I get that people want the strong guarantees of a newtype. I also get that such a newtype's guarantees can be broken via unsafeCoerce and that there are times where one should be able to do that. I think breaking the guarantees should be more explicit via unsafeCoerce than using the NonEmptyString constructor, something that can be overlooked or (as Thomas pointed out) unknowingly imported.

Here's my order of preference for solving this problem:

Implement a better module system (hard and not an easy fix, especially when circular dependencies cause issues)
Stop exporting the NonEmptyString constructor, inline both CodeUnits and CodePoints into Internal with prefixed names, and update CodeUnits and CodePoints to use those prefixed names' functions as their implementations to reduce breakage. (tedious but possible; possible unneeded code bloat)
In tooling, demote all modules that end in Internal. When importing functions, these modules appear at the bottom of the list to reduce the chance of them being imported accidentally.
Update documentation on the Internal module and NonEmptyString type.

I can be happy with the fourth option since v0.14.0 is more important than this and I don't think this is a big deal. I also realize that the second option isn't ideal.

hdgarrood · 2020-11-17T21:35:23Z

FWIW I'm unlikely to be on board with changes to the module system which would enable us to expose parts of a module but only to certain other modules; I think it would complicate the module system unacceptably. I also expect it would have a pretty poor effort to payoff ratio. I think option 4 is the way to go for now.

JordanMartinez · 2020-11-18T01:32:17Z

I think option 4 is the way to go for now.

Works for me. Due to previous experience, I'm not going to immediately push what I think the docs should say. Rather, I will present my rough draft with the understanding that we'll iterate on it here. Once we get a final version, I'll push that as a commit to this PR.

I don't think Internal modules should be explained in these docs. I've opened an issue in my learning repo to cover this in more detail as I think that's a better location for that (JordanMartinez/purescript-jordans-reference#527).

Data.String.NonEmpty.Internal Module docs:

While most of the code in this module is safe, this module does export a few partial functions and the NonEmptyString constructor. While the partial functions are obvious from the Partial constraint in their type signature, the NonEmptyString constructor can be overlooked when searching for issues in one's code. See the constructor's documentation for more information.

NonEmptyString constructor docs:

You can use this constructor to create a NonEmptyString that isn't non-empty, breaking the guarantee behind this newtype. It is provided as an escape hatch mainly for the Data.NonEmpty.CodeUnits and Data.NonEmpty.CodePoints modules. Usage of this constructor in your own code provides the same safety guarantees of unsafeCoerce, that is, none at all. Use this at your own risk when you know what you are doing.

hdgarrood · 2020-11-18T08:21:50Z

That all sounds great to me 👍

JordanMartinez · 2020-11-18T15:16:04Z

I added docs to the constructor, but it's not showing up on the generated docs. Did I do it wrong? Or is this a bug in the compiler?

hdgarrood · 2020-11-18T15:33:47Z

src/Data/String/NonEmpty/Internal.purs

+  -- | non-empty, breaking the guarantee behind this newtype. It is
+  -- | provided as an escape hatch mainly for the `Data.NonEmpty.CodeUnits`
+  -- | and `Data.NonEmpty.CodePoints` modules. Usage of this constructor
+  -- | in your own code provides the same safety guarantees of


This isn't quite true; the NonEmptyString constructor at least requires that you provide a string, whereas unsafeCoerce doesn't. That is, you can do unsafeCoerce 3 :: NonEmptyString, whereas you can't do NonEmptyString 3 :: NonEmptyString.

Sorry, I missed this before.

Lol. No worries. What are you suggestions for how to fix it?

Can’t we just remove this sentence? The first one already mention that usage of this constructor breaks the newtype guarantees.

Yeah, that sounds good to me.

hdgarrood · 2020-11-18T15:36:07Z

I guess documenting individual constructors just hasn't been implemented yet.

JordanMartinez · 2020-11-18T20:11:13Z

Should I move the docs to the type then? Not quite ideal as it seems to bloat the type's docs, but better than nothing...?

kl0tl · 2020-11-19T22:54:05Z

I did the following in purescript/purescript-arrays#184:

kl0tl · 2020-11-19T22:54:45Z

src/Data/String/NonEmpty/Internal.purs

@@ -13,7 +19,15 @@ import Prim.TypeError as TE
 import Unsafe.Coerce (unsafeCoerce)

 -- | A string that is known not to be empty.
-newtype NonEmptyString = NonEmptyString String
+newtype NonEmptyString =
+  -- | You can use this constructor to create a NonEmptyString that isn't


*NonEmptyString (with backticks)?

Constructor docs don't yet appear in the generated documentation

JordanMartinez · 2020-11-26T18:12:35Z

Thanks for the feedback. I've pushed all changes mentioned.

src/Data/String/NonEmpty/CodePoints.purs

rhendric mentioned this pull request Nov 16, 2020

Core library updates for 0.14 purescript/purescript#3942

Closed

57 tasks

Replace unsafeCoerce with safer alternative where possible

91c8aa3

Remove unused import

a0423b3

JordanMartinez mentioned this pull request Nov 18, 2020

Document Internal modules JordanMartinez/purescript-jordans-reference#527

Closed

Add docs to the Internal module and NonEmptyString constructor

3fdf060

hdgarrood reviewed Nov 18, 2020

View reviewed changes

kl0tl mentioned this pull request Nov 19, 2020

Remove some usages of unsafeCoerce purescript/purescript-arrays#184

Merged

kl0tl reviewed Nov 19, 2020

View reviewed changes

JordanMartinez added 3 commits November 26, 2020 10:10

Remove 'usage of this constructor...' sentence

0157801

Move constructor docs to type docs

c0be958

Constructor docs don't yet appear in the generated documentation

Add backticks around NonEmptyString in docs

9dcca40

thomashoneyman reviewed Nov 26, 2020

View reviewed changes

src/Data/String/NonEmpty/CodePoints.purs Outdated Show resolved Hide resolved

thomashoneyman approved these changes Nov 26, 2020

View reviewed changes

Add "For internal use only. Do not export" above to/fromNEString & listS

036e260

JordanMartinez merged commit e897bb7 into purescript:master Nov 26, 2020

JordanMartinez deleted the addKl0tliSuggestions branch November 26, 2020 20:30

Use coerce instead of unsafeCoerce where appropriate #130

Use coerce instead of unsafeCoerce where appropriate #130

Uh oh!

Conversation

JordanMartinez commented Nov 15, 2020

Uh oh!

JordanMartinez commented Nov 15, 2020

Uh oh!

rhendric commented Nov 16, 2020

Uh oh!

MonoidMusician commented Nov 16, 2020

Uh oh!

JordanMartinez commented Nov 16, 2020

Uh oh!

hdgarrood commented Nov 16, 2020

Uh oh!

hdgarrood commented Nov 16, 2020

Uh oh!

JordanMartinez commented Nov 17, 2020

Uh oh!

hdgarrood commented Nov 17, 2020

Uh oh!

hdgarrood commented Nov 17, 2020

Uh oh!

hdgarrood commented Nov 17, 2020

Uh oh!

JordanMartinez commented Nov 17, 2020

Uh oh!

hdgarrood commented Nov 17, 2020

Uh oh!

JordanMartinez commented Nov 17, 2020

Uh oh!

hdgarrood commented Nov 17, 2020

Uh oh!

thomashoneyman commented Nov 17, 2020

Uh oh!

JordanMartinez commented Nov 17, 2020

Uh oh!

hdgarrood commented Nov 17, 2020

Uh oh!

JordanMartinez commented Nov 18, 2020

Uh oh!

hdgarrood commented Nov 18, 2020

Uh oh!

JordanMartinez commented Nov 18, 2020

Uh oh!

hdgarrood Nov 18, 2020

Choose a reason for hiding this comment

Uh oh!

hdgarrood Nov 18, 2020

Choose a reason for hiding this comment

Uh oh!

JordanMartinez Nov 18, 2020

Choose a reason for hiding this comment

Uh oh!

kl0tl Nov 19, 2020

Choose a reason for hiding this comment

Uh oh!

hdgarrood Nov 19, 2020

Choose a reason for hiding this comment

Uh oh!

hdgarrood commented Nov 18, 2020

Uh oh!

JordanMartinez commented Nov 18, 2020

Uh oh!

kl0tl commented Nov 19, 2020

Uh oh!

kl0tl Nov 19, 2020

Choose a reason for hiding this comment

Uh oh!

JordanMartinez commented Nov 26, 2020

Uh oh!

Uh oh!

Uh oh!