Skip to content

[css-text-3] Clarify Segment Break Transformation Rules when mutiple segment breaks involve #836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
upsuper opened this issue Dec 26, 2016 · 5 comments
Assignees
Labels

Comments

@upsuper
Copy link
Member

upsuper commented Dec 26, 2016

The first rule for collapsing segment breaks is:

If the character immediately before or immediately after the segment break is the zero-width space character (U+200B), then the break is removed, leaving behind the zero-width space.

It is not clear to me what should happen if there are multiple segment breaks involve here. For example, if I have ZWSP LF LF LF x, would this rule produce:

  1. ZWSP LF LF x (with only the first LF removed), or
  2. ZWSP x (with all LF removed because of recursively applying this rule)?

(In the first case, the remaining LFs would be converted to whitespaces by the last rule there, and the second whitespace would be removed by step 4 of Phase I, so the final result would be ZWSP WS x.)

This may also affect the second rule:

Otherwise, if the East Asian Width property of both the character before and after the line feed is F, W, or H (not A), and neither side is Hangul, then the segment break is removed.

If I have W LF LF W, should the two LFs be removed by this rule?

It seems to me that removing all segment breaks together would be easier for implementation, so I would propose making the rules that way if there are no other concerns.

@upsuper upsuper added the css-text-3 Current Work label Dec 26, 2016
@upsuper
Copy link
Member Author

upsuper commented Dec 26, 2016

cc @chenpighead

@fantasai
Copy link
Collaborator

Amending this to be consecutive segment breaks makes sense to me. Seems like limiting it to only one is an error. The only thing that shouldn't change is that, if there's a space or tab somewhere in that sequence, the sequence becomes a space and not nothing.

@upsuper
Copy link
Member Author

upsuper commented Dec 26, 2016

For those rules, I don't think there can be any whitespace or tab in their input, because whitespaces should have been removed by step 1 of Phase I if there is any segment break.

@fantasai
Copy link
Collaborator

Oh, right. Yeah, that's probably more correct. :)

@fantasai
Copy link
Collaborator

Agenda+ to confirm the fix.

@astearns astearns removed the Agenda+ label Mar 8, 2017
@frivoal frivoal added the Tested Memory aid - issue has WPT tests label Apr 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants