Skip to content

Drop UNICODE_VERSION ? #104

Open
Open
@jquast

Description

@jquast

From the work and results of ucs-detect, https://ucs-detect.readthedocs.io/results.html

I have discovered that terminals do not support a single version of Unicode. At this time, very few support a single version of the specification completely.

  • For specific types, like wide characters, they may very at any version, fe. gnome terminal, https://ucs-detect.readthedocs.io/sw_results/GNOMETerminal.html#gnometerminal supports 93% of characters unique to version 15.0, and 90% of characters unique to version 14.
  • It may not be immediately obvious, as, "Language Support" is a bit of a proxy for "Zero-Width support", because combining characters are best tested with the characters expected to be combined with, but their support for combining characters or the tables used in their code not necessarily match their latest wide table. In fact, most terminals only update their wide tables for the most popular demand of emoji support.
  • And of course, though ZWJ and VS-16 came out at roughly unicode version 8 and 9, very few terminals that support unicode 9 or higher of the wide tables support ZWJ and VS-16, see this specific part of the table:

image

Because of those results, I think its perfectly fine to drop support for this UNICODE_VERSION, I very much doubt it is used, or useful to anyone when it is, because it cannot correctly describe the terminal's support to wcwidth.

If is a useful idea?

I was interested whether terminal emulator authors would have feedback about UNICODE_VERSION, and whether they would consider exporting it. I have not received any feedback.

However, with tools like 'ucs-detect', we can very programmatically determine with black-box testing, which wide, zero-width, and whether ZWJ and VS-16 are supported, right down to exactly which ones. By making this a delta of expected terminal support, and using ranges with codepoints, maybe it is possible to describe with a complex environment variable.

Just spitballing an idea of what it might look like,

UNICODE_SUPPORT="zero[8.0:!category:Mc,Mn,!1001-1002,!1003],wide[15.1:!zwj,!vs16,!9009-9010]"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions