Explain UTF-8 BOM rule in readme #1640

mk-pmb · 2017-11-18T18:43:05Z

Most style decisions are explained in the readme, but I couldn't find the reasoning on why a BOM is considered bad. Where I've looked:

the patch that enabled it, found no commit message body.

the current code of the rules file,

javascript/packages/eslint-config-airbnb-base/rules/style.js

Lines 472 to 474 in 8cf2c70

    
           // require or disallow the Unicode Byte Order Mark 
        
           // https://eslint.org/docs/rules/unicode-bom 
        
           'unicode-bom': ['error', 'never'],

the eslint rule page mentioned in the patch code comment, doesn't claim it's bad.
searched the readme for "BOM", "unicode" and "byte order"
searched issue tracker for "BOM", "unicode" and "byte order"

Could someone explain it, or add search keywords to make the explanation easier to find?

Update: Also, is there a recommendation on how to declare the file encoding instead? I searched the readme for "charset", "encod" and "character set" but no matches.

ljharb · 2017-11-18T21:28:44Z

The rule page says "UTF-8 does not require a BOM because byte ordering does not matter when characters are a single byte. Since UTF-8 is the dominant encoding of the web, we make "never" the default option." - the reasoning for choosing "never" is because files should always only be in UTF-8.

Are you running into issues with this rule?

mk-pmb · 2017-11-19T11:49:37Z

Since UTF-8 is the dominant encoding of the web, we make "never" the default option.

Yes, that's what I found there as well. It's ok as a default for eslint. I thought the patch repeats this because there had been a stronger reason for "never".
In my projects I like BOMs because Firefox will trust the BOM more than it trusts the Content-Type header, thus it protects my files' encoding when used on webspace that announces another charset (e.g. legacy website). It's also helpful on webspace that doesn't send any charset info (e.g. python -m SimpleHTTPServer), or not even any Content-Type (file:// access).

To reproduce, run Python 2.7.6 and Firefox 57 on Ubuntu trusty with system locale en_US.UTF-8. Python's SimpleHTTPServer sends just "text/plain" without a charset, probably because it doesn't consider itself authoritative to guess more details. Both this way and when loading via file://, without a BOM, Firefox guesses "windows-1252" and consequently garbles umlauts and emoji.

ljharb · 2017-11-19T14:26:10Z

In general, this config repeats all the defaults explicitly.

I'd say that your solution should be to use a build process to auto-insert the BOM for you, rather than encoding that directly in the file.

Separately, lots of web features are broken on file://, so it shouldn't be used for any reason ever anyways.

mk-pmb · 2017-11-19T14:39:16Z

your solution should be to use a build process

I could imagine an argument for separation of concerns: The transport and compatibility issues should be solved by some other mechanism because the code files should only be concerned with behavior.
Are there other reasons to suggest a build process in general, independent of project details?

ljharb · 2017-11-19T14:40:36Z

Modern web dev requires a build process anyways (for minification, babel, etc) - it has for years, and it will for the foreseeable future.

mk-pmb · 2017-11-19T14:56:55Z

I think those are reasons worthy to be mentioned in the style guide. How about this?
"Your projects should use a build process so you can easily plug in a linter, transpiler, minification etc. Dealing with encoding issues in the source files (e.g. UTF-8 BOM to indicate Unicode) thus is a code smell for a lack of tooling."

Update: Changed the "ing"s to "er"s to match the search keywords.

ljharb · 2017-11-19T14:58:11Z

I guess that's fine; this isn't something that almost anybody runs into because almost everyone uses tools that assume UTF-8. Want to send a PR?

mk-pmb · 2017-11-19T15:00:02Z

ok, PR coming up later.

galvarez421 · 2019-09-19T16:09:39Z

For whatever it's worth, this rule does make it harder to work with Visual Studio default file saving behavior. See:

ljharb · 2019-09-19T16:48:35Z

Your SO link contains a link to a vscode extension that fixes the vscode bug.

galvarez421 · 2019-09-19T17:48:07Z

In my case, it has been easier to disable the rule so that other developers working on the projects using the config don't have to install an extension or otherwise configure things specifically to satisfy the rule (granted, there may be reasons to keep the rule enabled, but I haven't run into them). I mostly mentioned the Visual Studio case in response to your question to the OP ("Are you running into issues with this rule?") and in case it's considered worth consideration, given the popularity of Visual Studio.

I agree with @mk-pmb's suggestion that the documentation should explain why exactly BOM is disallowed. Given that the only options for the rule are "always" or "never", it's clear why the default is "never", given your explanation and the explanation in the rule page. However, I don't think it's clear why the Airbnb config enables the rule as an error. The rule page says that "UTF-8 does not require a BOM" but it's not clear why that should translate to the BOM being disallowed.

mk-pmb · 2019-09-19T19:13:52Z

See my PR #1643 for potential reasons.

ljharb added the question label Nov 18, 2017

mk-pmb mentioned this issue Nov 22, 2017

[guide] Non-code issues: keep them out of your code files. #1643

Open

j0pgrm mentioned this issue Aug 26, 2022

Utf 8 #2639

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explain UTF-8 BOM rule in readme #1640

Explain UTF-8 BOM rule in readme #1640

mk-pmb commented Nov 18, 2017 •

edited

Loading

ljharb commented Nov 18, 2017

Uh oh!

mk-pmb commented Nov 19, 2017

Uh oh!

ljharb commented Nov 19, 2017

Uh oh!

mk-pmb commented Nov 19, 2017 •

edited

Loading

Uh oh!

ljharb commented Nov 19, 2017

Uh oh!

mk-pmb commented Nov 19, 2017 •

edited

Loading

Uh oh!

ljharb commented Nov 19, 2017

Uh oh!

mk-pmb commented Nov 19, 2017

Uh oh!

galvarez421 commented Sep 19, 2019

Uh oh!

ljharb commented Sep 19, 2019

Uh oh!

galvarez421 commented Sep 19, 2019

Uh oh!

mk-pmb commented Sep 19, 2019 •

edited

Loading

Uh oh!

Explain UTF-8 BOM rule in readme #1640

Explain UTF-8 BOM rule in readme #1640

Comments

mk-pmb commented Nov 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ljharb commented Nov 18, 2017

Uh oh!

mk-pmb commented Nov 19, 2017

Uh oh!

ljharb commented Nov 19, 2017

Uh oh!

mk-pmb commented Nov 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ljharb commented Nov 19, 2017

Uh oh!

mk-pmb commented Nov 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ljharb commented Nov 19, 2017

Uh oh!

mk-pmb commented Nov 19, 2017

Uh oh!

galvarez421 commented Sep 19, 2019

Uh oh!

ljharb commented Sep 19, 2019

Uh oh!

galvarez421 commented Sep 19, 2019

Uh oh!

mk-pmb commented Sep 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mk-pmb commented Nov 18, 2017 •

edited

Loading

mk-pmb commented Nov 19, 2017 •

edited

Loading

mk-pmb commented Nov 19, 2017 •

edited

Loading

mk-pmb commented Sep 19, 2019 •

edited

Loading