Evaluate hex encoding for embedding files? #7211

buu700 · 2018-10-02T03:44:48Z

It was just pointed out to me in a Hacker News thread that hex might be more efficient than base64 after gzip/brotli compression for SINGLE_FILE's embedding.

His test showed that a hex-encoded wasm file was 65% the size of a base64-encoded wasm file after compression. My own test with a random 4.7 MB png I had sitting around showed that it didn't make a difference — with brotli they were all about the same (hex having the edge over base64 by 215 B and over the compressed original file by ~2.5 KB) and gzip was similar (in this case hex was ~600 KB bigger) — so it's unclear what the impact would be.

Just calling attention to this on the off chance that it would actually be an improvement, as I don't recall hex being considered when originally writing the SINGLE_FILE PR, and the previous discussion on encoding efficiencies (#3326 (comment)) has no mention of hex.

The text was updated successfully, but these errors were encountered:

curiousdannii · 2018-10-03T03:50:33Z

Well a PNG file isn't a good example as it's already highly compressed. As @jedisct1 said in #7213 the base64 encoding means that repeated opcode sequences are being obscured, but with a PNG image there shouldn't be any repetitions left.

I hadn't thought of this before, but it makes sense. Patterns in the original are preserved, Huffman encoding means using only 16/256 symbols won't be a big problem, and decoding should be simple as you can pre-allocate a buffer half the string length. Use lowercase a-f to make the most benefit of Huffman encoding. Or be extra crazy and use "etnris" for optimal Huffman benefit. (This is from an old analysis we did of the most used characters in jQuery for UglifyJS. If you wanted to try such an approach, we could do a fresh analysis of Emscripten output.)

stale · 2019-10-03T04:01:21Z

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

jedisct1 · 2019-10-09T16:41:45Z

Booh, don't close me!

stale · 2020-10-09T12:44:10Z

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

jedisct1 · 2020-10-09T13:49:15Z

🥕

stale · 2022-04-17T21:21:51Z

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

jedisct1 · 2022-04-17T22:26:07Z

🐰

vadimkantorov · 2023-11-21T13:31:58Z

if this can easily save SINGLE_FILE sizes after gzip, should be a nice thing to test!

sbc100 · 2023-11-21T16:12:06Z

Out of interest why do folks here what to be using SINGLE_FILE in the first place? It breaks compiled module caching as well as streaming compilation, both significant downsides.

vadimkantorov · 2023-11-21T16:20:29Z

I am using this to distribute relatively small self-contained apps that can be opened in browser without requiring a web-server which can simply be opened by a lay-user by clicking on app.html, so here the file is already distributed on the user's machine (without forcing the user to set up a web-server, configuring MIME types etc, bypassing the issues with file:/// and simplifying distribution at the same time).

Another usecase is also for personal-use apps, accessible via some my.github.io/repo/app.html (possibly deployed for their private use by the user themselves). A real-world example is tiddlywiki which is distributed as a single-file html. I'm also planning at some point support such mode for my busytex project (extremely simple client-side latex editor for either local files or working with github repos)

Regarding caching/streaming compilation - can this limitation be bypassed by auto-storing a compiled wasm into browser's local storage?

buu700 · 2023-11-21T16:32:41Z

I use it in pqcrypto.js and libsodium.js to provide a single JS file that works in any JS environment and opportunistically takes advantage of WebAssembly where available (with asm.js fallback).

This is helpful for some contexts where relying on an external module would be problematic, and simplifies subresource integrity verification.

sbc100 · 2023-11-21T16:43:34Z

Thanks for the feedback!

suzukieng · 2023-11-21T18:47:07Z

@sbc100 I use it to distribute a barcode scanning library (https://strich.io, it's commercial). Basically for similar reasons like @vadimkantorov, it's easier for apps to consume that way - they usually just bundle the .js file with their app, and don't need to think about deploying an extra .wasm file in their assets. The decoupling of the .js and .wasm file has also caused some headaches for me in the past due to caching (version mismatch between .js and companion .wasm).
But I will eventually have to move away from SINGLE_FILE once I support SIMD. Bundling both WASM SIMD/non-SIMD in to the .js and switching between the two based on the browser's SIMD support would be too much wasted space.

vadimkantorov · 2023-11-23T16:33:37Z

@sbc100 Here is a primer of such distribution of a toy example: https://github.com/vadimkantorov/wasm-iconv

It might be nice if there were ways of pre-transforming the wasm bytes to make it more compressable by gzip (or maybe just self-decompressing for allowing to use something better than gzip)

sbc100 · 2023-11-27T18:19:41Z

@vadimkantorov wouldn't it make sense to use separate wasm file for https://vadimkantorov.github.io/wasm-iconv/? The downsides of SINGLE_FILE are significant enough that I feel like we should be warning folks against unless they really need it for a given use case.

jedisct1 · 2023-11-27T18:44:30Z

libsodium.js originally used a separate wasm files, but there were significant integration issues [1]. SINGLE_FILE was very helpful to solve them.

vadimkantorov · 2023-11-28T10:26:19Z

@sbc100 For this particular wasm-iconv case, I'd like to be open the iconv.html from file:/// as a simple single-file webapp (and also single-file helps for versioning of the wrapper+wasm together / distribution - no need to unpack the archive / or always remember to move the two files together). If there is some more suited widely-working format for distribution of local apps (not requiring internet / network) - e.g. some modern analogue of https://en.wikipedia.org/wiki/HTML_Application or web-analogue of snap/appx, I'd be happy to use it.

Maybe this means that there is a need for a new such web standard.

sbc100 · 2023-11-28T19:03:48Z

@sbc100 For this particular wasm-iconv case, I'd like to be open the iconv.html from file:/// as a simple single-file webapp (and also single-file helps for versioning of the wrapper+wasm together / distribution - no need to unpack the archive / or always remember to move the two files together). If there is some more suited widely-working format for distribution of local apps (not requiring internet / network) - e.g. some modern analogue of https://en.wikipedia.org/wiki/HTML_Application or web-analogue of snap/appx, I'd be happy to use it.

Maybe this means that there is a need for a new such web standard.

Indeed, that sounds like a case were we should look into addressing the reasons you are being pushed into using SINGLE_FILE. I'm not aware of the best way to package local web apps like that, but I imagine such a thing could exist.
Honestly this is the first time I'm hearing this reason for choosing SINGLE_FILE. Most users seem happy to run a server and deal with multiple assets, but it would be good to find a solution for your requirements that doesn't depends on SINGLE_FILE.

buu700 mentioned this issue Oct 2, 2018

SINGLE_FILE could use hex instead of base64 #7213

Closed

stale bot added the wontfix label Oct 3, 2019

stale bot removed the wontfix label Oct 9, 2019

sbc100 added the help wanted label Oct 9, 2019

stale bot added the wontfix label Oct 9, 2020

stale bot removed the wontfix label Oct 9, 2020

stale bot added the wontfix label Apr 17, 2022

stale bot removed the wontfix label Apr 17, 2022

curiousdannii mentioned this issue Feb 21, 2024

Using gzip for -sSINGLE_FILE #21383

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate hex encoding for embedding files? #7211

Evaluate hex encoding for embedding files? #7211

buu700 commented Oct 2, 2018

curiousdannii commented Oct 3, 2018

stale bot commented Oct 3, 2019

jedisct1 commented Oct 9, 2019

stale bot commented Oct 9, 2020

jedisct1 commented Oct 9, 2020

stale bot commented Apr 17, 2022

jedisct1 commented Apr 17, 2022

vadimkantorov commented Nov 21, 2023

sbc100 commented Nov 21, 2023 •

edited

Loading

vadimkantorov commented Nov 21, 2023 •

edited

Loading

buu700 commented Nov 21, 2023

sbc100 commented Nov 21, 2023

suzukieng commented Nov 21, 2023

vadimkantorov commented Nov 23, 2023

sbc100 commented Nov 27, 2023

jedisct1 commented Nov 27, 2023

vadimkantorov commented Nov 28, 2023 •

edited

Loading

sbc100 commented Nov 28, 2023

Evaluate hex encoding for embedding files? #7211

Evaluate hex encoding for embedding files? #7211

Comments

buu700 commented Oct 2, 2018

curiousdannii commented Oct 3, 2018

stale bot commented Oct 3, 2019

jedisct1 commented Oct 9, 2019

stale bot commented Oct 9, 2020

jedisct1 commented Oct 9, 2020

stale bot commented Apr 17, 2022

jedisct1 commented Apr 17, 2022

vadimkantorov commented Nov 21, 2023

sbc100 commented Nov 21, 2023 • edited Loading

vadimkantorov commented Nov 21, 2023 • edited Loading

buu700 commented Nov 21, 2023

sbc100 commented Nov 21, 2023

suzukieng commented Nov 21, 2023

vadimkantorov commented Nov 23, 2023

sbc100 commented Nov 27, 2023

jedisct1 commented Nov 27, 2023

vadimkantorov commented Nov 28, 2023 • edited Loading

sbc100 commented Nov 28, 2023

sbc100 commented Nov 21, 2023 •

edited

Loading

vadimkantorov commented Nov 21, 2023 •

edited

Loading

vadimkantorov commented Nov 28, 2023 •

edited

Loading