-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Evaluate hex encoding for embedding files? #7211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Well a PNG file isn't a good example as it's already highly compressed. As @jedisct1 said in #7213 the base64 encoding means that repeated opcode sequences are being obscured, but with a PNG image there shouldn't be any repetitions left. I hadn't thought of this before, but it makes sense. Patterns in the original are preserved, Huffman encoding means using only 16/256 symbols won't be a big problem, and decoding should be simple as you can pre-allocate a buffer half the string length. Use lowercase a-f to make the most benefit of Huffman encoding. Or be extra crazy and use "etnris" for optimal Huffman benefit. (This is from an old analysis we did of the most used characters in jQuery for UglifyJS. If you wanted to try such an approach, we could do a fresh analysis of Emscripten output.) |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant. |
Booh, don't close me! |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant. |
🥕 |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant. |
🐰 |
if this can easily save SINGLE_FILE sizes after gzip, should be a nice thing to test! |
Out of interest why do folks here what to be using SINGLE_FILE in the first place? It breaks compiled module caching as well as streaming compilation, both significant downsides. |
I am using this to distribute relatively small self-contained apps that can be opened in browser without requiring a web-server which can simply be opened by a lay-user by clicking on Another usecase is also for personal-use apps, accessible via some my.github.io/repo/app.html (possibly deployed for their private use by the user themselves). A real-world example is Regarding caching/streaming compilation - can this limitation be bypassed by auto-storing a compiled wasm into browser's local storage? |
I use it in pqcrypto.js and libsodium.js to provide a single JS file that works in any JS environment and opportunistically takes advantage of WebAssembly where available (with asm.js fallback). This is helpful for some contexts where relying on an external module would be problematic, and simplifies subresource integrity verification. |
Thanks for the feedback! |
@sbc100 I use it to distribute a barcode scanning library (https://strich.io, it's commercial). Basically for similar reasons like @vadimkantorov, it's easier for apps to consume that way - they usually just bundle the .js file with their app, and don't need to think about deploying an extra .wasm file in their assets. The decoupling of the .js and .wasm file has also caused some headaches for me in the past due to caching (version mismatch between .js and companion .wasm). |
@sbc100 Here is a primer of such distribution of a toy example: https://github.com/vadimkantorov/wasm-iconv It might be nice if there were ways of pre-transforming the wasm bytes to make it more compressable by gzip (or maybe just self-decompressing for allowing to use something better than gzip) |
@vadimkantorov wouldn't it make sense to use separate wasm file for https://vadimkantorov.github.io/wasm-iconv/? The downsides of SINGLE_FILE are significant enough that I feel like we should be warning folks against unless they really need it for a given use case. |
|
@sbc100 For this particular wasm-iconv case, I'd like to be open the iconv.html from file:/// as a simple single-file webapp (and also single-file helps for versioning of the wrapper+wasm together / distribution - no need to unpack the archive / or always remember to move the two files together). If there is some more suited widely-working format for distribution of local apps (not requiring internet / network) - e.g. some modern analogue of https://en.wikipedia.org/wiki/HTML_Application or web-analogue of snap/appx, I'd be happy to use it. Maybe this means that there is a need for a new such web standard. |
Indeed, that sounds like a case were we should look into addressing the reasons you are being pushed into using SINGLE_FILE. I'm not aware of the best way to package local web apps like that, but I imagine such a thing could exist. |
It was just pointed out to me in a Hacker News thread that hex might be more efficient than base64 after gzip/brotli compression for
SINGLE_FILE
's embedding.His test showed that a hex-encoded wasm file was 65% the size of a base64-encoded wasm file after compression. My own test with a random 4.7 MB png I had sitting around showed that it didn't make a difference — with brotli they were all about the same (hex having the edge over base64 by 215 B and over the compressed original file by ~2.5 KB) and gzip was similar (in this case hex was ~600 KB bigger) — so it's unclear what the impact would be.
Just calling attention to this on the off chance that it would actually be an improvement, as I don't recall hex being considered when originally writing the
SINGLE_FILE
PR, and the previous discussion on encoding efficiencies (#3326 (comment)) has no mention of hex.The text was updated successfully, but these errors were encountered: