Skip to content

Using gzip for -sSINGLE_FILE #21383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
msqr1 opened this issue Feb 21, 2024 · 5 comments
Closed

Using gzip for -sSINGLE_FILE #21383

msqr1 opened this issue Feb 21, 2024 · 5 comments

Comments

@msqr1
Copy link

msqr1 commented Feb 21, 2024

Javascript's DecompressionStream is supported on all major browsers to decompress gzip. We can compress the wasm before turning it into base64 (after encoding, gzip is bad, I am aware of servers compressing automatically). On the client we can decode it and decompress it. I think this will make the file much smaller, but it may increase startup time.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 21, 2024

Interesting, thanks for the suggestion.

I wasn't aware of gzip being that much worse for after base64 encoding. Do you have some numbers for comparison? Doing gzip twice (once before base64 and then again when it goes over the wire) seems like it would have diminishing returns but lets see what the numbers say.

@curiousdannii
Copy link
Contributor

curiousdannii commented Feb 21, 2024

Older related discussion: #7211

AFAIK, the main issue with base64 is that it obscures the natural patterns within WASM. Hex encoding instead of base64 would reveal those patterns, at the cost of making the unzipped file 50% larger (100% vs 33% overhead). Compressing before base64 would make the unzipped file smaller. HTTP compression should then only do Huffman encoding for that part of the file, as there wouldn't be any repeated segments.

@msqr1
Copy link
Author

msqr1 commented Feb 22, 2024

@sbc100 Look at: https://stackoverflow.com/questions/38124361/why-does-base64-encoded-data-compress-so-poorly. I think this can be implemented via CompressionStream in tools/link.py, and decompression in src/base64Utils.js

@msqr1
Copy link
Author

msqr1 commented Feb 23, 2024

Well, working on it. This is going to change things a lot because DecompressionStream, which uses ReadableStream, which is async. So tryParseAsDataURI has to be async, and every other function that calls tryParseAsDataURI will be async. This will propagate all the way to the top module's code where await is OK, unless -sWASM_ASYNC_COMLILATION=0 and -sMODULARIZE=1. Maybe add another option like -sGZIP_WASM and make it incompatible?

@msqr1
Copy link
Author

msqr1 commented Feb 26, 2024

Added the feature, the binary size is actually so small now.

@msqr1 msqr1 closed this as not planned Won't fix, can't repro, duplicate, stale Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants