-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
re: missing whitespace in root node #34
Comments
You can still comment on closed issues. This isn’t about valid or invalid. We use an AST (abstract). Not a CST (concrete). We don’t generate the exact input document. We generate an equivalent document. |
This comment has been minimized.
This comment has been minimized.
sorry .__.
rendering html != transforming html
the parse tree is 99% concrete already. the resulting parser may be less "strict" and more "loose" than a perfectly spec-compliant parser ... |
That’s not possible through the GH UI, unfortunately!
It’s extremely complex. A concrete tree would house all original sources of character references. It would house whitespace inside tags. It would house the casing of attributes. It would include information on whether double or single quotes are used. It would information in doctypes, processing instructions. It would contain casing of tag names. It would be months of work to create.
It’s a ton of work, probably a full year to implement a new parser and design the data structure. If you are interested in paying me to make it reality I would likely quote you $200k.
This project adheres to the HTML spec.
In this particular case of only the whitespace? It would indeed probably not be terribly complex to implement, but But we have had similar conversations about this: |
thanks.
yes. i would use a concrete parser like
with a different transformer (different node names, different data structure) edit: for my use case "html to html transformer" |
Initial checklist
Affected packages and versions
#33
Link to runnable example
No response
Steps to reproduce
#33
so my input html is invalid?
is valid html per https://validator.w3.org/nu/#textarea
the unified toolchain mangles this to
i dont know where exactly the bug is.
maybe in hast-util-to-html, maybe somewhere else
its certainly useful for writing lossless transformers, which produce minimal diffs
maybe this could cause trouble in some tree consumers like hast-util-select
which expect "no whitespace in the root node"?
no, i want to preserve the original whitespace
Expected behavior
#33
Actual behavior
#33
Affected runtime and version
#33
Affected package manager and version
No response
Affected OS and version
No response
Build and bundle tools
No response
The text was updated successfully, but these errors were encountered: