-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safe JSON #4196
Safe JSON #4196
Conversation
for insertion in HTML, to avoid XSS
) | ||
_swap_orjson = _swap_json + ( | ||
("\u2028", "\\u2028"), | ||
("\u2029", "\\u2029"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are apparently JavaScript line terminator characters. The JS parser will see this end-of-line, see that the data structure is incomplete, and throw an error. AFAICT this isn't in itself a security issue, but it's a bug - if you ever wanted these characters in your figure for real it wouldn't work (when inserted in HTML).
The standard library json
converts at least these unicode characters into escape sequences anyway, so we don't need to worry about them, but orjson
sends them as unicode characters.
if unsafe_char in out: | ||
out = out.replace(unsafe_char, safe_char) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing first whether the character is present in the string is substantially faster than .replace
when it finds even a single instance of the character - so given that much of the time you won't have some of these characters it's worthwhile including the if
first. And looping over each character in turn is a lot faster than any solution I could find that only searches the string once, ie a regexp.
@@ -19,3 +19,4 @@ matplotlib==2.2.3 | |||
scikit-image==0.18.1 | |||
psutil==5.7.0 | |||
kaleido | |||
orjson==3.8.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added orjson
only to py3.7 and py3.9 optional tests, so that we'd run a bunch of the other tests using json
. Recent orjson
doesn't support py3.6 anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💃
Fixing a possible XSS issue when our JSON is inserted into an HTML string. We could apply this just to
fig.to_html
, but to be even safer I'm applying it to all JSON serialization, in case users use a different mechanism to insert this into HTML.Performance is a concern here, but in my testing except in really pathological situations these substitutions are a good deal faster than serialization even by
orjson
.plotly.graph_objects
, my modifications concern thecodegen
files and not generated files.modified existing tests.