-
Notifications
You must be signed in to change notification settings - Fork 1.7k
text plugin incorrectly deserializes scalar strings when running python 3 #647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is working as intended. The docs for If you don't want to be the case, you can consider formatting these strings as code, by using either
This yields the following result: I'll cross-post to SO. |
(Hey, Nick! Fancy running into you here...) I'm not sure this is actually WAI. I can't repro this behavior either with a fresh install of tensorflow 1.3.0 (bundled with tensorboard 0.1.8) from PyPI or building tensorboard independently at HEAD and running it against a sample events file from tensorflow nightly. In both cases, all three text samples produce the same expected rendering with underscores shown: @nrhine1 - could you check what version of the The python markdown package (like many others including github markdown) now defaults to not italicizing mid-word because it interferes so often with filenames using underscores. This behavior is behind an option (smart_emphasis) but it defaults to True, at least currently. And when it does italicize, backslash escaping the italics should work. See e.g. using
So I'm at a bit of a loss why you're getting the "non-smart" behavior. |
Hi @nfelt (indeed funny running into you here :) I was running with 2.6.9:
I installed 2.6.8 as suggested, and also printed the markdown version inside the tensorboard python executable to ensure it was 2.6.8 as expected. The resulting parsed Markdown in tensorboard still treats the underscores as emphasis. However, using backticks to escape the string as previously suggested does result in the expected rendering. I've run I've also tried explicitly passing |
Hmm, strange. Are you able to reproduce it in a fresh virtual environment with only tensorflow/tensorboard and deps installed? I'm also wondering a little if the presence of the DLMC tensorboard package could be interfering somehow. |
I am able to reproduce the error in a fresh virtualenv. I created a new python 3.4.3 virtualenv, activated it, ran I doubt it's relevant (and hope it's not), but I'm running the script and tensorboard remotely on a server, and visualizing the result in my local browser. New result of
|
Okay, thanks for the additional information - I could also repro this in a python 3 environment and it turns out to be a python 2-3 compat issue stemming from some TensorBoard code that works only in python 2. I'll revise the issue title accordingly and this should be a pretty simple fix. Essentially, the issue is the following line of code: The use of
In python 2, tostring() == the raw bytes which works because numpy's storage format matches python's. In python 3, the raw bytes don't match python - every character is followed by 3 null bytes due to the UTF-32 encoding. This in turn causes the markdown library to see non-word characters surrounding every letter in the string, which means that intra-word underscores like |
This fixes #647 in which TextPlugin incorrectly decodes scalar string values as their raw numpy byte representation before passing them to `markdown_to_safe_html()`. When running python 2 this happens to work because numpy and Python use the same raw byte representation for string data, but in python 3 this is not the case, and the result is a string with spurious null bytes, leading the markdown processor to interpret certain patterns incorrectly. See the issue for details. This both fixes the TextPlugin decoding logic itself, and also fixes `markdown_to_safe_html()` so that it will discard null bytes and log a warning about them before doing markdown interpretation. Tested under python 2 and 3 via `bazel test --python_path=...`.
This fixes tensorflow#647 in which TextPlugin incorrectly decodes scalar string values as their raw numpy byte representation before passing them to `markdown_to_safe_html()`. When running python 2 this happens to work because numpy and Python use the same raw byte representation for string data, but in python 3 this is not the case, and the result is a string with spurious null bytes, leading the markdown processor to interpret certain patterns incorrectly. See the issue for details. This both fixes the TextPlugin decoding logic itself, and also fixes `markdown_to_safe_html()` so that it will discard null bytes and log a warning about them before doing markdown interpretation. Tested under python 2 and 3 via `bazel test --python_path=...`.
This fixes #647 in which TextPlugin incorrectly decodes scalar string values as their raw numpy byte representation before passing them to `markdown_to_safe_html()`. When running python 2 this happens to work because numpy and Python use the same raw byte representation for string data, but in python 3 this is not the case, and the result is a string with spurious null bytes, leading the markdown processor to interpret certain patterns incorrectly. See the issue for details. This both fixes the TextPlugin decoding logic itself, and also fixes `markdown_to_safe_html()` so that it will discard null bytes and log a warning about them before doing markdown interpretation. Tested under python 2 and 3 via `bazel test --python_path=...`.
Strings with underscores in them aren't being rendered correctly by Tensorboard.
See the SO question here
Package versions: Tensorflow 1.3.0, TensorBoard 0.1.8
The text was updated successfully, but these errors were encountered: