|
| 1 | +# Implementation Notes for Internationalization of Jupyter Notebook |
| 2 | + |
| 3 | +This is a prototype implementation of i18n features for Jupyter notebook, and should not |
| 4 | +yet be considered ready for production use. I have tried to focus on the public user |
| 5 | +interfaces in the notebook for the first cut, while leaving much of the console messages |
| 6 | +behind, as their usefulness in a translated environment is questionable at best. |
| 7 | + |
| 8 | +### Using a prototype translated version |
| 9 | + |
| 10 | +In order to use this preliminary version, you need to do things after installing the |
| 11 | +notebook as normal: |
| 12 | + |
| 13 | +1. Set the LANG environment variable in your shell to "xx_XX" or just "xx". |
| 14 | +where "xx" is the language code you're wanting to run in. If you're |
| 15 | +running on Windows, I've found the easiest way to do this is to use Windows PowerShell, |
| 16 | +and run the command: |
| 17 | + |
| 18 | +`${Env:LANG} = "xx_XX"` |
| 19 | + |
| 20 | +2. Set the preferred language for web pages in your browser to YourLanguage (xx). At the moment, |
| 21 | +it has to be first in the list. |
| 22 | + |
| 23 | +3. Run the `jupyter notebook` command to start the notebook. |
| 24 | + |
| 25 | +### Message extraction: |
| 26 | + |
| 27 | +I have split out the translatable material for the notebook into 3 POT, as follows: |
| 28 | + |
| 29 | +notebook/i18n/notebook.pot - Console and startup messages, basically anything that is |
| 30 | + produced by Python code. |
| 31 | + |
| 32 | +notebook/i18n/nbui.pot - User interface strings, as extracted from the Jinja2 templates |
| 33 | + in notebook/templates/*.html |
| 34 | + |
| 35 | +noteook/i18n/nbjs.pot - JavaScript strings and dialogs, which contain much of the visible |
| 36 | + user interface for Jupyter notebook. |
| 37 | + |
| 38 | +To extract the messages from the source code whenever new material is added, use the |
| 39 | +`pybabel` command to extract messages from the source code as follows: |
| 40 | +( assuming you are in the base directory for Jupyter notebook ) |
| 41 | + |
| 42 | +`pybabel extract -F notebook/i18n/babel_notebook.cfg -o notebook/i18n/notebook.pot --no-wrap --project Jupyter .` |
| 43 | +`pybabel extract -F notebook/i18n/babel_nbui.cfg -o notebook/i18n/nbui.pot --no-wrap --project Jupyter .` |
| 44 | +`pybabel extract -F notebook/i18n/babel_nbjs.cfg -o notebook/i18n/nbjs.pot --no-wrap --project Jupyter .` |
| 45 | + |
| 46 | +(Note: there is a '.' at the end of these commands, and it has to be there...) |
| 47 | + |
| 48 | +After this is complete you have 3 POT files that you can give to a translator for your favorite language. |
| 49 | +Babel's documentation has instructions on how to integrate this into your setup.py so that eventually |
| 50 | +we can just do: |
| 51 | + |
| 52 | +`setup.py extract_messages` |
| 53 | + |
| 54 | +I hope to get this working at some point in the near future. |
| 55 | + |
| 56 | +### Post translation procedures |
| 57 | + |
| 58 | +After the source material has been translated, you should have 3 PO files with the same base names |
| 59 | +as the POT files above. Put them in `notebook/i18n/${LANG}/LC_MESSAGES`, where ${LANG} is the language |
| 60 | +code for your desired language ( i.e. German = "de", Japanese = "ja", etc. ). The first 2 files then |
| 61 | +need to be converted from PO to MO format for use at runtime. There are many different ways to do |
| 62 | +this, but pybabel has an option to do this as follows: |
| 63 | + |
| 64 | +`pybabel compile -D notebook -f -l ${LANG} -i notebook/i18n/${LANG}/LC_MESSAGES/notebook.po -o notebook/i18n/${LANG}/notebook.mo` |
| 65 | + |
| 66 | +`pybabel compile -D nbui -f -l ${LANG} -i notebook/i18n/${LANG}/LC_MESSAGES/nbui.po -o notebook/i18n/${LANG}/nbui.mo` |
| 67 | + |
| 68 | +The nbjs.po needs to be converted to JSON for use within the JavaScript code. I'm using po2json for this, as follows: |
| 69 | + |
| 70 | +`po2json -p -F -f jed1.x -d nbjs notebook/i18n/${LANG}/LC_MESSAGES/nbjs.po notebook/i18n/${LANG}/LC_MESSAGES/nbjs.json` |
| 71 | + |
| 72 | +The conversions from PO to MO probably can and should be done during setup.py. |
| 73 | + |
| 74 | +When new languages get added, their language codes should be added to notebook/i18n/nbjs.json |
| 75 | +under the "supported_languages" element. |
| 76 | + |
| 77 | +### Tips for Jupyter developers |
| 78 | + |
| 79 | +The biggest "mistake" I found while doing i18n enablement was the habit of constructing UI messages |
| 80 | +from English "piece parts". For example, code like: |
| 81 | + |
| 82 | + |
| 83 | +`var msg = "Enter a new " + type + "name:"` |
| 84 | + |
| 85 | +where "type" is either "file", "directory", or "notebook".... |
| 86 | + |
| 87 | +is problematic when doing translations, because the surrounding text may need to vary |
| 88 | +depending on the inserted word. In this case, you need to switch it and use complete phrases, |
| 89 | +as follows: |
| 90 | + |
| 91 | +```javascript |
| 92 | +var rename_msg = function (type) { |
| 93 | + switch(type) { |
| 94 | + case 'file': return _("Enter a new file name:"); |
| 95 | + case 'directory': return _("Enter a new directory name:"); |
| 96 | + case 'notebook': return _("Enter a new notebook name:"); |
| 97 | + default: return _("Enter a new name:"); |
| 98 | + } |
| 99 | +} |
| 100 | +``` |
| 101 | + |
| 102 | +Also you need to remember that adding an "s" or "es" to an English word to |
| 103 | +create the plural form doesn't translate well. Some languages have as many as 5 or 6 different |
| 104 | +plural forms for differing numbers, so using an API such as ngettext() is necessary in order |
| 105 | +to handle these cases properly. |
| 106 | + |
| 107 | +### Known issues |
| 108 | + |
| 109 | +1. Right now there are two different places where the desired language is set. At startup time, the Jupyter console's messages pay attention to the setting of the ${LANG} environment variable |
| 110 | +as set in the shell at startup time. Unfortunately, this is also the time where the Jinja2 |
| 111 | +environment is set up, which means that the template stuff will always come from this setting. |
| 112 | +We really want to be paying attention to the browser's settings for the stuff that happens in the |
| 113 | +browser, so we need to be able to retrieve this information after the browser is started and somehow |
| 114 | +communicate this back to Jinja2. So far, I haven't yet figured out how to do this, which means that if the ${LANG} at startup doesn't match the browser's settings, you could potentially get a mix |
| 115 | +of languages in the UI ( never a good thing ). |
| 116 | + |
| 117 | +2. We will need to decide if console messages should be translatable, and enable them if desired. |
| 118 | +3. The keyboard shorcut editor was implemented after the i18n work was completed, so that portion |
| 119 | +does not have translation support at this time. |
| 120 | + |
| 121 | +Any questions or comments please let me know @JCEmmons on github ( [email protected]) |
| 122 | + |
0 commit comments