|
1 | 1 | Tools for specific data sources
|
2 | 2 | *******************************
|
3 | 3 |
|
4 |
| -IEA World Energy Balances |
5 |
| -========================= |
| 4 | +.. _tools-iea: |
6 | 5 |
|
7 |
| -.. currentmodule:: message_ix_models.tools.iea_web |
| 6 | +International Energy Agency (IEA) (:mod:`.tools.iea`) |
| 7 | +===================================================== |
8 | 8 |
|
9 |
| -.. automodule:: message_ix_models.tools.iea_web |
10 |
| - :members: |
| 9 | +The IEA publishes many kinds of data. |
| 10 | +Each distinct data source is handled by a separate submodule of :mod:`message_ix_models.tools.iea`. |
11 | 11 |
|
12 |
| - The raw data are in CSV or compressed CSV format. |
13 |
| - They have file names like: |
| 12 | +Documentation for all module contents: |
14 | 13 |
|
15 |
| - - :file:`cac5fa90-en.zip` —the complete, extended energy balances, ZIP compressed, containing a single file with a name like :file:`WBIG_2021-2021-1-EN-20211119T100005.csv`. |
| 14 | +.. currentmodule:: message_ix_models.tools |
16 | 15 |
|
17 |
| - - :file:`WBAL_12052022124930839.csv` —a subset or ‘highlights’ |
| 16 | +.. autosummary:: |
| 17 | + :toctree: _autosummary |
| 18 | + :template: autosummary-module.rst |
| 19 | + :recursive: |
18 | 20 |
|
19 |
| - The data have the following structure: |
| 21 | + iea |
20 | 22 |
|
21 |
| - =========== ====================== |
22 |
| - Column name Example value |
23 |
| - =========== ====================== |
24 |
| - UNIT [1]_ KTOE |
25 |
| - Unit ktoe |
26 |
| - COUNTRY WLD |
27 |
| - Country World |
28 |
| - PRODUCT COAL |
29 |
| - Product Coal and coal products |
30 |
| - FLOW INDPROD |
31 |
| - Flow Production |
32 |
| - TIME 2012 |
33 |
| - Time 2012 |
34 |
| - Value 1234.5678 |
35 |
| - Flag Codes M |
36 |
| - Flags Missing value; data cannot exist |
37 |
| - =========== ====================== |
| 23 | +.. _tools-iea-web: |
38 | 24 |
|
39 |
| - .. [1] the column is sometimes labelled "MEASURE", but the contents appear to be the same. |
| 25 | +(Extended) World Energy Balances (:mod:`.tools.iea.web`) |
| 26 | +-------------------------------------------------------- |
40 | 27 |
|
41 |
| -Code lists |
42 |
| ----------- |
43 |
| -The following files, in :file:`message_ix_models/data/iea/`, contain code lists extracted from the paired columns of the raw data. |
44 |
| -The (longer, human-readable) names are not returned by :func:`.load_data`; only the (shorter) code IDs. |
| 28 | +.. contents:: |
| 29 | + :local: |
| 30 | + :backlinks: none |
45 | 31 |
|
46 |
| -These can be used with other package utilities: |
| 32 | +.. note:: These data are **proprietary** and require a paid subscription. |
47 | 33 |
|
48 |
| -.. code-block:: python |
| 34 | +The approach to handling proprietary data is the same as in :mod:`.project.advance` and :mod:`.project.ssp`: |
| 35 | + |
| 36 | +- Copies of the data are stored in the (private) :mod:`message_data` repository using Git LFS. |
| 37 | + This respository is accessible only to users who have a license for the data. |
| 38 | +- :mod:`message_ix_models` contains only a ‘fuzzed’ version of the data (same structure, random values) for testing purposes. |
| 39 | +- Non-IIASA users must obtain their own license to access and use the data; obtain the data themselves; and place it on the system where they use :mod:`message_ix_models`. |
| 40 | + |
| 41 | +The module :mod:`message_ix_models.tools.iea.web` attempts to detect and support both the providers/formats described below. |
| 42 | +The code supports using data from any of the above locations and formats, in multiple ways: |
| 43 | + |
| 44 | +- Use :func:`.tools.iea.web.load_data` to load data as :class:`pandas.DataFrame` and apply further pandas processing. |
| 45 | +- Use :class:`.IEA_EWEB` via :func:`.tools.exo_data.prepare_computer` to use the data in :mod:`genno` structured calculations. |
| 46 | + |
| 47 | +The **documentation** for the `2023 edition <https://iea.blob.core.windows.net/assets/0acb1453-1221-421b-9131-632ce71a4c1a/WORLDBAL_Documentation.pdf>`__ of the IEA source/format is publicly available. |
49 | 48 |
|
50 |
| - from message_ix_models.util import as_codes, load_package_data |
| 49 | +Structure |
| 50 | +~~~~~~~~~ |
51 | 51 |
|
52 |
| - # a list of sdmx.model.Code objects |
53 |
| - cl = as_codes(load_package_data("iea", "product.yaml")) |
| 52 | +The data have the following conceptual dimensions, each enumerated by a different list of codes: |
54 | 53 |
|
55 |
| - # …etc. |
| 54 | +- ``FLOW`, ``PRODUCT``: for both of these, the lists of codes appearing in the data are the same from 2021 and 2023 inclusive. |
| 55 | +- ``COUNTRY``: The data provided by IEA directly contain codes that are all caps, abbreviated country names, for instance "DOMINICANR". |
| 56 | + The data provided by the OECD contain ISO 3166-1 alpha-3 codes, for instance "DOM". |
| 57 | + In both cases, there are additional labels denoting country groupings; these are defined in the documentation linked above. |
56 | 58 |
|
| 59 | + Changes visible in these lists include: |
57 | 60 |
|
58 |
| -.. literalinclude:: ../../message_ix_models/data/iea/country.yaml |
59 |
| - :language: yaml |
60 |
| - :caption: COUNTRY / node (:file:`country.yaml`) |
| 61 | + - 2022 → 2023: |
61 | 62 |
|
62 |
| -.. literalinclude:: ../../message_ix_models/data/iea/product.yaml |
63 |
| - :language: yaml |
64 |
| - :caption: PRODUCT / commodity (:file:`product.yaml`) |
| 63 | + - New codes: ASEAN, BFA, GREENLAND, MALI, MRT, PSE, TCD. |
| 64 | + - Removed: MASEAN. |
65 | 65 |
|
66 |
| -.. literalinclude:: ../../message_ix_models/data/iea/flag-codes.yaml |
67 |
| - :language: yaml |
68 |
| - :caption: FLAG (:file:`flag-codes.yaml`) |
| 66 | + - 2021 → 2022: |
| 67 | + |
| 68 | + - New codes: GNQ, MDG, MKD, RWA, SWZ, UGA. |
| 69 | + - Removed: EQGUINEA, GREENLAND, MALI, MBURKINAFA, MCHAD, MMADAGASCA, MMAURITANI, MPALESTINE, MRWANDA, MUGANDA, NORTHMACED. |
| 70 | + |
| 71 | +- TIME: always a year. |
| 72 | +- MEASURE: unit of measurement, either "TJ" or "ktoe". |
| 73 | + |
| 74 | +:mod:`message_ix_models` is packaged with SDMX structure data (stored in :file:`message_ix_models/data/sdmx/`) comprising code lists extracted from the raw data for the COUNTRY, FLOW, and PRODUCT dimensions. |
| 75 | +These can be used with other package utilities, for instance: |
| 76 | + |
| 77 | +.. code-block:: python |
69 | 78 |
|
70 |
| -.. literalinclude:: ../../message_ix_models/data/iea/flow.yaml |
71 |
| - :language: yaml |
72 |
| - :caption: FLOW (:file:`flow.yaml`) |
| 79 | + >>> from message_ix_models.util.sdmx import read |
| 80 | +
|
| 81 | + # Read a code list from file: codes used in the |
| 82 | + # 2022 edition data from the OECD provider |
| 83 | + >>> cl = read("IEA:PRODUCT_OECD(2022)") |
| 84 | +
|
| 85 | + # Show some of its elements |
| 86 | + >>> print("\n".join(sorted(cl.items[:5]))) |
| 87 | + ADDITIVE |
| 88 | + ANTCOAL |
| 89 | + AVGAS |
| 90 | + BIODIESEL |
| 91 | + BIOGASES |
| 92 | +
|
| 93 | +The documentation linked above has full descriptions of each code. |
| 94 | + |
| 95 | +IEA provider/format |
| 96 | +~~~~~~~~~~~~~~~~~~~ |
| 97 | + |
| 98 | +From 2023 (or earlier), the data are provided directly on the IEA website at https://www.iea.org/data-and-statistics/data-product/world-energy-balances. |
| 99 | +These data are available in two formats; ‘IVT’ or “Beyond 20/20” format (not supported by this module) or fixed-width text files. |
| 100 | +The latter are characterized by: |
| 101 | + |
| 102 | +- Multiple ZIP archives with names like :file:`WBIG[12].zip`, each containing a portion of the data and typically 110–130 MiB compressed size |
| 103 | +- …each containing a single, fixed-with TXT file with a name like :file:`WORLDBIG[12].TXT`, typically 3–4 GiB uncompressed, |
| 104 | +- …with no column headers, but data resembling:: |
| 105 | + |
| 106 | + WORLD HARDCOAL 1960 INDPROD KTOE .. |
| 107 | + |
| 108 | + …that appear to correspond to, respectively, the COUNTRY, PRODUCT, TIME, FLOW, and MEASURE dimensions and "Value" column of the above data, respectively. |
| 109 | + |
| 110 | +OECD provider/format |
| 111 | +~~~~~~~~~~~~~~~~~~~~ |
| 112 | + |
| 113 | +Up until 2023, the EWEB data were available from the OECD iLibrary with DOI `10.1787/enestats-data-en <https://doi.org/10.1787/enestats-data-en>`__. |
| 114 | +These files were characterized by: |
| 115 | + |
| 116 | +- Single ZIP archives with names like :file:`cac5fa90-en.zip`; typically ~850 MiB compressed size, |
| 117 | +- …containing a single CSV file with a name like :file:`WBIG_2022-2022-1-EN-20230406T100006.csv`, typically >20 GiB uncompressed, |
| 118 | +- …with a particular list of columns like: "MEASURE", "Unit", "COUNTRY", "Country", "PRODUCT", "Product", "FLOW", "Flow", "TIME", "Time", "Value", "Flag Codes", "Flags", |
| 119 | +- …with contents that duplicated code IDs—for instance, in the "FLOW" column—with human-readable labels—for instance in the "Flow" column: |
| 120 | + |
| 121 | + ============ === |
| 122 | + Column name Example value |
| 123 | + ============ === |
| 124 | + MEASURE [1]_ KTOE |
| 125 | + Unit ktoe |
| 126 | + COUNTRY WLD |
| 127 | + Country World |
| 128 | + PRODUCT COAL |
| 129 | + Product Coal and coal products |
| 130 | + FLOW INDPROD |
| 131 | + Flow Production |
| 132 | + TIME 2012 |
| 133 | + Time 2012 |
| 134 | + Value 1234.5678 |
| 135 | + Flag Codes M |
| 136 | + Flags Missing value; data cannot exist |
| 137 | + ============ === |
| 138 | + |
| 139 | + .. [1] the column is sometimes labelled "UNIT", but the contents appear to be the same. |
| 140 | +
|
| 141 | +This source is discontinued and will not publish subsequent editions of the data. |
0 commit comments