Skip to content

Fix typing for coords argument in DataArray #5671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DeltaSigma130 opened this issue Aug 3, 2021 · 3 comments
Closed

Fix typing for coords argument in DataArray #5671

DeltaSigma130 opened this issue Aug 3, 2021 · 3 comments

Comments

@DeltaSigma130
Copy link

Problem

Mypy reports errors on anything other than a list of tuples passed to coords in xr.DataArray constructor even though xr.Variable and xr.DataArray are valid objects.

Code

Contained in file type_test.py.

import numpy as np
import xarray as xr

t = np.linspace(0, 2*np.pi, 12)
tv = xr.Variable('time', t)  # line 5

d2sin = xr.DataArray(np.sin(2 * tv), coords=[('time', t)])  # using tuple
#    mypy column error detect here  ----v
dsin = xr.DataArray(np.sin(tv), coords=[tv])  # using xr.Variable
dcos = xr.DataArray(np.cos(tv), coords=[dsin['time']])  # using xr.DataArray

print(f"Types:\n\t{type(tv)=}\n\t{type(dsin['time'])=}")  # line 12
print("**d2sin**:", d2sin)
print("**dsin**:", dsin)
print("**dcos**:", dcos)

Running Python

(jupyter) PS C:\Users\<USER>\Documents\Projects\Jupyter\sim_optical_disk> python .\type_test.py
Types:
        type(tv)=<class 'xarray.core.variable.Variable'>
        type(dsin['time'])=<class 'xarray.core.dataarray.DataArray'>
**d2sin**: <xarray.DataArray (time: 12)>
array([ 0.00000000e+00,  9.09631995e-01,  7.55749574e-01, -2.81732557e-01,
       -9.89821442e-01, -5.40640817e-01,  5.40640817e-01,  9.89821442e-01,
        2.81732557e-01, -7.55749574e-01, -9.09631995e-01, -4.89858720e-16])
Coordinates:
  * time     (time) float64 0.0 0.5712 1.142 1.714 ... 4.57 5.141 5.712 6.283
**dsin**: <xarray.DataArray (time: 12)>
array([ 0.00000000e+00,  5.40640817e-01,  9.09631995e-01,  9.89821442e-01,
        7.55749574e-01,  2.81732557e-01, -2.81732557e-01, -7.55749574e-01,
       -9.89821442e-01, -9.09631995e-01, -5.40640817e-01, -2.44929360e-16])
Coordinates:
  * time     (time) float64 0.0 0.5712 1.142 1.714 ... 4.57 5.141 5.712 6.283
**dcos**: <xarray.DataArray (time: 12)>
array([ 1.        ,  0.84125353,  0.41541501, -0.14231484, -0.65486073,
       -0.95949297, -0.95949297, -0.65486073, -0.14231484,  0.41541501,
        0.84125353,  1.        ])
Coordinates:
  * time     (time) float64 0.0 0.5712 1.142 1.714 ... 4.57 5.141 5.712 6.283

Running MyPy

(jupyter) PS C:\Users\<USER>\Documents\Projects\Jupyter\sim_optical_disk> mypy .\type_test.py
type_test.py:9: error: List item 0 has incompatible type "Variable"; expected "Tuple[Any, ...]"
type_test.py:10: error: List item 0 has incompatible type "DataArray"; expected "Tuple[Any, ...]"
Found 2 errors in 1 file (checked 1 source file)

Environment

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.9.1 (default, Dec 11 2020, 09:29:25) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: English_United States.1252
libhdf5: 1.10.6
libnetcdf: 4.6.1

xarray: 0.18.0
pandas: 1.2.1
numpy: 1.20.2
scipy: 1.6.2
netCDF4: 1.5.7
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.5.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.3.2
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 52.0.0.post20210125
pip: 21.1.1
conda: None
pytest: None
IPython: 7.19.0
sphinx: None

@max-sixty
Copy link
Collaborator

Good spot, thanks @DeltaSigma130 .

I think this should be a fairly easy fix if anyone is up for taking this. I'm somewhat surprised that it's not covered in our existing code such that mypy fails within xarray.

@H0R5E
Copy link

H0R5E commented Dec 23, 2021

I can confirm this issue when specifying dimensions with coordinates, as per this example:

import xarray as xr

z = -1
x = [1, 2, 3]
y = [4, 5, 6]
data = [0, 0, 0]

coords = {"z": z,
          "x": ("dim_0", x),
          "y": ("dim_0", y)}

xr.DataArray(data,
             coords=coords)

MyPy output:

error: Argument "coords" to "DataArray" has incompatible type "Dict[str, object]"; expected "Union[Sequence[Tuple[Any, ...]], Mapping[Hashable, Any], None]"

I can work around it by specifying the type of coords explicitly:

from typing import Any, Hashable, Mapping

import xarray as xr

z = -1
x = [1, 2, 3]
y = [4, 5, 6]
data = [0, 0, 0]

coords: Mapping[Hashable, Any] = {"z": z,
                                  "x": ("dim_0", x),
                                  "y": ("dim_0", y)}

xr.DataArray(data,
             coords=coords)
Success: no issues found in 1 source file

There is some discussion about str and Hashable in this issue, but type invariance is still a little over my head, TBH.

@H0R5E
Copy link

H0R5E commented Dec 23, 2021

My issue might actually be fixed in main.

EDIT: Change was made in #5728

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants