Skip to content

Commit dc66f0d

Browse files
phoflIllviljan
andauthored
Fix bug for categorical pandas index with categories with EA dtype (#8481)
* Fix bug for categorical pandas index with categories with EA dtype * Add whatsnew * Update xarray/tests/test_dataset.py Co-authored-by: Illviljan <[email protected]> --------- Co-authored-by: Illviljan <[email protected]>
1 parent 71c2f61 commit dc66f0d

File tree

3 files changed

+15
-0
lines changed

3 files changed

+15
-0
lines changed

doc/whats-new.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,8 @@ Deprecations
4242
Bug fixes
4343
~~~~~~~~~
4444

45+
- Fix dtype inference for ``pd.CategoricalIndex`` when categories are backed by a ``pd.ExtensionDtype`` (:pull:`8481`)
46+
4547

4648
Documentation
4749
~~~~~~~~~~~~~

xarray/core/utils.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,8 @@ def get_valid_numpy_dtype(array: np.ndarray | pd.Index):
114114
elif hasattr(array, "categories"):
115115
# category isn't a real numpy dtype
116116
dtype = array.categories.dtype
117+
if not is_valid_numpy_dtype(dtype):
118+
dtype = np.dtype("O")
117119
elif not is_valid_numpy_dtype(array.dtype):
118120
dtype = np.dtype("O")
119121
else:

xarray/tests/test_dataset.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4697,6 +4697,17 @@ def test_from_dataframe_categorical(self) -> None:
46974697
assert len(ds["i1"]) == 2
46984698
assert len(ds["i2"]) == 2
46994699

4700+
def test_from_dataframe_categorical_string_categories(self) -> None:
4701+
cat = pd.CategoricalIndex(
4702+
pd.Categorical.from_codes(
4703+
np.array([1, 1, 0, 2]),
4704+
categories=pd.Index(["foo", "bar", "baz"], dtype="string"),
4705+
)
4706+
)
4707+
ser = pd.Series(1, index=cat)
4708+
ds = ser.to_xarray()
4709+
assert ds.coords.dtypes["index"] == np.dtype("O")
4710+
47004711
@requires_sparse
47014712
def test_from_dataframe_sparse(self) -> None:
47024713
import sparse

0 commit comments

Comments
 (0)