Skip to content

pivot_table does not accept Series.name as index, columns or value #1212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cmp0xff opened this issue May 15, 2025 · 3 comments · Fixed by #1216
Closed

pivot_table does not accept Series.name as index, columns or value #1212

cmp0xff opened this issue May 15, 2025 · 3 comments · Fixed by #1216

Comments

@cmp0xff
Copy link
Contributor

cmp0xff commented May 15, 2025

Describe the bug

pivot_table does not accept Series.name as index, columns or value

To Reproduce

Provide a minimal runnable pandas example that is not properly checked by the stubs.

from collections.abc import Hashable, Sequence

import pandas as pd

ind = pd.Series([0, 0, 1, 1], name="ind")
col = pd.Series([0, 1, 0, 1], name="col")
val = pd.Series([0, 1, 2, 3], name="val")
piv = pd.concat([ind, col, val], axis=1).pivot_table(index=ind.name, columns=col.name, values=val.name)


def foo(data: pd.DataFrame, ind: Hashable | Sequence[Hashable], col: Hashable, val: Hashable) -> pd.DataFrame:
    return data.pivot_table(index=ind , columns=col, values=val)

Indicate which type checker you are using (mypy or pyright).

mypy

Show the error message received from that type checker while checking your example.

sandbox\pd_series.py:8:60: error: Argument "index" to "pivot_table" of "DataFrame" has incompatible type "Hashable"; expected "str | Grouper | Sequence[Any] | None"  [arg-type]
sandbox\pd_series.py:8:78: error: Argument "columns" to "pivot_table" of "DataFrame" has incompatible type "Hashable"; expected "str | Grouper | Sequence[Any] | None"  [arg-type]
sandbox\pd_series.py:8:95: error: Argument "values" to "pivot_table" of "DataFrame" has incompatible type "Hashable"; expected "str | Sequence[str] | None"  [arg-type]
sandbox\pd_series.py: note: In function "foo":
sandbox\pd_series.py:12:35: error: Argument "index" to "pivot_table" of "DataFrame" has incompatible type "Hashable | Sequence[Hashable]"; expected "str | Grouper | Sequence[Any] | None"  [arg-type]
sandbox\pd_series.py:12:48: error: Argument "columns" to "pivot_table" of "DataFrame" has incompatible type "Hashable"; expected "str | Grouper | Sequence[Any] | None"  [arg-type]
sandbox\pd_series.py:12:60: error: Argument "values" to "pivot_table" of "DataFrame" has incompatible type "Hashable"; expected "str | Sequence[str] | None"  [arg-type]
Found 6 errors in 1 file (checked 1 source file)

Please complete the following information:

OS: [e.g. Windows, Linux, MacOS]

Windows

OS Version [e.g. 22]

10

python version

3.13.2

version of type checker

1.15.0

version of installed pandas-stubs

2.2.3.250308

Additional context

Related to pandas-dev/pandas#61432

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented May 15, 2025

This is a tough call as to how to proceed. There are 2 options:

  1. Change the name properties of Series and Index to be str because that's what most people do for names. (Note that Index.name currently has no typing - so that ought to get fixed anyway.)
  2. Change the arguments of pivot_table() to accept Hashable instead of str

While it is valid to write something like s=pd.Series([1,2,3], name=("a", 3)), my sense is that it would help people more if the stubs enforced that names need to be strings, so I'd prefer option (1). As an example, in the stubs, DataFrame.columns is Index[str], meaning that we are assuming that column names are strings. I tested making that change and all of our current typing tests pass.

If we go to option (2), then my concern is that using Hashable makes things too wide and won't catch common errors.

@cmp0xff
Copy link
Contributor Author

cmp0xff commented May 15, 2025

Hi @Dr-Irv , thank you for your reply. Personally I would be happy to go with option 1. Should we wait for others to join the discussion?

Furthermore, how should we proceed? As one could see in the linked pandas-dev/pandas#61432, the pandas documentation also needs to be fixed, one way or another. Should we fix the stubs or the documentation first?

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented May 15, 2025

In this case, I'm fine if we change the stubs. I'm not sure there will be agreement on changing the docs to say that a name should be just a str. That's because there are operations where column names could end up being hashable. E.g.,

>>> df = pd.DataFrame({"x":[1,2,3], "let":["a", "b", "c"], "val":[10,20,30]}).set_index(["x", "let"])
>>> df
       val
x let
1 a     10
2 b     20
3 c     30
>>> dft = df.T
>>> dft
x     1   2   3
let   a   b   c
val  10  20  30
>>> dft.columns[0]
(np.int64(1), 'a')
>>> dft[dft.columns[0]].name
(np.int64(1), 'a')

There are other operations that can create a DataFrame that has column names that are tuples, and then the "name" of each column is a tuple.

cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this issue May 16, 2025
cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this issue May 16, 2025
cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this issue May 16, 2025
Dr-Irv pushed a commit that referenced this issue May 16, 2025
* fix: #1212 Index.name currently has no typing #1212 (comment)

* fix: #1212 use typing from pandas.core.reshape.pivot

* fix(comment): #1216 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants