Skip to content

SparseDataFrame loc raises InvalidIndexError when repeated keys are passed #24270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vindex10 opened this issue Dec 13, 2018 · 0 comments · Fixed by #28425
Closed

SparseDataFrame loc raises InvalidIndexError when repeated keys are passed #24270

vindex10 opened this issue Dec 13, 2018 · 0 comments · Fixed by #28425
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Sparse Sparse Data Type

Comments

@vindex10
Copy link

Code Sample, a copy-pastable example if possible

tmp = pd.SparseDataFrame([[0]*330]*447)
tmp.loc[[0]*61]
Traceback:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vindex10/downloads/pand/pandas/pandas/core/base.py", line 77, in __repr__
    return str(self)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/base.py", line 56, in __str__
    return self.__unicode__()
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 628, in __unicode__
    line_width=width, show_dimensions=show_dimensions)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 708, in to_string
    line_width=line_width)
  File "/home/vindex10/downloads/pand/pandas/pandas/io/formats/format.py", line 428, in __init__
    self._chk_truncate()
  File "/home/vindex10/downloads/pand/pandas/pandas/io/formats/format.py", line 497, in _chk_truncate
    frame = concat((frame.iloc[:row_num, :],
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 1490, in __getitem__
    return self._getitem_tuple(key)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 2155, in _getitem_tuple
    retval = getattr(retval, self.name)._getitem_axis(key, axis=axis)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 2206, in _getitem_axis
    return self._get_slice_axis(key, axis=axis)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 2176, in _get_slice_axis
    return self._slice(slice_obj, axis=axis, kind='iloc')
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 151, in _slice
    return self.obj._slice(obj, axis=axis, kind=kind)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/sparse/frame.py", line 528, in _slice
    return self.reindex(index=new_index, columns=new_columns)
  File "/home/vindex10/downloads/pand/pandas/pandas/util/_decorators.py", line 186, in wrapper
    return func(*args, **kwargs)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 3641, in reindex
    return super(DataFrame, self).reindex(**kwargs)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/generic.py", line 4331, in reindex
    fill_value, copy).__finalize__(self)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 3573, in _reindex_axes
    fill_value, limit, tolerance)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/sparse/frame.py", line 679, in _reindex_index
    indexer = self.index.get_indexer(index, method, limit=limit)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexes/base.py", line 2680, in get_indexer
    raise InvalidIndexError('Reindexing only valid with uniquely'
pandas.core.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Problem description

When asking for 60 rows it works:

tmp.loc[[0]*60]

And also works with 61 and regular DataFrame. So the behavior is definitely unexpectable.

Expected Output

SparseDataFrame with 61 row.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: c037128
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.83-gentoo
machine: x86_64
processor: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz
byteorder: little
LC_ALL: en_US.UTF8
LANG: en_US.utf8
LOCALE: en_US.UTF-8

pandas: 0.24.0.dev0+1274.gc0371288f
pytest: None
pip: 18.1
setuptools: 39.0.1
Cython: 0.29.1
numpy: 1.15.4
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@mroeschke mroeschke added Bug Indexing Related to indexing on series/frames, not to indexes themselves Sparse Sparse Data Type labels Jan 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants