-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: DataFrame.to_dict when orient=index data loss #22810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ame with non-unique index
Hello @jameswinegar! Thanks for updating the PR.
Comment last updated on September 23, 2018 at 02:01 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #22810 +/- ##
==========================================
+ Coverage 92.19% 92.19% +<.01%
==========================================
Files 169 169
Lines 50873 50913 +40
==========================================
+ Hits 46904 46941 +37
- Misses 3969 3972 +3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a note in the api changes section about this for 0.24.0
pandas/core/frame.py
Outdated
@@ -1163,8 +1163,13 @@ def to_dict(self, orient='dict', into=dict): | |||
for k, v in zip(self.columns, np.atleast_1d(row))) | |||
for row in self.values] | |||
elif orient.lower().startswith('i'): | |||
return into_c((t[0], dict(zip(self.columns, t[1:]))) | |||
for t in self.itertuples()) | |||
if self.index.unique().size == self.index.size: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just use self.index.is_unique
pandas/core/frame.py
Outdated
if self.index.unique().size == self.index.size: | ||
return into_c((t[0], dict(zip(self.columns, t[1:]))) | ||
for t in self.itertuples()) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for an else!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this affect any other orient types?
can you add some equiv tests for .to_json? (or is this affected?)
@jreback Where are the API changes documentation? |
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -191,6 +191,7 @@ Other Enhancements | |||
- :meth:`Series.resample` and :meth:`DataFrame.resample` have gained the :meth:`Resampler.quantile` (:issue:`15023`). | |||
- :meth:`Index.to_frame` now supports overriding column name(s) (:issue:`22580`). | |||
- New attribute :attr:`__git_version__` will return git commit sha of current build (:issue:`21295`). | |||
- :func:`~DataFrame.to_dict` with orient on index and a non-unique index will raise a ValueError instead of losing data (:issue:`22801`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to other api changes
@jreback any more edits needed? |
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -804,6 +804,7 @@ Reshaping | |||
- Bug in :meth:`Series.replace` and meth:`DataFrame.replace` when dict is used as the ``to_replace`` value and one key in the dict is is another key's value, the results were inconsistent between using integer key and using string key (:issue:`20656`) | |||
- Bug in :meth:`DataFrame.drop_duplicates` for empty ``DataFrame`` which incorrectly raises an error (:issue:`20516`) | |||
- Bug in :func:`pandas.wide_to_long` when a string is passed to the stubnames argument and a column name is a substring of that stubname (:issue:`22468`) | |||
- Bug in :func:`DataFrame.to_dict` raises ``ValueError`` when used with ``orient='index'`` and a non-unique index instead of losing data (:issue:`22801`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to API Breaking changes
@jreback is that move good? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @jreback
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -847,7 +847,7 @@ Reshaping | |||
- Bug in :meth:`Series.replace` and meth:`DataFrame.replace` when dict is used as the ``to_replace`` value and one key in the dict is is another key's value, the results were inconsistent between using integer key and using string key (:issue:`20656`) | |||
- Bug in :meth:`DataFrame.drop_duplicates` for empty ``DataFrame`` which incorrectly raises an error (:issue:`20516`) | |||
- Bug in :func:`pandas.wide_to_long` when a string is passed to the stubnames argument and a column name is a substring of that stubname (:issue:`22468`) | |||
- | |||
- Bug in :func:`merge` when merging ``datetime64[ns, tz]`` data that contained a DST transition (:issue:`18885`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved merge conflict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebase and ping on green.
@jreback fixed. long wait last night for circleci |
|
||
Bug in :func:`DataFrame.to_dict` raises ``ValueError`` when used with | ||
``orient='index'`` and a non-unique index instead of losing data (:issue:`22801`) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a mini-example here (just showing what kind of data frame will raise). the prior example was too long, but I think we need something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an ipython block that will raise the error. @jreback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right but now the docs will not work. so, you can do this as an ipython block but you need an :okexcept:
to handle the exception.
also show the df after defining it
Add example that will raise value error.
|
||
Bug in :func:`DataFrame.to_dict` raises ``ValueError`` when used with | ||
``orient='index'`` and a non-unique index instead of losing data (:issue:`22801`) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right but now the docs will not work. so, you can do this as an ipython block but you need an :okexcept:
to handle the exception.
also show the df after defining it
doc/source/whatsnew/v0.24.0.txt
Outdated
``orient='index'`` and a non-unique index instead of losing data (:issue:`22801`) | ||
|
||
.. ipython:: python | ||
:okexcept: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line this up as below, e.g.
.. ipython:: python
:okexcept:
df = ......
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more doc correction, ping on green.
thanks @jameswinegar |
git diff upstream/master -u -- "*.py" | flake8 --diff