-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: GH3468 Fix assigning a new index to a duplicate index in a DataFrame would fail #3483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@y-p when you have a chance....I think i figured out the duplicate columns across dtypes issue. I non-trivially create the indexer. seems to fix this problem (I added a few tests). Pretty sure it also fixes the whole to_csv mess that we had. It think some of your #3458 can revert. Also I seem to have lost the issue that was created at the very end of the to_csv stuff last month when we punted on fixing dup columns (well it failed fast in certain cases). Also now should review the other dup columns stuff. |
@y-p thanks...that looks right....still some issue remaining...but making progress.... |
@y-p what class of issues/prs are we doing for 0.11.1....? mostly just bug fixes, or minor features? sort of like a mini-release...just shorter in time? |
bug fixes and small-medium enhancements that are back-compatible. no sudden moves. |
k...great....hopefully will have most of the dup columns issues resolved very shortly |
I'll add the config option to control dupe col name mangling tomorrow, will decide on a default before the release. |
@y-p I think I got all the dup issues, only one to fix remaining is the to_csv; this will work (I pseudo tested), but need to remove the exception that is raised when the columns are duped. You can do the same type of indexing on a dup column as on a non-index column (I think that was the issue), still can't figure out which issue it was? |
hmmm. #3095? |
that looks right.... |
that is closed, don't we have an open one? (or did we just shove the exception in that one and punt?) |
Yeah, I think we did, the block positional indexer issue was all I remember. |
@wesm any thoughts before we merge this? |
I added #3495 to cover the to_csv stuff |
…get) when using a non-unique index (GH2786 for the warning and GH3230 for applymap) TST: test for GH2194 (which is fixed)
closed in favor of #3509, which is cleaner and has to_csv support |
partially fixes #3468
This would previously raise (same dtype assignment to a non-multi dtype frame with dup indicies)
construction of a multi-dtype frame with a dup index (#2194) is fixed
This was also previously would raise
For those of you interested.....here is the new ref_loc indexer for duplicate columns
its by necessity a block oriented indexer, returns the column map (by column number) to a tuple of the block and the index in the block, only created when needed (e.g. when trying to get a column via iget and the index is non-unique, and the results are cached), this is #3092
Fixed the #2786, #3230 bug that caused applymap to not work (we temp worked around by raising a ValueError; removed that check)