-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: fix DataFrame(data=[None, 1], dtype='timedelta64[ns]') raising ValueError #60081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 7 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
06f317e
BUG: fix DataFrame(data=[None, 1], dtype='timedelta64[ns]') raising V…
yuanx749 b5bec39
ci
yuanx749 4fe0ffa
Merge remote-tracking branch 'upstream/main' into sequence_to_td64ns
yuanx749 98dd50f
Add whatsnew
yuanx749 9c34762
Move to maybe_downcast_to_dtype
yuanx749 65a48e1
Merge remote-tracking branch 'upstream/main' into sequence_to_td64ns
yuanx749 0627217
Fix mypy failure
yuanx749 d93d5b2
Move upper level
yuanx749 84c01da
Merge remote-tracking branch 'upstream/main' into sequence_to_td64ns
yuanx749 bc106df
Merge remote-tracking branch 'upstream/main' into sequence_to_td64ns
yuanx749 11a9a7e
Update
yuanx749 bf680ce
mypy
yuanx749 fc2405f
Merge remote-tracking branch 'upstream/main' into sequence_to_td64ns
yuanx749 5c36932
Merge remote-tracking branch 'upstream/main' into sequence_to_td64ns
yuanx749 e19bf19
Merge remote-tracking branch 'upstream/main' into sequence_to_td64ns
yuanx749 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would e.g.
[None, 1]
be converted to a 2D array?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the Traceback, the input list is converted to a 2D array in
ndarray_to_mgr
, where_prep_ndarraylike
returns_ensure_2d(values)
.pandas/pandas/core/internals/construction.py
Line 267 in 9e10119
It seems for other types of input,
_ensure_2d
is also called.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see. Generally I think there would still be problems if a user passes a nested list (e.g. a "2x2" nested list) or a user passes
dtype="datetime64[unit]"
I think generally
maybe_cast_to_datetime
should assume the incomingvalue
is 1D since the_from_sequence
calls assume the values are 1D also, so the DataFrame code should apply this column column-wise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For nested list,
nested_data_to_arrays
inDataFrame.__init__
processes the data column-wise, so there is no problem.There is no error for datetime64, because
DatetimeArray._from_sequence
actually happens to work with 2D array:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to move the ravel and reshape in
_try_cast
below, sort of like the Unicode string dtype elif branch, so as to ensure the input ofmaybe_cast_to_datetime
is 1D.pandas/pandas/core/construction.py
Lines 796 to 810 in 4651ddb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @jbrockmendel if you have thoughts on this approach