-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
SparseSeries accepts scipy.sparse.spmatrix in constructor #16617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a quick look. this is probably going to be tricky to make work.
doc/source/whatsnew/v0.20.2.txt
Outdated
@@ -25,6 +25,9 @@ Enhancements | |||
has been added to return the group order (:issue:`11642`); see | |||
:ref:`here <groupby.ngroup>`. | |||
|
|||
|
|||
- ``SparseSeries`` and ``SparseArray`` now support 1d ``scipy.sparse.spmatrix`` in constructor. Additionally, ``SparseDataFrame`` can be assigned columns of ``scipy.sparse.spmatrix``; see :ref:`here <sparse.scipysparse_series>`. (:issue:`15634`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be for 0.21.0
pandas/core/sparse/frame.py
Outdated
else: | ||
# 2d; make it iterable | ||
value = list(value.tocsc().T) | ||
super().__setitem__(key, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the fully qualified call
pandas/core/sparse/series.py
Outdated
@@ -722,6 +726,9 @@ def combine_first(self, other): | |||
|
|||
def to_coo(self, row_levels=(0, ), column_levels=(1, ), sort_labels=False): | |||
""" | |||
DEPRECATED; instead, make a SparseSeries with a two-level index, | |||
unstack it, then use .to_coo() on the resulting SparseDataFrame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the deprecated sphinx directive (I think we are changing these all over)
pandas/core/sparse/series.py
Outdated
@@ -779,6 +786,9 @@ def to_coo(self, row_levels=(0, ), column_levels=(1, ), sort_labels=False): | |||
@classmethod | |||
def from_coo(cls, A, dense_index=False): | |||
""" | |||
DEPRECATED; instead, pass 1d scipy.sparse matrices directly into | |||
SparseSeries constructor, and 2d into SparseDataFrame constructor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
Why do you think so? It does seem to work for the moment. 😃 |
Codecov Report
@@ Coverage Diff @@
## master #16617 +/- ##
==========================================
- Coverage 90.96% 90.95% -0.01%
==========================================
Files 161 161
Lines 49263 49287 +24
==========================================
+ Hits 44810 44827 +17
- Misses 4453 4460 +7
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #16617 +/- ##
==========================================
- Coverage 90.96% 90.95% -0.01%
==========================================
Files 161 161
Lines 49263 49287 +24
==========================================
+ Hits 44810 44827 +17
- Misses 4453 4460 +7
Continue to review full report at Codecov.
|
In Python 2, inner-block anonymous exception seems to overwrite the outer-block anonymous exception. We're supposed to re-raise the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty good. a couple of comments.
sdf[['z', 'w']] = sp_arr[:, [7, 8]] | ||
sdf.iloc[:, -5:] | ||
|
||
Below interface is deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
say that this is deprecated in 0.21.0
@@ -25,6 +25,10 @@ New features | |||
- Added `__fspath__` method to :class`:pandas.HDFStore`, :class:`pandas.ExcelFile`, | |||
and :class:`pandas.ExcelWriter` to work properly with the file system path protocol (:issue:`13823`) | |||
|
|||
- ``SparseSeries`` and ``SparseArray`` now support 1d ``scipy.sparse.spmatrix`` in constructor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the constructor
@@ -25,6 +25,10 @@ New features | |||
- Added `__fspath__` method to :class`:pandas.HDFStore`, :class:`pandas.ExcelFile`, | |||
and :class:`pandas.ExcelWriter` to work properly with the file system path protocol (:issue:`13823`) | |||
|
|||
- ``SparseSeries`` and ``SparseArray`` now support 1d ``scipy.sparse.spmatrix`` in constructor. | |||
Additionally, ``SparseDataFrame`` can be assigned columns of ``scipy.sparse.spmatrix``; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this 2nd sentence a separate bullet point (you can use same issue on both of them, or 2nd one should be the PR number maybe)
kind=self._default_kind) | ||
else: | ||
# 2d; make it iterable | ||
value = list(value.tocsc().T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this materialize?
spm = csr_matrix(np.arange(len(sdf))).T | ||
sdf['X'] = spm | ||
assert _equal(sdf[['X']].to_coo(), spm) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comparision on the scipy side is fine, but also let's compare with assert_sparse_series/frame_equal
|
||
# 1d row -- changing series contents not yet supported | ||
spm = csr_matrix(np.arange(sdf.shape[1], dtype=float)) | ||
idx = np.zeros(sdf.shape[0], dtype=bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you test with .loc/.iloc
as well (might already be another issue about this, if not and its too complicated for here, then create a new issue)
can you rebase / update |
@kernc if you have time can you rebase / update |
For some reason (probably |
oh we can do the other first what was that number? |
Right, it's #17785. |
closing as stale. if you want to continue working, pls ping. |
git diff upstream/master --name-only -- '*.py' | flake8 --diff