You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
>>> from sklearn.feature_extraction.text import CountVectorizer
47
48
48
49
Load some Data
49
50
**************
@@ -156,6 +157,20 @@ Only columns that are listed in the DataFrameMapper are kept. To keep a column b
156
157
[ 1., 0., 0., 5.],
157
158
[ 0., 0., 1., 4.]])
158
159
160
+
161
+
Working with sparse features
162
+
****************************
163
+
164
+
`DataFrameMapper`s will return a dense feature array by default. Setting `sparse=True` in the mapper will return a sparse array whenever any of the extracted features is sparse. Example:
165
+
166
+
>>> mapper4 = DataFrameMapper([
167
+
... ('pet', CountVectorizer()),
168
+
... ], sparse=True)
169
+
>>> type(mapper4.fit_transform(data))
170
+
<class 'scipy.sparse.csr.csr_matrix'>
171
+
172
+
The stacking of the sparse features is done without ever densifying them.
173
+
159
174
Cross-Validation
160
175
----------------
161
176
@@ -179,6 +194,7 @@ Changelog
179
194
********************
180
195
181
196
* Raise ``KeyError`` when selecting unexistent columns in the dataframe. Fixes #30.
197
+
* Return sparse feature array if any of the features is sparse and `sparse` argument is `True`. Defaults to `False` to avoid potential breaking of existing code. Resolves #34.
0 commit comments