-
Notifications
You must be signed in to change notification settings - Fork 58
Add getting learner's data as pandas.DataFrame; add learner.to_dataframe method #358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What problem does this PR solve? I believe a pandas DataFrame isn't a go-to choice for numerical data; is there a use case which is sufficiently broadly relevant? |
OK, I see: the entries indeed have different meanings and possibly dimensions (x, y, value). For that reason pandas seems appropriate. |
Codecov Report
@@ Coverage Diff @@
## master #358 +/- ##
==========================================
- Coverage 80.67% 80.50% -0.17%
==========================================
Files 38 38
Lines 5071 5336 +265
Branches 948 997 +49
==========================================
+ Hits 4091 4296 +205
- Misses 835 873 +38
- Partials 145 167 +22
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Description
This code in this PR allows exporting a learner to a
pandas.DataFrame
. This will be useful when running many learners (e.g, in aBalancingLearner
) and then being able to just concatenate theDataFrame
s to get the combined result.Currently saving the learner creates a
pickle
file and while this is fine for immediate use, for long term storage this is a bad practice.I chose to use
pandas
overxarray
because we are dealing with tabular data instead of ND dense or sparse arrays.Ping @CameronKing
Example:

Checklist
pre-commit run --all
(first install usingpip install pre-commit
)pytest
passedType of change
Check relevant option(s).