Skip to content

Add getting learner's data as pandas.DataFrame; add learner.to_dataframe method #358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Sep 19, 2022

Conversation

basnijholt
Copy link
Member

@basnijholt basnijholt commented Sep 6, 2022

Description

This code in this PR allows exporting a learner to a pandas.DataFrame. This will be useful when running many learners (e.g, in a BalancingLearner) and then being able to just concatenate the DataFrames to get the combined result.

Currently saving the learner creates a pickle file and while this is fine for immediate use, for long term storage this is a bad practice.

I chose to use pandas over xarray because we are dealing with tabular data instead of ND dense or sparse arrays.

Ping @CameronKing

Example:
image

Checklist

  • Fixed style issues using pre-commit run --all (first install using pip install pre-commit)
  • pytest passed

Type of change

Check relevant option(s).

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • (Code) style fix or documentation update
  • This change requires a documentation update

@akhmerov
Copy link
Contributor

akhmerov commented Sep 6, 2022

What problem does this PR solve? I believe a pandas DataFrame isn't a go-to choice for numerical data; is there a use case which is sufficiently broadly relevant?

@basnijholt basnijholt changed the title WIP: Add getting learner to pandas; add learner.as_dataframe WIP: Add getting learner to pandas; add learner.to_dataframe Sep 13, 2022
@basnijholt basnijholt changed the title WIP: Add getting learner to pandas; add learner.to_dataframe Add getting learner to pandas; add learner.to_dataframe Sep 13, 2022
@basnijholt basnijholt changed the title Add getting learner to pandas; add learner.to_dataframe Add getting learner's data as pandas.DataFrame; add learner.to_dataframe Sep 13, 2022
@basnijholt basnijholt changed the title Add getting learner's data as pandas.DataFrame; add learner.to_dataframe Add getting learner's data as pandas.DataFrame; add learner.to_dataframe method Sep 13, 2022
@basnijholt basnijholt assigned jbweston and unassigned jbweston Sep 13, 2022
@basnijholt basnijholt requested a review from jbweston September 13, 2022 18:10
@akhmerov
Copy link
Contributor

OK, I see: the entries indeed have different meanings and possibly dimensions (x, y, value). For that reason pandas seems appropriate.

@codecov-commenter
Copy link

codecov-commenter commented Sep 13, 2022

Codecov Report

Merging #358 (e091772) into master (2c8d2ad) will decrease coverage by 0.16%.
The diff coverage is 76.00%.

@@            Coverage Diff             @@
##           master     #358      +/-   ##
==========================================
- Coverage   80.67%   80.50%   -0.17%     
==========================================
  Files          38       38              
  Lines        5071     5336     +265     
  Branches      948      997      +49     
==========================================
+ Hits         4091     4296     +205     
- Misses        835      873      +38     
- Partials      145      167      +22     
Impacted Files Coverage Δ
adaptive/learner/integrator_learner.py 88.51% <33.33%> (-2.65%) ⬇️
adaptive/learner/average_learner1D.py 77.51% <62.50%> (-1.30%) ⬇️
adaptive/learner/learnerND.py 60.26% <65.21%> (+0.14%) ⬆️
adaptive/learner/average_learner.py 83.76% <71.42%> (-3.11%) ⬇️
adaptive/learner/learner1D.py 89.47% <71.42%> (-0.50%) ⬇️
adaptive/learner/learner2D.py 78.75% <72.72%> (-0.55%) ⬇️
adaptive/learner/sequence_learner.py 85.29% <75.00%> (-3.17%) ⬇️
adaptive/learner/balancing_learner.py 75.00% <80.00%> (+0.52%) ⬆️
adaptive/learner/data_saver.py 86.66% <80.00%> (-3.34%) ⬇️
adaptive/utils.py 78.49% <84.00%> (+2.02%) ⬆️
... and 4 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@basnijholt basnijholt requested a review from akhmerov September 14, 2022 05:17
@basnijholt basnijholt merged commit 21fb3b6 into master Sep 19, 2022
@basnijholt basnijholt deleted the as-pandas-dataframe branch September 19, 2022 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants