Skip to content

FEA allow any resampler in the BalancedBaggingClassifier #808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

glemaitre
Copy link
Member

@glemaitre glemaitre commented Feb 16, 2021

closes #653

It allows implementing the following methods just by swapping the sampling:

  • Over-Bagging
  • Under-Bagging
  • Under-Over-Bagging
  • SMOTE-Bagging

TODO:

  • Add more details in documentation
  • Add reference to articles
  • Add an example
  • Add a small section in the user guide.
  • Add a test to ensure the support for FunctionSampler by bypassing the sampling_strategy validation.

@glemaitre
Copy link
Member Author

I think that we need to be careful with the bootstrap option to see if we don't draw twice some bootstrap samples.

@codecov
Copy link

codecov bot commented Feb 16, 2021

Codecov Report

Merging #808 (d1d9546) into master (76abfa4) will decrease coverage by 2.78%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #808      +/-   ##
==========================================
- Coverage   98.59%   95.80%   -2.79%     
==========================================
  Files          93       91       -2     
  Lines        6045     5959      -86     
  Branches      503      504       +1     
==========================================
- Hits         5960     5709     -251     
- Misses         84      192     +108     
- Partials        1       58      +57     
Impacted Files Coverage Δ
imblearn/ensemble/_bagging.py 96.15% <100.00%> (-1.68%) ⬇️
imblearn/ensemble/tests/test_bagging.py 100.00% <100.00%> (ø)
imblearn/keras/tests/test_generator.py 7.93% <0.00%> (-84.13%) ⬇️
imblearn/keras/_generator.py 55.71% <0.00%> (-41.43%) ⬇️
imblearn/tensorflow/tests/test_generator.py 65.59% <0.00%> (-34.41%) ⬇️
imblearn/utils/deprecation.py 80.00% <0.00%> (-20.00%) ⬇️
imblearn/tensorflow/_generator.py 83.87% <0.00%> (-16.13%) ⬇️
imblearn/datasets/_imbalance.py 76.47% <0.00%> (-11.77%) ⬇️
imblearn/over_sampling/_smote/filter.py 86.44% <0.00%> (-6.78%) ⬇️
imblearn/ensemble/_weight_boosting.py 92.13% <0.00%> (-5.62%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 76abfa4...f04dc13. Read the comment docs.

@pep8speaks
Copy link

pep8speaks commented Feb 18, 2021

Hello @glemaitre! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 28:1: E402 module level import not at top of file
Line 39:1: E402 module level import not at top of file
Line 50:1: E402 module level import not at top of file
Line 51:1: E402 module level import not at top of file
Line 69:1: E402 module level import not at top of file
Line 70:1: E402 module level import not at top of file
Line 79:1: E402 module level import not at top of file
Line 97:1: E402 module level import not at top of file
Line 121:1: E402 module level import not at top of file
Line 122:1: E402 module level import not at top of file
Line 123:1: E402 module level import not at top of file

Line 31:1: E402 module level import not at top of file
Line 32:1: E402 module level import not at top of file
Line 50:1: E402 module level import not at top of file
Line 57:1: E402 module level import not at top of file
Line 58:1: E402 module level import not at top of file
Line 67:1: E402 module level import not at top of file
Line 68:1: E402 module level import not at top of file
Line 85:1: E402 module level import not at top of file
Line 86:1: E402 module level import not at top of file
Line 114:1: E402 module level import not at top of file
Line 134:1: E402 module level import not at top of file
Line 135:1: E402 module level import not at top of file
Line 182:1: E402 module level import not at top of file
Line 183:1: E402 module level import not at top of file

Line 251:13: W503 line break before binary operator

Comment last updated at 2021-02-18 11:24:39 UTC

@glemaitre glemaitre mentioned this pull request Feb 18, 2021
19 tasks
@glemaitre glemaitre merged commit aa1add3 into scikit-learn-contrib:master Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add reference for BalancedBaggingClassifier
2 participants