Skip to content

[ENH] Non-IID data distribution #931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yumetsuro opened this issue Oct 26, 2022 · 1 comment
Closed

[ENH] Non-IID data distribution #931

yumetsuro opened this issue Oct 26, 2022 · 1 comment

Comments

@yumetsuro
Copy link

<--
If you want to propose a new algorithm, please refer first to the scikit-learn inclusion criterion:
https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms
-->

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Non-IID data are data that sometime can be found when training models on distributed devices, these are unbalanced wrt the devices and have different distribution of labels as well. For example in Federated Learning there are plenty of those.


I want to propose an algorithm that takes some data and distributes them in an non-IID fashioned way, i had to do it for an experiment but i didn't find any general algorithm that do this, so I'm proposing to create one, don't know if here is the right place.

Describe alternatives you've considered

Ideally, we take the data and the labels, then we can distribute our data in two ways or a mix of the two: unbalancing the data on each sub-distribution or unbalancing around the labels on each sub-distribution.

Additional context

@hayesall
Copy link
Member

hayesall commented Nov 7, 2022

This question is a little too broad at the moment, we mainly focus on extensions and issues for imblearn here.

More-general Q&A forums for machine learning topics (e.g. https://stats.stackexchange.com/) might be a better fit for this.

Follow-up in the future if there's a good way to approach this. #105 is also where notes on new methods are currently tracked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants