[ENH] Non-IID data distribution #931

yumetsuro · 2022-10-26T08:56:31Z

<--
If you want to propose a new algorithm, please refer first to the scikit-learn inclusion criterion:
https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms
-->

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Non-IID data are data that sometime can be found when training models on distributed devices, these are unbalanced wrt the devices and have different distribution of labels as well. For example in Federated Learning there are plenty of those.

I want to propose an algorithm that takes some data and distributes them in an non-IID fashioned way, i had to do it for an experiment but i didn't find any general algorithm that do this, so I'm proposing to create one, don't know if here is the right place.

Describe alternatives you've considered

Ideally, we take the data and the labels, then we can distribute our data in two ways or a mix of the two: unbalancing the data on each sub-distribution or unbalancing around the labels on each sub-distribution.

Additional context

hayesall · 2022-11-07T17:38:29Z

This question is a little too broad at the moment, we mainly focus on extensions and issues for imblearn here.

More-general Q&A forums for machine learning topics (e.g. https://stats.stackexchange.com/) might be a better fit for this.

Follow-up in the future if there's a good way to approach this. #105 is also where notes on new methods are currently tracked.

hayesall closed this as completed Nov 7, 2022

hayesall added the Type: Question label Nov 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Non-IID data distribution #931

[ENH] Non-IID data distribution #931

yumetsuro commented Oct 26, 2022

hayesall commented Nov 7, 2022

[ENH] Non-IID data distribution #931

[ENH] Non-IID data distribution #931

Comments

yumetsuro commented Oct 26, 2022

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional context

hayesall commented Nov 7, 2022