You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe
Describe the solution you'd like
Non-IID data are data that sometime can be found when training models on distributed devices, these are unbalanced wrt the devices and have different distribution of labels as well. For example in Federated Learning there are plenty of those.
I want to propose an algorithm that takes some data and distributes them in an non-IID fashioned way, i had to do it for an experiment but i didn't find any general algorithm that do this, so I'm proposing to create one, don't know if here is the right place.
Describe alternatives you've considered
Ideally, we take the data and the labels, then we can distribute our data in two ways or a mix of the two: unbalancing the data on each sub-distribution or unbalancing around the labels on each sub-distribution.
Additional context
The text was updated successfully, but these errors were encountered:
<--
If you want to propose a new algorithm, please refer first to the scikit-learn inclusion criterion:
https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms
-->
Is your feature request related to a problem? Please describe
Describe the solution you'd like
Non-IID data are data that sometime can be found when training models on distributed devices, these are unbalanced wrt the devices and have different distribution of labels as well. For example in Federated Learning there are plenty of those.
I want to propose an algorithm that takes some data and distributes them in an non-IID fashioned way, i had to do it for an experiment but i didn't find any general algorithm that do this, so I'm proposing to create one, don't know if here is the right place.
Describe alternatives you've considered
Ideally, we take the data and the labels, then we can distribute our data in two ways or a mix of the two: unbalancing the data on each sub-distribution or unbalancing around the labels on each sub-distribution.
Additional context
The text was updated successfully, but these errors were encountered: