Skip to content

Open question: does the dataframe no longer supported in the current version of machinelearning sdk? #6138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ness001 opened this issue Mar 24, 2022 · 6 comments
Assignees
Labels
Microsoft.Data.Analysis All DataFrame related issues and PRs needs-author-action P3 Doc bugs, questions, minor issues, etc. question Further information is requested
Milestone

Comments

@ness001
Copy link

ness001 commented Mar 24, 2022

System information

  • OS version/distro:
  • .NET Version (eg., dotnet --info):

Issue

  • What did you do?
  • What happened?
  • What did you expect?
    I couldn't find the reference under Microsoft.dataAnalytics in the reference page. And, the code base is not changed since 21 months ago.

Source code / logs

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

@ghost ghost added the untriaged New issue has not been triaged label Mar 24, 2022
@luisquintanilla luisquintanilla self-assigned this Mar 29, 2022
@luisquintanilla luisquintanilla added question Further information is requested P3 Doc bugs, questions, minor issues, etc. Microsoft.Data.Analysis All DataFrame related issues and PRs and removed untriaged New issue has not been triaged labels Mar 29, 2022
@luisquintanilla luisquintanilla added this to the ML.NET Future milestone Mar 29, 2022
@luisquintanilla
Copy link
Contributor

Hi @ness001

Thanks for raising this issue. In short, the project is not dead, just not under active development at this time. We're evaluating the data preparation / data wrangling story as outlined in the roadmap. We are using issues and feature requests like these to inform common uses, asks, and pain points with the existing API. They are going to help frame our investigations and prioritize our efforts.

What is your current experience with the Microsoft.Data.Analysis (DataFrame) API? Are there specific issues you're running into?

@ghost
Copy link

ghost commented Mar 29, 2022

This issue has been marked needs-author-action since it may be missing important information. Please refer to our contribution guidelines for tips on how to report issues effectively.

@MgSam
Copy link
Contributor

MgSam commented Mar 30, 2022

@luisquintanilla The existing API is not complete is it? There are tons of DataFrame related issues filed in this repo since you guys released the first version in 2019.

I don't understand why you guys abandoned this project. .NET needs a good DataFrame library to be competitive with other ecosystems. You cannot do effective data analysis without a DataFrame. Releasing an alpha version 3 years ago and promptly abandoning it is not acceptable. If you're looking purely at usage stats- it's not going to get usage until it's in a strong usable state.

@luisquintanilla
Copy link
Contributor

@MgSam that is correct, the API is not complete yet and is still in preview. As you mentioned, it was released in 2019 as an experiment under the corefxlab repo. Last year it was moved to the ML.NET repo because we understand that the DataFrame has a large role to play especially when it comes to exploratory data analysis and data wrangling. However, there were already ongoing workstreams in other areas and the DataFrame API was not the highest priority at the time. That does not mean that the project is abandoned. Considering how important data preparation is to the machine learning process, the DataFrame is part of our roadmap and we'd like the community's help to make it the best it can be. I'm in the process of prioritizing the issues you mentioned and am tracking them in this issue. I would encourage adding additional feedback and discussing on there. Thanks again for the feedback and as mentioned, I'd encourage adding comments / concerns to issue #6144.

@MgSam
Copy link
Contributor

MgSam commented Mar 30, 2022

With all due respect, if Microsoft was serious about it they should have developers dedicated to its development. There is no reason it cannot be developed in parallel with ML.NET. It's useful for ML, yes, but it's use is far wider than that and is totally orthogonal to the main things ML.NET does.

It shouldn't be a hobby or side project when you guys have time (which will be half-past-never, as we all know newer priorities always come up). Moving it to ML.NET seemed like a good step- except that it's been just as ignored here as it was in corefxlab.

The net result of this all is that I, and I'm sure many others, would love to take a dependency on an official .NET DataFrame in my projects but cannot do so because of the preview status and increasing likelihood that it will be officially abandoned. I've been around the MS ecosystem long enough to see what happens to projects that get "paused" at Microsoft- I've never once seen such a project get restarted again.

@michaelgsharp
Copy link
Member

Closing this issue as everything is being tracked in #6144 now.

@ghost ghost locked as resolved and limited conversation to collaborators May 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Microsoft.Data.Analysis All DataFrame related issues and PRs needs-author-action P3 Doc bugs, questions, minor issues, etc. question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants