Description
With work underway on Tensor
, and the new perf optimizations available via Span
, I think it's time the .NET team seriously considered adding a DataFrame type. Working with large amounts of data has become critical in many applications these days, and is the reason libraries like Pandas for Python have been so successful. It's time to bring similar capabilities to .NET, and it needs to start with the framework itself adding a DataFrame type.
There are DataFrame libraries out there available for .NET, but the problem is that they each have their own implementations of DataFrames, that are minimally compatible with the BCL or any of the other libraries out there. From my experience, support for many of these various libraries is also pretty weak.
I think the BCL team implementing its own DataFrame is the first, most important step to improving the state of working with data in .NET. All we have right now is DataTable
, a powerful but ancient type that is not well optimized for many scenarios and which I've variously seen .NET team members refer to as "legacy".
I'm creating this issue to garner opinions on the topic. I don't have a specific API in mind to propose at this time, as gauging interest first is probably more important.
Let's make .NET as great for data analysis as any other platform.