-
Notifications
You must be signed in to change notification settings - Fork 1.9k
F# Records not compatible with ML .NET #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
should this be part of unittests and samples? How can it bind to backing fields in the first place!? |
The "backing fields" bit is an assumption on my part - normally Having said that, if we want a truly F#-friendly API then even |
@asthana86 This is a common issue when APIs are designed for use with C# and use e.g. reflection. I'm not sure if @cartermp will have any answers here - you probably need to look into the ML .NET codebase to see where and how it does the hydration. |
Assuming F# records are just producing get/set properties it’s not even just a problem for F# — public fields aren’t canonical C# code either. We gave the same feedback in the API review but it didn’t make the cut yet. Presumably #254 would fix that then. |
In practice, the trick for something like working with an ORM or equivalent piece of machinery that wants to do things to properties with reflection is ye olde [<CLIMutable>]
type Hoopty =
{ Doopty: int } Which effectively turns it into a property with So I would expect #254 to fix this. That said, it would be nice if we didn't need to set that somehow. |
@terrajobst They produce getter-only properties (with backing fields that are generated with a different convention to C# backing fields). You can avoid the need for |
I see. Supporting immutable types might be harder in the engine. @glebuk would know for sure. |
I would hope that the library doesn't modify data that you supply to it! :-) |
To be honest something else we discussed was to allow the caller to supply data to the library rather than the library hydrating data itself. This is kind of a prerequisite for in memory data support anyway. At that point it should not matter whether it's a record or not. But that's kind of a separate thing. |
Even if it was "harder" I don't understand why you couldn't construct a mutable from the record, then construct a new record from the final result of the mutation. record -> mutable ~> mutable -> record Creating a kind of "sh#t sandwich" . No judgement intended it's just a euphemism for wrapping something you don't want, with two things you do. |
@voronoipotato that would probably be even more work as it would still mean creating initial records with default values for all the fields and then modifying them later manually by creating new records. I think either use the CLIMutable approach, which is probably easier but sub-optimal, or create it based on the ctor. |
#254 is definitely the underlying problem here. @asthana86 Who is the expert on the schema mapping code? If we wanted to allow get/set properties (rather than just public fields) as the basis for schema mappings then who should we talk to? Thanks |
@terrajobst re this:
I could go ahead and do the work to allow get/set properties but I'd need it carefully reviewed. Do you know who is the right person in the ML.NET team to talk to about this? Thanks |
I searched for uses of "GetFields" and here's what I see: Relevant:
Not relevant:
|
Any chance of getting ctor support as well to avoid the need for CLIMutable? |
@isaacabraham I'm not sure. First things first. |
@isaacabraham Some notes on immutable F# records: For immutable records, a schema mapper or serialization framework really has no choice but to work with the "internal" representation of the record, i.e. the hidden, private fields. This is what binary serialization does, for example. This is because there are no setter properties on those immutable records, so there's really no other choice: the IDataView schema mapper works by setting fields one at a time on target objects and there is no way we could convert it to use the immutable object constructor for example. However inputs to ML.NET currently uses field/property names in string form, e.g.
So, if the ML.NET IDataView schema mapper is dealing with internal fields (e.g. private field
It's possible but F# emits no specific metadata about this relationship, so the rule would have to be F#-specific. It's a stretch but not impossible. However would require more subtle coding than I'm willing to try for now, so CLIMutable is a reasonable stepping point. |
Indeed, CLIMutable is a great step forward. But constructor support seems like it's a no-go because of how schema mapping in ML .NET works internally i.e. setting fields one at a time? Which, as you say, leaves just some custom "F# field mapping" based on convention as the only way to remove CLIMutable. It's definitely not the end of the world leaving it there, but it would be nice to not require it. |
Fixed in #616 |
type SentimentData =
{ [<Column(ordinal = "0")>] SentimentText : string
[<Column(ordinal = "1", name = "Label")>] Sentiment : float }
[<CLIMutable>]
type SentimentPrediction =
{ [<ColumnName "PredictedLabel">] Sentiment : bool } To me this also looks monstrous. Magic strings everywhere, even ints represented as strings, loss of compiler helpfulness with the reliance on annotations. Can this be done in ML.Net in a way that is not reflection-based? |
It's been a while since I looked at this but can you now actually access ML net without being coupled to the data access layer? If so, that would ideally remove all this stuff and we could just worry about that ourselves. |
Two ways to do this! |
Perhaps related to #92. Apparently ML .NET doesn't work with F# records! Is this perhaps to do with the fact that F# Records use a different naming convention to Classes for backing fields, and the ML .NET Library. If you use records, the library simply can't find the columns and complains of missing columns. If you port from records to classes with mutable properties, it doesn't fail with that error any longer.
However, mutable classes are pretty much non-idiomatic in F# - no one uses them for the sorts of data-bound workloads that you'll see with ML.
Instead of this:
You'll have to use something like this monstrosity.
You can't even use the
member val
shorthand syntax that F# provides for mutable get / set properties since theColumnName
attribute doesn't work with them. Plus, you lose all the standard features that Records bring such as lightweight syntax, easy creation, copy-and-update, immutability and structural equality.I strongly recommend adding support for them by ensuring that whatever internal hydration logic that is currently coupled to classes supports records as well.
The text was updated successfully, but these errors were encountered: