Skip to content

Calculated Feature #595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
markus-renezeder opened this issue Jul 28, 2018 · 8 comments
Closed

Calculated Feature #595

markus-renezeder opened this issue Jul 28, 2018 · 8 comments
Labels
API Issues pertaining the friendly API

Comments

@markus-renezeder
Copy link

Hello,

is there a way to add a calculated feature?
I get the data in a CSV-File which is loaded with a TextLoader. I'm not able to affect the data in the file. For My model I would use a calculated feature (e. g. NewFeature = Feature1 + Feature2) or any other calculations like manual binning (e. g. if Feature1 < 10 then NewFeature = 1 else if Feature1 >= 10 and Feature1 < 20 then NewFeature = 2 else NewFeature = 3).

In the Azure Machine Learning Studio are some tasks which can be used for e. g. SQL Transformation, R-Scripts ...

@WladdGorshenin
Copy link

Hi, I'm also curious how can I convert a date field to a set of new fields like: year, month, day_of_month, day_of_week, etc?

@AbhiOnGithub
Copy link

for dynamic fields using .NET Dictionaries Dictionary<"YourNewField",Value>

@Zruty0
Copy link
Contributor

Zruty0 commented Jul 30, 2018

Hi @gironymo ,

Your and @WladdGorshenin 's cases are perfect examples of why we want to have https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.Api/MapTransform.cs available through the user API.

When it is done, you will be able to add your own transformations to the pipeline, and mix&match them with the standard ones provided by ML.NET.

Right now, even though MapTransform is available as a public class, you will have to rewrite your entire experiment to use low-level API (and not LearningPipeline) in order to use it.

Once we resolve #371 and #581, we will reconcile the two.

@Zruty0 Zruty0 added the API Issues pertaining the friendly API label Jul 30, 2018
@WladdGorshenin
Copy link

Hi @AbhiOnGithub could you please elaborate a bit more?

@Zruty0
Copy link
Contributor

Zruty0 commented Aug 1, 2018

@WladdGorshenin , the comment from @AbhiOnGithub seems to me more like a feature request.

We definitely do not support dictionaries as sources of data right now. Nor do we have plans to do this in the near future, unless there is a compelling argument that outweighs the perf and data consistency implications of this potential feature.

@WladdGorshenin
Copy link

@Zruty0 is there any workaround?

@Zruty0
Copy link
Contributor

Zruty0 commented Aug 2, 2018

@WladdGorshenin I assume you are referring to this:

how can I convert a date field to a set of new fields like: year, month, day_of_month, day_of_week, etc?

My best recommendation at the moment would be to perform this pre-processing prior to giving this object to ML.NET.

Actually, now that we're merging #616 , you should be able to get away with something like this:

        public class MyDataRow
        {
            private DateTime _dateTime;

            public float Day { get { return _dateTime.Day; } set { throw new NotImplementedException(); } }
            public float DayOfWeek { get { return (float)_dateTime.DayOfWeek; } set { throw new NotImplementedException(); } }
            // etc
        }

@Zruty0
Copy link
Contributor

Zruty0 commented Aug 9, 2018

I think this should be closed. Feel free to reopen to add more comments.

@Zruty0 Zruty0 closed this as completed Aug 9, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

No branches or pull requests

4 participants