Skip to content

[WIP] TT Template for managing samples... #3017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

zeahmed
Copy link
Contributor

@zeahmed zeahmed commented Mar 19, 2019

As we are going along with making sample standalone, I have been observing that there is a lot of code duplication in the sample. I am proposing to use tt template for managing our samples.

The code in this PR shows how to make a template for samples to reduce the copy-paste of common code as well as minimize the code to review every time someone pushes a sample.

Please do not merge this PR. I have already tagged it as WIP.


namespace Microsoft.ML.Samples.Dynamic.Trainers.Regression
{
public static class FastTree2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is auto generated from FastTreeTemplate.tt and RegressionTemplate.txt.

var trainingData = mlContext.Data.LoadFromEnumerable(examples);

// Define the trainer.
<#=TrainingCode#>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in <# ... #> is a placeholder.

<#@ include file="RegressionTemplate.txt"#>

<#+
string ClassName="FastTree2";
Copy link
Contributor Author

@zeahmed zeahmed Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have identified these placeholder in regression samples.

using System.Linq;
using Microsoft.ML.Data;

namespace Microsoft.ML.Samples.Dynamic.Trainers.Regression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a txt, do we get Intellisence in VS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its is a stub code (an agreed upon template) e.g. I took it from Shahab's samples for regression and made it as template. You wont want get Intellisence for it but you get Intellisence for the generated file like FastTreeTemplate.cs.


In reply to: 267094827 [](ancestors = 267094827)

Copy link
Contributor

@rogancarr rogancarr Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if managing lots of txt files will be more work than standardizing files directly. I'm on the fence.

For example, it's really hard to get indentation correct in txt files.


In reply to: 267099333 [](ancestors = 267099333,267094827)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there will be one file for each ML task we have e.g. Regression, Classification etc. and that also in case when we have multiple samples using same pattern and one .tt for each sample. The .cs file will be auto generated.

So, this is how it will work. Create a sample in a normal way using .cs file. Once the code is agreed upon and there are a couple of places the same pattern is used then turn that into .tt file and make placeholders in it. Also, if there is a problem in template, the generated csharp file will have the same problem as the template file. Any indentation errors etc. will definitely be caught in the review (or even during development).

I think t4 template is just a way of automatically doing copy-paste.


In reply to: 267100884 [](ancestors = 267100884,267099333,267094827)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think in the long term having templates like this will improve maintainability and consistency; also to me it's easier to to than copy paste.


In reply to: 267112240 [](ancestors = 267112240,267100884,267099333,267094827)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I think we can get away with having to hard code the outputs and have the template calculate and print it directly. that would simplify the .tt files substantially. I'm working on that in my new PR.


In reply to: 267114254 [](ancestors = 267114254,267112240,267100884,267099333,267094827)

Copy link
Member


<ItemGroup>
<Service Include="{508349b6-6b84-4df5-91f0-309beebad82d}" />
</ItemGroup>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure! I added .tt files and VS studio did it automatically. Maybe @eerhardt can tell us.

@@ -35,4 +35,23 @@

</ItemGroup>

<ItemGroup>
<None Update="Dynamic\Trainers\Regression\FastTreeTemplate.tt">

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FastTreeTemplate.tt" [](start = 46, length = 20)

is this Runtime Text Template or Text Template? At what point does it run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more specific, it is design (or compile) time template. In VS studio, FastTreeTemplate.cs is generated from this file as soon as the .tt file is saved (Ctrl + S) in the VS.


In reply to: 267099394 [](ancestors = 267099394)

<DesignTime>True</DesignTime>
<AutoGen>True</AutoGen>
<DependentUpon>FastTreeTemplate.tt</DependentUpon>
</Compile>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you add these manually or through VS UI?

@codecov
Copy link

codecov bot commented Mar 19, 2019

Codecov Report

Merging #3017 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3017      +/-   ##
==========================================
- Coverage   72.33%   72.33%   -0.01%     
==========================================
  Files         803      803              
  Lines      143424   143424              
  Branches    16169    16169              
==========================================
- Hits       103752   103744       -8     
- Misses      35250    35256       +6     
- Partials     4422     4424       +2
Flag Coverage Δ
#Debug 72.33% <ø> (-0.01%) ⬇️
#production 68.04% <ø> (-0.01%) ⬇️
#test 88.53% <ø> (+0.01%) ⬆️
Impacted Files Coverage Δ
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0%> (-20.52%) ⬇️
src/Microsoft.ML.Transforms/Text/LdaTransform.cs 89.26% <0%> (-0.63%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 84.9% <0%> (+0.2%) ⬆️
...soft.ML.TestFramework/DataPipe/TestDataPipeBase.cs 74.03% <0%> (+0.33%) ⬆️

Copy link

@shmoradims shmoradims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@zeahmed
Copy link
Contributor Author

zeahmed commented Mar 21, 2019

Closing this PR because we already have solution incorporated in #3035.

@zeahmed zeahmed closed this Mar 21, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants