Skip to content

Bindings in ChooseColumnsByIndexTransform not ISchema #1879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 17, 2018

Conversation

wschin
Copy link
Member

@wschin wschin commented Dec 14, 2018

This PR is a part of #1501. We refactorize the Binding in ChooseColumnsByIndexTransform by making it not an ISchema but still maintaining necessary functionalities for connecting input and output. For the functionalities remained, please see the non-private member functions of Bindings. Some comments are added for a better readability.

@wschin wschin self-assigned this Dec 14, 2018
@wschin wschin requested a review from codemzs December 14, 2018 17:14
verWrittenCur: 0x00010001, // Initial
verReadableCur: 0x00010001,
verWeCanReadBack: 0x00010001,
verWrittenCur: 0x00010002,
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Dec 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0x00010002 [](start = 31, length = 10)

I would say, if you can preserve old saving mechanism, try to preserve it.
Do you have any particular reason to switch to new format? #Closed

Copy link
Member Author

@wschin wschin Dec 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This format has been changed previously. You can see that the old doc string describing the binary format doesn't match what previously stored.


In reply to: 241838591 [](ancestors = 241838591)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've check internal repo, and format is same as in code you changing.


In reply to: 241843452 [](ancestors = 241843452,241838591)

// bool (as byte): operation mode
// int[]: selected source column indices
_drop = ctx.Reader.ReadBoolByte();
_sources = ctx.Reader.ReadIntArray();
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Dec 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look suspicious.

So imagine I create transform with params [0,2,3] and drop:false
And we do it for two Schemas, one with 4 columns, one with 5.

result for both of them would be schema with 3 columns [0,2,3].

now let's work with [0,2,3] and drop:true

I pass schema with 4 columns to this transform and save it.
we will save [1] in source.

Now I load this transform and pass schema with 5 columns.
result would be same schema with [1] column.

But if I start on 5 columns schema I will have [1,4] and if I pass that saved transform schema with 4 columns, i'm screwed.

So please, preserve old logic and respect drop parameter.
#Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also would be nice if you throw some test coverage for this transform.

  [TestMethod, TestCategory("DataPipeSerialization")]
        public void SavePipeChooseColumnsByIndex()
        {
            TestCore(null, true,
                new[] {
                    "loader=Text{",
                    "  col=Label:U1[0-1]:0",
                    "  col=Features:U2:1-*",
                    "  col=A:U1[1-5]:1",
                    "  col=B:U1[3-8]:2",
                    "}",
                    "xf=ChooseColumnsByIndex{ind=2 ind=0}"
                });

            Done();
        }

        [TestMethod, TestCategory("DataPipeSerialization")]
        public void SavePipeChooseColumnsByIndexDrop()
        {
            TestCore(null, true,
                new[] {
                    "loader=Text{",
                    "  col=Label:U1[0-1]:0",
                    "  col=Features:U2:1-*",
                    "  col=A:U1[1-5]:1",
                    "  col=B:U1[3-8]:2",
                    "}",
                    "xf=ChooseColumnsByIndex{ind=3 ind=0 drop}"
                });

            Done();
        }


In reply to: 241891989 [](ancestors = 241891989)

Copy link
Member Author

@wschin wschin Dec 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is true? There is no TestCore with prototype TestCore(string, bool, ...). The cloest example I found about TestCore is in TestCommandBase.cs but there is no TestMethod attribute. Is that attribute needed?
[Update] I guess those invalid tests are just examples. I will create my own.


In reply to: 241892330 [](ancestors = 241892330,241891989)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are added and old logic is back.


In reply to: 241903270 [](ancestors = 241903270,241892330,241891989)

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@wschin wschin force-pushed the another-not-ischema branch from 99407e4 to a8f16d4 Compare December 16, 2018 20:56
@wschin wschin requested a review from shauheen December 16, 2018 23:22
@wschin
Copy link
Member Author

wschin commented Dec 16, 2018

@shauheen, I added some inline comments to this PR. They doesn't only contain information about each attribute but also describe how different attributes work together. Feel free to comment if you found something not understandable. Thanks.

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@wschin wschin merged commit 0b35b89 into dotnet:master Dec 17, 2018
@wschin wschin deleted the another-not-ischema branch December 17, 2018 18:15
@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants