-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Bindings in ChooseColumnsByIndexTransform not ISchema #1879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
verWrittenCur: 0x00010001, // Initial | ||
verReadableCur: 0x00010001, | ||
verWeCanReadBack: 0x00010001, | ||
verWrittenCur: 0x00010002, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0x00010002 [](start = 31, length = 10)
I would say, if you can preserve old saving mechanism, try to preserve it.
Do you have any particular reason to switch to new format? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This format has been changed previously. You can see that the old doc string describing the binary format doesn't match what previously stored.
In reply to: 241838591 [](ancestors = 241838591)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've check internal repo, and format is same as in code you changing.
In reply to: 241843452 [](ancestors = 241843452,241838591)
// bool (as byte): operation mode | ||
// int[]: selected source column indices | ||
_drop = ctx.Reader.ReadBoolByte(); | ||
_sources = ctx.Reader.ReadIntArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This look suspicious.
So imagine I create transform with params [0,2,3] and drop:false
And we do it for two Schemas, one with 4 columns, one with 5.
result for both of them would be schema with 3 columns [0,2,3].
now let's work with [0,2,3] and drop:true
I pass schema with 4 columns to this transform and save it.
we will save [1] in source.
Now I load this transform and pass schema with 5 columns.
result would be same schema with [1] column.
But if I start on 5 columns schema I will have [1,4] and if I pass that saved transform schema with 4 columns, i'm screwed.
So please, preserve old logic and respect drop parameter.
#Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also would be nice if you throw some test coverage for this transform.
[TestMethod, TestCategory("DataPipeSerialization")]
public void SavePipeChooseColumnsByIndex()
{
TestCore(null, true,
new[] {
"loader=Text{",
" col=Label:U1[0-1]:0",
" col=Features:U2:1-*",
" col=A:U1[1-5]:1",
" col=B:U1[3-8]:2",
"}",
"xf=ChooseColumnsByIndex{ind=2 ind=0}"
});
Done();
}
[TestMethod, TestCategory("DataPipeSerialization")]
public void SavePipeChooseColumnsByIndexDrop()
{
TestCore(null, true,
new[] {
"loader=Text{",
" col=Label:U1[0-1]:0",
" col=Features:U2:1-*",
" col=A:U1[1-5]:1",
" col=B:U1[3-8]:2",
"}",
"xf=ChooseColumnsByIndex{ind=3 ind=0 drop}"
});
Done();
}
In reply to: 241891989 [](ancestors = 241891989)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is true
? There is no TestCore
with prototype TestCore(string, bool, ...)
. The cloest example I found about TestCore
is in TestCommandBase.cs but there is no TestMethod attribute. Is that attribute needed?
[Update] I guess those invalid tests are just examples. I will create my own.
In reply to: 241892330 [](ancestors = 241892330,241891989)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are added and old logic is back.
In reply to: 241903270 [](ancestors = 241903270,241892330,241891989)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
99407e4
to
a8f16d4
Compare
@shauheen, I added some inline comments to this PR. They doesn't only contain information about each attribute but also describe how different attributes work together. Feel free to comment if you found something not understandable. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is a part of #1501. We refactorize the
Binding
inChooseColumnsByIndexTransform
by making it not anISchema
but still maintaining necessary functionalities for connecting input and output. For the functionalities remained, please see the non-private member functions ofBindings
. Some comments are added for a better readability.