-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Don't create parallel cursor if we have only one element in dataview #197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't create parallel cursor if we have only one element in dataview #197
Conversation
@@ -143,7 +143,7 @@ public override IRowCursor[] GetRowCursorSet(out IRowCursorConsolidator consolid | |||
var inputs = Source.GetRowCursorSet(out consolidator, predicateInput, n, rand); | |||
Contracts.AssertNonEmpty(inputs); | |||
|
|||
if (inputs.Length == 1 && n > 1 && WantParallelCursors(predicate)) | |||
if (inputs.Length == 1 && n > 1 && WantParallelCursors(predicate) && Source.GetRowCount() != 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!= 1 [](start = 101, length = 5)
<=n perhaps- no point of setting up N pipelines for <n items. The overhead is too high. #Closed
@@ -143,7 +143,7 @@ public override IRowCursor[] GetRowCursorSet(out IRowCursorConsolidator consolid | |||
var inputs = Source.GetRowCursorSet(out consolidator, predicateInput, n, rand); | |||
Contracts.AssertNonEmpty(inputs); | |||
|
|||
if (inputs.Length == 1 && n > 1 && WantParallelCursors(predicate)) | |||
if (inputs.Length == 1 && n > 1 && WantParallelCursors(predicate) && Source.GetRowCount() != 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetRowCount [](start = 88, length = 11)
However, the issue with this solution is that if user sets up a true enumerable, we will do a parallel clustering setup. At least emprirically it appears that setting up parallel cursors for scoring is 10k x slower vs not. Why we even doing this at all? The only time it will be worth doing if we are having on the order of 10k items to score. #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't know size of Source in general, so nothing we can do. At least with current interfaces. I guess I can propagate parameter which would forbid usage of parallel cursor, but's it's a lot of changes.
Plus in current situation, we are forcing data to be array
foreach (var item in _predictor.Predict(new[] { input }, reuseRowObjects: false)) |
In reply to: 189992635 [](ancestors = 189992635)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically that array can possibly be recycled.
Longer term we should use the row to row mappers that Yael has developed.
In reply to: 190086316 [](ancestors = 190086316,189992635)
@@ -432,6 +432,8 @@ public override bool CanShuffle | |||
|
|||
public override long? GetRowCount(bool lazy = true) | |||
{ | |||
if (_data is ICollection<TRow> collection) | |||
return collection.Count; | |||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Might be simpler as return (_data as ICollection<TRow>)?.Count;
. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Ivanidzo4ka -- this strikes me as a positive change, not the final word on this subject I'm sure but it at least partially solves the most important subcase.
…otnet#197) * Don't create parallel cursor if we have only one element in dataview * update for comments
related to #179