Skip to content

Avoid forking in AbstractSearchAsyncAction #71580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 15, 2021
Merged

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Apr 12, 2021

We don't need to fork when handling unassigned shard failures in AbstractSearchAsyncAction as we never call it recursively.

Relates #70792

@dnhatn dnhatn marked this pull request as ready for review April 12, 2021 23:58
@dnhatn dnhatn added the :Search/Search Search-related issues that do not fall into other categories label Apr 12, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Apr 12, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@dnhatn dnhatn added v8.0.0 v7.13.0 v7.12.1 v6.8.16 >bug and removed Team:Search Meta label for search team labels Apr 12, 2021
@dnhatn dnhatn requested a review from jimczi April 14, 2021 01:19
Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, should also remove the fork in

fork(() -> onShardFailure(shardIndex, shard, shardIt, e));
since executeNext will fork if needed ?

@gwbrown gwbrown added v7.12.2 and removed v7.12.1 labels Apr 14, 2021
@dnhatn
Copy link
Member Author

dnhatn commented Apr 15, 2021

@jimczi I think it's not entirely safe to remove fork here. If we have a lot of shards and we fail to get connections for many of them, then we recursively execute [performPhaseOnShard -> onShardFailure -> performPhaseOnShard -> onShardFailure] many times. This can lead to StackOverflow.

I think we should fork only after several executions instead of a single execution (the current behavior). I will look into it in a follow-up.

@jimczi
Copy link
Contributor

jimczi commented Apr 15, 2021

Right, I read too quickly and thought that executeNext would handle the retry on another replica. Let's leave it for a follow up. Thanks

@dnhatn
Copy link
Member Author

dnhatn commented Apr 15, 2021

Thanks Jim!

@dnhatn dnhatn merged commit f2228cf into elastic:master Apr 15, 2021
@dnhatn dnhatn deleted the avoid-forking branch April 15, 2021 17:00
dnhatn added a commit that referenced this pull request Apr 15, 2021
We don't need to fork when handling unassigned shard failures
in AbstractSearchAsyncAction as we never call it recursively.

Relates #70792
dnhatn added a commit that referenced this pull request Apr 15, 2021
We don't need to fork when handling unassigned shard failures
in AbstractSearchAsyncAction as we never call it recursively.

Relates #70792
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories v7.12.2 v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants