Skip to content

kernel: sched: use _is_thread_ready() in should_preempt() #8126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mike-scott
Copy link
Contributor

@mike-scott mike-scott commented Jun 2, 2018

We are using _is_thread_prevented_from_running() to see if the
_current thread can be preempted in should_preempt(). The idea
being that even if the _current thread is a high priority coop
thread, we can still preempt it when it's pending, suspended,
etc.

This does not take into account if the thread is sleeping.

k_sleep() merely removes the thread from the ready_q and calls
Swap(). The scheduler will swap away from the thread temporarily
and then on the next cycle get stuck to the sleeping thread for
however long the sleep timeout is, doing exactly nothing because
other functions like _ready_thread() use _is_thread_ready() as a
check before proceeding.

We should use !_is_thread_ready() to take into account when threads
are waiting on a timer, and let other threads run in the meantime.

Fixes: #8128

Signed-off-by: Michael Scott [email protected]

@mike-scott mike-scott requested a review from nashif June 2, 2018 21:57
@mike-scott mike-scott self-assigned this Jun 2, 2018
@mike-scott mike-scott added area: Kernel bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug labels Jun 2, 2018
@mike-scott mike-scott added this to the v1.12.0 milestone Jun 2, 2018
@mike-scott
Copy link
Contributor Author

@andyross Please take a look.

@mike-scott
Copy link
Contributor Author

@andyross I'm going to say "I told you so" here. Not really to be mean, but because this took another 24 hours of my time to debug.

I was using _is_thread_ready() in the patch I originally posted to fix the scheduler spin. When integrated into the rebase you were doing, you swapped to using _is_thread_prevented_from_running() instead. Then we had a discussion about it and I let you sway me into believing that _is_thread_prevented_from_running() was going to work.

So, "I told you so". :)

@mike-scott
Copy link
Contributor Author

There could be an argument as to why k_sleep() doesn't pend() a thread. But, I think at some point we have to fix what we have for v1.12.

@mike-scott
Copy link
Contributor Author

Looking into the sanity check failures, they seem very appropriate.

Copy link
Collaborator

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I stumbled on the k_sleep() issue from the other direction this morning and literally started working on a change to fix it to use the pend API directly. That would obviate the need for this, but is a slightly more invasive change that can wait for 1.13 if this fixes the issues.

To be clear, though: this fixes your issues, right?

@andyross
Copy link
Collaborator

andyross commented Jun 2, 2018

Grr... except it looks like something is failing with this applied. I'll bang on this some tonight when I get a chance too. I totally buy that this is the right issue. If you're curious about the fix-from-the-other-side I started working on this morning (it doesn't work yet either!) and want to try something like it, it's in andyross@1ca847c

@mike-scott
Copy link
Contributor Author

@andyross I can't confirm this fixes the issue entirely, but so far so good (only a few hours into the test tho).

@mike-scott
Copy link
Contributor Author

@andyross nope have a few devices fall offline. Will hook a debugger up after dinner.

(not that k_sleep isn't an issue, it just isn't the only issue)

Copy link
Contributor Author

@mike-scott mike-scott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the issue with this patch.. will resubmit :)

kernel/sched.c Outdated
@@ -122,7 +122,7 @@ static int should_preempt(struct k_thread *th, int preempt_ok)
}

/* Or if we're pended/suspended/dummy (duh) */
if (!_current || _is_thread_prevented_from_running(_current)) {
if (!_current || _is_thread_ready(_current)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic is backwards here. If !_is_thread_ready() then we should preempt.

We are using _is_thread_prevented_from_running() to see if the
_current thread can be preempted in should_preempt().  The idea
being that even if the _current thread is a high priority coop
thread, we can still preempt it when it's pending, suspended,
etc.

This does not take into account if the thread is sleeping.

k_sleep() merely removes the thread from the ready_q and calls
Swap().  The scheduler will swap away from the thread temporarily
and then on the next cycle get stuck to the sleeping thread for
however long the sleep timeout is, doing exactly nothing because
other functions like _ready_thread() use _is_thread_ready() as a
check before proceeding.

We should use !_is_thread_ready() to take into account when threads
are waiting on a timer, and let other threads run in the meantime.

Signed-off-by: Michael Scott <[email protected]>
@mike-scott mike-scott force-pushed the master-fix-k_sleep-spin branch from 20c890d to 03cd198 Compare June 4, 2018 00:36
@codecov-io
Copy link

Codecov Report

Merging #8126 into master will not change coverage.
The diff coverage is 0%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #8126   +/-   ##
=======================================
  Coverage   64.54%   64.54%           
=======================================
  Files         420      420           
  Lines       40142    40142           
  Branches     6765     6765           
=======================================
  Hits        25911    25911           
  Misses      11110    11110           
  Partials     3121     3121
Impacted Files Coverage Δ
kernel/sched.c 81.17% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a18b08...03cd198. Read the comment docs.

@nashif nashif merged commit 6c95daf into zephyrproject-rtos:master Jun 4, 2018
Copy link
Collaborator

@agross-oss agross-oss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. would be nice to have time to review this before it being merged. especially when it comes in on a weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Kernel bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants