Skip to content
This repository was archived by the owner on Jan 16, 2025. It is now read-only.

Commit 1de73bf

Browse files
authored
fix(runners): increase the log level to WARN when using the enable_job_queued_check parameter (#3046)
The readme section detailing the `enable_job_queued_check` makes it an appealing function, but it was unclear to me that it could cause some issues when used with ephemeral runners and matrix workflows. It is particularly challenging to notice for organizations with a very large number of repositories, at times with low activity the issue is not present, but when matrix jobs are started or when multiple repositories are triggering jobs around the same time window (such as cron'd workflow), this issue is more likely to be faced. For example, on the execution that allowed me to find the cause, when starting a matrix of 9 jobs, 4 of them were actually triggering this log message: ``` 2023-03-10 15:29:09.728 INFO [scale-up:a0502c68-f9b9-5d2d-afd6-70edb7486316 index.js:135249 isJobQueued] Job not queued ``` This log message, aside from being one amongst many other `INFO` messages, does not provide many details about the consequences. I suggest bumping the level to `WARN` and adding more details in the message itself. Related discussion: https://github.com/philips-labs/terraform-aws-github-runner/discussions/2931
1 parent a3eb81e commit 1de73bf

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

Diff for: README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -324,7 +324,7 @@ You can configure runners to be ephemeral, runners will be used only for one job
324324

325325
- The scale down lambda is still active, and should only remove orphan instances. But there is no strict check in place. So ensure you configure the `minimum_running_time_in_minutes` to a value that is high enough to got your runner booted and connected to avoid it got terminated before executing a job.
326326
- The messages sent from the webhook lambda to scale-up lambda are by default delayed delayed by SQS, to give available runners to option to start the job before the decision is made to scale more runners. For ephemeral runners there is no need to wait. Set `delay_webhook_event` to `0`.
327-
- All events on the queue will lead to a new runner crated by the lambda. By setting `enable_job_queued_check` to `true` you can enforce only create a runner if the event has a correlated queued job. Setting this can avoid creating useless runners, for example whn jobs got cancelled before a runner is created. We suggest to use this in combination with a pool.
327+
- All events on the queue will lead to a new runner created by the lambda. By setting `enable_job_queued_check` to `true` you can enforce only create a runner if the event has a correlated queued job. Setting this can avoid creating useless runners, for example when jobs got cancelled before a runner is created or if the job was already picked up by another runner. We suggest to use this in combination with a pool.
328328
- To ensure runners are created in the same order GitHub sends the events we use by default a FIFO queue, this is mainly relevant for repo level runners. For ephemeral runners you can set `enable_fifo_build_queue` to `false`.
329329
- Error related to scaling should be retried via SQS. You can configure `job_queue_retention_in_seconds` `redrive_build_queue` to tune the behavior. We have no mechanism to avoid events will never processed, which means potential no runner could be created and the job in GitHub can time out in 6 hours.
330330

Diff for: modules/runners/lambdas/runners/src/scale-runners/scale-up.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ async function isJobQueued(githubInstallationClient: Octokit, payload: ActionReq
110110
throw Error(`Event ${payload.eventType} is not supported`);
111111
}
112112
if (!isQueued) {
113-
logger.info(`Job not queued`, LogFields.print());
113+
logger.warn(`Job not queued in GitHub. A new runner instance will NOT be created for this job.`, LogFields.print());
114114
}
115115
return isQueued;
116116
}

0 commit comments

Comments
 (0)