You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: opt-out ssm parameters for github app (#4335)
Hi! 👋
Throughout my recent experiences, we had strong requirements to **not**
have secret values in the state. To handle that, I suggest to make
optional the creation of SSM parameters as part of the regular flow so
it doesn't leak secrets in Terraform state.
We could alternatively use the latest
[`ephemeral`](https://developer.hashicorp.com/terraform/language/resources/ephemeral)
feature, but I'm not sure everyone is using Terraform 1.10+ atm since
it's quite recent. I also think relying on ephemerals should be part of
a new breaking release as it won't be compatible with Terraform < 1.10.
Let me know what you think of this!
---------
Co-authored-by: Niek Palm <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Niek Palm <[email protected]>
Copy file name to clipboardExpand all lines: README.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -141,7 +141,7 @@ Join our discord community via [this invite link](https://discord.gg/bxgXW8jJGh)
141
141
| <aname="input_eventbridge"></a> [eventbridge](#input\_eventbridge)| Enable the use of EventBridge by the module. By enabling this feature events will be put on the EventBridge by the webhook instead of directly dispatching to queues for scaling.<br/><br/> `enable`: Enable the EventBridge feature.<br/> `accept_events`: List can be used to only allow specific events to be putted on the EventBridge. By default all events, empty list will be be interpreted as all events. | <pre>object({<br/> enable = optional(bool, true)<br/> accept_events = optional(list(string), null)<br/> })</pre> |`{}`| no |
142
142
| <aname="input_ghes_ssl_verify"></a> [ghes\_ssl\_verify](#input\_ghes\_ssl\_verify)| GitHub Enterprise SSL verification. Set to 'false' when custom certificate (chains) is used for GitHub Enterprise Server (insecure). |`bool`|`true`| no |
143
143
| <aname="input_ghes_url"></a> [ghes\_url](#input\_ghes\_url)| GitHub Enterprise Server URL. Example: https://github.internal.co - DO NOT SET IF USING PUBLIC GITHUB. However if you are using Github Enterprise Cloud with data-residency (ghe.com), set the endpoint here. Example - https://companyname.ghe.com|`string`|`null`| no |
144
-
| <aname="input_github_app"></a> [github\_app](#input\_github\_app)| GitHub app parameters, see your github app. Ensure the key is the base64-encoded `.pem` file (the output of `base64 app.private-key.pem`, not the content of `private-key.pem`). | <pre>object({<br/> key_base64 = string<br/> id = string<br/> webhook_secret = string<br/> })</pre> | n/a | yes |
144
+
| <a name="input_github_app"></a> [github\_app](#input\_github\_app) | GitHub app parameters, see your github app. <br/> You can optionally create the SSM parameters yourself and provide the ARN and name here, through the `*_ssm` attributes.<br/> If you chose to provide the configuration values directly here, <br/> please ensure the key is the base64-encoded `.pem` file (the output of `base64 app.private-key.pem`, not the content of `private-key.pem`).<br/> Note: the provided SSM parameters arn and name have a precedence over the actual value (i.e `key_base64_ssm` has a precedence over `key_base64` etc). | <pre>object({<br/> key_base64 = optional(string)<br/> key_base64_ssm = optional(object({<br/> arn = string<br/> name = string<br/> }))<br/> id = optional(string)<br/> id_ssm = optional(object({<br/> arn = string<br/> name = string<br/> }))<br/> webhook_secret = optional(string)<br/> webhook_secret_ssm = optional(object({<br/> arn = string<br/> name = string<br/> }))<br/> })</pre> | n/a | yes |
145
145
| <aname="input_idle_config"></a> [idle\_config](#input\_idle\_config)| List of time periods, defined as a cron expression, to keep a minimum amount of runners active instead of scaling down to 0. By defining this list you can ensure that in time periods that match the cron expression within 5 seconds a runner is kept idle. | <pre>list(object({<br/> cron = string<br/> timeZone = string<br/> idleCount = number<br/> evictionStrategy = optional(string, "oldest_first")<br/> }))</pre> |`[]`| no |
146
146
| <aname="input_instance_allocation_strategy"></a> [instance\_allocation\_strategy](#input\_instance\_allocation\_strategy)| The allocation strategy for spot instances. AWS recommends using `price-capacity-optimized` however the AWS default is `lowest-price`. |`string`|`"lowest-price"`| no |
147
147
| <aname="input_instance_max_spot_price"></a> [instance\_max\_spot\_price](#input\_instance\_max\_spot\_price)| Max price price for spot instances per hour. This variable will be passed to the create fleet as max spot price for the fleet. |`string`|`null`| no |
Copy file name to clipboardExpand all lines: docs/configuration.md
+37-13
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ To be able to support a number of use-cases, the module has quite a lot of confi
10
10
- Linux vs Windows. You can configure the OS types linux and win. Linux will be used by default.
11
11
- Re-use vs Ephemeral. By default runners are re-used, until detected idle. Once idle they will be removed from the pool. To improve security we are introducing ephemeral runners. Those runners are only used for one job. Ephemeral runners only work in combination with the workflow job event. For ephemeral runners the lambda requests a JIT (just in time) configuration via the GitHub API to register the runner. [JIT configuration](https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions#using-just-in-time-runners) is limited to ephemeral runners (and currently not supported by GHES). For non-ephemeral runners, a registration token is always requested. In both cases the configuration is made available to the instance via the same SSM parameter. To disable JIT configuration for ephemeral runners set `enable_jit_config` to `false`. We also suggest using a pre-build AMI to improve the start time of jobs for ephemeral runners.
12
12
- Job retry (**Beta**). By default the scale-up lambda will discard the message when it is handled. Meaning in the ephemeral use-case an instance is created. The created runner will ask GitHub for a job, no guarantee it will run the job for which it was scaling. Result could be that with small system hick-up the job is keeping waiting for a runner. Enable a pool (org runners) is one option to avoid this problem. Another option is to enable the job retry function. Which will retry the job after a delay for a configured number of times.
13
-
- GitHub Cloud vs GitHub Enterprise Server (GHES). The runners support GitHub Cloud (Public GitHub - github.com), GitHub Data Residency instances (ghe.com), and GitHub Enterprise Server. For GHES, we rely on our community for support and testing. We have no capability to test GHES ourselves.
13
+
- GitHub Cloud vs GitHub Enterprise Server (GHES). The runners support GitHub Cloud (Public GitHub - github.com), GitHub Data Residency instances (ghe.com), and GitHub Enterprise Server. For GHES, we rely on our community for support and testing. We have no capability to test GHES ourselves.
14
14
- Spot vs on-demand. The runners use either the EC2 spot or on-demand life cycle. Runners will be created via the AWS [CreateFleet API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html). The module (scale up lambda) will request via the CreateFleet API to create instances in one of the subnets and of the specified instance types.
15
15
- ARM64 support via Graviton/Graviton2 instance-types. When using the default example or top-level module, specifying `instance_types` that match a Graviton/Graviton 2 (ARM64) architecture (e.g. a1, t4g or any 6th-gen `g` or `gd` type), you must also specify `runner_architecture = "arm64"` and the sub-modules will be automatically configured to provision with ARM64 AMIs and leverage GitHub's ARM64 action runner. See below for more details.
16
16
- Disable default labels for the runners (os, architecture and `self-hosted`) can achieve by setting `runner_disable_default_labels` = true. If enabled, the runner will only have the extra labels provided in `runner_extra_labels`. In case you on own start script is used, this configuration parameter needs to be parsed via SSM.
@@ -24,17 +24,44 @@ The module uses the AWS System Manager Parameter Store to store configuration fo
24
24
|`ssm_paths.root/var.prefix?/app/`| App secrets used by Lambda's |
25
25
|`ssm_paths.root/var.prefix?/runners/config/<name>`| Configuration parameters used by runner start script |
26
26
|`ssm_paths.root/var.prefix?/runners/tokens/<ec2-instance-id>`| Either JIT configuration (ephemeral runners) or registration tokens (non ephemeral runners) generated by the control plane (scale-up lambda), and consumed by the start script on the runner to activate / register the runner. |
27
-
|`ssm_paths.root/var.prefix?/webhook/runner-matcher-config`| Runner matcher config used by webhook to decide the target for the webhook event.|
27
+
|`ssm_paths.root/var.prefix?/webhook/runner-matcher-config`| Runner matcher config used by webhook to decide the target for the webhook event. |
Manually creating the SSM parameters that hold the configuration of your GitHub App avoids leaking critical plain text values in your terraform state and version control system. This is a recommended security practice for handling sensitive credentials.
63
+
64
+
You can read more [over here](../examples/external-managed-ssm-secrets/README.md).
38
65
39
66
## Encryption
40
67
@@ -124,7 +151,6 @@ You can configure runners to be ephemeral, in which case runners will be used on
124
151
125
152
The example for [ephemeral runners](examples/ephemeral.md) is based on the [default example](examples/default.md). Have look at the diff to see the major configuration differences.
126
153
127
-
128
154
## Job retry (**Beta**)
129
155
130
156
You can enable the job retry function to retry a job after a delay for a configured number of times. The function is disabled by default. To enable the function set `job_retry.enable` to `true`. The function will check the job status after a delay, and when the is still queued, it will create a new runner. The new runner is created in the same way as the others via the scale-up function. Hence the same configuration applies.
@@ -133,7 +159,6 @@ For checking the job status a API call is made to GitHub. Which can exhaust the
133
159
134
160
The option `job_retry.delay_in_seconds` is the delay before the job status is checked. The delay is increased by the factor `job_retry.delay_backoff` for each attempt. The upper bound for a delay is 900 seconds, which is the max message delay on SQS. The maximum number of attempts is configured via `job_retry.max_attempts`. The delay should be set to a higher value than the time it takes to start a runner.
135
161
136
-
137
162
## Prebuilt Images
138
163
139
164
This module also allows you to run agents from a prebuilt AMI to gain faster startup times. The module provides several examples to build your own custom AMI. To remove old images, an [AMI housekeeper module](modules/public/ami-housekeeper.md) can be used. See the [AMI examples](ami-examples/index.md) for more details.
@@ -231,7 +256,7 @@ The watcher is listening for spot termination warnings and create a log message
231
256
### Termination handler
232
257
233
258
!!! warning
234
-
This feature will only work once the CloudTrail is enabled.
259
+
This feature will only work once the CloudTrail is enabled.
235
260
236
261
The termination handler is listening for spot terminations by capture the `BidEvictedEvent` via CloudTrail. The handler will log and optionally create a metric for each termination. The intend is to enhance the logic to inform the user about the termination via the GitHub Job or Workflow run. The feature is disabled by default. The feature is enabled once the watcher is enabled, the feature can be disabled explicit by setting `instance_termination_watcher.features.enable_spot_termination_handler = false`.
NOTE: By default, a runner AMI update requires a re-apply of this terraform config (the runner AMI ID is looked up by a terraform data source). To avoid this, you can use `ami_id_ssm_parameter_name` to have the scale-up lambda dynamically lookup the runner AMI ID from an SSM parameter at instance launch time. Said SSM parameter is managed outside of this module (e.g. by a runner AMI build workflow).
Copy file name to clipboardExpand all lines: examples/default/README.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ This module shows how to create GitHub action runners. Lambda release will be do
6
6
7
7
Steps for the full setup, such as creating a GitHub app can be found in the root module's [README](https://github.com/github-aws-runners/terraform-aws-github-runner). First download the Lambda releases from GitHub. Alternatively you can build the lambdas locally with Node or Docker, there is a simple build script in `<root>/.ci/build.sh`. In the `main.tf` you can simply remove the location of the lambda zip files, the default location will work in this case.
8
8
9
-
> The default example assumes local built lambda's available. Ensure you have built the lambda's. Alternativly you can downlowd the lambda's. The version needs to be set to a GitHub release version, see https://github.com/github-aws-runners/terraform-aws-github-runner/releases
9
+
> The default example assumes local built lambda's available. Ensure you have built the lambda's. Alternatively you can download the lambda's. The version needs to be set to a GitHub release version, see https://github.com/github-aws-runners/terraform-aws-github-runner/releases
0 commit comments