Skip to content

Possible race condition with aws_iam_role.instance recreate #583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dzinek opened this issue Nov 29, 2022 · 6 comments
Closed

Possible race condition with aws_iam_role.instance recreate #583

dzinek opened this issue Nov 29, 2022 · 6 comments
Labels
bug 🐛 Something isn't working stale Issue/PR is stale and closed automatically

Comments

@dzinek
Copy link

dzinek commented Nov 29, 2022

Hi, I decided to upgrade module from version 5.4.1 to 5.5.0 due to fixed #557.

Plan showed me that ASG Launch Template will be changed (due to changes in user_data with pull policy and gzip from #565).
But also the aws_iam_role.instance has to be recreated:

  # module.gitlab_runner.aws_iam_role.instance will be destroyed
  # (because resource uses count or for_each)
  - resource "aws_iam_role" "instance" {

and then

  # module.gitlab_runner.aws_iam_role.instance[0] will be created
  + resource "aws_iam_role" "instance" {

The plan looked good (first destroy role then create it) so I applied it.
Changes with launch template and ASG passed, but role recreation failed - seems like the race condition - the new role is starting to create, when the old one has not finish destroying:

module.gitlab_runner.aws_iam_role.instance: Destroying... [id=runner-instance]
module.gitlab_runner.aws_iam_role.instance[0]: Creating...
module.gitlab_runner.aws_iam_role.instance: Destruction complete after 3s
╷
│ Error: failed creating IAM Role (runner-instance): EntityAlreadyExists: Role with name runner-instance already exists.
│ 	status code: 409, request id: xxxxxxx
│
│   with module.gitlab_runner.aws_iam_role.instance[0],
│   on .terraform/modules/gitlab_runner/main.tf line 331, in resource "aws_iam_role" "instance":
│  331: resource "aws_iam_role" "instance" {

I need to run apply three times - 1st one to got this error, and also another two times to pass it.

The final state was, that the new runner cannot be registered, due to issue in user_data with fetching runner token from SSM. The instance profile lost the assigned role, so the ec2 could not fetch any data from SSM.
I had to reassign role to instance profile manually to fix the access and have runner registered.

@kayman-mk
Copy link
Collaborator

Can't confirm this using Terraform 1.3.2

Plan says:
module.gitlab_runner.module.gitlab_runner_test["subnet-xyz"].aws_iam_role.instance has moved to module.gitlab_runner.module.gitlab_runner_test["subnet-xyz"].aws_iam_role.instance[0].

The resource is not destroyed and recreated.

@dzinek
Copy link
Author

dzinek commented Nov 29, 2022

Ok, we are in process of migration to 1.3.5 - will test it on the later TF.

@kayman-mk
Copy link
Collaborator

Seems to be related to #591

@kayman-mk kayman-mk added the bug 🐛 Something isn't working label Jan 1, 2023
@kayman-mk
Copy link
Collaborator

Any news here, @dzinek? Or can we close this issue?

@github-actions
Copy link
Contributor

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 15 days.

@github-actions github-actions bot added the stale Issue/PR is stale and closed automatically label May 11, 2023
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 15 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working stale Issue/PR is stale and closed automatically
Projects
None yet
Development

No branches or pull requests

2 participants