Skip to content

NC | lifecycle | continue last run #8925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 2, 2025

Conversation

nadavMiz
Copy link
Contributor

@nadavMiz nadavMiz commented Apr 1, 2025

Describe the Problem

lifecycle run might fail or timeout before all the objects were processed. add a flag to continue the ran from where it last stopped

Explain the Changes

  1. add new flag continue to resume the last run. in case last run finished. this will end the ran. in case there was no last run. start a new run
  2. move function to load previous run status to a new file lifecycle_utils and use it both in nc_lifecycle and health scripts
  3. create init function for bucket and rule status, and init objects if they don't exist
  4. in case we run with continue. load the previous run states to this run status object

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

automatic tests:
sudo npx jest test_nc_lifecycle_cli
manual test:

  1. set timeout less then run time, but more then a single cycle
  2. keep starting new runs with continue flag, until a lifecycle finishes successfully
  3. all new lifecycle runs with continue should pass without doing anything
  • Doc added/updated
  • Tests added

@nadavMiz nadavMiz requested a review from romayalon April 1, 2025 16:53
@nadavMiz nadavMiz force-pushed the lifecycle-continue-argument branch from 4fd3474 to f735ef9 Compare April 1, 2025 16:58
@nadavMiz nadavMiz self-assigned this Apr 1, 2025
@nadavMiz nadavMiz force-pushed the lifecycle-continue-argument branch 3 times, most recently from 1d8f8da to d04893f Compare April 2, 2025 09:25
for (const [bucket_name, prev_bucket_status] of Object.entries(previous_run.buckets_statuses)) {
if (!buckets.includes(bucket_name)) continue;
const bucket_json = await config_fs.get_bucket_by_name(bucket_name, config_fs_options);
if (!bucket_json.lifecycle_configuration_rules) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why bucket json and lifecycle_configuration_rules check is needed? AFAIU you can just copy and nullify/undefindify ( :) ) the stats and times of each bucket/rule
I believe we care about the general structure of buckets and rules and state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buckets and rules can change between runs. since we save this in the log file. we will have empty object for rule/ bucket with irrelevant state. there may still be concurrency issue. but I think we should at least check it between runs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we can nullify the state if it doesn't exist. I can change it work like that

//ignore error
}
await exec_manage_cli(TYPES.LIFECYCLE, '', {continue: 'true', disable_service_validation: 'true', disable_runtime_validation: 'true', config_root}, undefined, undefined);
const object_list2 = await object_sdk.list_objects({bucket: test_bucket});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the assert that the run didn't do anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the test that the run didn't do anything. its a test that it did finish the job. might be a redundant test. but I though it might find general bugs with continue flow. I can test that the lifecycle finished successfully. the test that test that it didn't is the previous one

@nadavMiz nadavMiz force-pushed the lifecycle-continue-argument branch from d04893f to d59c586 Compare April 2, 2025 12:51
@nadavMiz nadavMiz force-pushed the lifecycle-continue-argument branch from d59c586 to 73edf4c Compare April 2, 2025 12:54
@nadavMiz nadavMiz merged commit af53218 into noobaa:master Apr 2, 2025
11 of 13 checks passed
@romayalon romayalon mentioned this pull request May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants