Skip to content

Add --disable-hook flag to cdi generate command #1077

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ArangoGutierrez
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez commented May 12, 2025

When running the nvidia-ctk cdi generate command, a user should be able to opt out of specific hooks. We propose to add a flag --disable-hooks that will take a comma-separated list of hooks that will be skipped when creating the CDI spec.

Fixes: #1074

@ArangoGutierrez ArangoGutierrez requested review from elezar and Copilot May 12, 2025 16:53
@ArangoGutierrez ArangoGutierrez self-assigned this May 12, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new CLI flag (--skip-hooks) to allow users to opt out of specific hooks when generating the CDI specification.

  • Updated tests to include a skip-hook option.
  • Modified the command initialization and spec generation logic to support disabling hooks based on the provided flag.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
cmd/nvidia-ctk/cdi/generate/generate_test.go Added a test case to validate the skip-hook functionality and updated expected spec/options accordingly.
cmd/nvidia-ctk/cdi/generate/generate.go Introduced a new CLI flag and added logic to disable hooks based on user input.
Comments suppressed due to low confidence (1)

cmd/nvidia-ctk/cdi/generate/generate_test.go:40

  • [nitpick] The variable name 'skipdHook' may be a typo and could be renamed to 'skipHook' for consistency and clarity.
skipdHook := cli.NewStringSlice("enable-cuda-compat")

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for a new flag (--skip-hooks) to the nvidia-ctk cdi generate command so that users can opt out of specific hooks when creating the CDI spec.

  • In generate.go, a new skipHook option and corresponding CLI flag are introduced and processed when creating CDI library options.
  • In generate_test.go, a new test case verifies the behavior when a non-default hook is skipped.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
cmd/nvidia-ctk/cdi/generate/generate_test.go Added test case for --skip-hooks flag and its effect.
cmd/nvidia-ctk/cdi/generate/generate.go Introduced skipHook flag and integrated it into initialization options.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new flag --skip-hooks to the cdi generate command to allow users to disable specific hooks while generating the CDI specification.

  • Introduces a new CLI flag in generate.go to capture the hooks to skip
  • Updates the options struct and test cases in generate_test.go to include and validate the new skipHook functionality

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
cmd/nvidia-ctk/cdi/generate/generate_test.go Adds a test case for skipHook using a new CLI string slice
cmd/nvidia-ctk/cdi/generate/generate.go Registers the new --skip-hook flag and passes the provided hooks to disable options during CDI spec generation
Comments suppressed due to low confidence (1)

cmd/nvidia-ctk/cdi/generate/generate_test.go:40

  • [nitpick] Consider adding a test case to verify that when a hook name is provided via skipHook, the generated CDI specification omits or disables that hook, ensuring the new flag's functionality is fully verified.
skipHook := cli.NewStringSlice("enable-cuda-compat")

Comment on lines 288 to 291
if len(opts.skipHook.Value()) > 0 {
for _, hook := range opts.skipHook.Value() {
initOpts = append(initOpts, nvcdi.WithDisabledHook(nvcdi.HookName(hook)))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to check for validity of the names, or do we want to add that as a follow-up?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's hold this PR until:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's hold this PR until:

Should this PR not just work automatically with those PRs -- assuming those hooks can be opted out of using the same option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now allows for disabling all hooks

@@ -176,6 +177,12 @@ func (m command) build() *cli.Command {
Usage: "Specify a pattern the CSV mount specifications.",
Destination: &opts.csv.ignorePatterns,
},
&cli.StringSliceFlag{
Name: "skip-hook",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about disable-hook (we could even add an alias as disable-hooks) instead? Thinking ahead, it may be useful to allow users to do something like:

nvidia-ctk cdi generate \
    --disable-hook=update-ldcache \
    --disable-hook=enable-cuda-compat \
    --mode=nvml

Even further forward, more conservative users may want to consider:

nvidia-ctk cdi generate --disable-hooks=all

although that is out of scope of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klueska @jgehrcke what are your thoughts on the UX?

@ArangoGutierrez ArangoGutierrez changed the title Add --skip-hooks flag to cdi generate command Add --disable-hook flag to cdi generate command May 14, 2025
@ArangoGutierrez ArangoGutierrez requested review from elezar and Copilot May 14, 2025 10:09
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for a new flag that allows users to disable specific hooks during the CDI spec generation. Key changes include:

  • Adding conditional logic to check for supported hooks in the NVML driver library.
  • Introducing the new CLI flag (and corresponding option) to allow disabling hooks.
  • Updating tests to reflect the new --disable-hook(s) functionality.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
pkg/nvcdi/driver-nvml.go Wraps hook-based discoverer creation in conditionals based on hook support.
cmd/nvidia-ctk/cdi/generate/generate_test.go Adds test cases and sample disable-hook values in accordance with the new functionality.
cmd/nvidia-ctk/cdi/generate/generate.go Introduces a new flag and integrates hook disabling into the initialization options.
Comments suppressed due to low confidence (1)

cmd/nvidia-ctk/cdi/generate/generate.go:181

  • [nitpick] Consider renaming the flag from "disable-hook" to "disable-hooks" to match the PR title and description for consistency.
Name:        "disable-hook",

When running the nvidia-ctk cdi generate command, a user should be able to opt out of specific hooks.
We propose to add a flag --disable-hook that will take a comma-separated list of hooks that will be skipped when creating the CDI spec.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>

if len(opts.disableHooks.Value()) > 0 {
for _, hook := range opts.disableHooks.Value() {
if hook == "all" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all should be a valid hook name implemented in the API. For how I could imagine this working, see the way we handle NVIDIA_VISIBLE_DEVICES in internal/config/image/devices.go.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the idea, PTAL

pkg/nvcdi/api.go Outdated
)

// NewHookName takes a string and returns a []HookName, empty if the HookName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right place to implement this. we have a pkg/nvcdi/hooks.go file where this can go instead.

Also, in terms of the implementation, does adding special handling to the disabledHooks type for the all hook not make this cleaner?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved

Comment on lines 288 to 291
if len(opts.disableHooks.Value()) > 0 {
for _, hook := range opts.disableHooks.Value() {
for _, hookName := range nvcdi.NewHookName(hook) {
initOpts = append(initOpts, nvcdi.WithDisabledHook(hookName))
}
}
}
Copy link
Member

@elezar elezar May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The len check is not required. One can always iterate over an empty slice.

Suggested change
if len(opts.disableHooks.Value()) > 0 {
for _, hook := range opts.disableHooks.Value() {
for _, hookName := range nvcdi.NewHookName(hook) {
initOpts = append(initOpts, nvcdi.WithDisabledHook(hookName))
}
}
}
for _, hook := range opts.disableHooks.Value() {
initOpts = append(initOpts, nvcdi.WithDisabledHook(hook))
}

(note that I have updated WithDisabledHook to:

// WithDisabledHook allows specific hooks to the disabled.
// This option can be specified multiple times for each hook.
func WithDisabledHook[T string | HookName](hook T) Option {
	return func(o *nvcdilib) {
		if o.disabledHooks == nil {
			o.disabledHooks = make(map[HookName]bool)
		}
		o.disabledHooks[HookName(hook)] = true
	}
}

to allow it to accept both string and HookName arguments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion taken

@@ -57,6 +57,7 @@ type options struct {

configSearchPaths cli.StringSlice
librarySearchPaths cli.StringSlice
disableHooks cli.StringSlice
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
disableHooks cli.StringSlice
disabledHooks cli.StringSlice

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

l.nvidiaCDIHookPath,
)
discoverers = append(discoverers, driverDotSoSymlinksDiscoverer)
if l.HookIsSupported(HookCreateSymlinks) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this hook specifically (and the update-ldcache hook too), there are many places in the code where we are injecting this hook. As such this is not going to be sufficient to prevent their injection. This was why I suggested that handling these hooks is out of scope for this PR.

@@ -176,6 +177,12 @@ func (m command) build() *cli.Command {
Usage: "Specify a pattern the CSV mount specifications.",
Destination: &opts.csv.ignorePatterns,
},
&cli.StringSliceFlag{
Name: "disable-hook",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add a disable-all-hooks flag so people don't have to enumerate all of them to disable them.

Also, should we add a list-hooks command so people can discover what hooks are available for disabling

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klueska my thoughts on how to handle the disabling of all hooks is to also accept all as a valid hook name. So that --disable-hook(s)=all can be specified.

Copy link
Member

@elezar elezar May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we add a list-hooks command so people can discover what hooks are available for disabling

Do you mean something like:

nvidia-ctk cdi list-hooks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added an alias of disable-hooks and a better Usage description.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces the --disable-hook flag for the CDI generate command so that users can opt out of specific hooks when creating the CDI spec. Key changes include:

  • Updating the options API to support disabling hooks via a generic WithDisabledHook function.
  • Adding a Set method to the disabledHooks type to handle the "all" keyword.
  • Integrating the new flag into the CLI command and updating tests to validate its behavior.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pkg/nvcdi/options.go Updated WithDisabledHook to accept both string and HookName types.
pkg/nvcdi/hooks.go Added a Set method on disabledHooks to support the "all" keyword.
pkg/nvcdi/driver-nvml.go Wrapped hook discovery calls with HookIsSupported checks.
pkg/nvcdi/api.go Introduced new hook definitions and an AllHooks list.
cmd/nvidia-ctk/cdi/generate/generate.go Added the disable-hook flag to the CLI command and injected the disabledHooks option.
cmd/nvidia-ctk/cdi/generate/generate_test.go Extended tests to check disabling hooks including the special "all" value.
Comments suppressed due to low confidence (1)

pkg/nvcdi/options.go:159

  • Fix the typo in the comment to read 'allows specific hooks to be disabled.'
// WithDisabledHook allows specific hooks to the disabled.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
@ArangoGutierrez
Copy link
Collaborator Author

PR #1090 will refactor how we handle the Hook Creation, and after it, I'll update this PR to build on that and provide a better handling of disabled hooks via flags

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a flag to nvidia-ctk cdi generate to allow users to specificy which Hooks to be skipped
3 participants