Skip to content

move nvidia-ctk hook command into own binary #474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# NVIDIA Container Toolkit Changelog

* Move `nvidia-ctk hook` commands to a separate `nvidia-cdi-hook` binary. The same subcommands are supported.

## v1.15.0

* Remove `nvidia-container-runtime` and `nvidia-docker2` packages.
Expand Down
31 changes: 31 additions & 0 deletions cmd/nvidia-cdi-hook/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# NVIDIA CDI Hook

The CLI `nvidia-cdi-hook` provides container device runtime hook capabilities when
called by a container runtime, as specific in a
[Container Device Interface](https://tags.cncf.io/container-device-interface/blob/main/SPEC.md)
file.

## Generating a CDI

The CDI itself is created for an NVIDIA-capable device using the
[`nvidia-ctk cdi generate`](../nvidia-ctk/) command.

When `nvidia-ctk cdi generate` is run, the CDI specification is generated as a yaml file.
The CDI specification provides instructions for a container runtime to set up devices, files and
other resources for the container prior to starting it. Those instructions
may include executing command-line tools to prepare the filesystem. The execution
of such command-line tools is called a hook.

`nvidia-cdi-hook` is the CLI tool that is expected to be called by the container runtime,
when specified by the CDI file.

See the [`nvidia-ctk` documentation](../nvidia-ctk/README.md) for more information
on generating a CDI file.

## Functionality

The `nvidia-cdi-hook` CLI provides the following functionality:

* `chmod` - Change the permissions of a file or directory inside the directory path to be mounted into a container.
* `create-symlinks` - Create symlinks inside the directory path to be mounted into a container.
* `update-ldcache` - Update the dynamic linker cache inside the directory path to be mounted into a container.
File renamed without changes.
36 changes: 36 additions & 0 deletions cmd/nvidia-cdi-hook/commands/commands.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
/**
# Copyright 2024 NVIDIA CORPORATION
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/

package commands

import (
"github.com/urfave/cli/v2"

"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/chmod"
symlinks "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/create-symlinks"
ldcache "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/update-ldcache"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"
)

// New creates the commands associated with supported CDI hooks.
// These are shared by the nvidia-cdi-hook and nvidia-ctk hook commands.
func New(logger logger.Interface) []*cli.Command {
return []*cli.Command{
ldcache.NewCommand(logger),
symlinks.NewCommand(logger),
chmod.NewCommand(logger),
}
}
93 changes: 93 additions & 0 deletions cmd/nvidia-cdi-hook/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
/**
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
**/

package main

import (
"os"

"github.com/sirupsen/logrus"

"github.com/NVIDIA/nvidia-container-toolkit/internal/info"

cli "github.com/urfave/cli/v2"

"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/commands"
)

// options defines the options that can be set for the CLI through config files,
// environment variables, or command line flags
type options struct {
// Debug indicates whether the CLI is started in "debug" mode
Debug bool
// Quiet indicates whether the CLI is started in "quiet" mode
Quiet bool
}

func main() {
logger := logrus.New()

// Create a options struct to hold the parsed environment variables or command line flags
opts := options{}

// Create the top-level CLI
c := cli.NewApp()
c.Name = "NVIDIA CDI Hook"
c.UseShortOptionHandling = true
c.EnableBashCompletion = true
c.Usage = "Command to structure files for usage inside a container, called as hooks from a container runtime, defined in a CDI yaml file"
c.Version = info.GetVersionString()

// Setup the flags for this command
c.Flags = []cli.Flag{
&cli.BoolFlag{
Name: "debug",
Aliases: []string{"d"},
Usage: "Enable debug-level logging",
Destination: &opts.Debug,
EnvVars: []string{"NVIDIA_CDI_DEBUG"},
},
&cli.BoolFlag{
Name: "quiet",
Usage: "Suppress all output except for errors; overrides --debug",
Destination: &opts.Quiet,
EnvVars: []string{"NVIDIA_CDI_QUIET"},
},
}

// Set log-level for all subcommands
c.Before = func(c *cli.Context) error {
logLevel := logrus.InfoLevel
if opts.Debug {
logLevel = logrus.DebugLevel
}
if opts.Quiet {
logLevel = logrus.ErrorLevel
}
logger.SetLevel(logLevel)
return nil
}

// Define the subcommands
c.Commands = commands.New(logger)

// Run the CLI
err := c.Run(os.Args)
if err != nil {
logger.Errorf("%v", err)
os.Exit(1)
}
}
15 changes: 9 additions & 6 deletions cmd/nvidia-ctk/cdi/generate/generate.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ type options struct {
deviceNameStrategies cli.StringSlice
driverRoot string
devRoot string
nvidiaCTKPath string
nvidiaCDIHookPath string
ldconfigPath string
mode string
vendor string
Expand Down Expand Up @@ -132,9 +132,12 @@ func (m command) build() *cli.Command {
Destination: &opts.librarySearchPaths,
},
&cli.StringFlag{
Name: "nvidia-ctk-path",
Usage: "Specify the path to use for the nvidia-ctk in the generated CDI specification. If this is left empty, the path will be searched.",
Destination: &opts.nvidiaCTKPath,
Name: "nvidia-cdi-hook-path",
Aliases: []string{"nvidia-ctk-path"},
Usage: "Specify the path to use for the nvidia-cdi-hook in the generated CDI specification. " +
"If not specified, the PATH will be searched for `nvidia-cdi-hook`. " +
"NOTE: That if this is specified as `nvidia-ctk`, the PATH will be searched for `nvidia-ctk` instead.",
Destination: &opts.nvidiaCDIHookPath,
},
&cli.StringFlag{
Name: "ldconfig-path",
Expand Down Expand Up @@ -198,7 +201,7 @@ func (m command) validateFlags(c *cli.Context, opts *options) error {
}
}

opts.nvidiaCTKPath = config.ResolveNVIDIACTKPath(m.logger, opts.nvidiaCTKPath)
opts.nvidiaCDIHookPath = config.ResolveNVIDIACDIHookPath(m.logger, opts.nvidiaCDIHookPath)

if outputFileFormat := formatFromFilename(opts.output); outputFileFormat != "" {
m.logger.Debugf("Inferred output format as %q from output file name", outputFileFormat)
Expand Down Expand Up @@ -262,7 +265,7 @@ func (m command) generateSpec(opts *options) (spec.Interface, error) {
nvcdi.WithLogger(m.logger),
nvcdi.WithDriverRoot(opts.driverRoot),
nvcdi.WithDevRoot(opts.devRoot),
nvcdi.WithNVIDIACTKPath(opts.nvidiaCTKPath),
nvcdi.WithNVIDIACDIHookPath(opts.nvidiaCDIHookPath),
nvcdi.WithLdconfigPath(opts.ldconfigPath),
nvcdi.WithDeviceNamers(deviceNamers...),
nvcdi.WithMode(opts.mode),
Expand Down
11 changes: 2 additions & 9 deletions cmd/nvidia-ctk/hook/hook.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,10 @@
package hook

import (
chmod "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/hook/chmod"
"github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-cdi-hook/commands"
"github.com/NVIDIA/nvidia-container-toolkit/internal/logger"

"github.com/urfave/cli/v2"

symlinks "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/hook/create-symlinks"
ldcache "github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/hook/update-ldcache"
)

type hookCommand struct {
Expand All @@ -46,11 +43,7 @@ func (m hookCommand) build() *cli.Command {
Usage: "A collection of hooks that may be injected into an OCI spec",
}

hook.Subcommands = []*cli.Command{
ldcache.NewCommand(m.logger),
symlinks.NewCommand(m.logger),
chmod.NewCommand(m.logger),
}
hook.Subcommands = commands.New(m.logger)

return &hook
}
28 changes: 26 additions & 2 deletions internal/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,9 @@ const (
configOverride = "XDG_CONFIG_HOME"
configFilePath = "nvidia-container-runtime/config.toml"

nvidiaCTKExecutable = "nvidia-ctk"
nvidiaCTKDefaultFilePath = "/usr/bin/nvidia-ctk"
nvidiaCTKExecutable = "nvidia-ctk"
nvidiaCTKDefaultFilePath = "/usr/bin/nvidia-ctk"
nvidiaCDIHookDefaultFilePath = "/usr/bin/nvidia-cdi-hook"

nvidiaContainerRuntimeHookExecutable = "nvidia-container-runtime-hook"
nvidiaContainerRuntimeHookDefaultPath = "/usr/bin/nvidia-container-runtime-hook"
Expand Down Expand Up @@ -177,6 +178,8 @@ var getDistIDLike = func() []string {
// This executable is used in hooks and needs to be an absolute path.
// If the path is specified as an absolute path, it is used directly
// without checking for existence of an executable at that path.
//
// Deprecated: Use ResolveNVIDIACDIHookPath directly instead.
func ResolveNVIDIACTKPath(logger logger.Interface, nvidiaCTKPath string) string {
return resolveWithDefault(
logger,
Expand All @@ -186,6 +189,27 @@ func ResolveNVIDIACTKPath(logger logger.Interface, nvidiaCTKPath string) string
)
}

// ResolveNVIDIACDIHookPath resolves the path to the nvidia-cdi-hook binary.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this and ResolveNVIDIACTKPath above again, what do you think of rewriting this as:

// ResolveNVIDIACTKPath resolves the path to the nvidia-ctk binary.
// This executable is used in hooks and needs to be an absolute path.
// If the path is specified as an absolute path, it is used directly
// without checking for existence of an executable at that path.
//
// Deprecated: Use ResolveNVIDIACDIHookPath directly instead.
func ResolveNVIDIACTKPath(logger logger.Interface, nvidiaCTKPath string) string {
	return resolveWithDefault(
		logger,
		"NVIDIA Container Toolkit CLI",
		nvidiaCTKPath,
		nvidiaCTKDefaultFilePath,
	)
}

// ResolveNVIDIACDIHookPath resolves the path to the nvidia-cdi-hook binary.
// This executable is used in hooks and needs to be an absolute path.
// If the path is specified as an absolute path, it is used directly
// without checking for existence of an executable at that path.
func ResolveNVIDIACDIHookPath(logger logger.Interface, nvidiaCDIHookPath string) string {
	if filepath.Base(nvidiaCDIHookPath) == "nvidia-ctk" {
		return resolveWithDefault(
			logger,
			"NVIDIA Container Toolkit CLI",
			nvidiaCDIHookPath,
			nvidiaCTKDefaultFilePath,
		)
	}
	return resolveWithDefault(
		logger,
		"NVIDIA CDI Hook CLI",
		nvidiaCDIHookPath,
		nvidiaCDIHookDefaultFilePath,
	)
}

and then adding:

	//nolint:staticcheck  // TODO(elezar): We should swith the nvidia-container-runtime from using nvidia-ctk to using nvidia-cdi-hook.
	cfg.NVIDIACTKConfig.Path = config.ResolveNVIDIACTKPath(r.logger, cfg.NVIDIACTKConfig.Path)

where we call ResolveNVIDIACTKPath?

We can work on properly handling the config changes for the runtime in a follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. Change made and pushed.

// This executable is used in hooks and needs to be an absolute path.
// If the path is specified as an absolute path, it is used directly
// without checking for existence of an executable at that path.
func ResolveNVIDIACDIHookPath(logger logger.Interface, nvidiaCDIHookPath string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In responding to cncf-tags/container-device-interface#203 (comment) I had the thought that the concept of a CDI Hook executable could be standardized so that the nvidia-cdi-hook is a vendor-specific implementation. Does that mean that we should use cdiHookPath instead of using nvidiaCDIHookPath for our variables and config values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters all that much. Until such time as that actually happens, might as well get this in and iterate on it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It can definitely be done as a follow-up.

if filepath.Base(nvidiaCDIHookPath) == "nvidia-ctk" {
return resolveWithDefault(
logger,
"NVIDIA Container Toolkit CLI",
nvidiaCDIHookPath,
nvidiaCTKDefaultFilePath,
)
}
return resolveWithDefault(
logger,
"NVIDIA CDI Hook CLI",
nvidiaCDIHookPath,
nvidiaCDIHookDefaultFilePath,
)
}

// ResolveNVIDIAContainerRuntimeHookPath resolves the path the nvidia-container-runtime-hook binary.
func ResolveNVIDIAContainerRuntimeHookPath(logger logger.Interface, nvidiaContainerRuntimeHookPath string) string {
return resolveWithDefault(
Expand Down
Loading