Skip to content

[tmpnet] Refactor runtime configuration in preparation for kube #3867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 14, 2025

Conversation

maru-ava
Copy link
Contributor

@maru-ava maru-ava commented Apr 8, 2025

PR Chain: tmpnet+kube

This PR chain enables tmpnet to deploy temporary networks to Kubernetes. Early PRs refactor tmpnet to support the addition in #3615 of a new tmpnet node runtime for kube.

Why this should be merged

Adding the kube runtime for tmpnet requires being able to configure it via cli args. This PR refactors the existing runtime configuration to better support the introduction of the kube runtime configuration.

How this works

How this was tested

CI, manual use of tmpnetctl

Need to be documented in RELEASES.md?

N/A

TODO

@maru-ava maru-ava added the testing This primarily focuses on testing label Apr 8, 2025
@maru-ava maru-ava self-assigned this Apr 8, 2025
@github-project-automation github-project-automation bot moved this to Backlog 🗄️ in avalanchego Apr 8, 2025
@maru-ava maru-ava moved this from Backlog 🗄️ to In Progress 🏗 in avalanchego Apr 8, 2025
@maru-ava maru-ava changed the title [tmpnet] Refactor runtime handling to enable kube extension [tmpnet] Refactor runtime configuration in preparation for kube Apr 8, 2025
Base automatically changed from tmpnet-use-content-keys to master April 9, 2025 04:09
@maru-ava maru-ava force-pushed the tmpnet-refactor-runtime-config branch from 6a78622 to ca512a2 Compare April 9, 2025 06:24
@maru-ava maru-ava changed the base branch from master to tmpnet-add-network-reference-to-node April 9, 2025 06:24
@maru-ava maru-ava force-pushed the tmpnet-refactor-runtime-config branch 6 times, most recently from 4983079 to b824686 Compare April 9, 2025 06:35
if err := n.EnsureNodeConfig(node); err != nil {
return err
}
if err := node.Write(); err != nil {
return err
}

// Check the VM binaries after EnsureNodeConfig to ensure node.RuntimeConfig is non-nil
if err := checkVMBinaries(log, n.Subnets, node.RuntimeConfig.AvalancheGoPath, pluginDir); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this check to the node

@@ -298,17 +284,6 @@ func (n *Network) Create(rootDir string) error {
}
n.Dir = canonicalDir

// Ensure the existence of the plugin directory or nodes won't be able to start.
pluginDir, err := n.GetPluginDir()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to node start

@maru-ava maru-ava force-pushed the tmpnet-refactor-runtime-config branch from b824686 to ff2f225 Compare April 9, 2025 06:44
@@ -220,15 +218,6 @@ func (n *Network) EnsureDefaultConfig(log logging.Logger, avalancheGoPath string
n.DefaultFlags = FlagsMap{}
}

// Only configure the plugin dir with a non-empty value to ensure
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plugin dir is now only ever set in node flags rather than as a network default flag

@maru-ava maru-ava force-pushed the tmpnet-refactor-runtime-config branch 2 times, most recently from 77b1a04 to 7db5b52 Compare April 9, 2025 06:58
@maru-ava maru-ava moved this from In Progress 🏗 to In Review 👀 in avalanchego Apr 9, 2025
@maru-ava maru-ava marked this pull request as ready for review April 9, 2025 07:00
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 16 out of 16 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

tests/fixture/tmpnet/network.go:1039

  • The PluginDir field is used directly in forming the VM binary path. Consider adding a validation to ensure PluginDir is not empty (or providing a default) to prevent constructing an invalid path.
vmPath := filepath.Join(config.PluginDir, chain.VMID.String())

tests/antithesis/compose.go:76

  • The removal of avalancheGoPath and pluginDir parameters in initBootstrapDB and the subsequent call to BootstrapNewNetwork could lead to misconfiguration in environments that expect these values. Consider ensuring that default values or validations are in place to maintain correct behavior.
if err := initBootstrapDB(network, bootstrapVolumePath); err != nil {

Base automatically changed from tmpnet-add-network-reference-to-node to master April 10, 2025 22:07
case processRuntime:
processRuntimeConfig, err := v.processRuntimeVars.getProcessRuntimeConfig()
if err != nil {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit it might be useful when supporting more runtimes (kube) to mention what runtime we tried to get, in case it's misconfigured:

Suggested change
return nil, err
return nil, fmt.Errorf("for runtime %s: %w", err)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +52 to +54
IsEphemeral bool `json:",omitempty"`
Flags FlagsMap `json:",omitempty"`
RuntimeConfig *NodeRuntimeConfig `json:",omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these omitempty tags necessary or an extra addition to the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's secondary cleanup related to how runtime configuration is serialized.

Comment on lines 55 to +57
type NodeRuntimeConfig struct {
Process *ProcessRuntimeConfig
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the aim is to then have a Process and Kube field right?
Could/Should we instead have an interface here, with the process implementation and a future kube implementation? 🤔
Maybe I'm missing the point completely though, in which case, my apologies 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a struct, it has fields, and there are no fields in common between process and kube to suggest an interface that would be common between them. There is an an interface for the runtime for which there are methods common to both types.

Copy link
Contributor

@felipemadero felipemadero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for this. Preaproving with just a couple nits/questions

@@ -518,7 +479,8 @@ func (n *Network) StartNode(ctx context.Context, log logging.Logger, node *Node)

// Restart a single node.
func (n *Network) RestartNode(ctx context.Context, log logging.Logger, node *Node) error {
if node.RuntimeConfig.ReuseDynamicPorts {
runtimeConfig := node.getRuntimeConfig()
if runtimeConfig.Process != nil && runtimeConfig.Process.ReuseDynamicPorts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q/ couldn't ReuseDynamicPorts be attempted on kube? I wonder if this setting can be separated from the kind of runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Reuse' implies the use of dynamic ports in the first place, but dynamic ports are not compatible with kube due to the kubelet needing a fixed port to perform readiness checks against. Also not needed because a kube pod gets its own IP so there is no potential for clashing with other nodes deployed as pods.

@@ -37,8 +37,9 @@ const (
var (
AvalancheGoPluginDirEnvName = config.EnvVarName(config.EnvPrefix, config.PluginDirKey)

errNodeAlreadyRunning = errors.New("failed to start node: node is already running")
errNotRunning = errors.New("node is not running")
errNodeAlreadyRunning = errors.New("failed to start node: node is already running")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may it be desirable for tmpnet as a library to export errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to prefer YAGI, and I'm not getting any requests to export these errors. The driver for exporting is more likely to be implementing e2e testing of tmpnet itself, which is a near-term goal to ensure behavior that isn't implicitly tested as part of avalanchego's e2e suite.

@maru-ava maru-ava added this pull request to the merge queue Apr 14, 2025
Merged via the queue into master with commit d6ce7ec Apr 14, 2025
24 checks passed
@maru-ava maru-ava deleted the tmpnet-refactor-runtime-config branch April 14, 2025 16:55
@github-project-automation github-project-automation bot moved this from In Review 👀 to Done ✅ in avalanchego Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing This primarily focuses on testing
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants