Skip to content

fix(ssh): use TOFU known_hosts for interactive SSH sessions#653

Merged
ArangoGutierrez merged 1 commit intoNVIDIA:mainfrom
ArangoGutierrez:fix/ssh-host-key-consistency
Feb 13, 2026
Merged

fix(ssh): use TOFU known_hosts for interactive SSH sessions#653
ArangoGutierrez merged 1 commit intoNVIDIA:mainfrom
ArangoGutierrez:fix/ssh-host-key-consistency

Conversation

@ArangoGutierrez
Copy link
Collaborator

Summary

  • Replace StrictHostKeyChecking=no with accept-new + holodeck's known_hosts file
  • Interactive SSH sessions now verify host keys consistently with the Go SSH TOFU pattern

Audit Finding

Changes

  • cmd/cli/ssh/ssh.go: Use UserKnownHostsFile pointing to holodeck cache, StrictHostKeyChecking=accept-new

Test plan

  • gofmt — no formatting issues
  • go build — compiles
  • go test ./pkg/... — all tests pass

Interactive SSH used StrictHostKeyChecking=no, completely bypassing
host key verification. Use holodeck's known_hosts file with
accept-new for consistency with the Go SSH TOFU pattern.

Audit finding NVIDIA#31 (LOW).

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Copilot AI review requested due to automatic review settings February 12, 2026 20:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses an audit finding by improving interactive SSH host key verification to follow a TOFU (trust-on-first-use) approach, aligning interactive system ssh behavior with Holodeck’s Go SSH TOFU pattern.

Changes:

  • Replaces StrictHostKeyChecking=no with StrictHostKeyChecking=accept-new for interactive system SSH.
  • Configures interactive system SSH to use a Holodeck-specific known_hosts file under the user cache directory.

Comment on lines +176 to +178
cacheBase, _ := os.UserCacheDir()
knownHostsPath := filepath.Join(cacheBase, "holodeck", "known_hosts")

Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.UserCacheDir() error is ignored here, and this also bypasses the CLI --cachepath setting (and instances.NewManager default). If UserCacheDir fails or a custom cache path is used, the interactive ssh session will read/write host keys in an unexpected location (or relative to CWD). Consider deriving knownHostsPath from m.cachePath when set (same base as instance cache), and return an error when the cache directory cannot be determined.

Suggested change
cacheBase, _ := os.UserCacheDir()
knownHostsPath := filepath.Join(cacheBase, "holodeck", "known_hosts")
var knownHostsPath string
if m.cachePath != "" {
// When a custom cache path is configured, keep known_hosts alongside the instance cache.
knownHostsPath = filepath.Join(m.cachePath, "known_hosts")
} else {
cacheBase, err := os.UserCacheDir()
if err != nil {
return fmt.Errorf("failed to determine user cache directory: %w", err)
}
knownHostsPath = filepath.Join(cacheBase, "holodeck", "known_hosts")
}
// Ensure the directory for the known_hosts file exists.
if err := os.MkdirAll(filepath.Dir(knownHostsPath), 0o755); err != nil {
return fmt.Errorf("failed to create known_hosts directory: %w", err)
}

Copilot uses AI. Check for mistakes.
Comment on lines +177 to +182
knownHostsPath := filepath.Join(cacheBase, "holodeck", "known_hosts")

args := []string{
"-i", keyPath,
"-o", "StrictHostKeyChecking=no",
"-o", "UserKnownHostsFile=/dev/null",
"-o", "StrictHostKeyChecking=accept-new",
"-o", fmt.Sprintf("UserKnownHostsFile=%s", knownHostsPath),
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before calling system ssh, ensure the parent directory for knownHostsPath exists (e.g., os.MkdirAll(filepath.Dir(knownHostsPath), 0700)). Otherwise OpenSSH will fail to create/update the file when the holodeck cache directory hasn't been created yet (notably when using a non-default cache location).

Copilot uses AI. Check for mistakes.
@coveralls
Copy link

Pull Request Test Coverage Report for Build 21962538360

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 47.501%

Totals Coverage Status
Change from base Build 21955389842: 0.0%
Covered Lines: 2500
Relevant Lines: 5263

💛 - Coveralls

@ArangoGutierrez ArangoGutierrez merged commit ef1f8c2 into NVIDIA:main Feb 13, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants