Skip to content

talm upgrade extracts image from node body, not values.yaml #176

@lexfrei

Description

@lexfrei

Problem

talm upgrade extracts the target installer image from the node-body patch (nodes/<name>.yaml's machine.install.image), not from values.yaml's top-level image: key. Operators who change values.yaml: image: ... expecting the upgrade to use that image observe upgrade run against a different image, with no warning.

This diverges from talm apply / talm template, which do honor values.yaml overlays.

Reproduction

dev17 OCI cluster, running cozystack/talos:v1.12.6:

  1. values.yaml: image: "ghcr.io/siderolabs/installer:v1.13.0" (operator's intent: upgrade to 1.13).
  2. nodes/node0.yaml still carries the historical machine.install.image: ghcr.io/cozystack/cozystack/talos:v1.12.6.
  3. talm upgrade -f nodes/node0.yaml.

Wrapper code at pkg/commands/upgrade_handler.go:104-119:

configBundle, machineType, err := engine.FullConfigProcess(eopts, patches)
result, err := engine.SerializeConfiguration(configBundle, machineType)
config, err := configloader.NewFromBytes(result)
image := config.Machine().Install().Image()

engine.FullConfigProcess(eopts, patches) calls InitializeConfigBundle (default Talos config generation via machinery's generate.NewInput — NOT chart render) then ApplyPatches(loadedPatches=[nodeFile]). The chart's values.yaml is never read. So Install().Image() reflects whatever machine.install.image the node body declares, falling back to the machinery default.

apply and template use engine.Render() against the chart, which DOES read values.yaml. Two code paths, two different sources of truth for the same value.

Expected

Either:

  1. talm upgrade uses the same render path as apply/template, picking up values.yaml.image and only falling back to node-body machine.install.image when the chart didn't render one.
  2. Or talm upgrade documents the divergence loudly and emits a warning: line when the node-body image diverges from the chart's values.yaml.image.

Why this matters

Verified on dev17: talm upgrade -f nodes/node0.yaml with values.yaml.image: siderolabs/installer:v1.13.0 and node-body cozystack/talos:v1.12.6 ran against cozystack:v1.12.6 — a no-op same-version upgrade. The operator's values.yaml change had zero effect. Operator footgun severity: HIGH on fleets where values.yaml is the single source of truth.

Surfaced during Phase 2C real-Talos validation (see #175 / PR #173 context).

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/commandsIssues or PRs related to pkg/commands (CLI subcommands, flag parsing, root detection)area/upgradeIssues or PRs related to talm upgrade flow (image extraction, post-upgrade verification)kind/bugCategorizes issue or PR as related to a bugpriority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next releasetriage/acceptedIndicates an issue is ready to be actively worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions