Skip to content

jsonformat: Tolerate incorrect paths in plan relevant_attributes#2988

Merged
apparentlymart merged 1 commit intomainfrom
b-relevantattrs-incorrect-through-dynamic
Jul 3, 2025
Merged

jsonformat: Tolerate incorrect paths in plan relevant_attributes#2988
apparentlymart merged 1 commit intomainfrom
b-relevantattrs-incorrect-through-dynamic

Conversation

@apparentlymart
Copy link
Copy Markdown
Contributor

@apparentlymart apparentlymart commented Jul 2, 2025

The code for matching relevant_attributes against resource_drift entries (a part of the heuristic for deciding whether to show "changes outside of OpenTofu" in the human-oriented plan UI) was previously assuming that paths in resource_drift would always be valid for the associated resource instance object values because in most cases the language runtime will detect invalid references and so fail to generate a plan at all.

However, when the reference is to something within a dynamically-typed argument (such as the manifest in kubernetes_manifest) and when it appears only as an argument to either the try or can functions (so the dynamic error is intentionally suppressed) the language runtime can't catch it and so the incorrect reference will leak out into relevant_attributes, thereby violating assumptions made by the path matcher.

Instead then, we'll continue the existing precedent that this "relevant attributes" mechanism is a best-effort heuristic that prefers to succeed with an incomplete result rather than to fail, extending that to the traversals in the plan renderer which will now treat incorrectly-typed steps as not matching rather than causing OpenTofu to crash completely.

Since a reference to something that doesn't exist cannot succeed it also cannot possibly actually contribute directly to the final result of the expression it appeared in, so in practice it should be okay to disregard these invalid references for the purposes of deciding which changes outside of OpenTofu seem likely to have caused the actions that OpenTofu is proposing to make during the apply phase.


This problem was originally reported as a provider bug in hashicorp/terraform-provider-helm#1657. Although it was a change to the provider that caused existing valid references to become incorrect after upgrading, it's OpenTofu CLI's responsibility to describe that situation correctly without crashing, and so that's what this change aims to achieve.

Since this is arguably a pretty rare situation that arises only when a few different components interact together in a specific way, I'm ambivalent about whether this justifies a v1.10 backport. I've left this without a changelog entry for now so we can discuss that risk/reward tradeoff, but if we ultimately decide not to backport it then I'll add another commit here to introduce the changelog entry on the main branch rather than on the v1.10 branch.

The code for matching relevant_attributes against resource_drift entries
(a part of the heuristic for deciding whether to show "changes outside of
OpenTofu" in the human-oriented plan UI) was previously assuming that paths
in resource_drift would always be valid for the associated resource
instance object values because in most cases the language runtime will
detect invalid references and so fail to generate a plan at all.

However, when the reference is to something within a dynamically-typed
argument (such as the manifest in kubernetes_manifest) and when it appears
only as an argument to either the "try" or "can" functions (so the dynamic
error is intentionally suppressed) the language runtime can't catch it
and so the incorrect reference will leak out into relevant_attributes,
thereby violating assumptions made by the path matcher.

Instead then, we'll continue the existing precedent that this "relevant
attributes" mechanism is a best-effort heuristic that prefers to succeed
with an incomplete result rather than to fail, extending that to the
traversals in the plan renderer which will now treat incorrectly-typed
steps as not matching rather than causing OpenTofu to crash completely.

Since a reference to something that doesn't exist cannot succeed it also
cannot possibly _actually_ contribute directly to the final result of the
expression it appeared in, so in practice it should be okay to disregard
these invalid references for the purposes of deciding which changes outside
of OpenTofu seem likely to have caused the actions that OpenTofu is
proposing to make during the apply phase.

Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
@apparentlymart apparentlymart self-assigned this Jul 2, 2025
@apparentlymart apparentlymart requested a review from a team as a code owner July 2, 2025 18:31
@opentofu opentofu deleted a comment from github-actions Bot Jul 2, 2025
@cam72cam
Copy link
Copy Markdown
Member

cam72cam commented Jul 2, 2025

I'm in favor of a backport. It's a unlikely situation granted, but it's still a clearly visible bug to users.

@apparentlymart
Copy link
Copy Markdown
Contributor Author

Cool, okay! In that case, I'll leave this without a changelog entry and will then add the changelog entry in the subsequent backlog PR, so that it ends up on the changelog of the correct branch.

Copy link
Copy Markdown
Contributor

@yottta yottta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test in plan_test.go makes this clearer. Thanks a lot for it.
LGTM

@cam72cam
Copy link
Copy Markdown
Member

cam72cam commented Jul 3, 2025

It looks like HC is working on a change to HCL to work around this issue: hashicorp/hcl#763

@apparentlymart
Copy link
Copy Markdown
Contributor Author

apparentlymart commented Jul 3, 2025

That HCL change does address a proximate cause of the problem reported against the hashicorp/helm provider in hashicorp/terraform-provider-helm#1657: in that case, v3 of the provider changed the static schema of helm_release so that metadata is now a single object rather than a list of objects, and so there is some type information available that HCL could potentially rely on but it was previously ignoring that information when encountering unknown values.

Making use of that schema information means that (once upgraded to a newer version of HCL with that fix) OpenTofu would catch the invalid use of [0] on the non-list value in that helm_release case in particular during the planning phase, returning an error about it.

However, it doesn't address the root cause: the code that populates RelevantAttributes is inherently imprecise and best-effort because it relies only on the hcl.Expression.Variables result rather than on hcl.Expression.Value. The integration test I added here illustrates the worst-case scenario where the incorrect index operator is applied to an attribute of cty.DynamicPseudoType, thereby defeating OpenTofu's attempt at static validation and forcing it to be checked only at runtime, and then using try to intentionally ignore the result of that runtime check. The test case I added in plan_test.go here still fails even when using the patch from hashicorp/hcl#763, for that reason.

The HCL fix is not wrong though: it complements the root-cause fix I made here by adding some additional cases where the language runtime will be able to reject an expression as invalid before trying to analyze the references in it. I do think we should also adopt that HCL patch once it's been included in a release, but we won't need to backport it to the v1.10 branch because this PR will prevent that situation from causing a crash. (Backporting seems a little risky because it subtly changes HCL's treatment of an invalid expression in a way that might affect existing uses of try and can, though I think that's pretty marginal.)

@apparentlymart apparentlymart merged commit 647ea6a into main Jul 3, 2025
17 checks passed
@apparentlymart apparentlymart deleted the b-relevantattrs-incorrect-through-dynamic branch July 3, 2025 15:50
apparentlymart added a commit that referenced this pull request Jul 3, 2025
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
apparentlymart added a commit that referenced this pull request Jul 3, 2025
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
cam72cam pushed a commit that referenced this pull request Jul 14, 2025
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
Signed-off-by: Christian Mesh <christianmesh1@gmail.com>
cam72cam pushed a commit that referenced this pull request Jul 14, 2025
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
Signed-off-by: Christian Mesh <christianmesh1@gmail.com>
cam72cam pushed a commit that referenced this pull request Jul 14, 2025
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
Signed-off-by: Christian Mesh <christianmesh1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants