fix(cli): hide diffs of mangled unicode strings#25525
fix(cli): hide diffs of mangled unicode strings#25525
Conversation
CloudFormation's `GetStackTemplate` irrecoverably mangles any character not in the 7-bit ASCII range. This causes noisy output from `cdk diff` when a template contains non-English languages or emoji. We can detect this case and consider these strings equal. *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
aws-cdk-automation
left a comment
There was a problem hiding this comment.
The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.
A comment requesting an exemption should contain the text Exemption Request. Additionally, if clarification is needed add Clarification Request to a comment.
|
Exemption Request - This change only affects the CLI output and therefore an integration test isn't possible. I'm happy to spend more time documenting or working through other prerequisites but I do want to make sure that, conceptually, this change is one that the team will accept. I have also included more information about the benefits and tradeoffs of this change in the original PR message. |
|
@skinny85 any chance I could get some eyes on this? |
|
@laverdet unfortunately, I'm no longer on the CDK team, so I can't help with the PR 🙂. |
|
@corymhall Is there any chance you or someone on the team could take a quick look at this? It is a 3-line CLI-only QOL improvement and I'm a little confused as to why I haven't gotten any communication from the team in the 2 months since I opened #24557. If this is not something the team wants then that's ok but right now I'm caught in an awkward holding pattern. |
|
|
||
| test('mangled strings', () => { | ||
| expect(deepEqual('foo', 'foo')).toBeTruthy(); | ||
| expect(deepEqual('????', '文字化け')).toBeTruthy(); |
There was a problem hiding this comment.
Equality, being an equivalence relation, should be reflexive, symmetric and transitive. In particular, for the symmetric property, the following should hold:
deepEqual('????', '文字化け') deepEqual('文字化け', '????')
which is not the case. The left-hand side is true and the right-hand side is false. But even if we fix this (and assuming transitivity also holds), we get the following nonsensical implication:
// Given that the following statements are true:
deepEqual('文字化け', '????')
deepEqual('????', '🤦🏻♂️')
// Then, by transitivity:
deepEqual('文字化け', '🤦🏻♂️')There was a problem hiding this comment.
@otaviomacedo thank you for the response.
I agree in principle that equality should be equality but we must confront the reality of CloudFormation's limitations. This is not a general purpose deep equality function and already contains special cases for CloudFormation (see: DependsOn).
const left = {
DependsOn: ['a', 'b'],
};
const right = {
DependsOn: ['c', 'd'],
};
expect(deepEqual(left, right)).toBeTruthy(); // passes (bug?)I don't think we should let the notions of ideal mathematical purity get in the way of fixing a very real and plainly observable issue in CDK. GetStackTemplate is not reflexive in relation to CreateChangeSet and therefore this function must operate under the same imperfect framework. Viewing this function under any other lens is incorrect in the same way applying euclidean geometry on a non-euclidean surface is incorrect.
Here lvalue is "what CloudFormation has given us" and rvalue is "what the developer intended." Under this exotic framework, I believe that the implementation is sound.
|
This PR has been in the CHANGES REQUESTED state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week. |
|
Please don't close it again. It is clearly not abandoned. |
|
@otaviomacedo - Could you kindly dismiss the automation error? The diff is not abandoned and I would like to not open a new PR tomorrow. |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
|
This PR has been deemed to be abandoned, and will be automatically closed. Please create a new PR for these changes if you think this decision has been made in error. |
|
The pull request linter fails with the following errors: PRs must pass status checks before we can provide a meaningful review. If you would like to request an exemption from the status checks or clarification on feedback, please leave a comment on this PR containing |
I am reopening this from #25525 and following up on my comments here: #24557 (comment) #24557 (comment) #25008 (comment) #25008 (comment) #25008 (comment) #25008 (comment) #25008 (comment) #25008 (comment) #25525 (comment) #25525 (comment) 🫠 #25525 (comment) 🫠 --- Fixes #25309 Fixes #22203 Fixes #20212 Fixes #13634 Fixes #10523 Fixes #10219 See also: aws-cloudformation/cloudformation-coverage-roadmap#1220 See also: aws-cloudformation/cloudformation-coverage-roadmap#814 --- 👻 I have retitled this PR as a `chore` instead of a `fix` because @aws-cdk-automation keeps closing my PRs as abandoned even though they are clearly not abandoned. > This PR has been deemed to be abandoned, and will be automatically closed. Please create a new PR for these changes if you think this decision has been made in error. --- @otaviomacedo @rix0rrr @TheRealAmazonKendra - I'm happy to adjust the approach, add more tests, or do what else needs to be done. I'm not getting any feedback from the team so I'm not sure how to proceed. The diff noise with non-ASCII information in cdk diff makes it difficult to find meaningful changes to our stacks. 🗿🗞️📬 **Crucially, this change only affects the CLI output and therefore an integration test isn't possible.** --- CloudFormation's `GetStackTemplate` irrecoverably mangles any character not in the 7-bit ASCII range. This causes noisy output from `cdk diff` when a template contains non-English languages or emoji. We can detect this case and consider these strings equal. *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license* Many AWS services accept non-ASCII input, eg many "description" fields. CloudFormation will correctly dispatch these templates but when invoking `GetStackTemplate` the result is mangled. This causes annoying noise in the output of `cdk diff`: ``` Resources [~] AWS::Lambda::Function Lambda/Resource └─ [~] Description ├─ [-] ????? └─ [+] 🤦🏻♂️ ``` This change modifies the diff algorithm to consider the string equal if the lvalue is a mangled version of the rvalue. Of course this runs the risk of hiding changesets which modify only a single non-ASCII character to another non-ASCII character, but these fields already tend to be informative in nature.
I am reopening this from #25008
and following up on my comments here:
#24557 (comment)
#24557 (comment)
#25008 (comment)
#25008 (comment)
#25008 (comment)
#25008 (comment)
#25008 (comment)
#25008 (comment)
@aws-cdk-automation @rix0rrr @TheRealAmazonKendra - I'm happy to adjust the approach, add more tests, or do what else needs to be done. I'm not getting any feedback from the team so I'm not sure how to proceed. The diff noise with non-ASCII information in cdk diff makes it difficult to find meaningful changes to our stacks.
🗿🗞️📬 Crucially, this change only affects the CLI output and therefore an integration test isn't possible.
CloudFormation's
GetStackTemplateirrecoverably mangles any character not in the 7-bit ASCII range. This causes noisy output fromcdk diffwhen a template contains non-English languages or emoji. We can detect this case and consider these strings equal.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license
Many AWS services accept non-ASCII input, eg many "description" fields. CloudFormation will correctly dispatch these templates but when invoking
GetStackTemplatethe result is mangled. This causes annoying noise in the output ofcdk diff:This change modifies the diff algorithm to consider the string equal if the lvalue is a mangled version of the rvalue.
Of course this runs the risk of hiding changesets which modify only a single non-ASCII character to another non-ASCII character, but these fields already tend to be informative in nature.
fixes #25309