Skip to content
This repository was archived by the owner on Sep 30, 2025. It is now read-only.

Fix overescaping of quote to fix warning with gawk 5.0.1#70

Merged
pbrisbin merged 2 commits intopbrisbin:masterfrom
aragon999:fix/overescaping-in-awk
Sep 3, 2019
Merged

Fix overescaping of quote to fix warning with gawk 5.0.1#70
pbrisbin merged 2 commits intopbrisbin:masterfrom
aragon999:fix/overescaping-in-awk

Conversation

@aragon999
Copy link
Contributor

Although I just noticed that you will be sunsetting aurget, I want to open this pull request, to fix an awk warning from the update of gawk to version 5.0.1.

I understand the reasoning why you are sunsetting aurget (especially since you do not use it yourself), my reason to use it, is because of its simplicity and I never had any issues :-) just for your information.
Thank you very much for your work so far.

@restyled-io
Copy link

restyled-io bot commented Aug 30, 2019

Hey there-

I'm a bot, here to let you know that some code in this PR might not
match the team's automated styling. I ran the team's auto-reformatting tools on
the files changed in this PR and found some differences. Those differences can
be seen in #71.

Please see that Pull Request's description for more details.

@pbrisbin
Copy link
Owner

Thank you very much! I do hope that aurget can remain useful to folks in its current state, and as much as I can merge fixes such as this to ensure that, I'm happy to.

This fix looks good to me, but would you be able to add a test case for it? I think adding a case to the search.t tests that searches for a package with quotes somewhere in its information should hit this path -- and show your warning in the output until this fix is applied.

@aragon999
Copy link
Contributor Author

Thank you for the quick reply, but I am not really sure about the tests here, or how you would do them. I initially thought how you would write a test for that. Unfortunately I do not really know cram (yet).
To trigger the error in the tests, one would need to update gawk to version 5.x.

The commit which is responsible for the behavior of awk, should be: https://git.savannah.gnu.org/cgit/gawk.git/commit/?id=3ddc932b0a808a4f5f55519ccf1d65f6caa41666

I.e. the functionality is not changed, but just a warning is added, both versions seem to work just fine, i.e.:

$ awk '{ gsub(/\\"/, "\""); print; print "" }'
\"
"

"
"

\\"
\"
$ awk '{ gsub(/\\\"/, "\""); print; print "" }'
awk: cmd:1: Warning: regexp escape sequence `\"' is not a known regexp operator
\"
"

"
"

\\"
\"

The warning is shown in every action where the search is used, i.e. aurget -Syu, aurget -Ss <pkg-name> and so on. Unfortunately I was not able to find a package with a quote.

@pbrisbin
Copy link
Owner

pbrisbin commented Aug 30, 2019

I initially thought how you would write a test for that. Unfortunately I do not really know cram (yet).

Did you look at the search.t test I mentioned? I had hoped that would be helpful and self-explanatory, so I tried not to overwhelm with too much up-front explanation.

Now that you asked, here are more details:

To test this, I would add a new case that searches for a package that we know includes a " somewhere in its info. I found the 3dsconv package by searching the AUR for "hi" (though I think I just got lucky, I don't think the search term was particularly good).

diff --git i/test/search.t w/test/search.t
index 42325e0..7253e73 100644
--- i/test/search.t
+++ w/test/search.t
@@ -79,3 +79,7 @@ Searching for info on chruby and aurget
   Votes       : 21
   Description : Changes the current ruby. Supports both zsh and bash.
   
+
+Searching for info with "s
+
+  $ aurget -Ssi --nocolor 3dsconv

If you checkout cram's documentation (it's short), I like to use the -i option:

% cram -i test/search.t
!
--- search.t
+++ search.t.err
@@ -83,3 +83,11 @@
 Searching for info with "s
 
   $ aurget -Ssi --nocolor 3dsconv
+  Repository  : aur
+  Name        : 3dsconv
+  Version     : 4.1-0
+  URL         : https://github.com/ihaveamac/3dsconv
+  Out of date : Yes
+  Votes       : 2
+  Description : Tool to convert Nintendo 3DS CTR Cart Image files (CCI, ".3ds") to the CTR Importable Archive format (CIA).
+  
Accept this change? [yN]

It my case, it seems I don't have the awk that warns, but I would guess that you'd see something more like this (from current master):

 Searching for info with "s
 
   $ aurget -Ssi --nocolor 3dsconv
+  awk: cmd:1: Warning: regexp escape sequence `\"' is not a known regexp operator
+  Repository  : aur
+  Name        : 3dsconv
+  Version     : 4.1-0
+  URL         : https://github.com/ihaveamac/3dsconv
+  Out of date : Yes
+  Votes       : 2
+  Description : Tool to convert Nintendo 3DS CTR Cart Image files (CCI, ".3ds") to the CTR Importable Archive format (CIA).
+  
Accept this change? [yN]

Now, you would Accept that, but then manually change the test to remove the warning since that's what you are trying to make happen. At that point, the test should fail.

Apply your fix and confirm it passes! In addition to asserting the warning is not seen, this test also asserts quotes are handled correctly generally.

NOTE: this will create a file at test/fixtures/curl. This file was recorded from the first time you ran the test and it actually invoked curl and talked to the AUR to perform the search, but subsequent runs will not actually call curl and would use the fixture instead -- you should commit this new directory along with your test.

Hope that made sense!

@aragon999
Copy link
Contributor Author

Yup, I looked at the tests, and especially the search.t (before submitting this PR I also ran the tests locally). I got the basic idea of them and how they work, in general this should not be the problem. Sorry I misstated that.

The problem is, that I am not sure how one want to test this particular issue. Since the warning is only shown when you have gawk >= 5.0.0, which is not the case for the docker images used, i.e. python:2.7. I updated Arch this morning, where the version of gawk was updated from 4.2.1 to 5.0.1. After the normal pacman update, I ran the AUR update, where I first noticed this warning.

I quickly looked through the python Docker images, the only distribution which should have yield the error is python:2.7-alpine3.10, see here: https://pkgs.alpinelinux.org/packages?name=gawk&branch=v3.10

All the other images still have some a version of gawk 4.x, for example Debian: https://packages.debian.org/search?keywords=gawk

The code, and therefore the warning is displayed for every search action performed, since the regex is applied no matter if there is a match or not (even aurget -Ss shows the warning).

Therefore I meant to really test this, one would need to also change the test setup and so on. Do you think this is a reasonable thing to do? If this is really needed, I will look into how to restructure the test setup. But this will take some days I guess. Or would one want to compile gawk on the docker image itself, and then compare the two versions?

The behavior itself is not changed, since the old version also works, i.e. the following is correctly displayed:

$ aurget -Ss 3dsconv  
awk: cmd:4: Warning: regexp escape sequence `\"' is not a known regexp operator
aur/3dsconv 4.1-0 [out of date]
    Tool to convert Nintendo 3DS CTR Cart Image files (CCI, ".3ds") to the CTR Importable Archive format (CIA).
aur/3dsconv-git r83.00a95c3-1 [out of date]
    Tool to convert Nintendo 3DS CTR Cart Image files (CCI, ".3ds") to the CTR Importable Archive format (CIA).

@pbrisbin
Copy link
Owner

pbrisbin commented Aug 30, 2019

Ah, gotcha. Sorry about the confusion.

Therefore I meant to really test this, one would need to also change the test setup and so on. Do you think this is a reasonable thing to do? If this is really needed, I will look into how to restructure the test setup

I don't think we need to get crazy rigorous here. And since you've identified complexity in doing so, I would actually recommend against it.

All I need to accept this PR is a test confirming this change doesn't break existing quote handling (regardless of awk version). The test I have in my previous comment does that. Please add it :)

It would be nice to have a test that actually validates your fix by ensuring warnings aren't visible in the newer awk. But as you said (and I only now realize), all existing tests are doing that already -- if/when run on newer awk. So IMO, we're all set. I don't feel a need to fight with CI on the matter.

@aragon999
Copy link
Contributor Author

I added the test case, as you suggested.

Furthermore I noticed #73 where a different approach was chosen, i.e. replacing the " with \x22. I am not sure which of the two solutions I would prefer, or which advantages the Hex code yields.

@aragon999 aragon999 force-pushed the fix/overescaping-in-awk branch 2 times, most recently from 347f082 to d28845e Compare September 2, 2019 10:50
@aragon999 aragon999 force-pushed the fix/overescaping-in-awk branch from d28845e to ca877df Compare September 2, 2019 11:03
@pbrisbin pbrisbin merged commit 9c06b96 into pbrisbin:master Sep 3, 2019
@pbrisbin
Copy link
Owner

pbrisbin commented Sep 3, 2019

Thanks! Going to merge this and then ask in the other PR if/why it might be better.

@aragon999 aragon999 deleted the fix/overescaping-in-awk branch September 3, 2019 12:53
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants