Skip to content

printf: fix mingw-w64 format checks, turn C99 format strings into macro#14643

Closed
vszakats wants to merge 32 commits intocurl:masterfrom
vszakats:mingw-w64-7-global-fix-test
Closed

printf: fix mingw-w64 format checks, turn C99 format strings into macro#14643
vszakats wants to merge 32 commits intocurl:masterfrom
vszakats:mingw-w64-7-global-fix-test

Conversation

@vszakats
Copy link
Member

@vszakats vszakats commented Aug 22, 2024

Change mingw-w64 printf format checks to use __MINGW_PRINTF_FORMAT
instead of gnu_printf, to sync the format checker with the format
string macros published via curl/system.h. (Also disable format checks
for mingw-w64 older than 3.0.0 (2013-09-20) and classic-mingw, which
do not support this macro.)

This fixes bogus format checker -Wformat warnings in 3rd party code
using curl format strings with the curl printf functions, when using
mingw-w64 7.0.0 (2019-11-10) and older (with GCC, MSVCRT).

It also allows to delete two workaounds for this within curl itself:

But, __MINGW_PRINTF_FORMAT creates a new problem, where the C99 %z
(size_t) format string is no longer accepted by the format checker
when using mingw-w64 7.0.0 (2019-11-10) and earlier, which default to
ms_printf format string style.

Fix this by adding new, internal, size_t format string macros, and
replace %z uses across the codebase with them.

The macro adapts to mingw-w64 and other build environments
automatically.

After this patch the curl/system.h format string macros, the format
checker style set in curl/mprintf.h (and lib/curl_setup.h), and the
format strings used across the source code always match.

Also:

  • lib/ws.c: add missing space to a string.
  • fix to use standard printf in the example ftpgetinfo.

Verification script:

: check if format string translations were accurate:
curl -LO https://github.com/curl/curl/pull/14643.diff
grep '^-' 14643.diff > 14643-out.diff
grep '^+' 14643.diff > 14643-in.diff
grep -E -o '%[0-9]*z[xud]' 14643-out.diff > 14643-out-fmt.diff
grep -E -o '%[0-9]*" CURL_FORMAT_[XS]?SIZE_T' 14643-in.diff > 14643-in-fmt.diff
paste 14643-in-fmt.diff 14643-out-fmt.diff | sort | uniq -c > 14643-FORMATS.txt
: check if strings remained intact after reflow:
grep -E '^(\+|@)' 14643.diff | grep -E -o '("[^"]+"|^\+\+\+.+|^@@)' | tr '\n' '\t' | sed -E -e 's/\"\t\"//g' -e 's/\+\+\+ b/--- a/g'       | tr '\t' '\n' > 14643-in.diff
grep -E '^(-|@)'  14643.diff | grep -E -o '("[^"]+"|^---.+|^@@)'    | tr '\n' '\t' | sed -E -e 's/\"\t\"//g' -e 's/(%[0-9]*)z(x|u|d)/\1/g' | tr '\t' '\n' > 14643-out.diff
diff -u 14643-out.diff 14643-in.diff > 14643-STRINGS.diff

@vszakats vszakats marked this pull request as draft August 22, 2024 08:55
@vszakats vszakats added the build label Aug 22, 2024
@vszakats
Copy link
Member Author

vszakats commented Aug 22, 2024

OK, so at first try this revealed (via new -Wformat warnings) existing code using
CURL_FORMAT_CURL_OFF_T[U] with CRT printf() functions:
https://github.com/curl/curl/actions/runs/10504973605/job/29101585339?pr=14643#step:12:158

/cc @bagder

@bagder
Copy link
Member

bagder commented Aug 22, 2024

Clearly the format warning on those mingw compiler versions is not functional for how we do things.

@vszakats
Copy link
Member Author

Another approach to try is this:

--- a/include/curl/mprintf.h
+++ b/include/curl/mprintf.h
@@ -36,9 +36,9 @@ extern "C" {
   defined(__IAR_SYSTEMS_ICC__)) &&                                      \
   defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) &&         \
   !defined(CURL_NO_FMT_CHECKS)
-#if defined(__MINGW32__) && !defined(__clang__)
+#if defined(__MINGW32__)
 #define CURL_TEMP_PRINTF(fmt, arg) \
-  __attribute__((format(gnu_printf, fmt, arg)))
+  __attribute__((format(__MINGW_PRINTF_FORMAT, fmt, arg)))
 #else
 #define CURL_TEMP_PRINTF(fmt, arg) \
   __attribute__((format(printf, fmt, arg)))

I'll try it once fixed the stray standard printfs in http/client.

@bagder
Copy link
Member

bagder commented Aug 22, 2024

The CURL_FORMAT_ defines are meant to work with any printf(). We typically discourage applications from using the curl provided printf functions.

@vszakats
Copy link
Member Author

vszakats commented Aug 22, 2024

The CURL_FORMAT_ defines are meant to work with any printf(). We typically discourage applications from using the curl provided printf functions.

Ah OK, got it. It throws out the original idea here, but the format-spec patch might be a way to go that's compatible with it. Basically we need to make sure the formats and format-checker are in sync with each other and with standard printf, while curl printf accepts any format flavour anyway.

Also perhaps introducing curl printf into example was a mistake in the previous PR, and should be reverted. (...along with the fprintf() converstions I'm doing now in this PR.)

@vszakats
Copy link
Member Author

The initial idea checks out at this point of the PR. Now turning onto the other solution.

@vszakats vszakats force-pushed the mingw-w64-7-global-fix-test branch 2 times, most recently from bf78f32 to 01624a2 Compare August 22, 2024 10:06
@vszakats vszakats changed the title build: try curl/system.h solution for mingw-w64 <= 7.0.0 mingw: revise printf format checks Aug 22, 2024
@vszakats vszakats added the Windows Windows-specific label Aug 22, 2024
@vszakats
Copy link
Member Author

vszakats commented Aug 22, 2024

With the clean approach we bump into another problem: curl internally uses the %z format (for size_t* and curl_socket_t), which doesn't fit the __MINGW_PRINTF_FORMAT spec with mingw-w64 v7 (2019-11-10) and earlier:
https://github.com/curl/curl/actions/runs/10506559918/job/29106453350?pr=14643

(This might have been a reason why I didn't go with that after testing it in #12489)

One clean (IMO), but laborous fix is to swap %z with a macro that's set to work for all mingw versions. And adjust CURL_FORMAT_SOCKET_T.

The upside is that this only touches internals, not the public headers.

@vszakats vszakats force-pushed the mingw-w64-7-global-fix-test branch from 2b5b229 to ca506e6 Compare August 22, 2024 15:41
@vszakats
Copy link
Member Author

@bagder: After a series of tests, the only way I see to replace
the build hack in place now, is to swap %zd and %zu within
the source to the newly added CURL_FORMAT_[S]SIZE_T.

This brings the format-check, the format specifiers, and the
source code in sync with each other, regardless of mingw
version. It also gives a tool to handle it with any other compiler
that has issues with C99 format specifiers.

(To selectively disable format checks for old mingw-w64
versions, we'd need to include _mingw.h to the public curl
headers. I wouldn't want to take that risk. This header may
alter the compiler env, and in a public header this is an issue.)

Do you think I should go ahead with this?

@vszakats vszakats force-pushed the mingw-w64-7-global-fix-test branch from ca506e6 to 2a112c6 Compare August 26, 2024 23:26
@vszakats vszakats marked this pull request as ready for review August 26, 2024 23:49
@vszakats
Copy link
Member Author

As discussed, I replaced all %z format strings with new CURL_FORMAT_[X|S]SIZE_T macros.

@vszakats
Copy link
Member Author

vszakats commented Aug 27, 2024

Script to verify that format-string replacements were done correctly:

curl -LO https://github.com/curl/curl/pull/14643.diff
grep '^-' 14643.diff > 14643-out.diff
grep '^+' 14643.diff > 14643-in.diff
grep -E -o '%[0-9]*z(x|u|d)' 14643-out.diff > 14643-out-fmt.diff
grep -E -o '%[0-9]*" CURL_FORMAT_(X|S)?SIZE_T' 14643-in.diff > 14643-in-fmt.diff
paste 14643-in-fmt.diff 14643-out-fmt.diff | sort | uniq -c > 14643-match.txt

The 493 changes checked out OK:

 340 %" CURL_FORMAT_SIZE_T	%zu
 141 %" CURL_FORMAT_SSIZE_T	%zd
   7 %" CURL_FORMAT_XSIZE_T	%zx
   5 %04" CURL_FORMAT_XSIZE_T	%04zx

@vszakats
Copy link
Member Author

vszakats commented Aug 27, 2024

Script to verify if the strings remained the same after this patch:

curl -LO https://github.com/curl/curl/pull/14643.diff
grep -E '^(\+|@)' 14643.diff | grep -E -o '("[^"]+"|^\+\+\+.+|^@@)' | tr $'\n' $'\t' | sed -E -e 's/\"\t\"//g' -e 's/\+\+\+ b/--- a/g'       | tr $'\t' $'\n' > in
grep -E '^(-|@)'  14643.diff | grep -E -o '("[^"]+"|^---.+|^@@)'    | tr $'\n' $'\t' | sed -E -e 's/\"\t\"//g' -e 's/(%[0-9]*)z(x|u|d)/\1/g' | tr $'\t' $'\n' > out
diff -u out in > 14643-strings.diff

They did, except one places where I added missing space (plus code changes outside of strings):

--- out
+++ in
@@ -1,9 +1,7 @@
 --- a/configure.ac
 @@
-"$curl_cv_native_windows$have_mingw"
 --- a/docs/examples/CMakeLists.txt
 @@
-"__USE_MINGW_ANSI_STDIO=1"
 --- a/docs/examples/Makefile.am
 @@
 --- a/docs/examples/ftpgetinfo.c
@@ -85,10 +83,10 @@
 "Included max number of cookies (%) in request!"
 --- a/lib/curl_setup.h
 @@
-"lldllucurl/mprintf.h"
 @@
+"curl/mprintf.h"
 @@
-"zd"
+"xudzxzuzd"
 --- a/lib/cw-out.c
 @@
 "cw_out, wrote % %s bytes -> %"
@@ -485,7 +483,7 @@
 @@
 "WS: decode ending with % frame bytes remaining"
 @@
-"WS: starting new frame with % bytes from last oneremaining to be sent"
+"WS: starting new frame with % bytes from last one remaining to be sent"
 @@
 "WS, using chunk size %"
 @@
@@ -531,7 +529,6 @@
 "Bad variable name length (%), skipping"
 --- a/tests/http/clients/CMakeLists.txt
 @@
-"__USE_MINGW_ANSI_STDIO=1"
 --- a/tests/http/clients/Makefile.am
 @@
 --- a/tests/libtest/lib1156.c

@vszakats vszakats changed the title mingw: revise printf format checks printf: fix mingw-w64 format checks and make C99 format strings a macro Aug 27, 2024
@vszakats vszakats changed the title printf: fix mingw-w64 format checks and make C99 format strings a macro printf: fix mingw-w64 format checks, make C99 format strings a macro Aug 27, 2024
@vszakats vszakats changed the title printf: fix mingw-w64 format checks, make C99 format strings a macro printf: fix mingw-w64 format checks, turn C99 format strings into macro Aug 27, 2024
@vszakats vszakats force-pushed the mingw-w64-7-global-fix-test branch from ba3265c to 56d9b0f Compare August 28, 2024 12:38
@jan2000
Copy link
Contributor

jan2000 commented Aug 31, 2024

As discussed, I replaced all %z format strings with new CURL_FORMAT_[X|S]SIZE_T macros.

Might I ask where this is discussed? Because I fail to see the pros of this change vs the cons. This seem like a step backward for code legibility. A workaround for an older runtime over the whole code base, while there already is a less impactful workaround in place.

@vszakats
Copy link
Member Author

vszakats commented Aug 31, 2024

Format masks/checks are a long time issue (and not a fun one at that). Despite spending quite a bit of time finding a workaround for making it work everywhere on all platforms (in particular all mingw-w64 versions), I have found none that doesn't have other downsides (like including _mingw.h in a public curl header).

This and referenced PRs (and #14703 made after this PR) have/had plenty of tries in sub commits with just as much dead-ends.

edit: The current solution is build hacks for examples and tests/http/clients. That however doesn't solve this problem for anyone trying to use the public headers in the problematic envs.

The root issue is using C99 format strings in a C89 project.

Certainly the clean solution here is "annoying" (just like the existing solution for 64-bit formats).

If you know a better one that makes pass all cases and still offers format checks, I guess everyone would be happy to use it as a better alternative.

@bagder
Copy link
Member

bagder commented Aug 31, 2024

The root issue is using C99 format strings in a C89 project.

I would perhaps go a little further and say:

The root issue is using C99 format strings in a C89 project - and asking the compiler to verify the printf formatting.

@bagder
Copy link
Member

bagder commented Aug 31, 2024

We could also just switch off the printf checking and save ourselves from a lot of trouble...

@jan2000
Copy link
Contributor

jan2000 commented Aug 31, 2024

Ultimately the problem is using C99 format strings in a C89 project.

Is this the problem though? If the C99 format is parsed by C89 code this seems perfectly fine to me...

and asking the compiler to verify the printf formatting.

This indeed seems more like the problem, to verify with compilers that are incompatible with the format strings used.

We could also just switch off the printf checking and save ourselves from a lot of trouble...

I would agree this would be a far better choice (for incompatible compilers) than to decrease the code legibility all over the place.

@bagder
Copy link
Member

bagder commented Aug 31, 2024

@vszakats we could probably switch if off for all mingw compilers without that being too complicated

@vszakats
Copy link
Member Author

vszakats commented Aug 31, 2024

Switching it off for all mingw is not complicated, but in that case we'll lose checks for that platform completely.

It basically means that formats will no longer be checked on Windows at all, and also that Windows-specific strings (or Windows-specific uses of strings) will no longer be checked at all. Returning to pre-PR #12489 where these fallouts were fixed after adding format checks for the first time for these.

I don't think that's a useful way to go.

@vszakats
Copy link
Member Author

Ultimately the problem is using C99 format strings in a C89 project.

Is this the problem though? If the C99 format is parsed by C89 code this seems perfectly fine to me...

C99 format is or isn't parsed perfectly depending on Windows compiler vendor, compiler version, CRT, format check configuration. Windows is a weird place.

@bagder
Copy link
Member

bagder commented Aug 31, 2024

It basically means that formats will no longer be checked on Windows at all

Right, but that's not a terribly big deal IMHO. We have not checked the format for long (meaning that we can survive pretty well without them) and most flags can still be checked fine on other platforms.

I don't think that's a useful way to go.

I'm beginning to think it is. Right now that seems like a lower price than this format change.

@jan2000
Copy link
Contributor

jan2000 commented Aug 31, 2024

C99 format is or isn't parsed perfectly depending on Windows compiler vendor, compiler version, CRT, format check configuration. Windows is a weird place.

The C99 format is parsed in curl by the code in mprintf.c, right? Then it does not depend at all on those things.

It basically means that formats will no longer be checked on Windows at all, and also that Windows-specific strings (or Windows-specific uses of strings) will no longer be checked at all.

Does mingw-w64 >= v8.0.0 have a compatible format checker? Maybe it can be kept on with those versions without having to do a format change.

@vszakats
Copy link
Member Author

vszakats commented Sep 1, 2024

C99 format is or isn't parsed perfectly depending on Windows compiler vendor, compiler version, CRT, format check configuration. Windows is a weird place.

The C99 format is parsed in curl by the code in mprintf.c, right? Then it does not depend at all on those things.

Correct, the actual parsing doesn't depend on it. However the compiler's checker can only check for the standard it can understand. It doesn't know it's curl's printf, which understands everything.

It basically means that formats will no longer be checked on Windows at all, and also that Windows-specific strings (or Windows-specific uses of strings) will no longer be checked at all.

Does mingw-w64 >= v8.0.0 have a compatible format checker? Maybe it can be kept on with those versions without having to do a format change.

This is what I tried first in this PR. It works. But to check for the mingw-w64 version, curl's public header must include mingw-w64's internal header _mingw.h which sets its version macro. Pulling that in implicitly to all code using libcurl seems ...risky. _mingw.h has side effects like setting a default Windows target version. That's actually the reason why we include it to curl's own code. For other code it might be undesired.

(Notice that mingw-w64 GCC might not be the sole target affected by issues around this. mingw-w64 just happens to be the first "problematic" target where we (I) enabled format checks. We haven't tried with the more popular MSVC.)

@vszakats
Copy link
Member Author

vszakats commented Sep 1, 2024

I gave #14703 another go and managed to make it work (though CI haven't fully finished yet).

The idea is to override the format check in public curl/mprintf.h with curl internal ones while compiling curl itself.

This makes it possible to sync public format checks with public format strings in curl/system.h.
Which fixes format checks for 3rd-party code using libcurl. Also examples and tests/http/clients, allowing to drop build-level hacks.

Internal format checks continue to use GNU-style, which supports %z with all mingw-w64 versions.

The downside is that this is still a compiler-specific hack (albeit the best looking one so far). It relies on the mingw-w64 feature that allows controlling format check style, adapting to C99 code internally, while sticking to whatever the compiler default is externally.

As for this PR, I agree these format macros are long and not pleasing to type or read (true for existing 64-bit ones too).
@icing already added nicer aliases for the latter: CURL_PRId64 / CURL_PRIu64. I wonder if reception would be more positive if using a friendlier name for size_t, such as CURL_zd / CURL_zu? Food for thought in case we ever want to go full portable.

@vszakats vszakats marked this pull request as draft September 2, 2024 08:56
@vszakats
Copy link
Member Author

vszakats commented Sep 2, 2024

Closing this one in favour of #14703.

@vszakats vszakats closed this Sep 2, 2024
@vszakats vszakats deleted the mingw-w64-7-global-fix-test branch September 2, 2024 16:10
vszakats added a commit that referenced this pull request Sep 2, 2024
Change mingw-w64 printf format checks in public curl headers to use
`__MINGW_PRINTF_FORMAT` instead of `gnu_printf`. This syncs the format
checker with format string macros published via `curl/system.h`. (Also
disable format checks for mingw-w64 older than 3.0.0 (2013-09-20) and
classic-mingw, which do not support this macro.)

This fixes bogus format checker `-Wformat` warnings in 3rd party code
using curl format strings with the curl printf functions, when using
mingw-w64 7.0.0 (2019-11-10) and older (with GCC, MSVCRT).

It also allows to delete two workaounds for this within curl itself:
- setting `-D__USE_MINGW_ANSI_STDIO=1` for mingw-w64 via cmake and
  configure for `docs/examples` and `tests/http/clients`.
  Ref: c730c85 #14640

The format check macro is incompatible (depending on mingw-w64 version
and configuration) with the C99 `%z` (`size_t`) format string used
internally by curl.

To work around this problem, override the format check style in curl
public headers to use `gnu_printf`. This is compatible with `%z` in all
mingw-w64 versions and allows keeping the C99 format strings internally.

Also:
- lib/ws.c: add missing space to an error message.
- docs/examples/ftpgetinfo.c: fix to use standard printf.

Ref: #14643 (take 1)
Follow-up to 3829759 #12489

Closes #14703
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants