Skip to content

Fix various loss of data warnings#1762

Merged
Dead2 merged 1 commit intozlib-ng:developfrom
carlossanlop:FixLossOfDataBugs
Aug 16, 2024
Merged

Fix various loss of data warnings#1762
Dead2 merged 1 commit intozlib-ng:developfrom
carlossanlop:FixLossOfDataBugs

Conversation

@carlossanlop
Copy link
Copy Markdown
Contributor

@carlossanlop carlossanlop commented Jul 25, 2024

Fixes #1760

Fixed in the dotnet/runtime repo copy of zlib-ng: dotnet/runtime#105433

WD4242 and WD4244 are compiler warnings that should not be suppressed because the warn about possible loss of data.

  • WD4242 shows up in /arch/*/slide_hash*.c files and comes from the arguments passed to the slide_hash_chain method.
  • WD4244 happens in Windows when building in Debug configuration, in various deflate*.c files, and comes from the arguments passed to the check_match method.
  • Also fixed a couple additional failures that showed up after upgrading to v2.2.1.

Fixed by:

  • Adding asserts to verify the values are below the maximum allowed for their type.
  • Casting them the proper type before passing them as arguments to their methods.

@Dead2
Copy link
Copy Markdown
Member

Dead2 commented Jul 29, 2024

The datatype changes in some of the slide_hash functions would need to be checked for performance impacts.

As a side note, wsize cannot be bigger than 32768 without breaking with the Deflate format (Deflate64 however can be ~64k, but we don't support that). So we could in theory make s->window_size an uint_16 as well and avoid some checking, although that would have to be investigated as it could present challenges other places, and could end up costing us performance instead.

What do you think @nmoinvaz?

@codecov
Copy link
Copy Markdown

codecov bot commented Jul 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.92%. Comparing base (2c801bd) to head (281dee5).
Report is 2 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1762      +/-   ##
===========================================
- Coverage    82.99%   82.92%   -0.08%     
===========================================
  Files          135      135              
  Lines        10292    10299       +7     
  Branches      2784     2700      -84     
===========================================
- Hits          8542     8540       -2     
- Misses        1053     1064      +11     
+ Partials       697      695       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Jul 30, 2024

unsigned short and unsigned char are slower than unsigned int. That's why we originally changed from using Pos in some places to using unsigned int. It also helped with aligning struct members. The issue with non-inlined function is that either compiler does the casting of function parameter implicitly or the developer does it explicitly.

Copy link
Copy Markdown
Collaborator

@mtl1979 mtl1979 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@nmoinvaz
Copy link
Copy Markdown
Member

nmoinvaz commented Aug 2, 2024

Changing s->window_size to uint32_t might be a good optimization.

@KungFuJesus
Copy link
Copy Markdown
Collaborator

Changing s->window_size to uint32_t might be a good optimization.

At the cost of halving the throughput of slide_hash, though.

@nmoinvaz
Copy link
Copy Markdown
Member

nmoinvaz commented Aug 5, 2024

Sorry, I meant uint16_t, I was responding to @Dead2.

@nmoinvaz
Copy link
Copy Markdown
Member

nmoinvaz commented Aug 5, 2024

These commits should be squashed and commit message cleaned up, maybe use the same message as the name & description of this PR.

@KungFuJesus
Copy link
Copy Markdown
Collaborator

Sorry, I meant uint16_t, I was responding to @Dead2.

Ah. Yeah making it a 16 bit field would at least eliminate the branch at the top of the function. And it'd shrink the struct to boot. So long as we aren't violating API compatibility I don't see a reason not to do that.

@carlossanlop
Copy link
Copy Markdown
Contributor Author

carlossanlop commented Aug 9, 2024

@nmoinvaz @Dead2 @mtl1979 - I'm updating dotnet/runtime to 2.2.1 and I found a few more loss of data bugs. Before squashing as requested by @nmoinvaz, I would like to include the fixes for the new data loss issues in this same PR.

Here is the PR updating dotnet/runtime to 2.2.1 dotnet/runtime#105771 . The errors are:

C:\Users\calope\source\repos\runtime\src\native\external\zlib-ng\deflate.c(207): 
    warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
C:\Users\calope\source\repos\runtime\src\native\external\zlib-ng\arch\generic\crc32_braid_c.c(195): 
    error C4242: '=': conversion from 'z_word_t' to 'uint32_t', possible loss of data
C:\Users\calope\source\repos\runtime\src\native\external\zlib-ng\deflate_p.h(81): 
    error C4242: '=': conversion from 'uint32_t' to 'uint16_t', possible loss of data
C:\Users\calope\source\repos\runtime\src\native\external\zlib-ng\deflate_p.h(82): 
    error C4242: '=': conversion from 'uint32_t' to 'unsigned char', possible loss of data

Now I'm wondering: Is zlib-ng might be suppressing C4242 and C4334? Why aren't these errors being caught in this repo? It's outside the scope of this PR (meant for only fixing the build warnings/errors), and I don't have the expertise or time to apply and verify such changes.

@carlossanlop carlossanlop requested a review from mtl1979 August 9, 2024 22:58
@carlossanlop
Copy link
Copy Markdown
Contributor Author

Changing s->window_size to uint32_t might be a good optimization.

Sorry, I meant uint16_t, I was responding to @Dead2.

Ah. Yeah making it a 16 bit field would at least eliminate the branch at the top of the function. And it'd shrink the struct to boot. So long as we aren't violating API compatibility I don't see a reason not to do that.

Can someone else please take care of these improvements after this PR gets merged?

@carlossanlop
Copy link
Copy Markdown
Contributor Author

I squashed all the commits, and I included extra fixes from similar problems found in v2.2.1 coming from crc32_braid_c.c, deflate_p.h and deflate.c.

@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Aug 10, 2024

Sorry, I meant uint16_t, I was responding to @Dead2.

Ah. Yeah making it a 16 bit field would at least eliminate the branch at the top of the function. And it'd shrink the struct to boot. So long as we aren't violating API compatibility I don't see a reason not to do that.

It will alter the alignment of the following struct members, so extra padding is required. It will also slow down access a lot as I already mentioned. In some of the supported architectures copying less than 32 bits to a register needs explicit zeroing of the unused bits.

@KungFuJesus
Copy link
Copy Markdown
Collaborator

KungFuJesus commented Aug 10, 2024

It will alter the alignment of the following struct members, so extra padding is required. It will also slow down access a lot as I already mentioned. In some of the supported architectures copying less than 32 bits to a register needs explicit zeroing of the unused bits.

So move where it's declared in the struct? I'm sure there are several 8 bit fields or other 16 bit fields to put it near that could result in the same sized struct or even smaller. Inflating the padding up to 7 bytes doesn't really seem like nearly as much a net loss as removing a branch for this function every time is a gain.

@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Aug 10, 2024

It will alter the alignment of the following struct members, so extra padding is required. It will also slow down access a lot as I already mentioned. In some of the supported architectures copying less than 32 bits to a register needs explicit zeroing of the unused bits.

So move where it's declared in the struct? I'm sure there are several 8 bit fields or other 16 bit fields to put it near that could result in the same sized struct or even smaller. Inflating the padding up to 7 bytes doesn't really seem like nearly as much a net loss as removing a branch for this function every time is a gain.

The structs were originally optimized for 32-bit pointers, so things get ugly if we want the layout be clean for both 32-bit and 64-bit targets. In worst case we need only 4 bytes of padding.

@Dead2
Copy link
Copy Markdown
Member

Dead2 commented Aug 10, 2024

I did a quick test with just changing s->window_size to be uint16_t, but at least on my x86-64 test this resulted in a noticeable reduction in performance and an increased code size. So that probably means it increased the need for casts elsewhere, and it could perhaps be fixed by finding those regressions and improve the related code.

@carlossanlop This is certainly not something we expect you to do. We just sometimes have ideas for further/deeper improvements and we encourage freely sharing and discussing those. It can make the PR discussions a little more cluttered, but usually it is ok.

This PR is looking pretty good so far, but after the last changes it throws an error

[ 17%] Building C object CMakeFiles/zlib.dir/deflate.c.o
In file included from /home/runner/work/zlib-ng/zlib-ng/deflate.c:52:
/home/runner/work/zlib-ng/zlib-ng/deflate_p.h:82:19: error: use of undeclared identifier 'UCHAR_MAX'
    Assert(len <= UCHAR_MAX, "len should fit in unsigned char");
                  ^
1 error generated.

@KungFuJesus
Copy link
Copy Markdown
Collaborator

KungFuJesus commented Aug 10, 2024

I did a quick test with just changing s->window_size to be uint16_t, but at least on my x86-64 test this resulted in a noticeable reduction in performance and an increased code size. So that probably means it increased the need for casts elsewhere, and it could perhaps be fixed by finding those regressions and improve the related code.

That's maybe a little surprising but it might have to do with the fact that going from a 32 bit value to a 16 bit is virtually free on x86 but in the opposite direction it requires an explicit move as 16 bit writes I think leave the second half of a register alone. I suspect you're correct that in changing every access to be 16 bit will fix that.

@carlossanlop
Copy link
Copy Markdown
Contributor Author

@carlossanlop This is certainly not something we expect you to do. We just sometimes have ideas for further/deeper improvements and we encourage freely sharing and discussing those. It can make the PR discussions a little more cluttered, but usually it is ok.

Noted, and thank you for the clarification. Then feel free to discuss all you want, hope the final decisions bring fruitful results!

This PR is looking pretty good so far, but after the last changes it throws an error

[ 17%] Building C object CMakeFiles/zlib.dir/deflate.c.o
In file included from /home/runner/work/zlib-ng/zlib-ng/deflate.c:52:
/home/runner/work/zlib-ng/zlib-ng/deflate_p.h:82:19: error: use of undeclared identifier 'UCHAR_MAX'
    Assert(len <= UCHAR_MAX, "len should fit in unsigned char");
                  ^
1 error generated.

@Dead2 Where are you seeing this error? In the dotnet/runtime repo, the PR CI passed without failures, and the build also passed in my local machine. We do have the warnings enabled in that repo.

Related question: are we suppressing the warnings in this repo somewhere? I am assuming that whatever I fix and push here, won't show up as a warning or error in the local build or in the CI, hence why I kept catching them in the dotnet/runtime repo. What cmake change should I make to make sure we get those build warnings and errors from now on?

@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Aug 12, 2024

I did a quick test with just changing s->window_size to be uint16_t, but at least on my x86-64 test this resulted in a noticeable reduction in performance and an increased code size. So that probably means it increased the need for casts elsewhere, and it could perhaps be fixed by finding those regressions and improve the related code.

That's maybe a little surprising but it might have to do with the fact that going from a 32 bit value to a 16 bit is virtually free on x86 but in the opposite direction it requires an explicit move as 16 bit writes I think leave the second half of a register alone. I suspect you're correct that in changing every access to be 16 bit will fix that.

As I said before, only 32-bit operations zero the upper bits automatically, explaining why 32-bit operations have smaller performance penalty than 8- and 16-bit operations... Also, some operations will widen implicitly if used with narrower than 32 bits. Shifts will widen to largest possible bit width that can handle the whole expression, not just operands of the shift itself.

@carlossanlop
Copy link
Copy Markdown
Contributor Author

Last remaining suggestion addressed. If this is ready, feel free to approve and merge.

@carlossanlop carlossanlop changed the title Fix loss of data warnings when calling check_match and slide_hash_chain Fix various loss of data warnings Aug 12, 2024
@carlossanlop
Copy link
Copy Markdown
Contributor Author

And hopefully we can consider enabling the warnings so they do show up in this repo. It could become too time consuming (and even unsafe) to keep addressing the issues in the dotnet/runtime repo every time we upgrade (weeks or months after a new release comes out), and then have to come back here to submit the upstream fix.

@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Aug 12, 2024

And hopefully we can consider enabling the warnings so they do show up in this repo. It could become too time consuming (and even unsafe) to keep addressing the issues in the dotnet/runtime repo every time we upgrade (weeks or months after a new release comes out), and then have to come back here to submit the upstream fix.

We currently use /W3 for Visual C++, we don't treat any warnings as errors... It will be easy to make a pull request that treats some common warnings as errors to avoid introducing regressions. Just enabling certain warnings will result in most of developers just ignoring them as they don't read the CI logs if the build passes.

@carlossanlop
Copy link
Copy Markdown
Contributor Author

Just enabling certain warnings will result in most of developers just ignoring them as they don't read the CI logs if the build passes.

Yep. This is one of the reasons why we now treat all warnings as errors in the dotnet/runtime repo.

…s due to data type mismatch in various files.
@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Aug 14, 2024

Just enabling certain warnings will result in most of developers just ignoring them as they don't read the CI logs if the build passes.

Yep. This is one of the reasons why we now treat all warnings as errors in the dotnet/runtime repo.

I've been developer for over 40 years... I'm not afraid to say aloud things that I think help make certain project look more professional. That doesn't mean I always suggest the best course of action or that I don't make mistakes... The whole point of open-source community is that there is always other people to review changes and suggest improvements. zlib-ng pretty much started as one-man project, but after few years it made the stock zlib look like something written by child in kindergarten. Last few months there has been a lot of activity in stock zlib, but that doesn't change the fact that it was unmaintained for over 5 years.

@carlossanlop
Copy link
Copy Markdown
Contributor Author

That doesn't mean I always suggest the best course of action or that I don't make mistakes... The whole point of open-source community is that there is always other people to review changes and suggest improvements

I fully appreciate what you guys have achieved with zlib-ng. The impact of this project has been impressive to say the least. I became such a big fan of zlib-ng when I discovered it that I brought it to the attention of the .NET team so we would consider switching to it instead of stock zlib, and it makes me very happy to say that we are finally going to include it in our next release.

I'll try to help with improvements whenever possible, and for the things I am unable to help with (due to lack of time or knowledge), I will always make sure to mention it in the comments or open issues so other maintainers become aware.

Copy link
Copy Markdown
Member

@Dead2 Dead2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Dead2 Dead2 merged commit 3da40c2 into zlib-ng:develop Aug 16, 2024
@carlossanlop carlossanlop deleted the FixLossOfDataBugs branch August 16, 2024 17:30
This was referenced Sep 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows error C4242: 'function': Conversion from 'unsigned int' to 'Pos', possible loss of data

5 participants