Do not hardcode char/bytes having 8 bits by tautschnig · Pull Request #917 · diffblue/cbmc

tautschnig · 2017-05-13T18:41:30Z

The C standard does not guarantee that char is exactly 8 bits, and there are DSPs (such Texas Instruments C55x) that do not have 8-bit-bytes. Use the configuration value instead.

tautschnig · 2017-05-23T11:59:44Z

~~Marking do-not-merge as it includes a commit that overlaps with #955.~~

tautschnig · 2018-05-30T13:53:32Z

~~Marking do-not-merge as with the changes from #2246 the byte width should become part of byte_extract/byte_update.~~

thomasspriggs · 2021-03-15T10:56:29Z

@tautschnig I have just been taking a look at this PR as part of the effort to clean-up the backlog of open PRs. You stated that this PR was marked "Do not merge" due to overlap with #955. However #955 has now been merged. Does this mean that this PR can now be rebased and merged, or is the effort no longer worthwhile due to the amount of accumulated merge conflicts?

tautschnig · 2021-03-15T13:33:30Z

@thomasspriggs Thank you for undertaking a spring clean! This PR, however, is still of interest to me. I can't quite promise when I'll get to work on it, but I do intend to get this merged eventually.

tautschnig · 2021-04-03T17:49:29Z

Cleanup done, but "the changes from #2246 the byte width should become part of byte_extract/byte_update." is still to be done.

codecov · 2021-04-03T19:27:04Z

Codecov Report

Base: 77.99% // Head: 74.12% // Decreases project coverage by -3.87% ⚠️

Coverage data is based on head (4c871ca) compared to base (99ce20c).
Patch coverage: 92.15% of modified lines in pull request are covered.

❗ Current head 4c871ca differs from pull request most recent head 9ab636f. Consider uploading reports for the commit 9ab636f to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #917      +/-   ##
===========================================
- Coverage    77.99%   74.12%   -3.88%     
===========================================
  Files         1619     1444     -175     
  Lines       187184   157472   -29712     
===========================================
- Hits        145999   116729   -29270     
+ Misses       41185    40743     -442

Impacted Files	Coverage Δ
src/analyses/goto_rw.cpp	`52.80% <0.00%> (-13.80%)`	⬇️
src/ansi-c/literals/convert_character_literal.cpp	`51.21% <0.00%> (-34.50%)`	⬇️
src/goto-symex/symex_function_call.cpp	`92.89% <ø> (-2.72%)`	⬇️
src/goto-symex/symex_other.cpp	`84.40% <ø> (-2.01%)`	⬇️
src/solvers/flattening/boolbv_index.cpp	`79.57% <0.00%> (+6.66%)`	⬆️
unit/util/expr_cast/expr_cast.cpp	`100.00% <ø> (ø)`
unit/util/pointer_offset_size.cpp	`100.00% <ø> (ø)`
src/util/simplify_expr_struct.cpp	`69.04% <66.66%> (-5.34%)`	⬇️
src/solvers/flattening/boolbv_byte_extract.cpp	`69.62% <75.00%> (-0.51%)`	⬇️
src/util/pointer_offset_size.cpp	`92.08% <82.50%> (-0.76%)`	⬇️
... and 1497 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

The C standard does not guarantee that char is exactly 8 bits, and there are DSPs (such Texas Instruments C55x) that do not have 8-bit-bytes. Use the configuration value instead.

peterschrammel

tautschnig created on 13 May 2017

Last one on the stack?

thomasspriggs · 2022-10-17T15:02:04Z

src/goto-instrument/synthesizer/expr_enumerator.cpp

 std::vector<std::size_t> get_ones_pos(std::size_t v)
 {
-  const std::size_t length = sizeof(std::size_t) * 8;
+  const std::size_t length = sizeof(std::size_t) * CHAR_BIT;


The PR description states that you are using the width from the configuration. But this line is using the width for the machine for which cbmc is being compiled. Shouldn't config.ansi_c.char_width be used instead? The distinction could be important if cbmc is to be used for program written for a machine such as a DSP like you suggested in the PR desription. Wouldn't these architectures be more likely to be targeted using a cross compiler rather than porting cbmc to the target hardware itself?

You are right in that care is required where CHAR_BIT is the right choice and where config.ansi_c.char_width is to be used instead. I hope to have made the right choice in all places, but of course may have made mistakes. For this particular case: as far as I understand the code, this is about the width of bytes that CBMC is being compiled for, and not whatever the verification target might be. So I claim that CHAR_BIT is correct. Now you are also right in that, in all likelihood, one would use a cross compiler for such a DSP, and it may well be that no platform that CBMC ever runs on has CHAR_BIT != 8 - but I'd also like to avoid magic numbers, making it very difficult to tell whether the use of "8" is one that should be config.ansi_c.char_width instead, or is actually fine for it's the compile-time byte width on all relevant platforms.

If you are convinced that we are depending on the right thing in this case, then I am happy to merge this as is.

This specific case it is to enumerate various bit sequences, and this enumeration is done by the compiled CBMC binary.

tautschnig · 2022-10-18T07:34:22Z

tautschnig created on 13 May 2017

Last one on the stack?

🤣 You wish!

thomasspriggs

FYI - CHAR_BIT is the C standard library version of this constant. The C++ equivalent is std::numeric_limits<char>::digits. In this case digits refers to binary digits on the underlying hardware and std::numeric_limits<char>::radix is 2.

🤔 I am happy for debugging and addition of tests for a platform where char is not 8 bits to be left for a follow-up PR if and when this becomes a priority.

feliperodri · 2022-10-18T14:14:47Z

src/util/pointer_offset_size.cpp

      bit_field_bits += w;
-      result += bit_field_bits / 8;
-      bit_field_bits %= 8;
+      result += bit_field_bits / config.ansi_c.char_width;


I'm assuming we guarantee that char_width is always different than zero in all these cases, right?

Yes, though we might one day want to have some architecture sanity check.

It can be loaded from file through configt::set_from_symbol_table and symtab2gb. Therefore it could theoretically be set to 0 in the file and I am not sure we have appropriate validation that the value from file is not 0.

remi-delmas-3000 · 2022-10-18T14:19:50Z

src/goto-instrument/contracts/memory_predicates.cpp


 array_typet is_fresh_baset::get_memmap_type()
 {
-  return array_typet(c_bool_typet(8), infinity_exprt(size_type()));


Do we really need to propagate this to instrumentation code ? This code does not represent anything running on an actual processor. Should c_bool_typet(8) be deprecated completely ?

I'm ok with an arbitrary type, but we must not have "8" as a magic number. I assumed this was "8" in the first place so as to have byte granularity? Was I wrong?

My usual assumption is that architecture should be well defined at the point of running the front end. This is because the C front end runs an external C pre-processor which can have architecture specific macro expansions. As goto-instrument runs after the front end, it is operating on an architecture specific program form. Note for example that the c_bool_type() construction function in src/util/c_types.cpp references the same config field.

tautschnig force-pushed the byte-config branch from dd4210c to bc36ba0 Compare May 23, 2017 11:58

tautschnig self-assigned this May 23, 2017

tautschnig added the do not merge label May 23, 2017

tautschnig force-pushed the byte-config branch 3 times, most recently from d6010a2 to 4047f38 Compare May 23, 2017 17:43

tautschnig removed the do not merge label May 23, 2017

tautschnig force-pushed the byte-config branch from 4047f38 to a2edaec Compare May 24, 2017 08:04

tautschnig assigned kroening and unassigned tautschnig May 25, 2017

tautschnig force-pushed the byte-config branch from a2edaec to e62a5f5 Compare May 29, 2017 15:45

tautschnig force-pushed the byte-config branch from e62a5f5 to 5111350 Compare June 8, 2017 06:30

tautschnig added the cleanup label Jun 9, 2017

tautschnig force-pushed the byte-config branch 2 times, most recently from c5fb9c2 to ebc4c75 Compare June 28, 2017 14:16

tautschnig force-pushed the byte-config branch 2 times, most recently from ff0e2f6 to 437462a Compare July 15, 2017 18:22

tautschnig force-pushed the byte-config branch from 437462a to 90365a2 Compare August 7, 2017 12:21

tautschnig changed the base branch from master to develop August 22, 2017 12:25

tautschnig assigned tautschnig and unassigned kroening Sep 2, 2017

tautschnig changed the title ~~Do not assume that chars/bytes have 8 bits (and some cleanup)~~ [depends: #1331,#1332,#1333] Do not assume that chars/bytes have 8 bits (and some cleanup) Sep 2, 2017

tautschnig changed the title ~~[depends: #1331,#1332,#1333] Do not assume that chars/bytes have 8 bits (and some cleanup)~~ [depends: #1333] Do not assume that chars/bytes have 8 bits (and some cleanup) Sep 4, 2017

tautschnig mentioned this pull request Nov 17, 2017

Simplify equalities of constants #1598

Merged

tautschnig force-pushed the byte-config branch from 90365a2 to 3f38349 Compare January 5, 2018 19:22

tautschnig requested review from cesaro, chrisr-diffblue, kroening and martin-cs as code owners January 5, 2018 19:22

tautschnig force-pushed the byte-config branch from 38ac042 to 0ea8715 Compare April 24, 2018 17:51

tautschnig unassigned smowton Apr 24, 2018

tautschnig force-pushed the byte-config branch from 0ea8715 to 6ed0fa9 Compare May 21, 2018 08:00

tautschnig mentioned this pull request May 29, 2018

smt2: bswap and popcount #2246

Merged

tautschnig assigned tautschnig and unassigned kroening and peterschrammel May 30, 2018

tautschnig added the do not merge label May 30, 2018

tautschnig mentioned this pull request Oct 16, 2018

Handle empty structs in the back-end (and a number of induced fixes) #2161

Merged

tautschnig mentioned this pull request Jan 24, 2019

Feature banner helper #3920

Merged

7 tasks

tautschnig mentioned this pull request Jan 15, 2021

Fix unpack_array_vector to produce an array of bytes #5750

Merged

3 tasks

martin-cs mentioned this pull request Feb 27, 2021

Simplify byte-extract from struct or union expressions #5873

Merged

4 tasks

tautschnig mentioned this pull request Mar 19, 2021

Do not assume that build architecture has byte/char==8 bits #5962

Merged

3 tasks

tautschnig force-pushed the byte-config branch from 6ed0fa9 to 1b08132 Compare April 3, 2021 17:42

tautschnig requested review from allredj and romainbrenguier as code owners April 3, 2021 17:42

tautschnig mentioned this pull request Apr 27, 2021

make_byte_{extract,update} to build byte_{extract,update} expressions #6056

Merged

4 tasks

peterschrammel mentioned this pull request Sep 13, 2021

Fix applicability condition of simplify_byte_update #6311

Merged

7 tasks

Do not hardcode char/bytes having 8 bits

9ab636f

The C standard does not guarantee that char is exactly 8 bits, and there are DSPs (such Texas Instruments C55x) that do not have 8-bit-bytes. Use the configuration value instead.

peterschrammel approved these changes Oct 17, 2022

View reviewed changes

thomasspriggs reviewed Oct 17, 2022

View reviewed changes

thomasspriggs approved these changes Oct 18, 2022

View reviewed changes

feliperodri approved these changes Oct 18, 2022

View reviewed changes

remi-delmas-3000 reviewed Oct 18, 2022

View reviewed changes

Conversation

tautschnig commented May 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tautschnig commented May 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tautschnig commented May 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasspriggs commented Mar 15, 2021

Uh oh!

tautschnig commented Mar 15, 2021

Uh oh!

tautschnig commented Apr 3, 2021

Uh oh!

codecov bot commented Apr 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

peterschrammel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tautschnig commented Oct 18, 2022

Uh oh!

thomasspriggs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

remi-delmas-3000 Oct 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

tautschnig commented May 13, 2017 •

edited

Loading

tautschnig commented May 23, 2017 •

edited

Loading

tautschnig commented May 30, 2018 •

edited

Loading

codecov bot commented Apr 3, 2021 •

edited

Loading

remi-delmas-3000 Oct 18, 2022 •

edited

Loading