Use small intrusive pointer in irep by reuk · Pull Request #786 · diffblue/cbmc

reuk · 2017-04-09T21:05:05Z

Follows up #685. Rather than using std::shared_ptr which uses more memory than is strictly necessary (an extra pointer to the control block in the shared_ptr itself, plus the control block memory) we use a custom small_shared_ptr which is based on boost's intrusive_ptr.

This approach uses exactly the same amount of memory as was used previously, but separating the resource management into a separate class makes the logic a bit clearer, and greatly simplifies the copy- and move-assignment operators. It also means we don't need to explicitly redeclare custom move/copy constructors/assignment-ops in irep.h.

tautschnig

A few questions; I overall like this approach, but what we really need for this to move forward is a solid performance testing set up that has long been talked about.

tautschnig · 2017-04-10T07:59:23Z

src/util/irep.h

What is the rationale for this change?

See following comment.

Same here: keep it simple. We don't need to solve problems we don't have.

tautschnig · 2017-04-10T08:05:52Z

src/util/small_shared_ptr.h

What's the idea of the separate using directive, instead of just doing std::swap(t_, rhs.t_); ?

In the general case, using std::swap allows you to place a custom swap for your type in a namespace other than std, because manually overloading std::swap isn't allowed. This overload can be found by Koenig lookup, but if not found will fall back to std::swap. In this case, this behaviour isn't strictly required, so I could just use std::swap directly.

Keep it simple.

tautschnig · 2017-04-10T08:08:03Z

src/util/small_shared_ptr.h

What is the use case of this one?

Like std::make_unique and std::make_shared, this has exception-safety benefits.

In the expression my_func(small_shared_ptr<X>(new X), small_shared_ptr<Y>(new Y)), the compiler is allowed to sequence the operations like so:

new X new Y <-- This may throw and leak previously-allocated X construct small_shared_ptr<X> construct small_shared_ptr<Y> my_func

Whereas my_func(make_small_shared_ptr<X>(), make_small_shared_ptr<Y>()) cannot leak because the calls to make_small_shared_ptr won't be interleaved.

Shouldn't that be my_func(make_small_shared_ptr<X, Y>()) ?

No, empty parameter packs are valid. You can see at irep.cpp:30 and irep.h:242 how the function is used in practice.

Ok, I think I just misunderstood: your argument was about the order of invoking new vs the construction of small_shared_ptr, and not the order of the small_shared_ptr constructs (which is still unspecified). Fair enough. You may wish to add a comment to the constructor of small_shared_ptr to say that make_small_shared_ptr should be used in case of a new T argument.

What would be a use case where the parameter pack is non-empty? Is this just to support types other than dt that require >= 1 argument in their constructor?

Yes, if you have a type with a constructor like my_type(Foo, Bar, Baz) then you can create a small_shared_ptr to it using make_small_shared_ptr<my_type>(Foo(), Bar(), Baz()) (for example).

This also avoids repetition of the identifier my_type (the alternative would be small_shared_ptr<my_type>(new my_type(Foo(), Bar(), Baz()) which is needlessly repetitive).

Ok, thanks!

tautschnig · 2017-04-10T11:05:29Z

As said before, this looks good to me, and includes quite a bit of long overdue cleanup. Yet I don't dare setting any approval here in absence of a performance test suite.

tautschnig · 2017-04-12T14:29:46Z

With 22170b7 my looks-good-to-me no longer applies.

reuk · 2017-04-12T19:31:29Z

Yes, sorry about that - I think the copy-on-write implementation currently in irep is buggy because it allows you to make changes which affect copies in unexpected ways:

irept a;
auto& comments = a.get_comments();
const auto b(a);
comments.<some mutating method>();
// b is changed here!

I'm guessing this is why get_comments and other similar methods are marked DANGEROUS. True copy-on-write would mark the internal structure 'unshareable' when giving out a mutable reference to a data member. Then, the example above becomes:

irept a;
auto& comments = a.get_comments(); // This detaches if necessary, and
                                   // marks the internal data unshareable
const auto b(a); // This is a deep copy because a is unshareable
comments.<some mutating method>(); // Only affects 'a'

I'll close the PR and re-open when I have something that fixes this bug.

tautschnig · 2017-04-12T19:41:23Z

I don't necessarily think you have to go as far as closing this one. How about making get_comments and get_sub private? There are only a handful of users of those out there.

This commit merges the best aspects of two approaches to hash-based loop identification: - Clean implementation from PR diffblue#732 (bigweaver/clone-cbmc-private-20251130-231902) - Comprehensive testing from PR diffblue#786 (bigweaver/clone-cbmc-private-20251209-144542) Core Implementation (from PR diffblue#732): - Enhanced loop_idt struct with hash support and backward compatibility - compute_loop_hashes() using AST fingerprinting approach - Hash based on source location, loop structure, and body characteristics - Uses hash_combine() and hash_finalize() utilities - Clean separation of concerns with modular design Testing Infrastructure (from PR diffblue#786): - Unit tests: unit/goto-programs/loop_hash.cpp (Catch2-based) - Basic regression: 3 test suites (types, nested, stability) - Comprehensive suite: 21 automated test cases covering: * Position independence (11 tests) * Sensitivity to changes (4 tests) * Determinism (4 tests) * Special cases (2 tests) - Multiple test frameworks: Python (basic + enhanced) and Bash - Test utilities for hash comparison and extraction Key Benefits: - Loop identifiers stable across unrelated code changes - Hashes change appropriately when loop logic changes - Backward compatible with existing loop_number system - Comprehensive test coverage (21+ test cases) - Well-documented with extensive README files Files Modified: - Core: 5 implementation files in src/goto-programs/ - Tests: 52 test files (unit + regression) - Build: unit/Makefile updated See LOOP_HASH_MERGE_SUMMARY.md for detailed documentation.

reuk force-pushed the small-shared-ptr branch from d6ee7ec to 0e57b66 Compare April 9, 2017 21:48

tautschnig reviewed Apr 10, 2017

View reviewed changes

Use small intrusive pointer in irep

d86342e

reuk force-pushed the small-shared-ptr branch from 0e57b66 to d86342e Compare April 10, 2017 08:42

Add intrusive copy-on-write class

22170b7

reuk closed this Apr 12, 2017

reuk mentioned this pull request Apr 15, 2017

Fix functions marked 'DANGEROUS' in irep #834

Closed

Conversation

reuk commented Apr 9, 2017

Uh oh!

tautschnig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tautschnig Apr 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tautschnig commented Apr 10, 2017

Uh oh!

tautschnig commented Apr 12, 2017

Uh oh!

reuk commented Apr 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tautschnig commented Apr 12, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tautschnig Apr 10, 2017 •

edited

Loading

reuk commented Apr 12, 2017 •

edited

Loading