Skip to content

Various bugfixes / enhancements for <flat_set>#3985

Closed
achabense wants to merge 20 commits intomicrosoft:feature/flat_setfrom
achabense:_For_flat_set
Closed

Various bugfixes / enhancements for <flat_set>#3985
achabense wants to merge 20 commits intomicrosoft:feature/flat_setfrom
achabense:_For_flat_set

Conversation

@achabense
Copy link
Contributor

@achabense achabense commented Aug 22, 2023

This pr is becoming large and hard to review, I'm going to replace this pr with a new one with all commits re-grouped when it is finished. This pr has been replaced by pr 3993.

This pr contains a lot of bugfixes/enhancements for <flat_set>:

About each commit:

  • 1st: Fix usage of _Compressed_pair. _Compressed_pair tries to compress its first member, but the implementation was erroneously trying to compress the container instead of key_compare.
  • 2nd: Add #pragma pack(push, _CRT_PACKING) etc.
  • 3rd: (This turns out to be merely a nitpick. see this comment.)
  • 4th: _Emplace_hint should return iterator instead of void, and const_iterator _Where may be not be convertible to iterator. (As to const const_iterator _Begin = cbegin(); const const_iterator _End = cend();, they are enhancements to match with _Where.)
  • 5th: Nitpicks to enhance constness.
  • 6th: Regroup/reorder non-constructor methods to match with the standard. insert(initializer_list) is found missing.
  • 7th: Implement insert(initializer_list).
  • 8th: Improve implementation of contains by using binary_search.
  • 9th: Fix insert methods. Especially, insert(const _Kty&) was broken before this commit.
  • 10th: operator=(initializer_list) was not observable from flat_set/multiset. flat_set/multiset should using _Mybase::operator=;. (this turns out to be problematic; fixed by 14th commit)
  • 11th: Various cleanups.
  • 12th: insert(const iter&, const iter&) should be insert(iter, iter)
  • 13th : Implement LWG-3884 (*flat_foo is missing allocator-extended copy/move constructors*), as applied in N4958.
  • 14th: Fix 10th commit. Add tests for insertion methods, the new constructor and ebco optimization; test_insert is a general test covering commit 4, 7, 9 and 12.
  • 15th: emplace forgot to return the result of _Emplace; use const key_compare& for comparision consistently; some other nitpicks.
  • 16th: emplace and emplace_hint should not construct temporary for key_type (so that a key_type&& will not be moved away if insertion fails); try to add debug check for flat_set::insert([hint,]auto&&).
  • 17th: For conformance and efficiency, flat_multiset::emplace should use upper_bound instead of lower_bound.
  • 18th: Fix incorrect bound-finding behavior of emplace_hint (flat_set should always look for lower_bound); update test.

@achabense achabense requested a review from a team as a code owner August 22, 2023 16:57
@achabense

This comment was marked as outdated.

@achabense achabense marked this pull request as draft August 22, 2023 17:18
@achabense
Copy link
Contributor Author

achabense commented Aug 23, 2023

It seems the standard doesn't say what flat_set::emplace should do? (only flat_multiset::emplace is specified)

@frederick-vs-ja
Copy link
Contributor

It seems the standard doesn't say what flat_set::emplace should do? (only flat_multiset::emplace is specified)

May be specified in [associative.reqmts.general]/49.

But it seems to me that flat_set::emplace possibly uses Alloc::construct, while flat_multiset::emplace doesn't. An LWG issue may be needed.

@achabense

This comment was marked as resolved.

const iterator _End = end();
const iterator _Where = lower_bound(_Val);
const const_iterator _End = end();
const const_iterator _Where = lower_bound(_Val);
Copy link
Contributor Author

@achabense achabense Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterator is not convertible to const_iterator (return type) in many cases. see below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit? I thought it was required to have iterator convertible to const_iterator https://eel.is/c++draft/container.reqmts#6

Copy link
Contributor Author

@achabense achabense Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, my mistake; correction: const_iterator is not required to convert to iterator.
Update: the 3rd commit doesn't "fix" anything (but can be seen as nitpick/enhancement).

(There are real iterator fixes, see 4th commit (it's returning _Where in a branch which is a const_iterator, but the function should return an iterator) )

stl/inc/flat_set Outdated
return _Cont.insert(_Where, _STD forward<_Ty>(_Val));
}
return _Where;
return _Cont.begin() + (_Where - _Begin);
Copy link
Contributor Author

@achabense achabense Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just making this line work (convert const_iterator to iterator); may need further improvements in the future (hopefully after the implementation settling down).

@achabense
Copy link
Contributor Author

achabense commented Aug 23, 2023

STL/stl/inc/flat_set

Lines 498 to 500 in d6e6b5a

const iterator _Where = lower_bound(_Val);
if (_Where != _End && _Keys_equal(*_Where, _Val)) {
return _Where;

There are several places that, due to the nature of lower_bound, the usage of _Keys_equal (!_Compare(_Lhs, _Rhs) && !_Compare(_Rhs, _Lhs);) can be simplified to a single comparision (see std::binary_search for example). However this will make it harder to review. I think we can do these changes as enhancements after the implementation settling down.

@achabense
Copy link
Contributor Author

achabense commented Aug 23, 2023

STL/stl/inc/flat_set

Lines 137 to 141 in d6e6b5a

_Deriv& operator=(initializer_list<_Kty> _Ilist) {
_Get_cont() = container_type(_Ilist.begin(), _Ilist.end());
_Make_invariants_fulfilled();
return static_cast<_Deriv&>(*this);
}

I'm afraid this is not observable from flat_set/multiset, as flat_set/multiset is currently not using base class's operator=

Update: fixed by 10th commit.

auto insert(const value_type& _Val) {
return _Insert<_Kty>(_Val);
return _Emplace(_Val);
}
Copy link
Contributor Author

@achabense achabense Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this change, even the following code will fail to compile:

flat_set<int> s;
int i = 1;
s.insert(i);

As, a const _Kty& is not convertible to _Kty&& previously in _Insert...

}

template <class _Ty>
requires (!_Multi && _Keylt_transparent && is_constructible_v<_Kty, _Ty>) || is_same_v<_Ty, _Kty>
Copy link
Contributor Author

@achabense achabense Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about this removal:

  • it's enough to constrain in the caller site (insert / emplace).
  • shouldn't be is_same_v<_Ty, _Kty>, as _Ty can be const _Kty&

@achabense achabense changed the title <flat_set>: _Base_flat_set's _Compressed_pair should take key_compare as its first argument Various bugfixes / enhancements for <flat_set> Aug 23, 2023
@achabense
Copy link
Contributor Author

achabense commented Aug 24, 2023

As to this function, I can imagine it causing surprises in practice. I think we should try to add debug check in the future (after the implementation settling down).

    template <class _Other>
        requires (!_Multi && _Keylt_transparent && is_constructible_v<_Kty, _Other>)
    auto insert(_Other&& _Val) {
        return _Emplace(_STD forward<_Other>(_Val));
    }

image
I think this will also cause unexpected regressions sometimes:

std::flat_set<int, std::less<>> s{ 1,2,3,5,7 };
s.insert(4.0); // will be slower than s.insert(int(4.0))

Also see this comment.

@achabense
Copy link
Contributor Author

achabense commented Aug 24, 2023

STL/stl/inc/flat_set

Lines 548 to 550 in d6e6b5a

void _Restore_invariants_after_insert(const size_type& _Old_size) {
key_compare& _Compare = _Get_comp();
const iterator _Old_end = begin() + static_cast<difference_type>(_Old_size);

Is it always correct to use const key_compare&? Is BinaryPredicate allowed to offer different behaviors based on key_compare& vs const key_compare&?

That is, is qux a valid BinaryPredicate? (I think if so nothing can make sense; but I'm having difficulty finding some formal citations.)

struct qux {
    bool operator()(int a, int b) {
        return a < b;
    }
    bool operator()(int a, int b)const {
        return a > b;
    }
};

@achabense
Copy link
Contributor Author

I've force-pushed this pr to add serial numbers for each commit description. Nothing else is changed.

@achabense achabense marked this pull request as ready for review August 24, 2023 11:41
@@ -132,9 +119,74 @@ void test_constructors() {
flat_set<int> a{};
a = {1, 7, 7, 7, 2, 100, -1};
Copy link
Contributor Author

@achabense achabense Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this pr, this was passing as the initializer_list was implicitly converted to flat_multiset and operator=(flat_set&&) was called here. Now I can make sure this is calling operator=(initializer_list)

@achabense
Copy link
Contributor Author

achabense commented Aug 24, 2023

Folowup to this comment:

Hmm, is this UB or another implementation bug?

    std::flat_set<int, std::less<>> s{ 1,2,3,5,7 };
    s.insert(2.5);
    for (auto i : s) {
        std::cout << i << ' '; // ?! 1 2 2 3 5 7
    }

Update:
assert(s.find(2.5) == s.find(int(2.5))); turns false, so this is UB....

I think this example deserves some extra attention, as it shows the precondition is fairly tricky, and implies potential defects in the current implementation. At least I think we should try to catch such errors in debug mode.

Update:
Effects: If the set already contains an element equivalent to x, *this and x is unchanged. Otherwise, inserts a new element as if by: return emplace(std::forward<K>(x));.
I think insert(_Other&&) should be re-implemented.

(-update re-implemented in 16th commit)

@achabense

This comment was marked as resolved.

…e_hint`; use `const key_compare&` consistently; test nitpick
return _Emplace(_Kty{_STD forward<_Args>(_Vals)...});
constexpr bool _Is_key_type = _In_place_key_extract_set<_Kty, remove_cvref_t<_Args>...>::_Extractable;
if constexpr (_Is_key_type) {
return _Emplace(_STD forward<_Args>(_Vals)...);
Copy link
Contributor Author

@achabense achabense Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to emplace and emplace_hint:
When input is (const)_Kty(&/&&), I believe there shouldn't be a temporary before insertion, even for flat_multiset; especially, for flat_set a _Kty&& input should not be moved to a temporary if insertion fails.
For example, the following code holds true for set and unordered_set, and should hold true for flat_set as well.

    Set<shared_ptr<int>> s;
    auto a = shared_ptr<int>(new int);
    auto b = a, c = a;
    assert(a == b && a == c);
    assert(!(a < b) && !(b < a));
    s.insert(a);
    s.emplace(move(b));
    assert(b != nullptr);
    s.insert(move(c));
    assert(c != nullptr);

_STL_ASSERT(lower_bound(_Keyval) == _Where && !_Keys_equal(_Keyval, *_Where),
"find(val) was not equal to find(key_type{forward(val)})");
return pair{_Cont.emplace(_Where, _STD move(_Keyval)), true};
}
Copy link
Contributor Author

@achabense achabense Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for debug of flat_set::insert([hint,]auto&&); there are fairly dangerous use patterns like:

    std::flat_set<int, std::less<>> s{ 1,2,3,5,7 };
    s.insert(2.5);

Which can break invariant unexpectedly. See details in this comment.

Otherwise, this is actually equivalent to the if constexpr (is_same_v<remove_cvref_t<_Ty>, _Kty>) branch (as _Cont.emplace(_Where, _STD forward<_Ty>(_Val)) can do the same thing).

Due to the trouble it introduces, I believe flat_set::insert([hint,]auto&&) should be renamed in the standard to avoid accidental misuse (or at least explained in more details).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why this is dangerous. Wouldn't 2.5 be converted to 2 first and cause the insertion to fail? I tested the 2.5 insertion with flat_map and the flat_map was unchanged.

Copy link
Contributor Author

@achabense achabense Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find(2.5) would get end(); find(int(2.5))->find(2) would get 2; this is precondition violation and is UB.

See this link

Copy link
Contributor Author

@achabense achabense Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For flat_set::emplace, I think a conversion to the key_type is required before insertion, as this is explicitly required in flat_multiset::emplace:
image
Conversion happens before insertion even if flat_multiset can have transparent comparators...

image
This means I cannot call emplace[_hint]() directly for insert(auto&&), as in emplace, for non-key types I need to firstly convert auto&& to a key_type, which may change the value of auto&&.

I think [flat.set.modifiers] should explicitly define flat_set::emplace...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find(2.5) whould get end(); find(int(2.5))->find(2) would get 2; this is precondition violation and is UB.

See this link

So the implementation is not wrong then ... ? It's just counter-intuitive?

Copy link
Contributor Author

@achabense achabense Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not confident whether the implementation is correct for insert([hint,]auto&&); but yes, s.insert(2.5); is just UB behavior so has nothing to do with implementation correctness; I think this method is just too tricky and can cause a lot of surprises in practice.

}
} else {
// _Val < (*_Where - 1)
_Where = _STD upper_bound(_Begin, _Where, _Val, _Compare);
Copy link
Contributor Author

@achabense achabense Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was making the following test failing:

        flat_set<int, lt, C> a{0, 5};
        assert_all_requirements_and_equals(a, {0, 5});
        a.emplace_hint(a.end());
        assert_all_requirements_and_equals(a, {0, 5}); // < got {0, 0, 5}!

flat_set should always use lower_bound for correctness; flat_multiset should use upper_bound for conformance and efficiency.

if constexpr (_Multi) {
_STL_INTERNAL_STATIC_ASSERT(is_same_v<remove_cvref_t<_Ty>, _Kty>);
return _Cont.emplace(_Where, _STD forward<_Ty>(_Val));
return _Cont.emplace(upper_bound(_Val), _STD forward<_Ty>(_Val));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@achabense achabense marked this pull request as draft August 25, 2023 16:36
@achabense
Copy link
Contributor Author

achabense commented Aug 26, 2023

For ease of review, I've opened #3993 to replace this one.

@achabense achabense closed this Aug 26, 2023
@CaseyCarter CaseyCarter added the uncharted Excluded from the Status Chart label Aug 29, 2023
@achabense achabense deleted the _For_flat_set branch August 30, 2023 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

uncharted Excluded from the Status Chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants