Optimize is_permutation for reversed sequences#2043
Optimize is_permutation for reversed sequences#2043StephanTLavavej merged 12 commits intomicrosoft:mainfrom
is_permutation for reversed sequences#2043Conversation
miscco
left a comment
There was a problem hiding this comment.
I have to say that I find this overly complicated for a rather simple matching algorithm.
From what I can tell, we should be able to putt all in the first loop at 1005 without any issue
if constexpr (bidirectional_iterator<_It1> && bidirectional_iterator<_It2>) {
// determine final iterator values
auto _Final1 = _Find_last_iterator(_First1, _Last1, _Count);
auto _Final2 = _Find_last_iterator(_First2, _Last2, _Count);
for (;;--_Count) { // trim
if (_Count == 0) { // everything matched
return true;
}
// Matching prefix
if (_STD invoke(_Pred, _STD invoke(_Proj1, *_First1), _STD invoke(_Proj2, *_First2))) {
++_First1;
++_First2;
continue;
}
// Matching suffix
--_Final1;
--_Final2;
if (_STD invoke(_Pred, _STD invoke(_Proj1, *_Final1), _STD invoke(_Proj2, *_Final2))) {
continue;
}
// Matching reversed
if (_STD invoke(_Pred, _STD invoke(_Proj1, *_First1), _STD invoke(_Proj2, *_Final2))) {
++_First1;
++_Final1;
continue;
}
// Matching reversed
if (_STD invoke(_Pred, _STD invoke(_Proj1, *_Final1), _STD invoke(_Proj2, *_First2))) {
++_First2;
++_Final2;
continue;
}
break;
}
}Note this assume that we completely split into bidirectional / forward ranges
We could also do the dance and optimize bidirectional + forward range that way
|
I think we could merge everything without loosing efficiency into _It1 _Final1;
if constexpr (bidirectional_iterator<_It1>) {
_Final1 = _Find_last_iterator(_First1, _Last1, _Count);
}
_It2 _Final2;
if constexpr (bidirectional_iterator<_It2>) {
_Final2 = _Find_last_iterator(_First2, _Last2, _Count);
}
for (;;) { // trim
// Matching suffix
if (_STD invoke(_Pred, _STD invoke(_Proj1, *_First1), _STD invoke(_Proj2, *_First2))) {
++_First1;
++_First2;
continue;
}
// Matching reversed
if constexpr (bidirectional_iterator<_It2>) {
if (_STD invoke(_Pred, _STD invoke(_Proj1, *_First1), _STD invoke(_Proj2, *--_Final2))) {
++_First1;
continue;
}
}
// Matching reversed
if constexpr (bidirectional_iterator<_It1>) {
if (_STD invoke(_Pred, _STD invoke(_Proj1, *--_Final1), _STD invoke(_Proj2, *_First2))) {
++_First2;
if constexpr (bidirectional_iterator<_It2>) {
++_Final2;
}
continue;
}
}
// Matching suffix
if constexpr (bidirectional_iterator<_It1> && bidirectional_iterator<_It2>) {
if (_STD invoke(_Pred, _STD invoke(_Proj1, *_Final1), _STD invoke(_Proj2, *_Final2))) {
continue;
}
}
if (--_Count == 0) { // everything matched
return true;
}
break;
} |
That loop makes a lot of unnecessary comparisons. Consider ranges like { 1, 2, 3, 4, 5 } and { 2, 1, 3, 4, 5 }: |
|
I believe that the same thing happens with your loop with slightly changed inputs. That said you can easily work around this template <class _It1, class _Se1, class _It2, class _Se2, class _Pr, class _Pj1, class _Pj2>
_NODISCARD static constexpr bool _Is_permutation_sized(_It1 _First1, _Se1 _Last1, _It2 _First2, _Se2 _Last2,
iter_difference_t<_It1> _Count, _Pr _Pred, _Pj1 _Proj1, _Pj2 _Proj2) {
_STL_INTERNAL_STATIC_ASSERT(forward_iterator<_It1>);
_STL_INTERNAL_STATIC_ASSERT(sentinel_for<_Se1, _It1>);
_STL_INTERNAL_STATIC_ASSERT(forward_iterator<_It2>);
_STL_INTERNAL_STATIC_ASSERT(sentinel_for<_Se2, _It2>);
_STL_INTERNAL_STATIC_ASSERT(
indirect_equivalence_relation<_Pr, projected<_It1, _Pj1>, projected<_It2, _Pj2>>);
_STL_INTERNAL_CHECK(_RANGES distance(_First1, _Last1) == _Count);
_STL_INTERNAL_CHECK(_RANGES distance(_First2, _Last2) == _Count);
_It1 _Final1;
if constexpr (bidirectional_iterator<_It1>) {
_Final1 = _Find_last_iterator(_First1, _Last1, _Count);
}
_It2 _Final2;
if constexpr (bidirectional_iterator<_It2>) {
_Final2 = _Find_last_iterator(_First2, _Last2, _Count);
}
_Trim_checked _Checked = _Trim_checked::_Nothing_checked;
for (;; --_Count) { // trim
if (_Count == 0) { // everything matched
return true;
}
// Matching prefix
if (!(_Checked & _Trim_checked::_Prefix_checked)
&& _STD invoke(_Pred, _STD invoke(_Proj1, *_First1), _STD invoke(_Proj2, *_First2))) {
++_First1;
++_First2;
_Checked &= _Trim_checked::_Suffix_checked;
continue;
} else {
_Checked |= _Trim_checked::_Prefix_checked;
}
// Matching reversed
if constexpr (bidirectional_iterator<_It2>) {
--_Final2;
if (!(_Checked & _Trim_checked::_Reversed12_checked)
&& _STD invoke(_Pred, _STD invoke(_Proj1, *_First1), _STD invoke(_Proj2, *_Final2))) {
++_First1;
_Checked &= _Trim_checked::_Reversed21_checked;
continue;
} else {
_Checked |= _Trim_checked::_Reversed12_checked;
}
}
// Matching reversed
if constexpr (bidirectional_iterator<_It1>) {
--_Final1;
if (!(_Checked & _Trim_checked::_Reversed21_checked)
&& _STD invoke(_Pred, _STD invoke(_Proj1, *_Final1), _STD invoke(_Proj2, *_First2))) {
++_First2;
if constexpr (bidirectional_iterator<_It2>) {
++_Final2;
}
_Checked &= _Trim_checked::_Reversed12_checked;
continue;
} else {
_Checked |= _Trim_checked::_Reversed21_checked;
}
}
// Matching suffix
if constexpr (bidirectional_iterator<_It1> && bidirectional_iterator<_It2>) {
if (!(_Checked & _Trim_checked::_Suffix_checked)
&& _STD invoke(_Pred, _STD invoke(_Proj1, *_Final1), _STD invoke(_Proj2, *_Final2))) {
_Checked &= _Trim_checked::_Prefix_checked;
continue;
} else {
_Checked |= _Trim_checked::_Suffix_checked;
}
}
if constexpr (bidirectional_iterator<_It1>) {
++_Final1;
}
if constexpr (bidirectional_iterator<_It2>) {
++_Final2;
}
break;
}
if (_Count == 1) { // single non-matching elements remain; not a permutation
return false;
}
// If we get here, _Count > 1, initial elements do not match, and final elements do not match.
if constexpr (bidirectional_iterator<_It1>) {
return _Match_counts(_STD move(_First1), _STD move(_Final1), _STD move(_First2), _STD move(_Last2),
_Pred, _Proj1, _Proj2);
} else if constexpr (bidirectional_iterator<_It2>) {
return _Match_counts(_STD move(_First1), _STD move(_Last1), _STD move(_First2), _STD move(_Final2),
_Pred, _Proj1, _Proj2);
} else if constexpr (bidirectional_iterator<_It1> && bidirectional_iterator<_It2>) {
return _Match_counts(_STD move(_First1), _STD move(_Final1), _STD move(_First2), _STD move(_Final2),
_Pred, _Proj1, _Proj2);
} else {
return _Match_counts(_STD move(_First1), _STD move(_Last1), _STD move(_First2), _STD move(_Last2),
_Pred, _Proj1, _Proj2);
}
}EDIT fixed a bug |
Could you elaborate on this? |
|
FYI, I put an POC of my proposal (only ranges edition) here https://github.com/miscco/STL/tree/is_permutation Happy to discuss alternatives / performance |
|
I made some benchmarks: AdamBucior/is_permutation-optmization miscco/is_permutation Your implementation introduces a lot of overhead even for equal arrays. Benchmark code: |
|
I am getting slightly different results So essentially your algorithm obliterates mine in sorted case and is better in the pure reversal case. Otherwise it performs roughly equal |
is_permutation for reversed sequences
StephanTLavavej
left a comment
There was a problem hiding this comment.
Thanks, this looks good! (And a video review of this will be available in the near future.) Just a few issues, nothing major.
mnatsuhara
left a comment
There was a problem hiding this comment.
This looks awesome! Thanks for this great work :)
I've made a few comments, but they're for small comment nitpicks and one suggested restructuring (hence, no changes requested).
If you don't want to reset testing for these comments, perhaps @StephanTLavavej can add them to his list of cleanups for the next big cleanup PR! :)
|
Thanks! Since this code is fresh in our minds, I'll validate and push changes instead of deferring cleanups until later. 🧠 |
Co-authored-by: Miya Natsuhara <46756417+mnatsuhara@users.noreply.github.com>
Co-authored-by: Miya Natsuhara <46756417+mnatsuhara@users.noreply.github.com>
|
@AdamBucior @mnatsuhara I've pushed changes, plus a merge with |
|
I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
|
Thanks for implementing this major optimization in the STL's highest-complexity algorithm! 🚀 😻 🎉 |
Fixes #279.
Also fixes
P0202R3_constexpr_algorithm_and_exchangetest fornext_permutationandprev_permutation(it was broken and never called) and addsis_permutationcoverage to this test (which was apparently missed).