Skip to content

<functional>: _Move_only_function_base::_Construct_with_null() appears to be doing too much work #5316

@StephanTLavavej

Description

@StephanTLavavej

⚙️ move_only_function representation

_Move_only_function_data is a union with the following representation:

STL/stl/inc/functional

Lines 1301 to 1307 in 3b2a52e

// _Move_only_function_data is defined as an array of pointers.
// The first element is always a pointer to _Move_only_function_base::_Impl_t; it emulates a vtable pointer.
// The other pointers are used as storage for a small functor;
// if the functor does not fit in, the second pointer is the pointer to allocated storage, the rest are unused.
union alignas(max_align_t) _Move_only_function_data {
void* _Pointers[_Small_object_num_ptrs];
const void* _Impl;

STL/stl/inc/functional

Lines 1335 to 1337 in 3b2a52e

void _Set_large_fn_ptr(void* const _Value) noexcept {
_Pointers[1] = _Value;
}

_Move_only_function_base then stores _Move_only_function_data _Data; and has:

STL/stl/inc/functional

Lines 1504 to 1511 in 3b2a52e

void _Construct_with_null() noexcept {
_Data._Impl = nullptr;
_Data._Set_large_fn_ptr(nullptr); // initialize, since we'll be copying it
}
void _Reset_to_null() noexcept {
_Data._Impl = nullptr;
}

❓ Questions

  • Why does _Construct_with_null() do the extra work of _Data._Set_large_fn_ptr(nullptr);?
  • What does the comment of // initialize, since we'll be copying it actually mean?

🕵️ Analysis

There are three calls to _Construct_with_null():

STL/stl/inc/functional

Lines 1885 to 1891 in 3b2a52e

move_only_function() noexcept {
this->_Construct_with_null();
}
move_only_function(nullptr_t) noexcept {
this->_Construct_with_null();
}

STL/stl/inc/functional

Lines 1895 to 1907 in 3b2a52e

template <class _Fn>
requires _Enable_one_arg_constructor<_Fn>
move_only_function(_Fn&& _Callable) {
using _Vt = decay_t<_Fn>;
static_assert(is_constructible_v<_Vt, _Fn>, "_Vt should be constructible from _Fn. "
"(N4950 [func.wrap.move.ctor]/6)");
if constexpr (is_member_pointer_v<_Vt> || is_pointer_v<_Vt> || _Is_specialization_v<_Vt, move_only_function>) {
if (_Callable == nullptr) {
this->_Construct_with_null();
return;
}
}

They're the default constructor and the constructors from nullptr_t and null callables (function pointers, PMFs, PMDs, and other move_only_functions).

What happens when we perform such a construction, then move the move_only_function? It has a defaulted move ctor, so we need to look at _Move_only_function_base:

move_only_function(move_only_function&&) noexcept = default;

STL/stl/inc/functional

Lines 1499 to 1502 in 3b2a52e

_Move_only_function_base(_Move_only_function_base&& _Other) noexcept {
_Checked_move(_Data, _Other._Data);
_Other._Reset_to_null();
}

STL/stl/inc/functional

Lines 1530 to 1537 in 3b2a52e

static void _Checked_move(_Move_only_function_data& _Data, _Move_only_function_data& _Src) noexcept {
const auto _Impl = _Get_impl(_Src);
if (_Impl->_Move) {
_Impl->_Move(_Data, _Src);
} else {
_Function_move_large(_Data, _Src);
}
}

_Checked_move() starts with const auto _Impl = _Get_impl(_Src);. That doesn't get the null previously set by _Data._Impl = nullptr;. It does something fancy:

STL/stl/inc/functional

Lines 1595 to 1604 in 3b2a52e

_NODISCARD static const _Impl_t* _Get_impl(const _Move_only_function_data& _Data) noexcept {
static constexpr _Impl_t _Null_move_only_function = {
_Function_not_callable<_Rx, _Types...>,
nullptr,
nullptr,
};
const auto _Ret = static_cast<const _Impl_t*>(_Data._Impl);
return _Ret ? _Ret : &_Null_move_only_function;
}

Because _Data._Impl is null, we get &_Null_move_only_function instead, which has null function pointers for _Move and _Destroy:

STL/stl/inc/functional

Lines 1477 to 1493 in 3b2a52e

struct _Impl_t { // A per-callable-type structure acting as a virtual function table.
// Using vtable emulations gives more flexibility for optimizations and reduces the amount of RTTI data.
// (The RTTI savings may be significant as with lambdas and binds there may be many distinct callable types.
// Here we don't have a distinct wrapper class for each callable type, only distinct functions when needed.)
// _Move and _Destroy are nullptr if trivial. Besides being an optimization, this enables assigning an
// empty function from a DLL that is unloaded later, and then safely moving/destroying that empty function.
// Calls target
_Invoke_t<_Noexcept>::_Call _Invoke;
// Moves the data, including pointer to "vtable", AND destroys old data (not resetting its "vtable").
// nullptr if we can trivially move two pointers.
void(__stdcall* _Move)(_Move_only_function_data&, _Move_only_function_data&) _NOEXCEPT_FNPTR;
// Destroys data (not resetting its "vtable").
// nullptr if destruction is a no-op.
void(__stdcall* _Destroy)(_Move_only_function_data&) _NOEXCEPT_FNPTR;
};

Going back to _Checked_move(), because _Impl->_Move is null, we call _Function_move_large(_Data, _Src);. That is, when we move a default constructed (etc.) move_only_function, we treat it as "large" (dynamically allocated). What happens now?

STL/stl/inc/functional

Lines 1385 to 1387 in 3b2a52e

inline void __stdcall _Function_move_large(_Move_only_function_data& _Self, _Move_only_function_data& _Src) noexcept {
_CSTD memcpy(&_Self._Data, &_Src._Data, _Minimum_function_size); // Copy Impl* and functor data
}

This is a memcpy() of the first two pointers of _Move_only_function_data, i.e. the _Data._Impl that we previously set to null (and definitely had to), followed by the _Data._Pointers[1] that _Data._Set_large_fn_ptr(nullptr); previously assigned to (i.e. the assignment in question here):

STL/stl/inc/functional

Lines 1340 to 1342 in 3b2a52e

// Size of a large function. Treat an empty function as if it has this size.
// Treat a small function as if it has this size too if it fits and is trivially copyable.
inline constexpr size_t _Minimum_function_size = 2 * sizeof(void*);

💡 Conclusions

Because memcpy() can copy anything, including uninitialized bytes, @AlexGuteniev (the original author) and I now believe that _Construct_with_null() does not need to do the extra work of _Data._Set_large_fn_ptr(nullptr);.

The comment of // initialize, since we'll be copying it was written under the impression that if we left the _Large_fn_ptr (i.e. _Pointers[1]) uninitialized, that would spell trouble when a later move of the move_only_function attempted to copy that pointer's bits (followed by nulling out the source). But since we copy this pointer with memcpy(), leaving it uninitialized would have been fine.

When _Data._Impl is null, nothing else in the code should care about the _Large_fn_ptr (i.e. _Pointers[1]). We inspect _Data._Impl to determine whether the move_only_function is empty:

STL/stl/inc/functional

Lines 1961 to 1963 in 3b2a52e

_NODISCARD explicit operator bool() const noexcept {
return !this->_Is_null();
}

STL/stl/inc/functional

Lines 1582 to 1584 in 3b2a52e

_NODISCARD bool _Is_null() const noexcept {
return _Data._Impl == nullptr;
}

And if someone tries to call an empty move_only_function, we go through _Get_invoke() which will pick up the _Null_move_only_function's _Function_not_callable<_Rx, _Types...>:

STL/stl/inc/functional

Lines 1591 to 1593 in 3b2a52e

_NODISCARD auto _Get_invoke() const noexcept {
return _Get_impl(_Data)->_Invoke;
}

This makes sense, because _Data._Impl is a pseudo-vtable, and needs to control what the move_only_function does with its representation. It switches between dealing with the representation in null mode, small mode, or large mode (and we've seen that null mode bypasses looking at _Large_fn_ptr). We can't possibly "punch through" and directly inspect _Large_fn_ptr without going through _Data._Impl, because that would be wrong if we were in small mode.

The only thing that manipulates _Large_fn_ptr in null mode is _Checked_move() as we've already seen - which unifies the null mode and large mode cases into a memcpy() of two function pointers.

🛠️ Potential change

I believe that we can remove _Construct_with_null() and replace its usage with _Reset_to_null(), avoiding the unnecessary work of nulling out the _Large_fn_ptr which we don't need to use. This would be strictly simpler, and remove the confusing line/comment that prompted my investigation (it made me worry that we actually did need to care about _Large_fn_ptr in null mode).

It could be possible to split how we handle null mode and large mode in _Function_move_large() (copying only one null pointer for null mode), but that would be a lot of logic change for a minimal improvement. (How often are we moving empty move_only_functions?) I recommend against exploring that path.

(I noticed this while implementing Destructor Tombstoning but I'm filing a followup issue since I don't want to randomize that PR with unnecessary logic changes.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    fixedSomething works now, yay!performanceMust go faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions