Foundation Classes - Tree & collection performance optimizations, move semantics, unified map API#1065
Conversation
…e semantics, dead code cleanup NCollection_UBTree/EBTree: - Add move constructor and move assignment operators - Replace recursive Select() and delNode() with iterative stack-based traversal to avoid stack overflow on deeply unbalanced trees - Optimize EBTree::Add() and Remove() to use single-lookup ChangeSeek() instead of double-lookup UnBind()+Bind() / Contains()+operator() - Remove unused DEFINE_HUBTREE / DEFINE_HEBTREE / IMPLEMENT_HUBTREE / IMPLEMENT_HEBTREE macros - Remove unused includes from EBTree (Standard_Type, Standard_Transient, NCollection_List, Standard_Integer, NCollection_Sequence) - Fix doxygen @param tags and comment style NCollection_LocalArray: - Add move constructor and move assignment operators with optimized three-way branching (stack-stack copy, heap-heap swap, stack-heap steal) - Add Reallocate() method supporting grow-with-copy for use as a dynamically growable stack NCollection_CellFilter: - Replace const_cast destructive-copy hack in Cell with proper copy and move semantics using NCollection_LocalArray move support - Replace C-style cast with explicit const_cast in add() - Change ListNode default constructor from runtime throw to = delete - Use size_t for dimension loops and add dimension size guard in IsEqual - Remove SUN WorkShop 5.3 workaround - Fix typo "usially" -> "usually" in class documentation NCollection_Map / NCollection_FlatMap: - Add Seek() and ChangeSeek() methods returning pointer to key in map (nullptr if not found) for single-lookup access patterns Dead compiler workaround removal: - NCollection_DefineAlloc: remove Borland/SUN #if branch, keep only the version with placement delete - NCollection_SparseArrayBase: remove SUN WorkShop 5.3 workaround GTests: - Add move constructor/assignment tests for LocalArray, UBTree, EBTree - Add Seek/ChangeSeek tests for NCollection_Map and NCollection_FlatMap - Add CellFilter tests and UBTree deep-unbalanced-tree stress test
There was a problem hiding this comment.
Pull request overview
Performance- and robustness-focused update to OCCT FoundationClasses collections/trees, adding move semantics, removing legacy compiler workarounds/dead code, and reducing recursion risks on deep/unbalanced structures.
Changes:
- Added move ctor/assignment and iterative traversal to UBTree/EBTree; optimized EBTree map updates to single-lookups.
- Enhanced
NCollection_LocalArraywith move semantics andReallocate()for growable-stack use. - Added
Seek()/ChangeSeek()APIs and expanded GTest coverage for new behaviors; removed obsolete SUN/Borland workarounds.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/FoundationClasses/TKernel/NCollection/NCollection_UBTree.hxx | Iterative traversal for Select/delete to avoid recursion; adds move operations; removes dead macros/comments. |
| src/FoundationClasses/TKernel/NCollection/NCollection_EBTree.hxx | Adds move operations; optimizes Add/Remove with single-lookups; removes unused includes/macros. |
| src/FoundationClasses/TKernel/NCollection/NCollection_LocalArray.hxx | Adds move ctor/assignment and Reallocate() to support growable buffers. |
| src/FoundationClasses/TKernel/NCollection/NCollection_Map.hxx | Adds Seek() / ChangeSeek() returning pointers for single-lookup patterns. |
| src/FoundationClasses/TKernel/NCollection/NCollection_FlatMap.hxx | Adds Seek() / ChangeSeek() returning pointers for single-lookup patterns. |
| src/FoundationClasses/TKernel/NCollection/NCollection_CellFilter.hxx | Removes obsolete compiler workaround; improves Cell copy/move; uses ChangeSeek; cleans up empty cells; fixes typo. |
| src/FoundationClasses/TKernel/NCollection/NCollection_SparseArrayBase.hxx | Removes SUN WorkShop 5.3 workaround in access modifier section. |
| src/FoundationClasses/TKernel/NCollection/NCollection_DefineAlloc.hxx | Removes Borland/SUN conditional branch; keeps placement delete version only. |
| src/FoundationClasses/TKernel/GTests/NCollection_UBTree_Test.cxx | New tests for UBTree/EBTree selection, deep unbalance, and move operations. |
| src/FoundationClasses/TKernel/GTests/NCollection_Map_Test.cxx | Adds tests for Seek() / ChangeSeek(). |
| src/FoundationClasses/TKernel/GTests/NCollection_FlatMap_Test.cxx | Adds tests for Seek() / ChangeSeek(). |
| src/FoundationClasses/TKernel/GTests/NCollection_LocalArray_Test.cxx | Adds tests for Reallocate() and move operations. |
| src/FoundationClasses/TKernel/GTests/NCollection_CellFilter_Test.cxx | New tests for add/inspect/remove/purge and empty-cell cleanup. |
| src/FoundationClasses/TKernel/GTests/FILES.cmake | Registers new GTest sources. |
| //! Reallocate the array to a new size. | ||
| //! @param[in] theNewSize new number of elements | ||
| //! @param[in] theToCopy if true, existing elements are copied to the new buffer | ||
| void Reallocate(const size_t theNewSize, bool theToCopy = true) | ||
| { | ||
| Deallocate(); | ||
| if (theSize > MAX_ARRAY_SIZE) | ||
| myPtr = (theItem*)Standard::Allocate(theSize * sizeof(theItem)); | ||
| else | ||
| myPtr = myBuffer; | ||
| if (theNewSize <= mySize) | ||
| { | ||
| // Shrinking - just update the logical size, keep existing allocation | ||
| mySize = theNewSize; | ||
| return; | ||
| } | ||
|
|
||
| const bool isOnHeap = (myPtr != myBuffer); | ||
| const size_t aNewBytes = theNewSize * sizeof(theItem); | ||
|
|
||
| mySize = theSize; | ||
| if (theNewSize <= static_cast<size_t>(MAX_ARRAY_SIZE)) | ||
| { | ||
| // New size fits in stack buffer | ||
| if (isOnHeap) | ||
| { | ||
| if (theToCopy && mySize > 0) | ||
| { | ||
| std::memcpy(myBuffer, myPtr, std::min(mySize, theNewSize) * sizeof(theItem)); | ||
| } | ||
| Standard::Free(myPtr); | ||
| myPtr = myBuffer; | ||
| } | ||
| mySize = theNewSize; | ||
| return; | ||
| } | ||
|
|
||
| if (isOnHeap) | ||
| { | ||
| // Already on heap - use Standard::Reallocate (preserves content when growing) | ||
| if (theToCopy) | ||
| { | ||
| myPtr = (theItem*)Standard::Reallocate(myPtr, aNewBytes); | ||
| } | ||
| else | ||
| { | ||
| Standard::Free(myPtr); | ||
| myPtr = (theItem*)Standard::Allocate(aNewBytes); | ||
| } | ||
| } | ||
| else | ||
| { | ||
| // Stack to heap transition | ||
| myPtr = (theItem*)Standard::Allocate(aNewBytes); | ||
| if (theToCopy && mySize > 0) | ||
| { | ||
| std::memcpy(myPtr, myBuffer, std::min(mySize, theNewSize) * sizeof(theItem)); | ||
| } | ||
| } | ||
| mySize = theNewSize; | ||
| } |
There was a problem hiding this comment.
NCollection_LocalArray is templated on arbitrary theItem, but Reallocate() uses std::memcpy() and Standard::Reallocate() which are only safe for trivially copyable/trivially relocatable types. This can silently break invariants (and/or cause double-destruction / leaks) when theItem has non-trivial copy/move/destructor. Consider enforcing this with a static_assert(std::is_trivially_copyable_v<theItem> && std::is_trivially_destructible_v<theItem>), or reworking the implementation to perform element-wise move/copy construction into newly allocated storage and properly destroy old elements.
| //! Copy constructor | ||
| Cell(const Cell& theOther) | ||
| : index(theOther.index.Size()) | ||
| : index(theOther.index.Size()), | ||
| Objects(theOther.Objects) | ||
| { | ||
| (*this) = theOther; | ||
| const size_t aDim = theOther.index.Size(); | ||
| for (size_t anIdx = 0; anIdx < aDim; anIdx++) | ||
| index[anIdx] = theOther.index[anIdx]; | ||
| } | ||
|
|
||
| //! Assignment operator: ensure that list is not deleted twice | ||
| void operator=(const Cell& theOther) noexcept | ||
| //! Move constructor: transfers ownership of the object list | ||
| Cell(Cell&& theOther) noexcept | ||
| : index(std::move(theOther.index)), | ||
| Objects(theOther.Objects) | ||
| { | ||
| int aDim = int(theOther.index.Size()); | ||
| for (int anIdx = 0; anIdx < aDim; anIdx++) | ||
| index[anIdx] = theOther.index[anIdx]; | ||
| theOther.Objects = nullptr; | ||
| } | ||
|
|
||
| Objects = theOther.Objects; | ||
| ((Cell&)theOther).Objects = nullptr; | ||
| Cell& operator=(Cell&& theOther) noexcept | ||
| { | ||
| index = std::move(theOther.index); | ||
| Objects = theOther.Objects; | ||
| theOther.Objects = nullptr; | ||
| return *this; | ||
| } |
There was a problem hiding this comment.
The Cell copy ctor currently copies Objects (a owning linked-list head), which can lead to shared ownership and double-destruction when copies occur (e.g., copying the map/filter). Additionally, move-assignment overwrites Objects without first releasing the current list, which can leak targets (their destructors are never called). A safer model here is: (1) copy ctor copies only index and sets Objects = nullptr (since Objects is not part of the key), and (2) move assignment should either clear the current list before taking ownership or implement swap-based move to preserve exception safety and avoid leaks.
| //! ChangeSeek returns modifiable pointer to key in map. Returns NULL if not found. | ||
| TheKeyType* ChangeSeek(const TheKeyType& theKey) | ||
| { | ||
| MapNode* p; | ||
| if (!lookup(theKey, p)) | ||
| return nullptr; | ||
| return &p->ChangeValue(); |
There was a problem hiding this comment.
ChangeSeek() exposes a modifiable pointer to a key stored inside a hash set. Allowing callers to mutate keys in-place can break the container invariants (hash/equality) and lead to hard-to-debug corruption. If the intent is single-lookup access without allowing mutation, prefer returning const TheKeyType* from the non-const overload as well (or returning an opaque node/iterator type). If mutation must remain possible for internal use, strongly document that modifying the key is forbidden (or add runtime checks in debug builds), and consider renaming to avoid implying that mutation is safe.
| //! ChangeSeek returns modifiable pointer to key in map. Returns NULL if not found. | |
| TheKeyType* ChangeSeek(const TheKeyType& theKey) | |
| { | |
| MapNode* p; | |
| if (!lookup(theKey, p)) | |
| return nullptr; | |
| return &p->ChangeValue(); | |
| //! ChangeSeek returns pointer to key in map. Returns NULL if not found. | |
| const TheKeyType* ChangeSeek(const TheKeyType& theKey) | |
| { | |
| MapNode* p; | |
| if (!lookup(theKey, p)) | |
| return nullptr; | |
| return &p->Value(); |
| //! ChangeSeek returns modifiable pointer to key in map. Returns NULL if not found. | ||
| TheKeyType* ChangeSeek(const TheKeyType& theKey) | ||
| { | ||
| if (mySize == 0) | ||
| return nullptr; | ||
| const std::optional<size_t> aIdx = findSlot(theKey); | ||
| if (!aIdx.has_value()) | ||
| return nullptr; | ||
| return &mySlots[*aIdx].Key(); |
There was a problem hiding this comment.
Same concern as NCollection_Map::ChangeSeek(): returning a modifiable pointer to a key in a set-like container makes it easy to accidentally mutate the key and break hashing/ordering invariants. If you only need a single-lookup access path, consider returning const TheKeyType* for both const and non-const overloads (or an iterator/pointer-to-slot with restricted operations). Also, if there is a dedicated non-const accessor (e.g., ChangeKey()), use it for clarity; Key() may be const in other contexts and this pattern is easy to misuse.
| //! ChangeSeek returns modifiable pointer to key in map. Returns NULL if not found. | |
| TheKeyType* ChangeSeek(const TheKeyType& theKey) | |
| { | |
| if (mySize == 0) | |
| return nullptr; | |
| const std::optional<size_t> aIdx = findSlot(theKey); | |
| if (!aIdx.has_value()) | |
| return nullptr; | |
| return &mySlots[*aIdx].Key(); | |
| //! ChangeSeek returns pointer to key in map. Returns NULL if not found. | |
| const TheKeyType* ChangeSeek(const TheKeyType& theKey) const | |
| { | |
| return Seek(theKey); |
| // Collect children arrays during pre-order traversal, free them after. | ||
| NCollection_LocalArray<TreeNode*, 64> aChildArrays(64); | ||
| NCollection_LocalArray<TreeNode*, 64> aStack(64); | ||
| int aNumArrays = 0; | ||
| int aTop = 0; | ||
|
|
||
| aStack[aTop++] = theNode; | ||
|
|
||
| while (aTop > 0) | ||
| { | ||
| if (theNode->myChildren) | ||
| TreeNode* aNode = aStack[--aTop]; | ||
| if (aNode->myChildren) | ||
| { | ||
| delNode(&theNode->myChildren[0], theAlloc); | ||
| delNode(&theNode->myChildren[1], theAlloc); | ||
| theAlloc->Free(theNode->myChildren); | ||
| // Record children array for later freeing | ||
| if (aNumArrays >= static_cast<int>(aChildArrays.Size())) | ||
| aChildArrays.Reallocate(aChildArrays.Size() * 2, true); | ||
| aChildArrays[aNumArrays++] = aNode->myChildren; | ||
|
|
||
| // Push both children for traversal | ||
| if (aTop + 2 > static_cast<int>(aStack.Size())) | ||
| aStack.Reallocate(aStack.Size() * 2, true); | ||
| aStack[aTop++] = &aNode->myChildren[1]; | ||
| aStack[aTop++] = &aNode->myChildren[0]; | ||
| } | ||
| theNode->~TreeNode(); | ||
| aNode->~TreeNode(); | ||
| } | ||
|
|
||
| // Free all collected children arrays | ||
| for (int i = 0; i < aNumArrays; ++i) | ||
| theAlloc->Free(aChildArrays[i]); |
There was a problem hiding this comment.
delNode() avoids recursion (good), but the current pre-order approach defers freeing every myChildren allocation by collecting them all in aChildArrays, which adds extra O(number_of_internal_nodes) memory on top of the traversal stack. You can avoid this extra storage by doing an explicit post-order traversal (e.g., push frames with a visited flag, or push node twice) so each myChildren block is freed immediately after both children are processed.
…st, NCollection_LocalArray_Test, and NCollection_UBTree_Test
…access and add TryEmplace methods in NCollection_Map
| NCollection_LocalArray<TreeNode*, 64> aChildArrays(64); | ||
| NCollection_LocalArray<TreeNode*, 64> aStack(64); |
There was a problem hiding this comment.
Magic number 64 is used as initial stack size in multiple places. Consider defining a named constant (e.g., INITIAL_STACK_SIZE) to improve maintainability and clarify the meaning of this value.
| NCollection_LocalArray<TreeNode*, 64> aChildArrays(64); | |
| NCollection_LocalArray<TreeNode*, 64> aStack(64); | |
| constexpr int INITIAL_STACK_SIZE = 64; | |
| NCollection_LocalArray<TreeNode*, INITIAL_STACK_SIZE> aChildArrays(INITIAL_STACK_SIZE); | |
| NCollection_LocalArray<TreeNode*, INITIAL_STACK_SIZE> aStack(INITIAL_STACK_SIZE); |
| class NCollection_LocalArray | ||
| { | ||
| static_assert(std::is_trivially_copyable<theItem>::value, | ||
| "NCollection_LocalArray uses memcpy/realloc and requires trivially copyable types"); |
There was a problem hiding this comment.
The static_assert error message should clarify that this is a compile-time requirement. Consider updating the message to: 'NCollection_LocalArray requires trivially copyable types for memcpy/realloc operations'.
| "NCollection_LocalArray uses memcpy/realloc and requires trivially copyable types"); | |
| "NCollection_LocalArray requires trivially copyable types for memcpy/realloc operations"); |
| Cell(const CellIndex& theIndex) | ||
| : index(theIndex.Size()), | ||
| Objects(nullptr) |
There was a problem hiding this comment.
The parameter name 'theIndex' is slightly misleading as it suggests a single index value, whereas it's actually a cell index array. Consider renaming to 'theIndexArray' or 'theCellIndex' for clarity.
| const TheKeyType* Seek(const TheKeyType& theKey) const | ||
| { | ||
| MapNode* p; | ||
| if (!lookup(theKey, p)) | ||
| return nullptr; | ||
| return &p->Value(); | ||
| } |
There was a problem hiding this comment.
The variable name 'p' is not descriptive. Consider renaming to 'aNode' to match the naming convention used elsewhere in the codebase (e.g., in the emplaceImpl method).
| CellIndex& theIndex, | ||
| const Cell& theMinIndex, | ||
| const Cell& theMaxIndex, |
There was a problem hiding this comment.
Parameter names 'theMinIndex' and 'theMaxIndex' suggest they are index values, but they are actually Cell objects containing bounding information. Consider renaming to 'theMinCell' and 'theMaxCell' for clarity.
| TheKeyType* ChangeSeek(const TheKeyType& theKey) | ||
| { | ||
| MapNode* p; | ||
| if (!lookup(theKey, p)) | ||
| return nullptr; | ||
| return &p->ChangeValue(); | ||
| } |
There was a problem hiding this comment.
The variable name 'p' is not descriptive. Consider renaming to 'aNode' to match the naming convention used elsewhere in the codebase.
…ethod for key existence checks, replacing Seek method usage in tests and improving code clarity.
…ved readability and consistency
…nce from NCollection_CellFilter_Inspector, improving clarity and consistency in point handling.
…s static constexpr int
…sistency and improved usability
NCollection_UBTree/EBTree:
traversal to avoid stack overflow on deeply unbalanced trees
instead of double-lookup UnBind()+Bind() / Contains()+operator()
IMPLEMENT_HEBTREE macros
NCollection_List, Standard_Integer, NCollection_Sequence)
NCollection_LocalArray:
three-way branching (stack-stack copy, heap-heap swap, stack-heap steal)
dynamically growable stack
NCollection_CellFilter:
semantics; delete copy constructor and copy assignment
use TryEmplaced() for single-lookup cell insertion
instead of C-style cast on Seek()
NCollection map API unification (Contained, TryEmplace, TryBind):
std::reference_wrapper; key-only maps return const key ref,
data maps return std::pair of const key ref + value ref
NCollection_IndexedMap for parity with NCollection_Map
NCollection_DataMap and NCollection_FlatDataMap
Contained())
Dead compiler workaround removal:
the version with placement delete
GTests: