Foundation Classes - Optimize NCollection_FlatMap and NCollection_FlatDataMap internals#1103
Conversation
…d NCollection_FlatDataMap Replace uint8_t probe distance storage/counters with size_t in both flat hash containers, so long Robin Hood probe chains are handled without overflow and without throwing "excessive probe length" exceptions. - remove THE_MAX_PROBE_DISTANCE and probe-length throw checks - update probe bookkeeping in find/insert/try-insert/bind/emplace paths - keep lookup early-exit logic based on stored probe distance - update class documentation: probe distance is bounded by table capacity - add collision stress tests: - NCollection_FlatMapTest.LongProbeSequence - NCollection_FlatDataMapTest.LongProbeSequence
There was a problem hiding this comment.
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (6)
src/FoundationClasses/TKernel/NCollection/NCollection_FlatDataMap.hxx:905
- The removal of exception throwing for excessive probe length changes the behavior for users who may have been catching Standard_OutOfRange exceptions. This is an API behavior change that could affect existing code. While the new behavior is arguably better (no arbitrary limits), consider documenting this breaking change in release notes or comments to help users understand that:
- The exception "NCollection_FlatDataMap: excessive probe length" will no longer be thrown
- Operations may take longer with pathological hash functions instead of failing fast
- Users should ensure their hash functions distribute keys reasonably well
This is particularly important since the old behavior provided a safety mechanism against severely degraded performance.
++aProbe;
aIndex = (aIndex + 1) & aMask;
}
src/FoundationClasses/TKernel/NCollection/NCollection_FlatMap.hxx:83
- Changing probe distance from uint8_t to size_t significantly increases memory usage per slot. On 64-bit systems, this adds 7 bytes per slot (8 bytes for size_t vs 1 byte for uint8_t). For a table with many entries, this could substantially increase memory footprint.
Consider whether there's a middle ground: could you use uint16_t or uint32_t to handle reasonable probe distances while limiting memory impact? A uint16_t would allow probe distances up to 65,535 which should be sufficient for most real-world scenarios while only adding 1 byte per slot. Alternatively, document this as a known tradeoff if size_t is deemed necessary.
size_t myProbeDistance; //!< Distance from ideal bucket (for Robin Hood)
src/FoundationClasses/TKernel/NCollection/NCollection_FlatDataMap.hxx:94
- Changing probe distance from uint8_t to size_t significantly increases memory usage per slot. On 64-bit systems, this adds 7 bytes per slot (8 bytes for size_t vs 1 byte for uint8_t). For a table with many entries, this could substantially increase memory footprint.
Consider whether there's a middle ground: could you use uint16_t or uint32_t to handle reasonable probe distances while limiting memory impact? A uint16_t would allow probe distances up to 65,535 which should be sufficient for most real-world scenarios while only adding 1 byte per slot. Alternatively, document this as a known tradeoff if size_t is deemed necessary.
size_t myProbeDistance; //!< Distance from ideal bucket (for Robin Hood)
src/FoundationClasses/TKernel/GTests/NCollection_FlatMap_Test.cxx:172
- The test uses 400 elements with a constant hasher to create worst-case collisions. While this effectively tests the removal of the probe cap, consider testing with even higher values to ensure the new size_t-based implementation truly handles extreme cases. Also, consider adding a comment explaining why 400 was chosen as the test size (presumably because it's significantly larger than the old 250 limit).
TEST_F(NCollection_FlatMapTest, LongProbeSequence)
{
struct ConstantHasher
{
size_t operator()(int) const { return 0; }
bool operator()(int theKey1, int theKey2) const { return theKey1 == theKey2; }
};
constexpr int THE_NUM_ELEMENTS = 400;
NCollection_FlatMap<int, ConstantHasher> aMap;
for (int i = 0; i < THE_NUM_ELEMENTS; ++i)
{
EXPECT_TRUE(aMap.Add(i));
}
EXPECT_EQ(THE_NUM_ELEMENTS, aMap.Size());
for (int i = 0; i < THE_NUM_ELEMENTS; ++i)
{
EXPECT_TRUE(aMap.Contains(i)) << "Key " << i << " not found";
}
}
src/FoundationClasses/TKernel/GTests/NCollection_FlatDataMap_Test.cxx:325
- The test uses 400 elements with a constant hasher to create worst-case collisions. While this effectively tests the removal of the probe cap, consider testing with even higher values to ensure the new size_t-based implementation truly handles extreme cases. Also, consider adding a comment explaining why 400 was chosen as the test size (presumably because it's significantly larger than the old 250 limit).
TEST_F(NCollection_FlatDataMapTest, LongProbeSequence)
{
struct ConstantHasher
{
size_t operator()(int) const { return 0; }
bool operator()(int theKey1, int theKey2) const { return theKey1 == theKey2; }
};
constexpr int THE_NUM_ELEMENTS = 400;
NCollection_FlatDataMap<int, int, ConstantHasher> aMap;
for (int i = 0; i < THE_NUM_ELEMENTS; ++i)
{
EXPECT_TRUE(aMap.Bind(i, i * 3));
}
EXPECT_EQ(THE_NUM_ELEMENTS, aMap.Size());
for (int i = 0; i < THE_NUM_ELEMENTS; ++i)
{
EXPECT_TRUE(aMap.IsBound(i)) << "Key " << i << " not found";
EXPECT_EQ(i * 3, aMap.Find(i));
}
}
src/FoundationClasses/TKernel/NCollection/NCollection_FlatMap.hxx:678
- The removal of exception throwing for excessive probe length changes the behavior for users who may have been catching Standard_OutOfRange exceptions. This is an API behavior change that could affect existing code. While the new behavior is arguably better (no arbitrary limits), consider documenting this breaking change in release notes or comments to help users understand that:
- The exception "NCollection_FlatMap: excessive probe length" will no longer be thrown
- Operations may take longer with pathological hash functions instead of failing fast
- Users should ensure their hash functions distribute keys reasonably well
This is particularly important since the old behavior provided a safety mechanism against severely degraded performance.
++aProbe;
aIndex = (aIndex + 1) & aMask;
}
…tDataMap internals - Encode slot state in probe distance (`myProbeDistancePlus1`): 0 = empty, >0 = used and remove explicit `SlotState`/tombstone handling paths. - Replace internal `findSlot()` optional-index API with `findSlotIndex()` bool + out index. - Consolidate insertion logic into `insertRehashedImpl()` variants and reuse cached hash during rehash to avoid redundant hash recomputation. - Tune growth policy to max load factor 13/16 (81.25%) and update `reserve()` math. - Keep behavior and API intact while reducing per-slot metadata overhead and hot-path branching.
myProbeDistancePlus1): 0 = empty, >0 = usedand remove explicit
SlotState/tombstone handling paths.findSlot()optional-index API withfindSlotIndex()bool + out index.insertRehashedImpl()variants and reuse cached hashduring rehash to avoid redundant hash recomputation.
reserve()math.