Remote entity reservation v2#18266
Conversation
Allocation order is not guaranteed. This test did not reflect that.
flushing is no longer needed for allocation, freeing, etc, so there is no such thing as `needs_flush`, and `worth_flushing` can be true even just by allocating directly.
…iottjPierce/bevy into remote-entity-reservation-v2
This makes AtomicPtr use correct in theory.
still has leak to prevent double free.
I don't want to admit how long I was trying to figure this out. `get_mut` returns &mu *mut T, which coerces to *mut T **AT THE FIRST REF**
Co-authored-by: Brezak <bezak.adam@proton.me>
This was missing from a review suggestion. I should have caught it there, but instead I'll fix it here.
chescock
left a comment
There was a problem hiding this comment.
Is the idea here roughly
struct EntityReservations {
pending: RwLock<Vec<Entity>>,
next_pending_index: AtomicIdCursor,
}except that you're combining the atomic for the lock with the atomic for next_pending_index so that instead of three atomic operations (lock, decrement, unlock), you only have to do two (lock-decrement, unlock-decrement)?
I think that could be made to work! It might be easier to reason about if you can encapsulate the lock-plus-counter in a generic lock type, and present a safe API that EntityReservations can use.
But there are still a lot of ways to introduce subtle race conditions, so I'd be wary of building all of this just to go from 3 atomic ops to 2.
| .fetch_sub(num, Ordering::Relaxed); | ||
| Some(result) | ||
| } else { | ||
| // unsuccessful, so we need to undo the change `next_pending_index_intention` to keep parity. |
There was a problem hiding this comment.
I think you need to use a compare-exchange loop instead of making a change and trying to undo it. Something like
self.next_pending_index_intention
.fetch_update(set_order, fetch_order, |index| (index >= num).then_some(index - num))Otherwise you can get weird values while some operations are in progress. For example, suppose upper_pending_index is 10, and we start operations for num of 20, 1, and 5, and they interleave like this:
- index is 10, op for 20 does fetch_sub(20)
- index is -10, op for 1 does fetch_sub(1)
- index is -11, op for 20 rolls back and does fetch_add(20)
- index is 9, op for 5 does fetch_sub(5)
- index is 4, op for 1 rolls back and does fetch_add(1)
- index is 5
The only successful operation was for 5, but it takes the slice [4..9] instead of [5..10]. So the entity at index 10 is never taken, and the one at index 4 will be taken twice.
There was a problem hiding this comment.
Maybe I'm missing something. The first fetch_sub(20) isn't added back, it's subtracted from next_pending_index_truth, signifying a successful op. So that first one would get indices [0..10] exclusively. This leaves next_pending_index_intention at -10.
All the other ops just subtract num, see that it's negative, and add it back.
I don't think there's an issue here. Unless I'm not seeing some ordering problem??
There was a problem hiding this comment.
Oh, sorry, I thought the <= 0 check was on the new value, not the old value.
I think there is still a problem if the next_pending_index_intention goes from negative to positive in between the fetch_sub and fetch_add, though. It looks like that can only happen from pending_scope_mut. So, I think the sequence is
pending_scope_mutdoes a fetch_sub- op for 1 does fetch_sub(1) and fails
pending_scope_mutfinishes and does a fetch_add- op for 5 does fetch_sub(5) and succeeds
- op for 1 rolls back and does fetch_add(1)
And the op for 5 saw a value that was too small because it started before the op for 1 finished rolling back.
There was a problem hiding this comment.
Ok, I'm going to walk though this to help me think through it. Let's start from both the intention and truth values being equal to some medium positive value x, lets say 4.
pending_scope_mutdoes a fetch_sub. Intention: x -u32::MAX, truth: x- reserve for 1 starts. Intention x - u32::MAX - 1, truth: x
pending_scope_mutfinishes with exclusive access: Intention: x - 1 + y, truth: x + y (where y is thedifffor resizes/rebalancing).- reserve for 5 starts. Intention: x + y - 1 - 5, truth x + y.
- reserve for 1 fails (it was negative). Intention: x + y - 5, truth: x + y. (The reserve then returns a new index. One may have been available, but it was not safe to access.)
- reserve for 5 succeeds. It uses the slice ending on x + y - 1. Intention: x + y - 5, truth: x + y - 5. Oh no! We skipped the pending entity at x + y!
Yikes! Good catch!
When we base the slice on the intention, and the intention fails and is added to, reservations that happen in the middle skip whatever slice the failed reservation would have used.
I'm going to think of a fix here, but I wanted to go ahead and reply.
| let prev_pending_cap = pending.capacity(); | ||
|
|
||
| if next_pending_index == next_pending_index_intention { | ||
| // SAFETY: we just told all the remote entities that the next index is negative, |
There was a problem hiding this comment.
This claims that things happen in a certain order, but all of the operations are Relaxed, so there are very few guarantees. For example, the compiler and CPU are free to reorder the accesses in pending_scope to be after the final next_pending_index_truth.fetch_sub. You need that fetch_sub to use Ordering::Release and the load here to use Ordering::Acquire in order to ensure that these accesses to the Vec happen after those.
There was a problem hiding this comment.
That makes sense. I may not get the chance to fix this until Monday, but I do see the problem and I will fix it.
This is exactly the kind of thing I'll need help catching, so thanks!
| self.pending_scope_mut(|pending, next_pending_index, in_use_elsewhere| { | ||
| // flush | ||
| let new_pending_len = (next_pending_index + 1).max(0) as usize; | ||
| for reused in pending.drain(new_pending_len..) { |
There was a problem hiding this comment.
This touches pending before checking in_use_elsewhere. Won't this be UB because drain takes &mut access to the contents of the slice while another thread could have & access?
There was a problem hiding this comment.
The other threads can only access the contents at/before next_pending_index, so the drain doesn't affect them. That is what determines the "length" in terms of what is still pending, though it's not the length in terms of Vec::len. Hopefully that made sense :).
Drain doesn't cause reallocations or changes to capacity, so it satisfies safety. All it does is bring Vec::len back into agreement with next_pending_index. We have to return before the rebalancing since that can resize the vec or access contents before next_pending_index.
It's also worth mentioning that there's almost never a conflict here. A remote registration would have to be in progress when flush is called.
There was a problem hiding this comment.
Oh, I see, the implementation of drain() never actually touches the rest of the slice, so that's not UB.
Making an &mut Vec when we don't have exclusive access to the entire slice still makes me really nervous! I think it would be a little safer to create only a &[T] when the slice was partially locked. You could combine the pending and in_use_elsewhere parameters into a Result, or even pass two closures for the two cases.
There was a problem hiding this comment.
Making an
&mut Vecwhen we don't have exclusive access to the entire slice still makes me really nervous!
Agreed. I'm making better safety comments to help combat this.
I think it would be a little safer to create only a
&[T]when the slice was partially locked. You could combine thependingandin_use_elsewhereparameters into aResult, or even pass two closures for the two cases.
I tried this originally. I agree this would be better in an ideal world. But even if something is accessing the pending part of the vec, we still need to be able to change the vec's length to signify that the flush happened. I don't see a good solution. Instead, I'm throwing safety comments everywhere. It's very unsafe code, but it's also only a tiny function to check the safety for.
| Entity::from_raw( | ||
| u32::try_from(self.meta.len() as IdCursor - n).expect("too many entities"), | ||
| ) | ||
| self.pending_scope(1, |reserved| { |
There was a problem hiding this comment.
If I understand correctly, then when a flush is running, pending_scope_mut ensures that calls to pending_scope return None. And when that happens, we fall back to allocating new entity indexes. Is that right?
Won't that mean that meta_len continues to grow as the app runs, even if the total number of live entities doesn't? Even if there are always enough recycled Entity values to use, we'll occasionally lose the race for the lock and allocate more memory in entity_meta.
There was a problem hiding this comment.
That's exactly right. Keep in mind that this only happens if someone is remotely reserving an entity at precisely the same time as flush_pending. And even that reserved entity can be reused. We have 4 billion entities. This might eat four or five of those, maybe.
If you can think of a better solution, I'm all for it, but I think this is a reasonable price to pay IMO.
There was a problem hiding this comment.
My concern is that even a small memory leak will make Bevy unusable for really long-running applications.
I suppose it's mitigated here by the fact that flush is only called with &mut World access, so if you're running in a regular system then commands.spawn() is guaranteed to be able to acquire a read lock. But if we start using these for assets, then an application that cycles through different assets will leak memory.
There was a problem hiding this comment.
To clarify, this isn't a memory leak. The entities that are reserved in this (very rare) case are not lost, and they are recycled as well.
This only ever happens when you have something like allocating from async during an &mut World flush. very rare. Lets say this happens 10_000 times, which is completely absurd. That's wasted memory of a grand total of 200_000 bytes, not even a megabyte.
And it's only wasted if an the app doesn't end up using them. Having extra pending entities can help balance pending vs owned. At the very least, these extra entities would speed up alloc. But again, there would be very few of these.
I really think this is a non issue. But maybe I'm missing something. Let me know if you think of a situation that could cause a problem here.
There was a problem hiding this comment.
To clarify, this isn't a memory leak. The entities that are reserved in this (very rare) case are not lost, and they are recycled as well.
I was being a little loose with my terminology, and referring to the fact that meta: Vec<EntityMeta> and pending will grow indefinitely. So it's not a memory leak in the sense that any individual EntityMeta will eventually get re-used, but it is a memory leak in the sense that the application will consume more memory over time even if the number of live entities stays constant.
Anyway, if you're planning to switch to an actual RwLock and we think contention is rare, then we can just use blocking read() and write() locks and the problem will go away!
There was a problem hiding this comment.
Ok. I see. I still think this is such a small leak that it's a non issue, but you're right, the RwLock will fix it anyway, so let's not worry about it.
Just to keep you in the loop, I've had another idea for how to do this. Should be even faster, but we'll see. I'm going to experiment with that and then come back to the RwLock if it doesn't work. I've been inspired by some of your ideas and the left_right crate. I think I can cook up something that is fully lock free with minimal atomics. But I'll know more soon.
this was the breaking change I forgot about
That's pretty much exactly the idea, yes. There's some extra data in there like
I could do that. I don't think we would reuse it elsewhere, but it may make it easier to understand. Then again, its also very connected to the rest of
This is hard, and it needs lots of review, but we don't just save one atomic op for this. Here's the inner RwLock's read functions: #[inline]
pub fn try_read(&self) -> bool {
self.state.fetch_update(Acquire, Relaxed, read_lock).is_ok()
}
#[inline]
pub fn read(&self) {
if !self.try_read() {
self.lock_contended(false)
}
}The function Anyway, So this trick actually goes from 5 atomic ops with some additional bit work, to 2 atomic ops that just add/subtract (which does have direct hardware support), so this actually saves quite a lot. We are leveraging the fact that these ops are commutative and the ordering doesn't matter to get the seed up. The I know this takes careful review to get right, and I don't pretend to have caught all the race conditions here myself, but if we can sort out those details, I think this custom solution is worth the effort. |
|
@chescock has found a lot of nasty bugs. I have a solution for them, but it's effectively a When I get the chance, I'll open a v3 of this with a |
|
I'm going to close this in favor of v4. At time of writing, the final choice is up in the air, but it's pretty clear that v4 is better than this in every way. Thanks for all the review on this! |
fixes #18003
Objective
Long story short: we need to be able to reserve entities from outside
&World.This is an alternative to #18195.
Solution
Thanks to everyone who helped brainstorm on discord. There's still ideas being tossed around, but here's how this PR works:
All entity meta is kept in a single unlocked
Vec<EntityMeta>. We keep a pending list that can be used remotely to reserve entities, etc.When we allocate an entity directly, we pop from the new
owned: Vec<Entity>, and if it's empty, we reserve some more entities in bulk. These entities are not flushed.To ensure we don't loose entities anywhere, flushing rebalances
pendingandowned. We can tweak the proportions later, but currently, it splits the pending entities between the two evenly.Practical Change
It used to be that a brand new
Entitieswould yield sequential indices viaalloc. This was never a guarantee, but many tests relied on this ordering. The ordering is still deterministic but is no longer sequential or aligned. In other words, if you allocate 5 entities. You might get back indices[0, 4, 2, 6, 7]. Generally entities like this are opaque and this should not affect users. However, tests and benchmarks that depend on this previous, not guaranteed behavior had to be reworked.One practical effect this may have is that systems that run through
Entitys sequentially, will have a different order. This may affect query orders, observer ordering, etc. None of these things were guarantees before, but this may expose some existing bugs in those systems.I suspect this is why some rendering systems broke.
Future Work
Flushing is now completely unnecessary for many of the interactions with [
Entities]. We still need to do it before applying command queues, etc, but we don't need to flush before allocations or freeing. If we go in this direction, we should explore relaxing those requirements for performance.Testing
Functionality hasn't changed, so no new tests were added. A few old tests were revised because of the practical changes.
However, it makes sense to add tests for remote registration, so if anyone has specific concerns, I'd be happy to write a test for it.