Flush commands after every mutation in WorldEntityMut#16219
Flush commands after every mutation in WorldEntityMut#16219alice-i-cecile merged 10 commits intobevyengine:mainfrom
WorldEntityMut#16219Conversation
90f42bc to
82340bf
Compare
|
Yes please, I'd like a small Migration Guide note here. |
maniwani
left a comment
There was a problem hiding this comment.
Yes, this is pretty much what I was thinking. I assume branch prediction should hide any overhead from this since the condition should basically never be true.
I think hitting this case should issue a warning. And we might want to have a #[track_caller] on the affected EntityWorldMut methods so that anyone who sees the warning knows where to start looking.
Something like...
#[inline(never)]
#[cold]
fn on_entity_invalid(entity: Entity, caller: &'static Location) {
warn!("error[B<insert-error-number-here>]: {caller} tried to modify the entity {:?}, but it no longer exists. See: https://bevyengine.org/learn/errors/b<insert-error-number-here>", entity);
}|
Ah, wait. Since Some of the |
|
|
|
Hello! 👋 I lack context about the changes in this PR, but by petition of @alice-i-cecile , I checked out this branch to see if I can still reproduce the crash of The answer is that no, I cannot. The thing is, I can't repro it in bevy main neither, so this bug must have been fixed at some point between Bevy 0.14.2 (the version where I encountered it) and current Bevy main atow Here, for reference: this is my minimal reproduceable example, on top of Bevy 0.14.2: https://github.com/Maximetinu/bevy/commits/minimal-reproduceable-bug-on-remove-hook/ see changes in latest commit running that modified example does reproduce the bug notice that the cause is just spawning an empty entity during a on_remove hook This is my minimal reproduceable example cherry-picked to this branch: see changes in latest commit, they are the same as in the branch above running that modified example does not reproduce the bug so, this branch works 🥳 However, bevy main also works: https://github.com/Maximetinu/bevy/commits/reproducing-on-remove-hook-panic-on-bevy-main/ which means that, well, this branch does not fix per se this bug, because it is already fixed on Bevy main but at least it means that it's not re-introducing it neither 🙂 Maybe my minimal reproduceable example can help to write a test covering from this specific scenario I hope this helps! |
|
I'd love to see that turned into a test, either in this PR or separately :) |
I have made a test for this, that fails at 0.14.2 but passes in current bevy main, so it should cover from this regression #[test]
fn test_spawn_entities_during_on_remove_hook() {
#[derive(Component)]
struct MyComponent;
App::new()
.add_plugins(MinimalPlugins)
.add_systems(Startup, |world: &mut World| {
world.register_component_hooks::<MyComponent>().on_remove(
|mut world, _entity, _component_id| {
world.commands().spawn_empty();
},
);
})
.add_systems(
Update,
|mut commands: Commands,
my_component_q: Query<Entity, With<MyComponent>>,
mut exit_event: EventWriter<AppExit>,
mut after_first_frame: Local<bool>| {
for entity in &my_component_q {
commands.entity(entity).despawn();
}
commands.spawn(MyComponent);
if *after_first_frame {
exit_event.send(AppExit::Success);
}
*after_first_frame = true;
},
)
.run();
}It's a bit of an unusual test (I'm not even sure if |
|
Let's spin this out into a new issue :) |
|
@maniwani Updated design (just quick notes for now):
|
12d4e58 to
ef4fad4
Compare
|
I changed this pull to instead panic rather than warn and try to behave if nothing is wrong. The reason for this is that if we only try to warn, there's a quite wide ranging knock-off effect into |
|
@alice-i-cecile @maniwani I'd love to get feedback if the new panicing approach is good or bad. Also, is it reasonable to add a panics section to all the methods which might panic for this. It should truly be a corner case when this happens, so I'm not sure if it's sensible to spam it everywhere (and it cannot be put into some of the non-fallible |
|
Implementation of panic is still placeholderish, just to note. |
For what it's worth, this behavior is widely hated :) I think I'm broadly on board with panicking here though, especially as a first pass. This really is an edge case, and I think that trying to figure out better ways to prevent this bad state from ever arising is more useful than bubbling the error up. |
59f2798 to
5075420
Compare
edc5cd2 to
eddf012
Compare
eddf012 to
aca60b6
Compare
b90ed0b to
4668d66
Compare
|
If #16499 gets merged before this one, one of the test cases will break, as it should when the ordering of hooks and observers is changed. |
|
@alice-i-cecile -S-Waiting-on-Author +S-Needs-Review 🙇 |
alice-i-cecile
left a comment
There was a problem hiding this comment.
Excellent tests and good analysis. I'd like to clean up the (very niche) panics here by swapping to results, but that's best left for follow-up.
chescock
left a comment
There was a problem hiding this comment.
I don't entirely understand what world.flush() does, but I do believe this PR calls assert_not_despawned(); before every read or write of the entity or its components and calls world.flush() and update_location() after every write, so self.location should always be correct.
The one exception is that the world_scope() method calls update_location() but not world.flush(). Should that call world.flush() as well?
| #[inline(always)] | ||
| #[track_caller] | ||
| fn assert_not_despawned(&self) { | ||
| if self.location.archetype_id == ArchetypeId::INVALID { |
There was a problem hiding this comment.
This could use the is_despawned() method if you wanted.
| if self.location.archetype_id == ArchetypeId::INVALID { | |
| if self.is_despawned() { |
|
An alternative to checking in every method if the entity has been despawned would be to panic after calling world.flush instead of checking in each method. This would disallow despawning the entity in hooks, but I think that makes sense that you wouldn't be able to despawn the EntityWorldMut while holding it. |
World flush runs any The There has been talk of having commands apply immediately when holding a |
It would be an alternative, but instead the aim is to go to another direction, to make the methods fallible instead. Also, there are two other reasons. First, there aren't any other ways to make archetype changes to entities but to take Second, you can't really prevent hooks and observers from making these types of changes in some special cases. So there isn't a workaround for that. Where as just dropping a |
|
FWIW i think panicking is fine here; there's similar stuff in the engine that already works this way. We can reconsider it in a broader review of bevy's error handling. But especially for guarding invariant going into unsafe code, panicking is probably the way to go. |
…6219) # Objective - Currently adding observers spawns an entity which implicitly flushes the command queue, which can cause undefined behaviour if the `WorldEntityMut` is used after this - The reason `WorldEntityMut` attempted to (unsuccessfully) avoid flushing commands until finished was that such commands may move or despawn the entity being referenced, invalidating the cached location. - With the introduction of hooks and observers, this isn't sensible anymore as running the commands generated by hooks immediately is required to maintain correct ordering of operations and to not expose the world in an inconsistent state - Objective is to make command flushing deterministic and fix the related issues - Fixes bevyengine#16212 - Fixes bevyengine#14621 - Fixes bevyengine#16034 ## Solution - Allow `WorldEntityMut` to exist even when it refers to a despawned entity by allowing `EntityLocation` to be marked invalid - Add checks to all methods to panic if trying to access a despawned entity - Flush command queue after every operation that might trigger hooks or observers - Update entity location always after flushing command queue ## Testing - Added test cases for currently broken behaviour - Added test cases that flushes happen in all operations - Added test cases to ensure hooks and commands are run exactly in correct order when nested --- Todo: - [x] Write migration guide - [x] Add tests that using `EntityWorldMut` on a despawned entity panics - [x] Add tests that commands are flushed after every operation that is supposed to flush them - [x] Add tests that hooks, observers and their spawned commands are run in the correct order when nested --- ## Migration Guide Previously `EntityWorldMut` triggered command queue flushes in unpredictable places, which could interfere with hooks and observers. Now the command queue is flushed always immediately after any call in `EntityWorldMut` that spawns or despawns an entity, or adds, removes or replaces a component. This means hooks and observers will run their commands in the correct order. As a side effect, there is a possibility that a hook or observer could despawn the entity that is being referred to by `EntityWorldMut`. This could already currently happen if an observer was added while keeping an `EntityWorldMut` referece and would cause unsound behaviour. If the entity has been despawned, calling any methods which require the entity location will panic. This matches the behaviour that `Commands` will panic if called on an already despawned entity. In the extremely rare case where taking a new `EntityWorldMut` reference or otherwise restructuring the code so that this case does not happen is not possible, there's a new `is_despawned` method that can be used to check if the referred entity has been despawned.
Objective
WorldEntityMutis used after thisWorldEntityMutattempted to (unsuccessfully) avoid flushing commands until finished was that such commands may move or despawn the entity being referenced, invalidating the cached location.observeonWorldEntityMutmay cause the reference to be invalidated #16212Commandsapply at unexpected times with exclusiveWorldaccess #14621Solution
WorldEntityMutto exist even when it refers to a despawned entity by allowingEntityLocationto be marked invalidTesting
Todo:
EntityWorldMuton a despawned entity panicsMigration Guide
Previously
EntityWorldMuttriggered command queue flushes in unpredictable places, which could interfere with hooks and observers. Now the command queue is flushed always immediately after any call inEntityWorldMutthat spawns or despawns an entity, or adds, removes or replaces a component. This means hooks and observers will run their commands in the correct order.As a side effect, there is a possibility that a hook or observer could despawn the entity that is being referred to by
EntityWorldMut. This could already currently happen if an observer was added while keeping anEntityWorldMutreferece and would cause unsound behaviour. If the entity has been despawned, calling any methods which require the entity location will panic. This matches the behaviour thatCommandswill panic if called on an already despawned entity. In the extremely rare case where taking a newEntityWorldMutreference or otherwise restructuring the code so that this case does not happen is not possible, there's a newis_despawnedmethod that can be used to check if the referred entity has been despawned.