Steps to reproduce
- On an Android device using the Impeller OpenGLES backend (
ImpellerBackend set to opengles), run a flutter_gpu renderer that suballocates per-draw data (for example model transforms) from a gpu.HostBuffer and submits more than one command buffer per frame, continuing to emplace between submits.
- Animate the emplaced data every frame.
Expected results
Each draw reads the data that was flushed for it.
Actual results
Draws intermittently read stale buffer contents, several frames old. Visually, objects lag behind their true transforms and snap back. When a buffer's only flush is lost (one-time geometry uploads), draws read uninitialized buffer contents, which can hang the GLES driver.
The cause is a race in DeviceBufferGLES. Flush() merges dirty ranges on the writer thread (the UI thread for host-visible flutter_gpu buffers) while BindAndUploadDataIfNecessary() consumes the range on a reactor thread. The upload reads dirty_range_, calls glBufferSubData, and clears dirty_range_ afterwards, so any flush landing during the upload is merged and then wiped without being uploaded. The race window is the full duration of glBufferSubData, so with multiple command buffers per frame it fires constantly.
Verified by mapping the GL buffer at bind time and comparing against the CPU backing store on a Pixel 10 Pro. 98% of instance-rate vertex buffer binds read mismatched GPU bytes, and instrumented write/upload generation counters showed binds where multiple flushes had occurred since the last upload while the dirty range was already empty. Taking and clearing the dirty range before the upload under a lock shared with Flush() eliminates all mismatches.
Code sample
Code sample
// Per frame, with a shared gpu.HostBuffer `transients`:
final pass1 = gpu.gpuContext.createCommandBuffer();
// ... encode draws using transients.emplace(...) views ...
pass1.submit();
// More emplaces for the next pass land while the reactor may be
// encoding/uploading pass1 on the raster thread.
final view = transients.emplace(modelTransformBytes);
final pass2 = gpu.gpuContext.createCommandBuffer();
// ... encode draws using `view` ...
pass2.submit();
Logs
Logs
Instrumented readback comparison at bind time (gpu_t/cpu_t are the translation column of the bound matrix in the GL buffer vs the CPU backing store, wgen counts Flush() calls, ugen the write generation last uploaded, dirty_in the dirty range on entry).
[INSTDBG-MISMATCH] handle=8 off=39936 len=64 gpu_t=(-2.37201,0.575955,-1.60164) cpu_t=(-2.33892,1.69092,0) wgen=13184 ugen=13175 dirty_in=[-1,-1) tid=496072007360
[INSTDBG-MISMATCH] handle=8 off=41216 len=64 gpu_t=(0,0,0) cpu_t=(-2.37201,0.575955,1.60164) wgen=13184 ugen=13175 dirty_in=[-1,-1) tid=496072007360
Nine flushes after the last upload with an empty dirty range, and (0,0,0) reads from ranges whose upload was lost entirely.
Flutter Doctor output
Doctor output
[!] Flutter (Channel [user-branch], 3.45.0-1.0.pre-465, on macOS 15.5 24F74 darwin-arm64, locale en-US)
• Framework revision 907c8a3719 (2026-06-11), engine revision e58da08c34
• Dart version 3.13.0 (build 3.13.0-184.0.dev)
[✓] Android toolchain - develop for Android devices (Android SDK version 36.1.0)
[✓] Xcode - develop for iOS and macOS (Xcode 16.4)
[✓] Connected device, Pixel 10 Pro (Android 16)
Reproduced at tip-of-tree with a locally built engine.
Steps to reproduce
ImpellerBackendset toopengles), run aflutter_gpurenderer that suballocates per-draw data (for example model transforms) from agpu.HostBufferand submits more than one command buffer per frame, continuing to emplace between submits.Expected results
Each draw reads the data that was flushed for it.
Actual results
Draws intermittently read stale buffer contents, several frames old. Visually, objects lag behind their true transforms and snap back. When a buffer's only flush is lost (one-time geometry uploads), draws read uninitialized buffer contents, which can hang the GLES driver.
The cause is a race in
DeviceBufferGLES.Flush()merges dirty ranges on the writer thread (the UI thread for host-visibleflutter_gpubuffers) whileBindAndUploadDataIfNecessary()consumes the range on a reactor thread. The upload readsdirty_range_, callsglBufferSubData, and clearsdirty_range_afterwards, so any flush landing during the upload is merged and then wiped without being uploaded. The race window is the full duration ofglBufferSubData, so with multiple command buffers per frame it fires constantly.Verified by mapping the GL buffer at bind time and comparing against the CPU backing store on a Pixel 10 Pro. 98% of instance-rate vertex buffer binds read mismatched GPU bytes, and instrumented write/upload generation counters showed binds where multiple flushes had occurred since the last upload while the dirty range was already empty. Taking and clearing the dirty range before the upload under a lock shared with
Flush()eliminates all mismatches.Code sample
Code sample
Logs
Logs
Instrumented readback comparison at bind time (gpu_t/cpu_t are the translation column of the bound matrix in the GL buffer vs the CPU backing store, wgen counts
Flush()calls, ugen the write generation last uploaded, dirty_in the dirty range on entry).Nine flushes after the last upload with an empty dirty range, and
(0,0,0)reads from ranges whose upload was lost entirely.Flutter Doctor output
Doctor output
Reproduced at tip-of-tree with a locally built engine.