Skip to content

Conversation

@digaobarbosa
Copy link
Contributor

@digaobarbosa digaobarbosa commented Dec 29, 2025

Description

Connection closes unexpectedly when running on modal through TURN server bug fix.

Added an ACK mechanism, where the client is responsible to sending the last frame it received.
This should limit the number of frames/data that are in transit on the TURN server.

The belief is that when sending frames through the datachannel, we hit some limits on the TURN server, for example the inference data sending is probably faster than the client can download from it.

With the ACK, we know an information from the client on really where it is reading currently.

Details:

  • only enabled for realtime_processing=false (video files)
  • default window of 20 frames
  • backwards compatible (the client can send ack to the old version, and the server keeps old behaviour if it's not receiving acks)

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How has this change been tested, please provide a testcase or example of how you tested the change?

locally/modal

Any specific deployment considerations

modal

Docs

  • Docs updated? What were the changes:

@digaobarbosa digaobarbosa self-assigned this Dec 29, 2025
"LOG_LEVEL": LOG_LEVEL,
"ONNXRUNTIME_EXECUTION_PROVIDERS": "[CUDAExecutionProvider,CPUExecutionProvider]",
"PROJECT": PROJECT,
"PYTHONASYNCIODEBUG": str(os.getenv("PYTHONASYNCIODEBUG", "0")),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adds the possibility to enable asyncio debugging

)
cls_with_options = cls_with_options.with_options(
ram=requested_ram_mb,
memory=requested_ram_mb,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discovered the right parameter is memory
https://modal.com/docs/reference/modal.Cls#with_options

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Modal was definitely allocating requested amount of RAM - maybe something changed in their docs

return frame
except StopIteration:
loop = asyncio.get_running_loop()
frame = await loop.run_in_executor(None, lambda: next(self._iterator, None))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

efforts to reduce main loop cpu pressure.

heartbeat_callback()
await asyncio.sleep(WEBRTC_DATA_CHANNEL_BUFFER_DRAINING_DELAY)

async def wait_for_buffer_drain() -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

efforts to reduce main loop cpu pressure.

self.realtime_processing = realtime_processing

# Optional receiver-paced flow control (enabled only after first ACK is received)
self._ack_last: int = 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properties related to the ack window behaviour.

workflow_output: Dict[str, Any],
data_output_mode: DataOutputMode,
) -> Tuple[Dict[str, Any], List[str]]:
"""Serialize workflow outputs in a thread to avoid blocking the event loop."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

efforts to reduce main loop cpu pressure.
This one was that improved the asyncio warning the most.
I believe because of the base64 jpeg conversion

)
cls_with_options = cls_with_options.with_options(
ram=requested_ram_mb,
memory=requested_ram_mb,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Modal was definitely allocating requested amount of RAM - maybe something changed in their docs

@digaobarbosa digaobarbosa merged commit 63297ae into main Dec 30, 2025
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants