Skip to content

A thread might skip the line in publisher flow controller #421

@plamut

Description

@plamut

In publisher FlowController, the checks whether or not a message can go through based on the current message count are not strict enough.

When a thread blocks because the message limit is hit, it goes to sleep in the add() method until a slot for another message becomes available. This happens when publishing one of the earlier messages completes and that message is released from the flow controller. Assuming that there's now again enough capacity to accept at least one new message, the sleeping threads get notified.

Now, just when our sleeping thread wakes up, but before it re-acquires the lock, another thread might arrive, call add() and acquire the lock just in front of the the woken up thread's nose. The new thread sees that there's room for a new message and takes it, forcing the older thread to wait until the next opportunity.

Expected result:
Threads should always be let through in FIFO order.

The fix is to force threads to reserve a slot for the message in the same way they need to reserve the free bytes capacity. Also, any released message slots must be distributed in FIFO order among the queued up threads.

In an extreme case, this issue can also cause some starving, though that becomes increasingly unlikely, as the waiting thread will eventually get lucky and grab the lock before the threads arriving after it will.

Metadata

Metadata

Assignees

Labels

api: pubsubIssues related to the googleapis/python-pubsub API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions