Skip to content

[Bug] SGLang PD logging a lot of Abort request message after some requests failed. #11040

@llc-kc

Description

@llc-kc

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
  • 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • 5. Please use English, otherwise it will be closed.

Describe the bug

when benchmark SGLang PD deployment with large concurrency, when some request failed, sometimes there are a lot of Abort request logging. For example, in prefill side:

Abort bootstrap queue request. req.rid='46e7f65cacc149f1b8324548e3640ac8'
Abort queued request. req.rid='e70c35086c82432aa946efab6507b018'
Abort bootstrap queue request. req.rid='988ce7ffe69645708e0f545c3aff27ba'
Abort bootstrap queue request. req.rid='a4126825397541328fcfaff001b2fff2'

in decoding side:

Abort transfer queue request. decode_req.req.rid='3d088e464d8a4a9889f43f92746e76c3'
Abort transfer queue request. decode_req.req.rid='ec3c4d7bccb14bdf852e986ad7462e94'
Abort transfer queue request. decode_req.req.rid='25013b592b254378b3633e378fa29a87'

Actually, these request are not really Aborted.
should be bug of code in Scheduler.abort_request():

        # Delete requests not in the waiting queue when PD disaggregation is enabled
        if self.disaggregation_mode == DisaggregationMode.PREFILL:
            # Abort requests that have not yet been bootstrapped
            for i, req in enumerate(self.disagg_prefill_bootstrap_queue.queue):
                logger.debug(f"Abort bootstrap queue request. {req.rid=}")
                if recv_req.abort_all or req.rid.startswith(recv_req.rid):
                    if hasattr(req.disagg_kv_sender, "abort"):
                        req.disagg_kv_sender.abort()

            # Abort in-flight requests
            for i, req in enumerate(self.disagg_prefill_inflight_queue):
                logger.debug(f"Abort inflight queue request. {req.rid=}")
                if recv_req.abort_all or req.rid.startswith(recv_req.rid):
                    if hasattr(req.disagg_kv_sender, "abort"):
                        req.disagg_kv_sender.abort()

        elif self.disaggregation_mode == DisaggregationMode.DECODE:
            # Abort requests that have not yet finished preallocation
            for i, decode_req in enumerate(self.disagg_decode_prealloc_queue.queue):
                logger.debug(f"Abort prealloc queue request. {decode_req.req.rid=}")
                if recv_req.abort_all or decode_req.req.rid.startswith(recv_req.rid):
                    if hasattr(decode_req.kv_receiver, "abort"):
                        decode_req.kv_receiver.abort()

            # Abort requests waiting for kvcache to release tree cache
            for i, decode_req in enumerate(self.disagg_decode_transfer_queue.queue):
                logger.debug(f"Abort transfer queue request. {decode_req.req.rid=}")
                if recv_req.abort_all or decode_req.req.rid.startswith(recv_req.rid):
                    if hasattr(decode_req.kv_receiver, "abort"):
                        decode_req.kv_receiver.abort()

these Abort logging calling should be in the scope of if statements.

Reproduction

benchmark sglang PD with large concurrency and large request num.

Environment

0.5.2

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions