feat: channel concurrency settings, close #1130 by looplj · Pull Request #1322 · looplj/axonhub

looplj · 2026-04-08T01:14:26Z

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

gemini-code-assist

Code Review

This pull request introduces a concurrency limit feature for channels, adding a maxConcurrent field to the rate limit settings across the frontend UI, GraphQL API, and backend scoring logic. The RateLimitAwareStrategy has been updated to utilize a connectionTracker for enforcing these limits. Feedback highlights an inconsistency where a concurrency limit of zero is treated as unlimited rather than a strict block, and notes a potential integer overflow risk in the GraphQL schema where the Int type is used for values that are int64 in the backend.

gemini-code-assist · 2026-04-08T01:21:19Z

 	}

+	// Check concurrent requests
+	if rl.MaxConcurrent != nil && *rl.MaxConcurrent > 0 && s.connectionTracker != nil {


The check *rl.MaxConcurrent > 0 implies that a concurrency limit of 0 is treated as unlimited (since it skips the limit check). However, the documentation in internal/objects/channel.go states that nil = unlimited. Usually, a limit of 0 should mean 'no requests allowed'. While the frontend validation ensures positive values, the backend should ideally handle 0 as a strict block to be consistent with the nil vs non-nil logic.

gemini-code-assist · 2026-04-08T01:21:19Z

 	}

+	// Check concurrent requests
+	if rl.MaxConcurrent != nil && *rl.MaxConcurrent > 0 && s.connectionTracker != nil {


Similar to the Score function, a concurrency limit of 0 will be ignored here due to the *rl.MaxConcurrent > 0 check. If nil is the designated value for 'unlimited', then 0 should likely be treated as 'exhausted'.

gemini-code-assist · 2026-04-08T01:21:19Z

 input ChannelRateLimitInput {
  rpm: Int
  tpm: Int
+  maxConcurrent: Int


The maxConcurrent field is defined as Int in GraphQL, which is a signed 32-bit integer. In the Go backend, it is defined as int64. While 2 billion concurrent requests is likely sufficient, using Int for rate limits (like TPM) can sometimes lead to overflows if the values are very large. Consider if this consistency with rpm and tpm is desired or if a larger type should be used.

* feat: per-channel concurrency queue + admission control, close #1130 PR #1322 only added scoring-layer down-ranking, which was a no-op when a model only had one channel. This change adds true admission enforcement. Backend: - ChannelLimiter: soft mode (count only) and hard mode (FIFO queue + capacity ceiling + per-channel timeout), with slot-handoff Release for FIFO fairness. - ChannelLimiterManager: per-channel limiter cache keyed by config struct equality; exposes Snapshot for observability. - ChannelQueueError: synthetic 429 + structured body (channel_queue_full / channel_queue_timeout). No Retry-After to avoid the cooldown middleware misclassifying local rejections as upstream 429s. - New ChannelRateLimit fields: QueueSize, QueueTimeoutMs. - Middleware order: channel admission runs before rate-limit tracking so locally rejected requests do not consume RPM budget. - LB scoring composes RPM/TPM and concurrency sub-scores via min(); hard- mode queueing capped at 30% so idle peers keep a clear advantage. - Removed legacy ConnectionTracker; ChannelLimiterManager is the single source of truth for in-flight stats. - OpenTelemetry: 5 channel-level instruments (inflight / queue_waiting gauges, queue_full / queue_timeout counters, queue_wait_seconds histogram). - GraphQL Channel.liveLimiterStats field for real-time UI display. - ValidateRateLimit guards against negatives and queueSize without maxConcurrent. Frontend: - channels-rate-limit-dialog: queueSize + queueTimeoutMs inputs with cross-field validation. - channel-limiter-cell: text-only inFlight/cap and Q waiting/queueSize display, color-coded by utilization, polled every 5s. - i18n strings for new fields and tooltips (en + zh-CN). * feat(channels): warn when MaxConcurrent is set without a queue When a user configures only MaxConcurrent and leaves Queue Size empty, the limiter runs in soft mode — the cap only down-ranks the channel in load-balancer scoring, it does NOT block excess upstream requests. This is easy to miss; surface an inline amber advisory under the Queue Size field so users know to set Queue Size for hard enforcement. * fix(channels): drop stale limiter entry, defensive scoring, monotonic hard-mode score Address PR #1503 review findings from copilot/gemini-code-assist: - ChannelLimiterManager.GetOrCreate now drops the cached entry when capacity transitions to 0, so Stats/Snapshot stop reporting a channel that no longer has a concurrency limit and downstream code can no longer dereference now-nil rate-limit pointers. - RateLimitAwareStrategy.scoreConcurrency adds a defensive nil check on channel.Settings.RateLimit / MaxConcurrent so a brief stale Stats hit before the next GetOrCreate cannot panic. - Hard-mode below-capacity score is shifted to [queueCeiling, maxScore] so a 9/10 channel always outranks a 10/10 channel with an empty queue (previously 10 vs 30 — load balancer preferred the saturated channel). - zh-CN softModeWarning drops literal **markers** that the dialog renders as plain text. - channel_queue_error.go docstring refreshed (connectionTracking middleware was renamed to channelLimiter). * fix(channels): per-attempt limiter slot to plug retry-induced leak The channel concurrency limiter middleware held a struct-scoped sync.Once for release. Pipeline.Process re-enters OnOutboundRawRequest on same-channel retry and channel switch, calling Acquire on the same middleware instance again -- but the struct-scoped Once short-circuits every release after the first, permanently leaking the slot. Under sustained upstream 429s with retry enabled, MaxConcurrent slots leak one-by-one until the limiter is fully exhausted and every new request times out in the queue. Replace acquired+releaseOnce with atomic.Pointer[limiterSlot] where each Acquire mints a fresh slot owning its own Once. Stream wrappers capture the slot pointer at wrap time so they release the slot they were minted with, not whatever's current at Close time. * test(channels): mark rate-limit subtests parallel to satisfy tparallel Parent tests already call t.Parallel(); subtests must too or the linter flags the asymmetry.

feat: channel concurrency settings, close #1130

d482b6f

devin-ai-integration Bot reviewed Apr 8, 2026

View reviewed changes

looplj merged commit 4a48f26 into release/v0.9.x Apr 8, 2026
5 checks passed

gemini-code-assist Bot reviewed Apr 8, 2026

View reviewed changes

suixinio mentioned this pull request Apr 26, 2026

feat: per-channel concurrency queue + admission control #1503

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: channel concurrency settings, close #1130#1322

feat: channel concurrency settings, close #1130#1322
looplj merged 1 commit into
release/v0.9.xfrom
dev-tmp

looplj commented Apr 8, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 8, 2026

Uh oh!

gemini-code-assist Bot Apr 8, 2026

Uh oh!

gemini-code-assist Bot Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

looplj commented Apr 8, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

looplj commented Apr 8, 2026 •

edited by devin-ai-integration Bot

Loading