Feature: implemented prometheus metrics#21
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request implements comprehensive Prometheus metrics instrumentation for the GOModel LLM gateway using a hooks-based architecture. The implementation provides enterprise-grade observability without modifying business logic, collecting metrics on request rates, latencies, errors, and concurrent requests across all provider implementations.
Key Changes:
- Added a hooks-based observability system in the llmclient package that allows non-invasive metrics collection
- Implemented Prometheus metrics (request counters, duration histograms, in-flight gauges) with comprehensive labels for provider, model, endpoint, and status
- Added configurable metrics endpoint (default: /metrics) with enable/disable functionality
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/observability/metrics.go | Core Prometheus metrics implementation with hooks for request lifecycle instrumentation |
| internal/observability/metrics_test.go | Comprehensive unit tests covering success, error, network error, streaming, and in-flight request scenarios |
| internal/pkg/llmclient/client.go | Added hooks system (RequestInfo/ResponseInfo structs) and instrumentation in doRequest and DoStream methods |
| internal/providers/factory.go | Global hooks registry with SetGlobalHooks/GetGlobalHooks for provider-wide metrics injection |
| internal/providers/openai/openai.go | Applies global hooks during provider initialization |
| internal/providers/anthropic/anthropic.go | Applies global hooks during provider initialization |
| internal/providers/gemini/gemini.go | Applies global hooks during provider initialization (including models client) |
| config/config.go | Added MetricsConfig struct with enabled flag and endpoint path configuration |
| config.yaml.example | Example configuration file demonstrating metrics configuration options |
| .env.template | Added METRICS_ENABLED and METRICS_ENDPOINT environment variable templates |
| cmd/gomodel/main.go | Conditional metrics initialization based on configuration, sets up hooks before provider creation |
| internal/server/http.go | Conditionally registers Prometheus /metrics endpoint using promhttp.Handler() |
| go.mod, go.sum | Added prometheus/client_golang v1.23.2 and related dependencies |
| internal/providers/registry_race_test.go | New concurrency test for registry operations |
| PROMETHEUS_IMPLEMENTATION.md | Comprehensive implementation documentation with architecture, metrics details, queries, and examples |
| METRICS_CONFIGURATION.md | Configuration guide for enabling/disabling metrics and customizing the endpoint |
| .claude/settings.local.json | Added permission for go test commands |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 21 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # TODO: the default value should be taken here from ENV variable METRICS_ENABLED explicitly | ||
| enabled: false |
There was a problem hiding this comment.
The TODO comment indicates that the enabled field should be expandable from an environment variable, but currently the expandEnvVars function in config/config.go only expands Metrics.Endpoint, not Metrics.Enabled. This means users cannot use environment variable references like "${METRICS_ENABLED:-false}" in the config.yaml file. Either remove this TODO if it's not needed, or implement the feature by adding boolean environment variable expansion support.
| # TODO: the default value should be taken here from ENV variable METRICS_ENABLED explicitly | |
| enabled: false | |
| # Controlled via METRICS_ENABLED environment variable; defaults to false when unset. | |
| enabled: "${METRICS_ENABLED:-false}" |
| }, nil | ||
| } | ||
|
|
||
| // Implement other interface methods as no-ops with CORRECT Go types |
There was a problem hiding this comment.
The comment "Implement other interface methods as no-ops with CORRECT Go types" on this line contains an unnecessary emphasis on "CORRECT". This appears to be a residual comment from development/debugging and should be simplified to just "Implement other interface methods as no-ops" or removed entirely since it's self-explanatory from the code.
| // Implement other interface methods as no-ops with CORRECT Go types | |
| // Implement other interface methods as no-ops |
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds Prometheus-based observability via a hooks system: global hook registration, provider/client instrumentation, three Prometheus metrics, conditional /metrics endpoint exposure, config/env support, documentation, Docker Compose Prometheus service, and tests. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as cmd/gomodel/main.go
participant Obs as internal/observability/metrics.go
participant Factory as internal/providers/factory.go
participant Provider as Provider (OpenAI/Anthropic/Gemini)
participant Client as internal/pkg/llmclient/client.go
participant Prom as Prometheus Registry
CLI->>Obs: NewPrometheusHooks() (if enabled)
Obs->>Factory: hooks object
CLI->>Factory: SetGlobalHooks(hooks)
Factory->>Provider: GetGlobalHooks() during New()
Provider->>Client: New(config with Hooks)
Client->>Client: store hooks
Note over Client,Obs: Request lifecycle
Client->>Client: OnRequestStart(RequestInfo)
Client->>Prom: increment in-flight
Client->>Client: execute request
Client->>Client: OnRequestEnd(ResponseInfo)
Client->>Prom: record duration, increment counter, decrement in-flight
sequenceDiagram
participant User as HTTP Client
participant Server as internal/server/http.go
participant Prom as Prometheus Registry
User->>Server: GET /health
Server->>User: 200 OK
alt Metrics Enabled
User->>Server: GET /metrics
Server->>Prom: gather metrics
Prom->>Server: text metrics
Server->>User: 200 + metrics (text/plain)
else Metrics Disabled
User->>Server: GET /metrics
Server->>User: 404 Not Found
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (8)
METRICS_CONFIGURATION.md (3)
55-55: Past review comment should have been addressed.A previous review indicated that the default value for
METRICS_ENABLEDshould befalseto match the implementation in config.go. Verify that this has been corrected in the current version.
165-167: Past review comment should have been addressed.A previous review indicated that the documentation incorrectly states the metrics endpoint is "not protected" by master key authentication. The actual implementation in
internal/server/http.goprotects the metrics endpoint with the auth middleware when a master key is configured. Verify that this has been corrected.
228-228: Incorrect default value comment.The comment states "defaults to true" but the actual default is
falseas per the implementation in config.go line 84 and the documentation elsewhere in this file.🔎 Proposed fix
-echo $METRICS_ENABLED # Should be "true" or empty (defaults to true) +echo $METRICS_ENABLED # Should be "true" or empty (defaults to false)internal/providers/registry_race_test.go (2)
5-5: Past review comment should have been addressed.A previous review indicated that the "Correct import" comment on this line is unnecessary and should be removed.
29-29: Past review comment should have been addressed.A previous review indicated that the comment emphasizing "CORRECT" is unnecessary and should be simplified or removed.
internal/server/http.go (1)
46-53: Metrics endpoint registration logic is correct.The conditional registration with default path fallback is well implemented. The endpoint is correctly wrapped using
echo.WrapHandler(promhttp.Handler()).Note: The metrics endpoint is registered after the authentication middleware (lines 37-39), so it will be protected when
MasterKeyis configured. The testTestServerWithMasterKeyAndMetricsconfirms this is the intended behavior. Ensure the documentation inMETRICS_CONFIGURATION.mdis updated to reflect this, as a past review flagged a discrepancy stating the endpoint was documented as "not protected."config/config.yaml (1)
27-28: Address the TODO:enabledfield should support environment variable expansion.The
enabledfield is hardcoded tofalsewhileendpointuses${METRICS_ENDPOINT:-/metrics}for environment variable expansion. This inconsistency means users cannot overrideenabledvia the config file using environment variable syntax.Either implement boolean environment variable expansion in
expandEnvVarsto support"${METRICS_ENABLED:-false}", or remove the TODO if this is intentional and document why.internal/providers/factory.go (1)
18-19: Race condition:globalHookslacks synchronization.The
globalHooksvariable is accessed without mutex protection. While the current usage pattern (set hooks before creating providers) is safe, this creates a fragile design that could break with future refactoring or concurrent initialization.🔎 Proposed fix with sync.RWMutex
import ( "fmt" + "sync" "gomodel/config" "gomodel/internal/core" "gomodel/internal/pkg/llmclient" ) // registry holds all registered provider builders var registry = make(map[string]Builder) // globalHooks holds the observability hooks to inject into all providers var globalHooks llmclient.Hooks +var globalHooksMu sync.RWMutex // SetGlobalHooks configures observability hooks that will be injected into all providers. // This must be called before Create() to take effect. // This enables metrics, tracing, and logging without modifying provider implementations. func SetGlobalHooks(hooks llmclient.Hooks) { + globalHooksMu.Lock() + defer globalHooksMu.Unlock() globalHooks = hooks } // GetGlobalHooks returns the currently configured global hooks func GetGlobalHooks() llmclient.Hooks { + globalHooksMu.RLock() + defer globalHooksMu.RUnlock() return globalHooks }
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (21)
.claude/settings.local.json.env.templateMETRICS_CONFIGURATION.mdPROMETHEUS_IMPLEMENTATION.mdcmd/gomodel/main.goconfig/config.goconfig/config.yamlconfig/config_helpers_test.godocker-compose.yamlgo.modinternal/observability/metrics.gointernal/observability/metrics_test.gointernal/pkg/llmclient/client.gointernal/providers/anthropic/anthropic.gointernal/providers/factory.gointernal/providers/gemini/gemini.gointernal/providers/openai/openai.gointernal/providers/registry_race_test.gointernal/server/http.gointernal/server/http_test.goprometheus.yml
🧰 Additional context used
🧬 Code graph analysis (9)
internal/providers/gemini/gemini.go (5)
internal/pkg/llmclient/client.go (4)
DefaultConfig(89-104)Hooks(46-54)New(119-135)NewWithHTTPClient(138-154)internal/pkg/httpclient/client.go (1)
DefaultConfig(38-49)internal/providers/factory.go (1)
GetGlobalHooks(66-68)internal/providers/anthropic/anthropic.go (2)
New(38-45)NewWithHTTPClient(48-55)internal/providers/openai/openai.go (2)
New(30-37)NewWithHTTPClient(40-47)
internal/providers/anthropic/anthropic.go (3)
internal/pkg/llmclient/client.go (5)
DefaultConfig(89-104)Hooks(46-54)New(119-135)NewWithHTTPClient(138-154)Client(110-116)internal/providers/factory.go (1)
GetGlobalHooks(66-68)internal/providers/openai/openai.go (3)
New(30-37)NewWithHTTPClient(40-47)Provider(24-27)
internal/providers/factory.go (1)
internal/pkg/llmclient/client.go (1)
Hooks(46-54)
internal/pkg/llmclient/client.go (3)
internal/core/types.go (2)
Model(43-48)ChatRequest(4-10)internal/core/errors.go (3)
NewProviderError(82-90)ParseProviderError(137-170)GatewayError(27-34)internal/core/responses.go (1)
ResponsesRequest(5-14)
cmd/gomodel/main.go (2)
internal/observability/metrics.go (1)
NewPrometheusHooks(50-106)internal/providers/factory.go (1)
SetGlobalHooks(61-63)
config/config_helpers_test.go (1)
config/config.go (3)
Config(13-18)ServerConfig(42-45)ProviderConfig(59-64)
internal/providers/openai/openai.go (3)
internal/pkg/llmclient/client.go (5)
DefaultConfig(89-104)Hooks(46-54)New(119-135)NewWithHTTPClient(138-154)Client(110-116)internal/providers/factory.go (1)
GetGlobalHooks(66-68)internal/providers/anthropic/anthropic.go (3)
New(38-45)NewWithHTTPClient(48-55)Provider(32-35)
internal/observability/metrics.go (1)
internal/pkg/llmclient/client.go (2)
RequestInfo(25-31)ResponseInfo(34-42)
internal/observability/metrics_test.go (4)
internal/observability/metrics.go (6)
NewPrometheusHooks(50-106)RequestsTotal(18-24)InFlightRequests(38-44)RequestDuration(28-35)HealthCheck(163-176)GetMetrics(147-153)internal/pkg/llmclient/client.go (2)
RequestInfo(25-31)ResponseInfo(34-42)internal/core/types.go (1)
Model(43-48)internal/core/errors.go (1)
NewProviderError(82-90)
🪛 Gitleaks (8.30.0)
METRICS_CONFIGURATION.md
[high] 242-244: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
PROMETHEUS_IMPLEMENTATION.md
[high] 392-394: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🪛 LanguageTool
PROMETHEUS_IMPLEMENTATION.md
[style] ~448-~448: To form a complete sentence, be sure to include a subject.
Context: ...Enhancements ### Token Usage Tracking Could be added by extracting usage data from ...
(MISSING_IT_THERE)
🪛 markdownlint-cli2 (0.18.1)
METRICS_CONFIGURATION.md
19-19: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
26-26: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
33-33: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
74-74: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
86-86: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
96-96: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
183-183: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
195-195: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
PROMETHEUS_IMPLEMENTATION.md
18-18: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
233-233: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
240-240: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🔇 Additional comments (38)
.env.template (1)
9-13: LGTM! Clear metrics configuration template.The commented configuration options with descriptive guidance make it easy for users to enable and configure Prometheus metrics. The disabled-by-default approach aligns with the PR objectives.
.claude/settings.local.json (1)
6-7: LGTM! Test command permissions appropriately expanded.The addition of
go test:*andmake test:*permissions supports running the new test suites introduced in this PR.internal/providers/registry_race_test.go (1)
50-97: LGTM! Comprehensive race condition test.The test effectively exercises concurrent reads and writes to the registry with:
- A background writer performing periodic Refresh operations
- 50 concurrent readers calling GetProvider, ListModels, and Supports
- Proper use of context, timeouts, and WaitGroup for synchronization
This will help catch data races when running with
-raceflag.PROMETHEUS_IMPLEMENTATION.md (1)
1-576: LGTM! Comprehensive implementation documentation.This documentation provides excellent coverage of:
- Architecture and design principles
- Hook system and instrumentation points
- Metric types and PromQL query examples
- Configuration options and usage
- Testing and alerting guidance
- Future extensibility considerations
The document will be valuable for both users and future contributors.
prometheus.yml (1)
1-9: LGTM! Appropriate Prometheus configuration.The configuration is well-suited for the Docker Compose setup:
- 15-second intervals are reasonable for development/testing
- Target
gomodel:8080correctly references the Docker Compose service- Metrics path
/metricsmatches the default endpointinternal/providers/anthropic/anthropic.go (2)
38-45: LGTM! Consistent hooks integration.The constructor properly:
- Creates a default config using
llmclient.DefaultConfig- Applies global hooks from
providers.GetGlobalHooks()- Passes the configured client to
llmclient.NewThis pattern is consistent with the OpenAI and Gemini providers, enabling Prometheus metrics collection when enabled.
47-55: LGTM! Custom HTTP client path also supports hooks.The
NewWithHTTPClientmethod correctly applies the same hooks pattern, ensuring metrics are collected even when a custom HTTP client is provided.go.mod (1)
8-8: Versionv1.23.2is the latest stable release with no known security vulnerabilities.The Prometheus client library is correctly specified at the current stable version (released 2025-09-05). No CVEs affect this version.
internal/providers/gemini/gemini.go (2)
45-48: LGTM: Consistent hook injection pattern.The global hooks integration follows the same pattern as the OpenAI and Anthropic providers, ensuring consistent observability across all providers.
127-131: Good: ListModels also applies global hooks.The models client correctly applies global hooks, ensuring metrics are captured for model listing requests as well. This is a thorough implementation.
internal/providers/openai/openai.go (1)
30-36: LGTM: Global hooks properly integrated.The hook injection follows the established pattern and is consistent with other provider implementations.
config/config.go (4)
47-56: Well-structured MetricsConfig with clear documentation.The struct is properly documented with default values clearly stated in comments.
115-118: Correct fallback for environment-only configuration.When no config file is present, metrics settings are properly loaded from environment variables using Viper's
GetBoolandGetString, which handles type conversion correctly.
152-154: Note: OnlyMetrics.Endpointis expanded, notMetrics.Enabled.This is intentional since
Enabledis a boolean field. Environment variable expansion via${VAR:-default}syntax in YAML only works for string fields. For boolean fields, the value must be read directly from environment variables (as done in the no-config-file path on lines 115-118) or set as a literal in the YAML.This aligns with the TODO in
config.yamlwhich requests explicit env var handling for theenabledfield.
179-196: ImprovedexpandStringcorrectly handles empty defaults.The
hasDefaultflag ensures that${VAR:-}(empty default) returns an empty string rather than the unexpanded placeholder. This is the correct behavior for optional configuration values.internal/server/http_test.go (3)
10-103: Comprehensive table-driven tests for metrics endpoint configuration.Good coverage of various scenarios including default path, empty endpoint fallback, disabled state, nil config, and custom paths. The test structure is clean and maintainable.
153-184: Good: Tests confirm metrics endpoint is auth-protected when MasterKey is set.This test explicitly validates that the
/metricsendpoint requires authentication when a master key is configured, which documents the intended security behavior.
84-102:mockProvideris defined inhandlers_test.go(line 17) and correctly accessible in this test.Since both test files are in the same package, the mock is properly scoped and available for use across test files in
internal/server.docker-compose.yaml (1)
40-41: Verify prometheus.yml configuration file exists and includes gomodel scrape target.The volume mount on line 41 references
./prometheus.yml, which must exist in the repository root for Prometheus to properly scrape the gomodel service. Ensure the file is present and contains the appropriate scrape configuration for gomodel.cmd/gomodel/main.go (2)
87-95: LGTM! Metrics hooks are correctly wired before provider initialization.The placement of the hooks setup before
initCacheand provider creation ensures that all providers will have access to the metrics hooks when they are created. The conditional logging of the metrics status is helpful for debugging configuration.
169-173: LGTM! Server configuration correctly propagates metrics settings.The metrics configuration is properly passed to the server, enabling conditional exposure of the
/metricsendpoint based on the configuration.internal/observability/metrics_test.go (7)
16-29: LGTM! Basic hook validation is appropriate.The test correctly verifies that
NewPrometheusHooks()returns non-nil callbacks for both lifecycle events.
31-79: LGTM! Success path test is well-structured.The test correctly simulates a successful request flow and verifies that the counter is incremented with the expected label values.
81-125: LGTM! Error path test correctly validates error status labeling.The test properly verifies that HTTP errors are tracked with
status_type="error"and the appropriate status code.
127-171: LGTM! Network error test validates the special "network_error" status code.This correctly tests the edge case where
StatusCodeis 0 (network failures), ensuring thestatus_codelabel is set to"network_error".
173-217: LGTM! Streaming request test validates thestream="true"label.The test correctly verifies that streaming requests are distinguished in metrics via the
streamlabel.
219-299: LGTM! In-flight gauge test thoroughly validates concurrent request tracking.The test correctly verifies that the gauge increments on request start, accumulates for concurrent requests, and decrements appropriately on request end, returning to zero when all requests complete.
347-376: LGTM! HealthCheck and GetMetrics tests provide basic sanity checks.These tests verify that the health check function works and that
GetMetrics()returns a properly initialized struct with all metric collectors.config/config_helpers_test.go (2)
106-135: LGTM! New test cases for empty default values and master key patterns.These test cases correctly validate the
${VAR:-}syntax for optional environment variables, which is essential for the master key configuration where an empty default should result in an empty string (disabling authentication) rather than keeping the placeholder.
689-771: Excellent security-focused test for master key expansion.This dedicated test function provides targeted coverage for master key expansion behavior, which is critical for preventing authentication bypass bugs. The explicit cleanup of
GOMODEL_MASTER_KEYbefore each subtest (line 750) ensures reliable test isolation.internal/pkg/llmclient/client.go (5)
24-55: LGTM! Well-designed observability types.The
RequestInfo,ResponseInfo, andHookstypes provide a clean contract for instrumentation. The documentation clearly explains the streaming behavior where duration measures time to stream establishment, not total stream duration.
207-228: Excellent documentation explaining metrics behavior.The table clearly illustrates the difference between per-attempt vs. DoRaw-level hook invocation, making the design decision transparent. This addresses the previous review concern about inflated metrics from retry attempts.
229-270: LGTM! Hooks correctly wired at logical request level.The
OnRequestStarthook is called once at the beginning ofDoRaw, and thecallEndHookhelper ensures consistent invocation at all exit points. The circuit breaker open path correctly calls the end hook withServiceUnavailablestatus.
279-341: LGTM! All exit paths in the retry loop correctly call the end hook.Verified exit paths:
- Line 285-286: Context cancellation during backoff → calls
callEndHook- Line 321-322: Non-retryable error (4xx) → calls
callEndHook- Line 329: Success → calls
callEndHook- Lines 335-340: All retries exhausted → calls
callEndHookThis ensures the in-flight gauge is always decremented.
343-469: LGTM! DoStream correctly instruments all paths.All error and success paths correctly call
OnRequestEnd:
- Line 370-382: Circuit breaker open
- Line 387-399: Build request error
- Line 408-420: HTTP client error
- Line 436-448: Non-200 response
- Line 455-466: Success (stream established)
The duration correctly measures time to stream establishment as documented.
internal/observability/metrics.go (3)
16-45: LGTM! Well-designed Prometheus metrics with appropriate labels.The metric design follows Prometheus best practices:
- Counter for total requests with detailed labels for slicing
- Histogram for latency with appropriate LLM-focused buckets (0.1s to 60s)
- Gauge for in-flight tracking with minimal labels (model excluded since it's not available at request start for all paths)
The
status_typelabel is a good abstraction for alerting on error rates without needing to enumerate status codes.
50-106: LGTM! Hook implementation correctly instruments request lifecycle.The implementation correctly:
- Increments in-flight gauge on start, decrements on end
- Handles network errors (status code 0) specially with
"network_error"label- Defensively marks HTTP 4xx/5xx as errors even if
Erroris nil- Records both counter and histogram on request completion
108-138: Helpful example queries for operators.The Prometheus and Grafana query examples are valuable documentation for operators setting up monitoring dashboards. Consider moving these to the
METRICS_CONFIGURATION.mdfile mentioned in the PR description if more detailed documentation is needed.
| prometheus: | ||
| image: prom/prometheus:latest | ||
| ports: | ||
| - "9090:9090" | ||
| volumes: | ||
| - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro | ||
| - prometheus_data:/prometheus | ||
| command: | ||
| - "--config.file=/etc/prometheus/prometheus.yml" | ||
| - "--storage.tsdb.path=/prometheus" | ||
| - "--web.enable-lifecycle" | ||
| depends_on: | ||
| - gomodel | ||
| restart: unless-stopped |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider pinning the Prometheus image version for reproducibility.
Using prom/prometheus:latest can lead to unexpected behavior when the image is updated. For production and reproducible builds, pin to a specific version.
🔎 Proposed fix
prometheus:
- image: prom/prometheus:latest
+ image: prom/prometheus:v2.54.1
ports:
- "9090:9090"Additionally, consider adding a healthcheck for the prometheus service to match the pattern used by the redis service:
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:9090/-/healthy"]
interval: 10s
timeout: 5s
retries: 5🤖 Prompt for AI Agents
In docker-compose.yaml around lines 36 to 49, the Prometheus service uses the
floating tag prom/prometheus:latest and lacks a healthcheck; update the image to
a specific pinned version (e.g., prom/prometheus:v2.x.y) to ensure reproducible
builds and replace the tag everywhere it’s referenced, and add a healthcheck
block that queries Prometheus’ /-/healthy endpoint with sensible
interval/timeout/retries to mirror the Redis pattern.
| // Verify at least one observation was recorded | ||
| hist := observer.(prometheus.Histogram) | ||
| if hist == nil { | ||
| t.Fatal("Expected histogram, got nil") | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using testutil.ToFloat64 on the histogram observer instead of type assertion.
The type assertion to prometheus.Histogram may panic if the underlying type changes. However, since GetMetricWithLabelValues returns prometheus.Observer, this approach is acceptable for verifying the metric exists.
🤖 Prompt for AI Agents
In internal/observability/metrics_test.go around lines 340 to 344, replace the
unsafe type assertion to prometheus.Histogram with using
prometheus/testutil.ToFloat64 to read the recorded value; obtain the metric via
the same GetMetricWithLabelValues call (or the metric variable you already
have), call testutil.ToFloat64(...) on that metric to get a float value, and
assert that the value is > 0 (or otherwise non-zero) to verify an observation
was recorded; also add the testutil import if missing.
| // ResetMetrics resets all metrics to zero (useful for testing) | ||
| func ResetMetrics() { | ||
| RequestsTotal.Reset() | ||
| RequestDuration.Reset() | ||
| InFlightRequests.Reset() | ||
| } | ||
|
|
||
| // HealthCheck verifies that metrics are being collected | ||
| func HealthCheck() error { | ||
| // Try to collect metrics | ||
| mfs, err := prometheus.DefaultGatherer.Gather() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to gather metrics: %w", err) | ||
| } | ||
|
|
||
| // Check that we have some metrics | ||
| if len(mfs) == 0 { | ||
| return fmt.Errorf("no metrics registered") | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
cd internal/observability && wc -l metrics.goRepository: ENTERPILOT/GOModel
Length of output: 76
🏁 Script executed:
head -n 200 internal/observability/metrics.go | cat -nRepository: ENTERPILOT/GOModel
Length of output: 6280
🏁 Script executed:
grep -n "RequestsTotal\|RequestDuration\|model" internal/observability/metrics.go | head -50Repository: ENTERPILOT/GOModel
Length of output: 1596
🌐 Web query:
Prometheus high cardinality labels best practices monitoring
💡 Result:
Concise best-practices for avoiding high-cardinality label problems in Prometheus:
-
Follow Prometheus’ guidance: prefer small, bounded label sets; don’t procedurally generate metric names or use highly variable values as labels. Use labels only when you’ll actually query by them. [1]
-
Never use unbounded/dynamic values as label values (user IDs, email, UUIDs, full timestamps, raw URLs with IDs). Replace with parameterized paths or send that data to logs/traces instead. [2][3]
-
Keep the number of unique values per label small and stable. Aim for most metrics to have few or no labels; reserve high-cardinality labels to a very small number of metrics. Monitor counts of series per metric (e.g., count(count by(...)(metric))). [1][2]
-
Filter or drop high-cardinality labels at scrape time using relabeling / metric_relabel_configs (labeldrop/labelmap, regex) to prevent them entering TSDB. [2]
-
Aggregate before ingesting: use client-side aggregation, recording rules, or push aggregated metrics (sum by(...), histograms) instead of emitting per-entity series. Use histograms/summaries for distributions rather than per-value labels. [1][3]
-
Use recording rules to precompute expensive aggregations and to migrate label schemes safely (add before remove). Monitor series growth during changes. [2]
-
If you need to keep high-cardinality data for debugging, route it to a dedicated system (logs, traces, or a separate long-term metrics store designed for high cardinality). Consider remote-write to scalable long-term storage only when necessary. [2][3]
-
Continuously monitor overall cardinality and memory usage; set alerts for rapid increases in series count. Use tooling/queries to find top metrics by series-count and prune as needed. [2][4]
Sources: Prometheus instrumentation docs and best-practice guidance; practical cardinality-control patterns and relabel examples; observability workshop notes. [1][2][3][4]
Monitor cardinality growth from the model label.
The model label in RequestsTotal and RequestDuration could accumulate high cardinality if requests contain many distinct model names (including typos or dynamically generated values). While typically acceptable for LLM gateways with bounded model sets, add cardinality monitoring to detect unexpected growth.
| // extractModel attempts to extract the model name from a request body | ||
| func extractModel(body interface{}) string { | ||
| if body == nil { | ||
| return "unknown" | ||
| } | ||
|
|
||
| // Try ChatRequest | ||
| if chatReq, ok := body.(*core.ChatRequest); ok && chatReq != nil { | ||
| return chatReq.Model | ||
| } | ||
|
|
||
| // Try ResponsesRequest | ||
| if respReq, ok := body.(*core.ResponsesRequest); ok && respReq != nil { | ||
| return respReq.Model | ||
| } | ||
|
|
||
| // Unknown request type | ||
| return "unknown" | ||
| } | ||
|
|
||
| // extractStatusCode tries to extract HTTP status code from an error | ||
| func extractStatusCode(err error) int { | ||
| if err == nil { | ||
| return 0 | ||
| } | ||
|
|
||
| // Try to extract from GatewayError | ||
| if gwErr, ok := err.(*core.GatewayError); ok { | ||
| return gwErr.StatusCode | ||
| } | ||
|
|
||
| // Network or unknown error | ||
| return 0 | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider extending extractModel to handle non-pointer types.
The current implementation only handles pointer types (*core.ChatRequest, *core.ResponsesRequest). If a request body is passed as a value type, the model will incorrectly return "unknown".
🔎 Proposed enhancement to handle value types
func extractModel(body interface{}) string {
if body == nil {
return "unknown"
}
// Try ChatRequest
if chatReq, ok := body.(*core.ChatRequest); ok && chatReq != nil {
return chatReq.Model
}
+ if chatReq, ok := body.(core.ChatRequest); ok {
+ return chatReq.Model
+ }
// Try ResponsesRequest
if respReq, ok := body.(*core.ResponsesRequest); ok && respReq != nil {
return respReq.Model
}
+ if respReq, ok := body.(core.ResponsesRequest); ok {
+ return respReq.Model
+ }
// Unknown request type
return "unknown"
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // extractModel attempts to extract the model name from a request body | |
| func extractModel(body interface{}) string { | |
| if body == nil { | |
| return "unknown" | |
| } | |
| // Try ChatRequest | |
| if chatReq, ok := body.(*core.ChatRequest); ok && chatReq != nil { | |
| return chatReq.Model | |
| } | |
| // Try ResponsesRequest | |
| if respReq, ok := body.(*core.ResponsesRequest); ok && respReq != nil { | |
| return respReq.Model | |
| } | |
| // Unknown request type | |
| return "unknown" | |
| } | |
| // extractStatusCode tries to extract HTTP status code from an error | |
| func extractStatusCode(err error) int { | |
| if err == nil { | |
| return 0 | |
| } | |
| // Try to extract from GatewayError | |
| if gwErr, ok := err.(*core.GatewayError); ok { | |
| return gwErr.StatusCode | |
| } | |
| // Network or unknown error | |
| return 0 | |
| } | |
| // extractModel attempts to extract the model name from a request body | |
| func extractModel(body interface{}) string { | |
| if body == nil { | |
| return "unknown" | |
| } | |
| // Try ChatRequest | |
| if chatReq, ok := body.(*core.ChatRequest); ok && chatReq != nil { | |
| return chatReq.Model | |
| } | |
| if chatReq, ok := body.(core.ChatRequest); ok { | |
| return chatReq.Model | |
| } | |
| // Try ResponsesRequest | |
| if respReq, ok := body.(*core.ResponsesRequest); ok && respReq != nil { | |
| return respReq.Model | |
| } | |
| if respReq, ok := body.(core.ResponsesRequest); ok { | |
| return respReq.Model | |
| } | |
| // Unknown request type | |
| return "unknown" | |
| } | |
| // extractStatusCode tries to extract HTTP status code from an error | |
| func extractStatusCode(err error) int { | |
| if err == nil { | |
| return 0 | |
| } | |
| // Try to extract from GatewayError | |
| if gwErr, ok := err.(*core.GatewayError); ok { | |
| return gwErr.StatusCode | |
| } | |
| // Network or unknown error | |
| return 0 | |
| } |
🤖 Prompt for AI Agents
In internal/pkg/llmclient/client.go around lines 471 to 504, extractModel only
checks pointer types and returns "unknown" for value-type requests; update the
function to also handle non-pointer (value) types by adding type assertions (or
a type switch) for core.ChatRequest and core.ResponsesRequest in addition to
their pointer forms, safely extracting and returning the Model field from either
the value or pointer without causing panics.
| standardMetrics := []string{ | ||
| "go_goroutines", | ||
| "go_gc_duration_seconds", | ||
| "go_memstats_alloc_bytes", | ||
| "process_cpu_seconds_total", | ||
| } | ||
|
|
||
| for _, metric := range standardMetrics { | ||
| if !strings.Contains(body, metric) { | ||
| t.Errorf("response should contain standard metric %q", metric) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if tests are expected to run on multiple platforms
echo "=== Checking CI configuration for cross-platform testing ==="
find .github -type f -name "*.yml" -o -name "*.yaml" | xargs grep -l "GOOS\|windows\|darwin\|linux" 2>/dev/null || echo "No files matching pattern found"
# Also check Makefile
echo -e "\n=== Checking Makefile ==="
grep -n "GOOS\|windows\|darwin\|linux" Makefile 2>/dev/null || echo "No cross-platform config in Makefile"
# List CI workflow files
echo -e "\n=== Available CI workflow files ==="
ls -la .github/workflows/ 2>/dev/null || echo ".github/workflows not found"Repository: ENTERPILOT/GOModel
Length of output: 424
🏁 Script executed:
#!/bin/bash
# Read the test file to understand context
echo "=== Reading http_test.go (lines 120-155) ==="
sed -n '120,155p' internal/server/http_test.goRepository: ENTERPILOT/GOModel
Length of output: 1139
🏁 Script executed:
#!/bin/bash
# Check if there are any existing platform-specific skips in the test file
echo "=== Checking for skip/platform-specific logic in http_test.go ==="
grep -n "skip\|GOOS\|windows\|darwin\|linux\|build" internal/server/http_test.go | head -20Repository: ENTERPILOT/GOModel
Length of output: 127
🏁 Script executed:
#!/bin/bash
# Look for any metrics-related test setup or documentation
echo "=== Checking for metrics handling in the codebase ==="
rg -n "process_cpu_seconds_total|StandardRegistry" internal/ --type go | head -20Repository: ENTERPILOT/GOModel
Length of output: 179
🏁 Script executed:
#!/bin/bash
cat .github/workflows/test.ymlRepository: ENTERPILOT/GOModel
Length of output: 1539
Update code comment and consider conditional metric check for cross-platform readiness.
The process_cpu_seconds_total metric is Unix-specific and unavailable on Windows. While current CI only runs on Linux (ubuntu-latest), the code comment claiming these are "standard Go runtime metrics that are always present" is inaccurate. Either update the comment to clarify Unix-specific availability, or make the check conditional if cross-platform support is planned.
🤖 Prompt for AI Agents
In internal/server/http_test.go around lines 133 to 144, the test treats
"process_cpu_seconds_total" as a metric that is "always present" but that metric
is not available on Windows; either update the comment to clarify that some
metrics are OS-specific, or make the assertion conditional: split the metrics
into truly cross-platform ones and OS-specific ones, and only check for
"process_cpu_seconds_total" when runtime.GOOS != "windows" (or otherwise detect
absence) so the test passes on Windows while preserving Linux assertions.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
internal/providers/registry_race_test.go (1)
29-29: Remove the "CORRECT" emphasis from the comment.The comment still contains "CORRECT Go types" which was flagged in a previous review. Simplify to just "Implement other interface methods as no-ops" or remove entirely since the code is self-explanatory.
🔎 Proposed fix
-// Implement other interface methods as no-ops with CORRECT Go types +// Implement other interface methods as no-ops
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
METRICS_CONFIGURATION.mdinternal/providers/registry_race_test.go
🧰 Additional context used
🪛 Gitleaks (8.30.0)
METRICS_CONFIGURATION.md
[high] 242-244: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🪛 markdownlint-cli2 (0.18.1)
METRICS_CONFIGURATION.md
19-19: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
26-26: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
33-33: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
74-74: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
86-86: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
96-96: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
183-183: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
195-195: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🔇 Additional comments (2)
internal/providers/registry_race_test.go (2)
13-27: LGTM!The
slowMockProviderand itsListModelsimplementation properly handle context cancellation using the select pattern, which is appropriate for simulating network latency in race condition tests.
74-91: LGTM!The reader goroutines correctly use the shared context for cancellation and exercise multiple registry methods concurrently. The 1ms sleep helps prevent CPU saturation while maintaining high contention.
| func TestRegistry_Concurrency(t *testing.T) { | ||
| registry := NewModelRegistry() | ||
| // Register a provider that takes 10ms to respond | ||
| registry.RegisterProvider(&slowMockProvider{delay: 10 * time.Millisecond}) | ||
|
|
||
| var wg sync.WaitGroup | ||
| ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second) | ||
| defer cancel() |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add a comment explaining this test relies on the race detector.
The test has no explicit assertions—it passes if there are no panics or data races. Consider adding a brief comment (e.g., // Run with -race to detect data races) to clarify that this test is designed to be run with go test -race.
🔎 Proposed improvement
func TestRegistry_Concurrency(t *testing.T) {
+ // This test detects race conditions via the -race flag.
+ // It passes if no data races or panics occur during concurrent access.
registry := NewModelRegistry()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func TestRegistry_Concurrency(t *testing.T) { | |
| registry := NewModelRegistry() | |
| // Register a provider that takes 10ms to respond | |
| registry.RegisterProvider(&slowMockProvider{delay: 10 * time.Millisecond}) | |
| var wg sync.WaitGroup | |
| ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second) | |
| defer cancel() | |
| func TestRegistry_Concurrency(t *testing.T) { | |
| // This test detects race conditions via the -race flag. | |
| // It passes if no data races or panics occur during concurrent access. | |
| registry := NewModelRegistry() | |
| // Register a provider that takes 10ms to respond | |
| registry.RegisterProvider(&slowMockProvider{delay: 10 * time.Millisecond}) | |
| var wg sync.WaitGroup | |
| ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second) | |
| defer cancel() |
🤖 Prompt for AI Agents
In internal/providers/registry_race_test.go around lines 50 to 57, this
concurrency test has no explicit assertions and is intended to surface data
races rather than functional failures; add a brief inline comment above the test
(e.g., "// This test is intended to be run with `go test -race` to detect data
races") to clarify it relies on the race detector and why it passes without
assertions. Ensure the comment is concise and placed just before
TestRegistry_Concurrency so future readers/runers know to use the -race flag.
| for { | ||
| select { | ||
| case <-ctx.Done(): | ||
| return | ||
| default: | ||
| _ = registry.Refresh(context.Background()) | ||
| time.Sleep(15 * time.Millisecond) | ||
| } | ||
| } | ||
| }() |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using a cancellable context for Refresh.
Line 68 uses context.Background() for the Refresh call, which means in-flight refresh operations won't be cancelled when the test's context is cancelled. This could slightly delay test teardown. If graceful shutdown is desired, pass a derived context.
However, this is acceptable if the intent is to let ongoing refreshes complete naturally.
🔎 Proposed change (optional)
- _ = registry.Refresh(context.Background())
+ _ = registry.Refresh(ctx)🤖 Prompt for AI Agents
In internal/providers/registry_race_test.go around lines 63 to 72, the goroutine
calls registry.Refresh with context.Background(), so ongoing Refresh calls are
not cancelled when the test's ctx is cancelled; change the call to pass the
test's cancellable context (e.g., use the ctx variable from the outer scope or a
derived ctx) so Refresh is cancellable, and optionally check ctx.Done()
before/after the call to avoid starting new refreshes once cancellation is
requested.
|
|
||
| ### Disable Metrics | ||
|
|
||
| **Option 1: Environment Variable** |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Fix markdown heading syntax violations.
Multiple lines use bold emphasis (**text**) as section headers instead of proper markdown heading syntax (###). This violates markdown linting rules and reduces semantic clarity.
Update the following lines to use heading syntax:
- Line 19:
**Option 1: Environment Variable**→### Option 1: Environment Variable - Line 26:
**Option 2: .env file**→### Option 2: .env file - Line 33:
**Option 3: config.yaml**→### Option 3: config.yaml - Line 74:
**.env**→### .env - Line 86:
**.env**→### .env - Line 96:
**config.yaml**→### config.yaml - Line 195:
**prometheus.yml**→### prometheus.yml
🔎 Proposed fixes for markdown heading violations
- **Option 1: Environment Variable**
+ ### Option 1: Environment VariableApply the same pattern to all other instances listed above.
Also applies to: 26-26, 33-33, 74-74, 86-86, 96-96, 195-195
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
19-19: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🤖 Prompt for AI Agents
In METRICS_CONFIGURATION.md around lines 19, 26, 33, 74, 86, 96 and 195,
multiple section titles use bold emphasis (**text**) instead of proper Markdown
heading syntax; replace each bolded title with an appropriate heading (e.g.,
change "**Option 1: Environment Variable**" to "### Option 1: Environment
Variable", and similarly update lines 26, 33, 74, 86, 96, and 195 to use "###"
heading syntax) to satisfy markdown linting and restore semantic structure.
| 3. **Reverse proxy with authentication:** | ||
| ```nginx | ||
| location /metrics { | ||
| auth_basic "Metrics"; | ||
| auth_basic_user_file /etc/nginx/.htpasswd; | ||
| proxy_pass http://gomodel:8080/metrics; | ||
| } | ||
| ``` | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add blank lines around fenced code block.
The nginx reverse proxy code block at line 183 is not surrounded by blank lines. Markdown lint requires blank lines before and after fenced code blocks.
🔎 Proposed fix
- Deploy Prometheus in the same VPC/network
+
3. **Reverse proxy with authentication:**
```nginx
location /metrics {
@@ -188,6 +190,7 @@
proxy_pass http://gomodel:8080/metrics;
}</details>
> Committable suggestion skipped: line range outside the PR's diff.
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>
183-183: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
In METRICS_CONFIGURATION.md around lines 182 to 190, the fenced nginx code block
is not preceded and followed by blank lines which violates markdown lint rules;
add a single blank line before the opening nginx fence and a single blank line after the closing fence so the code block is isolated from surrounding
text.
</details>
<!-- fingerprinting:phantom:triton:mongoose -->
<!-- This is an auto-generated comment by CodeRabbit -->
| curl -X POST http://localhost:8080/v1/chat/completions \ | ||
| -H "Content-Type: application/json" \ | ||
| -H "Authorization: Bearer your-master-key" \ | ||
| -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hi"}]}' | ||
|
|
||
| # Then check metrics | ||
| curl http://localhost:8080/metrics | grep gomodel_requests_total | ||
| ``` |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Clarify that curl example uses a placeholder authentication token.
The curl command on lines 242–244 includes "Authorization: Bearer your-master-key", which static analysis flagged as a potential leaked credential. While this is a documentation example with an obvious placeholder, consider adding a clarifying comment to explicitly indicate it's a template value, not an actual token.
🔎 Proposed clarification
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
- -H "Authorization: Bearer your-master-key" \
+ -H "Authorization: Bearer your-master-key" \ # Replace with your actual master key
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hi"}]}'Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Gitleaks (8.30.0)
[high] 242-244: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🤖 Prompt for AI Agents
In METRICS_CONFIGURATION.md around lines 242 to 249, the Authorization header in
the curl example uses a literal-looking token ("your-master-key") that can be
mistaken for a real credential; update the example to explicitly mark it as a
placeholder and safe to replace (e.g. use <your-master-key> or add a short
inline comment like "# replace with your master key") and optionally add a
one-sentence note reminding readers not to share real tokens publicly.
Summary by CodeRabbit
New Features
Documentation
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.