Skip to content

feat(all): Resource monitoring for CPU, memory, disk#1461

Merged
yohamta0 merged 20 commits into
mainfrom
resource-monitoring
Dec 7, 2025
Merged

feat(all): Resource monitoring for CPU, memory, disk#1461
yohamta0 merged 20 commits into
mainfrom
resource-monitoring

Conversation

@yohamta0

@yohamta0 yohamta0 commented Dec 7, 2025

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features

    • Resource history API (GET /services/resources/history) returning CPU, memory, disk and load time series; supports duration and remote-node query params.
    • Background resource monitoring service with in-memory retention and configurable sampling interval.
    • Resource Usage charts added to System Status with loading/error states, current-value display, auto-refresh and last-updated timestamp.
  • Chores

    • Monitoring config defaults (retention 24h, interval 5s), config parsing and environment bindings.
    • Server startup integrates resource service lifecycle.
  • Tests

    • New unit and integration tests for the monitoring service, store, ring buffer, and API.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Dec 7, 2025

Copy link
Copy Markdown

Walkthrough

Adds a resource-monitoring subsystem that samples CPU, memory, disk, and load periodically into an in-memory retention store, exposes historical metrics at GET /services/resources/history, integrates the service into server lifecycle and frontend API, and surfaces charts in the UI.

Changes

Cohort / File(s) Summary
API schema & generated server types
api/v2/api.yaml, api/v2/api.gen.go
Added MetricPoint and ResourceHistory schemas, new GET /services/resources/history operation, generated request/response types, middleware, and router registration.
Module dependencies
go.mod
Added github.com/shirou/gopsutil/v3 and several indirect platform/metrics dependencies; promoted gopkg.in/yaml.v3 to direct.
Config surface & loader
internal/common/config/config.go, internal/common/config/definition.go, internal/common/config/loader.go, internal/common/config/loader_test.go
Introduced MonitoringConfig/MonitoringDef, added Monitoring to config/definition, set defaults (24h/5s), added env bindings and parsing, and updated tests.
Resource service implementation & store
internal/service/resource/service.go, internal/service/resource/store.go, internal/service/resource/ringbuffer.go, internal/service/resource/*_test.go
New resource.Service with Start/Stop/GetHistory; MemoryStore time-series store and generic RingBuffer for MetricPoint with retention pruning and filtering; periodic collection using gopsutil; unit tests.
Server lifecycle & wiring
internal/cmd/server.go, internal/cmd/startall.go, internal/cmd/context.go, internal/test/server.go
NewServer signatures updated to accept *resource.Service. Resource service created, started before server init, and stopped during shutdown; startup/shutdown error handling added.
Frontend API wiring & handler
internal/service/frontend/server.go, internal/service/frontend/api/v2/api.go, internal/service/frontend/api/v2/resources.go, internal/integration/queue_shell_test.go, internal/service/frontend/api/v2/resources_test.go
Frontend API holds *resource.Service; constructors updated. Implemented GetResourceHistory handler with duration parsing, retention enforcement, conversion to API types, and error-path test.
UI types, component & page integration
ui/src/api/v2/schema.ts, ui/src/features/system-status/components/ResourceChart.tsx, ui/src/pages/system-status/index.tsx
Added TS schemas for MetricPoint/ResourceHistory, created ResourceChart component, integrated resource-history fetching and four charts (CPU, Memory, Disk, Load) with auto-refresh controls.
Tests & integration updates
internal/service/resource/service_test.go, internal/service/resource/store_test.go, internal/service/resource/ringbuffer_test.go, internal/integration/*, internal/test/server.go
New tests for resource service lifecycle, store and ringbuffer behavior, API error path; adjusted integration/test call sites to match updated NewServer signature.
sequenceDiagram
    participant UI as Web UI
    participant API as V2 API Handler
    participant RS as Resource Service
    participant Store as MemoryStore
    participant Sys as System

    UI->>API: GET /services/resources/history?duration=...
    API->>RS: GetHistory(duration)
    RS->>Store: GetHistory(duration)
    Store-->>RS: ResourceHistory
    RS-->>API: ResourceHistory
    API-->>UI: 200 JSON (cpu,memory,disk,load arrays)

    Note over RS,Store: Background collection loop (started at boot)
    loop every interval
        RS->>Sys: collect metrics (cpu, mem, disk, load)
        Sys-->>RS: metric values
        RS->>Store: Add(cpu, mem, disk, load)
        Store->>Store: prune old points by retention
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Focus areas:
    • internal/service/resource/service.go, internal/service/resource/store.go: concurrency, ticker/cancel semantics, pruning and timestamp handling.
    • internal/common/config/loader.go: duration parsing, defaults, and env bindings.
    • internal/service/frontend/api/v2/resources.go: duration validation, error handling, and API response mapping.
    • internal/cmd/startall.go, internal/cmd/server.go, internal/cmd/context.go: startup/shutdown ordering and error propagation.
    • UI components: ui/src/pages/system-status/index.tsx, ui/src/features/system-status/components/ResourceChart.tsx: data fetching, auto-refresh coordination, and chart rendering.

Poem

🐇
I counted beats and logged each byte,
I sampled dawn and sampled night,
I pruned the old and kept the true,
I stored the hum and shared the view,
Now little charts will hop into sight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.84% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(all): Resource monitoring for CPU, memory, disk' directly describes the primary change: adding resource monitoring capabilities for system metrics.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch resource-monitoring

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
internal/service/resource/store_test.go (1)

27-60: Test name could better reflect what's being tested.

The test name TestMemoryStore_Retention suggests it tests automatic pruning, but as the comments explain (lines 43-51), it actually tests GetHistory filtering by duration. Consider renaming to TestMemoryStore_GetHistoryFiltering or adding a separate test for the actual retention/pruning behavior if that's important to verify.

ui/src/features/system-status/components/ResourceChart.tsx (1)

74-84: Unnecessary optional chaining on currentValue.

At line 82, currentValue?.toFixed(1) uses optional chaining, but currentValue is always a number (defaulting to 0 on line 75). The ?. is harmless but misleading.

-          {currentValue?.toFixed(1)}
+          {currentValue.toFixed(1)}
internal/cmd/startall.go (1)

213-216: Consider shutdown ordering for resource service.

The resource service is stopped before waiting for the server to finish (wg.Wait()). If the server is still handling in-flight requests to /services/resources/history, GetHistory could be called on a stopped service.

While the current implementation likely handles this gracefully (the store data remains accessible), consider moving the resource service stop after wg.Wait() for cleaner shutdown semantics:

+	// Wait for all services to finish with timeout
+	done := make(chan struct{})
+	go func() {
+		wg.Wait()
+		close(done)
+	}()
+
+	select {
+	case <-done:
+		logger.Info(ctx, "All services stopped gracefully")
+	case <-time.After(30 * time.Second):
+		logger.Error(ctx, "Timeout waiting for services to stop")
+	}
+
 	// Stop resource service
 	if err := resourceService.Stop(ctx); err != nil {
 		logger.Error(ctx, "Failed to stop resource service", tag.Error(err))
 	}
-
-	// Wait for all services to finish with timeout
-	done := make(chan struct{})
-	go func() {
-		wg.Wait()
-		close(done)
-	}()
-
-	select {
-	case <-done:
-		logger.Info(ctx, "All services stopped gracefully")
-	case <-time.After(30 * time.Second):
-		logger.Error(ctx, "Timeout waiting for services to stop")
-	}
ui/src/pages/system-status/index.tsx (1)

55-59: Manual refresh only updates resource data, not service status.

The handleRefresh function only calls mutateResource(), but the scheduler and coordinator queries don't expose their mutate functions. Consider whether users would expect the refresh button to also update service status.

+  const {
+    data: schedulerData,
+    error: schedulerError,
+    mutate: mutateScheduler,
+  } = useQuery('/services/scheduler', {
     params: {
       query: {
         remoteNode: appBarContext.selectedRemoteNode || 'local',
       },
     },
   });

+  const {
+    data: coordinatorData,
+    error: coordinatorError,
+    mutate: mutateCoordinator,
+  } = useQuery('/services/coordinator', {
     params: {
       query: {
         remoteNode: appBarContext.selectedRemoteNode || 'local',
       },
     },
   });

   const handleRefresh = async () => {
     setIsRefreshing(true);
-    await Promise.all([mutateResource()]);
+    await Promise.all([mutateResource(), mutateScheduler(), mutateCoordinator()]);
     setIsRefreshing(false);
   };
internal/service/resource/service.go (2)

31-48: Consider guarding against double-Start.

If Start is called twice, the second call overwrites s.cancel, potentially orphaning the first goroutine. Consider adding a guard or documenting single-Start semantics.

+func (s *Service) Start(ctx context.Context) error {
+	if s.cancel != nil {
+		return nil // Already started
+	}
 	ctx, cancel := context.WithCancel(ctx)
 	s.cancel = cancel

73-104: Consider handling collection failures more gracefully.

When metric collection fails, zero values are stored, which could create misleading data points in charts. Consider either skipping storage entirely on failure or using the previous known value.

This is a minor concern for initial implementation - the current approach is functional.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85b04da and ef0043f.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (21)
  • api/v2/api.gen.go (12 hunks)
  • api/v2/api.yaml (2 hunks)
  • go.mod (5 hunks)
  • internal/cmd/context.go (3 hunks)
  • internal/cmd/server.go (2 hunks)
  • internal/cmd/startall.go (4 hunks)
  • internal/common/config/config.go (1 hunks)
  • internal/common/config/definition.go (2 hunks)
  • internal/common/config/loader.go (3 hunks)
  • internal/common/config/loader_test.go (3 hunks)
  • internal/integration/queue_shell_test.go (1 hunks)
  • internal/service/frontend/api/v2/api.go (4 hunks)
  • internal/service/frontend/api/v2/resources.go (1 hunks)
  • internal/service/frontend/server.go (2 hunks)
  • internal/service/resource/service.go (1 hunks)
  • internal/service/resource/store.go (1 hunks)
  • internal/service/resource/store_test.go (1 hunks)
  • internal/test/server.go (1 hunks)
  • ui/src/api/v2/schema.ts (4 hunks)
  • ui/src/features/system-status/components/ResourceChart.tsx (1 hunks)
  • ui/src/pages/system-status/index.tsx (6 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/common/config/definition.go
  • internal/common/config/config.go
  • internal/service/frontend/api/v2/api.go
  • internal/common/config/loader.go
  • internal/service/frontend/api/v2/resources.go
  • internal/test/server.go
  • internal/service/resource/store.go
  • internal/common/config/loader_test.go
  • internal/cmd/context.go
  • internal/cmd/server.go
  • internal/service/frontend/server.go
  • internal/cmd/startall.go
  • internal/service/resource/store_test.go
  • internal/service/resource/service.go
  • internal/integration/queue_shell_test.go
  • api/v2/api.gen.go
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
  • ui/src/api/v2/schema.ts
  • ui/src/features/system-status/components/ResourceChart.tsx
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
  • ui/src/api/v2/schema.ts
  • ui/src/features/system-status/components/ResourceChart.tsx
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Co-locate Go tests as *_test.go; favour table-driven cases and cover failure paths
Use stretchr/testify/require and shared fixtures from internal/test instead of duplicating mocks

Files:

  • internal/common/config/loader_test.go
  • internal/service/resource/store_test.go
  • internal/integration/queue_shell_test.go
{api/v1,api/v2}/**/*.{proto,yaml,yml,json}

📄 CodeRabbit inference engine (AGENTS.md)

API definitions live in api/v1 and api/v2; generated server stubs land in internal/service, while matching TypeScript clients flow into ui/src/api

Files:

  • api/v2/api.yaml
🧠 Learnings (5)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to ui/**/*.{ts,tsx} : The React + TypeScript frontend resides in `ui/`, with production bundles copied to `internal/service/frontend/assets` by `make ui`

Applied to files:

  • ui/src/pages/system-status/index.tsx
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: `make run` starts the Go scheduler and serves the compiled UI (fails fast if `ui/dist` is missing)

Applied to files:

  • internal/test/server.go
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*.go : Backend entrypoint in `cmd/` orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under `internal/*` (for example `internal/runtime`, `internal/persistence`)

Applied to files:

  • internal/cmd/context.go
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*.go : Repository linting relies on `golangci-lint`; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in `internal/common`

Applied to files:

  • go.mod
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*_test.go : Co-locate Go tests as `*_test.go`; favour table-driven cases and cover failure paths

Applied to files:

  • internal/service/resource/store_test.go
🧬 Code graph analysis (11)
internal/common/config/config.go (1)
internal/common/logger/tag/tag.go (2)
  • Duration (222-224)
  • Interval (217-219)
internal/service/frontend/api/v2/api.go (3)
internal/service/resource/service.go (1)
  • Service (16-21)
internal/common/logger/tag/tag.go (1)
  • Service (200-202)
internal/service/coordinator/service.go (1)
  • Service (20-29)
ui/src/pages/system-status/index.tsx (2)
ui/src/lib/utils.ts (1)
  • cn (4-6)
api/v2/api.gen.go (2)
  • SchedulerInstance (735-747)
  • CoordinatorInstance (163-178)
internal/test/server.go (1)
internal/service/frontend/server.go (1)
  • NewServer (47-69)
internal/service/resource/store.go (2)
api/v2/api.gen.go (2)
  • MetricPoint (519-523)
  • ResourceHistory (688-693)
internal/persistence/legacy/model/status.go (1)
  • Time (62-64)
internal/common/config/loader_test.go (1)
internal/common/config/config.go (1)
  • MonitoringConfig (45-50)
internal/cmd/context.go (2)
internal/service/frontend/server.go (2)
  • NewServer (47-69)
  • Server (38-44)
internal/service/resource/service.go (1)
  • Service (16-21)
internal/cmd/server.go (2)
internal/service/resource/service.go (1)
  • NewService (23-29)
internal/service/frontend/server.go (1)
  • NewServer (47-69)
internal/service/frontend/server.go (2)
internal/service/resource/service.go (1)
  • Service (16-21)
internal/service/frontend/api/v2/api.go (1)
  • New (50-83)
internal/cmd/startall.go (2)
internal/service/resource/service.go (1)
  • NewService (23-29)
internal/service/frontend/server.go (1)
  • NewServer (47-69)
ui/src/features/system-status/components/ResourceChart.tsx (4)
internal/service/resource/store.go (1)
  • MetricPoint (10-13)
api/v2/api.gen.go (1)
  • MetricPoint (519-523)
ui/src/api/v2/schema.ts (1)
  • components (757-1343)
ui/src/components/ui/card.tsx (4)
  • Card (85-85)
  • CardHeader (86-86)
  • CardTitle (88-88)
  • CardContent (91-91)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Go Linter
  • GitHub Check: Build
🔇 Additional comments (37)
go.mod (3)

40-40: Consider upgrading gopsutil to v4, as v3 is no longer maintained.

v3 will not be updated except for high level security issues. The v4 version has new fixes, improves and better platform coverages. Given that this is a new resource monitoring feature, starting with v4 would provide better long-term maintainability and access to improvements.

If v3 is intentional (e.g., due to breaking changes in v4 affecting your usage), please verify that the v3 API meets all your resource monitoring requirements. The only breaking changes from v3 to v4 include host/SensorsTemperatures() moving to a new sensors package. If your implementation doesn't use SensorsTemperatures(), migration to v4 should be straightforward.


91-91: Indirect gopsutil dependencies are appropriately declared.

The platform-specific transitive dependencies (go-ole, lufia/plan9stats, power-devops/perfstat, shoenig/go-m1cpu, tklauser/go-sysconf, tklauser/numcpus, yusufpapurcu/wmi) are expected and properly marked as indirect dependencies from gopsutil. These enable system metrics collection across different operating systems.

Also applies to: 108-108, 128-128, 134-134, 139-139, 147-147


383-383: Verify that yaml.v3 is now directly imported in the codebase.

Changing gopkg.in/yaml.v3 from indirect to direct indicates that code now directly imports this package (rather than it being only a transitive dependency). Ensure this reflects an actual direct usage and is intentional. If yaml.v3 was previously only pulled in transitively and is now directly imported for the resource monitoring feature, this change is correct; otherwise, it should remain indirect.

internal/integration/queue_shell_test.go (1)

50-50: LGTM!

Passing nil for the resource service parameter is acceptable in this test context, as resource monitoring functionality is not being exercised here.

internal/common/config/config.go (1)

37-50: LGTM!

The monitoring configuration is well-structured with proper documentation. The memory estimation comment is helpful for capacity planning.

internal/cmd/context.go (1)

177-199: LGTM!

The dependency injection of the resource service is correctly implemented. The signature change aligns with the updated frontend.NewServer constructor.

internal/test/server.go (1)

67-67: LGTM!

Passing nil for the resource service is appropriate in this test helper, as resource monitoring is not being tested.

internal/service/frontend/api/v2/api.go (1)

47-82: LGTM!

The resource service is correctly wired into the API struct through dependency injection. The field assignment and parameter passing are properly implemented.

internal/common/config/definition.go (1)

136-137: LGTM!

The monitoring configuration definition is properly structured with appropriate mapstructure tags. Using string types for duration fields at the definition level is correct, as they are parsed into time.Duration during config loading.

Also applies to: 268-276

internal/common/config/loader.go (2)

345-367: LGTM!

The monitoring configuration parsing correctly handles duration validation and applies sensible defaults. The error handling appropriately logs warnings for invalid values rather than failing hard, which is consistent with other configuration sections.


591-593: LGTM!

The default values and environment variable bindings are properly configured. The 24-hour retention and 5-second collection interval are reasonable defaults for system monitoring.

Also applies to: 693-695

internal/cmd/server.go (2)

46-56: Verify resource cleanup on early return.

The resource service lifecycle is properly managed with start/stop. However, if ctx.NewServer(resourceService) (line 57) fails and returns an error, the deferred Stop will still execute, which is correct. The implementation properly ensures cleanup.


57-57: LGTM!

The resource service is correctly passed to the server constructor, enabling resource monitoring functionality.

internal/service/frontend/server.go (1)

29-29: LGTM - Clean dependency injection for resource service.

The resource service is properly injected through NewServer and passed to the v2 API layer. The wiring aligns with the apiv2.New signature shown in the relevant snippets.

Also applies to: 47-54

internal/service/frontend/api/v2/resources.go (3)

13-22: LGTM - Proper nil-safety check for resource service.

Good defensive programming returning a clear error when the resource service is unavailable.


24-33: Duration parsing handles edge cases correctly.

The logic properly handles:

  • Missing duration parameter (defaults to 1 hour)
  • Invalid duration strings (logs warning, uses default)
  • Zero/negative durations (rejected by d > 0 check)
  • Durations exceeding retention (capped with min)

50-58: Efficient metric conversion with pre-allocated slice.

Good use of make([]api.MetricPoint, len(points)) to avoid reallocations during the loop.

internal/common/config/loader_test.go (2)

793-817: Good test coverage for monitoring configuration.

The test covers all three configuration sources (YAML, environment variables, defaults) with appropriate assertions. The structure follows the existing test patterns in this file.


210-213: Monitoring defaults correctly added to existing test expectations.

Both TestLoad_Env and TestLoad_YAML now include the expected default Monitoring configuration, ensuring the new field is validated in comprehensive integration tests.

Also applies to: 444-447

internal/service/resource/store_test.go (2)

62-92: Thorough edge case coverage for filterPoints.

Good coverage including: all points kept, partial filtering, all excluded, nil slice handling, and the no-copy optimization path.


10-25: Basic add/get test validates core functionality.

Tests all four metric types (CPU, Memory, Disk, Load) and verifies ordering of values.

api/v2/api.yaml (2)

1528-1555: Well-structured API endpoint definition.

The endpoint follows existing patterns in the spec with proper tagging, parameter definition, and response schema reference. The default duration of "1h" aligns with the Go handler implementation.


2838-2870: Clean schema definitions for resource metrics.

MetricPoint correctly marks timestamp and value as required. ResourceHistory appropriately leaves array properties optional since they may be empty.

ui/src/features/system-status/components/ResourceChart.tsx (3)

116-116: Consider Y-axis domain for non-percentage metrics.

The domain={[0, 'auto']} works well for CPU, Memory, and Disk percentages (0-100%), but system load averages can exceed 100 on multi-core systems. The current 'auto' upper bound handles this, but you may want to verify the Load chart displays correctly with high load values.


38-64: Good handling of error and loading states.

The component properly shows distinct UI for error and loading states before attempting to render data, preventing potential runtime errors from undefined data.


66-72: Safe data transformation with fallbacks.

The timestamp-to-time conversion and value normalization handle edge cases appropriately with fallback values.

internal/cmd/startall.go (1)

95-98: LGTM - Resource service initialization and startup.

The resource service is properly created and started before the server begins serving requests, ensuring metrics collection is active when the API becomes available.

Also applies to: 130-133

ui/src/pages/system-status/index.tsx (1)

137-169: LGTM!

The Resource Usage section is well-structured with consistent prop patterns across all four charts. The inline color comments help document which Tailwind palette colors are being used.

ui/src/api/v2/schema.ts (1)

1-4: LGTM!

This is an auto-generated file from OpenAPI spec. The new ResourceHistory and MetricPoint types align correctly with the backend definitions in api/v2/api.gen.go.

internal/service/resource/store.go (3)

9-21: LGTM!

The MetricPoint and ResourceHistory types are well-defined and align with the API schema in api/v2/api.gen.go. The struct fields match the expected JSON serialization format.


48-69: LGTM!

The Add method is thread-safe with appropriate write locking. The periodic pruning strategy (every minute) is a good balance between memory efficiency and avoiding excessive lock contention. The binary search in filterPoints ensures O(log n) pruning complexity.


86-110: LGTM!

The filterPoints helper is well-implemented with binary search for O(log n) complexity. The copySlice parameter correctly differentiates between internal pruning (can reuse backing array) and external queries (must copy to prevent mutation). The edge cases (empty slice, all points old) are handled correctly.

api/v2/api.gen.go (5)

3-3: Generated types for metrics history look consistent

MetricPoint and ResourceHistory match the rest of the generated models (exported fields, json tags, pointer-to-slice pattern for optional arrays). The updated generator version line is also in the expected format. No changes needed.

Also applies to: 518-523, 687-693


1453-1457: Query parameter binding for duration is correctly modeled

GetResourceHistoryParams.Duration as an optional string with form:"duration,omitempty" / json:"duration,omitempty" and the wrapper’s BindQueryParameter("duration", &params.Duration) call align with how other optional query params are handled. Any validation or defaulting can be cleanly layered in the concrete handler implementation.


1747-1749: Endpoint wiring for /services/resources/history is complete

The new GetResourceHistory method is:

  • Declared on ServerInterface and Unimplemented with the usual 501 behavior.
  • Exposed via ServerInterfaceWrapper.GetResourceHistory, correctly binding the duration query param.
  • Registered in HandlerWithOptions at GET /services/resources/history.

This mirrors patterns used by neighboring endpoints, so the routing/plumbing looks correct.

Also applies to: 1984-1988, 3982-4015, 4311-4313


5775-5802: Strict-mode request/response types for GetResourceHistory are well-formed

GetResourceHistoryRequestObject, the GetResourceHistoryResponseObject interface, the 200 JSON response aliasing ResourceHistory, and the default JSON error wrapper all follow the existing convention used for other operations. Content-type and status handling in the Visit... methods are consistent.


5984-5986: Strict handler integration for GetResourceHistory is aligned with other operations

StrictServerInterface gains GetResourceHistory(...), and strictHandler.GetResourceHistory:

  • Wraps GetResourceHistoryParams into a GetResourceHistoryRequestObject.
  • Passes it through the strict middleware chain with the correct operation name.
  • Asserts the response as GetResourceHistoryResponseObject and dispatches via VisitGetResourceHistoryResponse.

This matches the strict-handler pattern elsewhere in the file; no adjustments needed.

Also applies to: 7120-7144

Comment thread internal/service/resource/service.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
api/v2/api.gen.go (1)

687-693: ResourceHistory shape matches store type; consider future type sharing

ResourceHistory has the same JSON surface as internal/service/resource/store.go (cpu, memory, disk, load), but uses pointer-to-slice fields and Cpu vs CPU identifiers. That’s fine for the API layer, but it does mean the convert helper must keep doing explicit field-by-field mapping between the service and API types; if this area grows, you might eventually want a shared DTO package or codegen for the store layer as well to avoid drift.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef0043f and e1c8cb1.

📒 Files selected for processing (4)
  • api/v2/api.gen.go (11 hunks)
  • api/v2/api.yaml (2 hunks)
  • ui/src/api/v2/schema.ts (4 hunks)
  • ui/src/pages/system-status/index.tsx (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/v2/api.yaml
🧰 Additional context used
📓 Path-based instructions (3)
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
  • ui/src/api/v2/schema.ts
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
  • ui/src/api/v2/schema.ts
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • api/v2/api.gen.go
🧠 Learnings (1)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to ui/**/*.{ts,tsx} : The React + TypeScript frontend resides in `ui/`, with production bundles copied to `internal/service/frontend/assets` by `make ui`

Applied to files:

  • ui/src/pages/system-status/index.tsx
🧬 Code graph analysis (2)
ui/src/pages/system-status/index.tsx (2)
ui/src/lib/utils.ts (1)
  • cn (4-6)
api/v2/api.gen.go (2)
  • SchedulerInstance (735-747)
  • CoordinatorInstance (163-178)
api/v2/api.gen.go (1)
internal/service/resource/store.go (2)
  • MetricPoint (10-13)
  • ResourceHistory (16-21)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test on ubuntu-latest
🔇 Additional comments (15)
ui/src/pages/system-status/index.tsx (4)

1-10: LGTM!

The import organization is clean and all imported modules are properly utilized throughout the component.


82-108: LGTM!

The auto-refresh toggle implementation follows the coding guidelines with compact sizing, proper dark mode support, and clear visual feedback when enabled.


112-145: LGTM!

Service card instances are correctly mapped from the API response data, with proper null safety and error handling. The field extraction matches the backend schema definitions.


147-179: LGTM!

The Resource Usage section is well-structured with responsive grid layout, consistent chart configuration, and proper loading/error state handling. The color choices provide good visual distinction between metrics.

ui/src/api/v2/schema.ts (1)

1-3244: Auto-generated schema file looks correct.

This file is auto-generated from the OpenAPI specification and includes the new resource monitoring endpoint, schemas, and operation definitions. The type definitions are properly structured and consistent with the OpenAPI patterns used throughout the file.

api/v2/api.gen.go (10)

518-523: MetricPoint wire model looks correct and consistent

MetricPoint matches the internal internal/service/resource/store.go shape at the JSON level (timestamp, value), and the Unix timestamp comment is clear. No issues here.


1453-1460: GetResourceHistoryParams follows existing params conventions

RemoteNode and optional Duration are wired exactly like other query params in this file (form + json tags, pointer types for optionality). This is consistent with the rest of the API surface.


1750-1752: ServerInterface extended cleanly with GetResourceHistory

The new method and comments align with the existing pattern for other /services/* endpoints, so implementers get a straightforward hook point.


1987-1991: Unimplemented stub for GetResourceHistory is consistent

Returning http.StatusNotImplemented matches all the other Unimplemented methods; nothing to change here.


3985-4026: ServerInterfaceWrapper binding for GetResourceHistory is correct

Middleware setup, context enrichment, and query binding for remoteNode and duration mirror the patterns used by other endpoints. Error paths correctly delegate to ErrorHandlerFunc.


4322-4324: Router registration for /services/resources/history is aligned

The route is grouped and registered in the same way as neighboring /services/* routes, using wrapper.GetResourceHistory and options.BaseURL correctly.


5786-5813: Strict GetResourceHistory request/response types are well-formed

The strict request wrapper, response interface, 200 JSON response (using ResourceHistory), and default error response all follow the same visitation and encoding patterns as the other strict endpoints.


5995-5997: StrictServerInterface correctly exposes GetResourceHistory

The strict interface method signature matches the request/response types above and is consistent with other entries in StrictServerInterface.


7131-7155: strictHandler wiring for GetResourceHistory matches the standard pattern

Request construction, middleware wrapping, dispatch to ssi.GetResourceHistory, and response handling mirror other strict handlers; type assertions and error paths are handled uniformly.


7259-7379: Embedded swaggerSpec updated; ensure it stays source-of-truth–driven

The base64/gzip swaggerSpec blob has changed, which is expected to reflect the new MetricPoint / ResourceHistory schemas and the /services/resources/history operation. Since this file is generated, just make sure the blob always comes from regenerating via oapi-codegen from api/v2/api.yaml rather than manual edits, so it doesn’t drift from the spec.

If you haven’t already, please rerun your usual OpenAPI → Go codegen pipeline (e.g., oapi-codegen invocation) and confirm this file matches the generated output in CI.

Comment thread ui/src/pages/system-status/index.tsx Outdated
Comment thread ui/src/pages/system-status/index.tsx

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
ui/src/pages/system-status/index.tsx (1)

79-83: Missing error handling leaves refresh button stuck if any request fails.

If any mutate call fails, Promise.all rejects and setIsRefreshing(false) is never called, leaving the button indefinitely disabled with a spinning icon.

 const handleRefresh = async () => {
   setIsRefreshing(true);
-  await Promise.all([mutateResource(), mutateScheduler(), mutateCoordinator()]);
-  setIsRefreshing(false);
+  try {
+    await Promise.all([mutateResource(), mutateScheduler(), mutateCoordinator()]);
+  } finally {
+    setIsRefreshing(false);
+  }
 };
internal/service/resource/service.go (1)

80-85: First CPU reading is invalid with interval=0.

The past review correctly identified this issue: cpu.PercentWithContext(ctx, 0, false) returns the percentage computed against the last prior call. On the first invocation (line 65 calls collect immediately), there is no baseline, so the returned value is meaningless.

As suggested in the previous review, use a positive interval:

-	if cpuPercent, err := cpu.PercentWithContext(ctx, 0, false); err != nil {
+	if cpuPercent, err := cpu.PercentWithContext(ctx, 100*time.Millisecond, false); err != nil {
🧹 Nitpick comments (2)
ui/src/pages/system-status/index.tsx (1)

195-199: Minor: Timestamp reflects render time, not actual data fetch time.

new Date().toLocaleTimeString() shows when the component rendered, which may differ from when data was actually fetched (especially on initial load or after errors). Consider deriving the timestamp from the data response if precise accuracy is needed, though the current implementation is acceptable for a monitoring dashboard with auto-refresh.

internal/service/resource/service.go (1)

23-29: Consider validating config parameter.

If cfg or cfg.Monitoring is nil, the constructor will panic when accessing cfg.Monitoring.Retention. For defensive coding, consider adding a nil check.

 func NewService(cfg *config.Config) *Service {
+	if cfg == nil || cfg.Monitoring == nil {
+		panic("config and config.Monitoring must not be nil")
+	}
 	return &Service{
 		config: cfg,
 		store:  NewMemoryStore(cfg.Monitoring.Retention),
 		done:   make(chan struct{}),
 	}
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1c8cb1 and adc414a.

📒 Files selected for processing (4)
  • internal/service/resource/service.go (1 hunks)
  • internal/service/resource/store_test.go (1 hunks)
  • ui/src/features/system-status/components/ResourceChart.tsx (1 hunks)
  • ui/src/pages/system-status/index.tsx (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/service/resource/store_test.go
  • ui/src/features/system-status/components/ResourceChart.tsx
🧰 Additional context used
📓 Path-based instructions (3)
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/service/resource/service.go
🧠 Learnings (1)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to ui/**/*.{ts,tsx} : The React + TypeScript frontend resides in `ui/`, with production bundles copied to `internal/service/frontend/assets` by `make ui`

Applied to files:

  • ui/src/pages/system-status/index.tsx
🧬 Code graph analysis (1)
ui/src/pages/system-status/index.tsx (2)
ui/src/lib/utils.ts (1)
  • cn (4-6)
api/v2/api.gen.go (2)
  • SchedulerInstance (735-747)
  • CoordinatorInstance (163-178)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Build
🔇 Additional comments (7)
ui/src/pages/system-status/index.tsx (4)

24-77: Previous feedback addressed - auto-refresh now applies to all data sources.

The queries for scheduler, coordinator, and resource data all have consistent refreshInterval configuration tied to the autoRefresh state, and all extract mutate functions for manual refresh. This addresses the previous review feedback.


97-121: Good implementation of compact controls with proper dark mode support.

The button styling follows the coding guidelines with compact sizing (h-7, text-xs), and the auto-refresh toggle has appropriate visual feedback for both light and dark modes (dark:bg-green-950). The spinning animation on the refresh button provides clear feedback during manual refresh.


126-159: Service cards correctly map API data with proper loading states.

The explicit type annotations on the map callbacks (s: SchedulerInstance, c: CoordinatorInstance) improve clarity, and the isLoading pattern (!data && !error) ensures stale data is shown while fetching updates, aligning with the coding guidelines.


161-193: Resource charts implemented with consistent patterns.

The four ResourceChart components follow a consistent prop structure. The unit="" for Load Average is correctly omitted since load averages aren't percentages. The color choices provide good visual distinction between metrics.

internal/service/resource/service.go (3)

54-56: LGTM!

Clean delegation to the store for history retrieval.


58-75: LGTM!

The loop properly handles initialization (immediate collection), periodic collection via ticker, and graceful shutdown via context cancellation. Cleanup with deferred channel close and ticker stop is correct.


87-109: LGTM!

Memory, disk, and load average collection logic is correct:

  • Memory: Uses VirtualMemoryWithContext to get system memory usage percentage
  • Disk: Monitors the configured data directory with UsageWithContext
  • Load: Captures 1-minute load average

Error handling logs failures but continues collection, allowing partial metrics to be stored, which is a reasonable resilience strategy.

Comment thread internal/service/resource/service.go
Comment thread internal/service/resource/service.go
Comment thread internal/service/resource/service.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
internal/service/resource/service_test.go (1)

147-166: Refactor to avoid manipulating internal state.

Closing svc.done directly at line 161 tests implementation details rather than observable behavior. If the service's synchronization mechanism changes, this test will break unnecessarily.

Consider one of these approaches:

  1. Simply call Stop without manipulating done and verify it returns no error (testing idempotent Stop)
  2. If the concern is blocking, use a short timeout context instead

Apply this diff to test idempotent Stop behavior without state manipulation:

-	// Create a new done channel that's already closed to prevent blocking
-	close(svc.done)
-
-	// Stop should handle nil cancel gracefully
+	// Stop without Start should be a no-op
 	err := svc.Stop(ctx)
 	require.NoError(t, err)
internal/service/frontend/api/v2/resources_test.go (2)

13-28: Good coverage of missing resource service error path

This test cleanly exercises the 500-path when resourceService is nil and asserts both error code and message, which is useful for guarding future refactors of the wiring and error text. Only thing to keep in mind is that it implicitly depends on test.SetupServer(t) continuing to construct the server without a resource service; if that default ever changes, this test will need to be updated to pass an explicit option for “no resource service”.


30-44: Duration-param error-path test is clear; consider future success/validation cases

This nicely verifies that adding a duration query parameter still hits the same internal-error path when the resource service is missing. As a follow-up (not for this PR), you might add tests for:

  • a successful history response when a resource service is injected, and
  • bad duration values (e.g. duration=bogus) returning a 4xx.

That would round out coverage for this endpoint.

internal/service/resource/store_test.go (2)

11-53: Solid coverage for Add/Get and duration filtering; watch time-based test brittleness

These tests give good confidence that MemoryStore.Add + GetHistory preserve ordering and that the duration window behaves as expected. The use of time.Sleep(1100 * time.Millisecond) to force a different Unix second is pragmatic, but it does add >1s per test and introduces a small risk of flakiness on very slow CI (e.g. GetHistory(50 * time.Millisecond) being called more than a second after the second Add). If this ever becomes an issue, consider:

  • injecting a time source into MemoryStore to allow deterministic timestamps in tests, or
  • directly constructing []MetricPoint with controlled timestamps when you only need to validate GetHistory/filtering behavior.

For now, the tests are acceptable as-is, just something to keep in mind.


55-81: Prune behavior test is effective but tightly coupled to internals

The test does a nice job of forcing and verifying pruning, including checking the internal slice length under a read lock. Just note that relying on store.lastPruned and store.cpu ties the test to the current implementation details; any refactor of the internal representation will require touching this test. If you want to reduce that coupling, a future refactor could assert prune behavior via public APIs only (e.g. by observing what GetHistory returns before/after prune), though that may be less direct.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between adc414a and 083979e.

📒 Files selected for processing (3)
  • internal/service/frontend/api/v2/resources_test.go (1 hunks)
  • internal/service/resource/service_test.go (1 hunks)
  • internal/service/resource/store_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/service/resource/service_test.go
  • internal/service/resource/store_test.go
  • internal/service/frontend/api/v2/resources_test.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Co-locate Go tests as *_test.go; favour table-driven cases and cover failure paths
Use stretchr/testify/require and shared fixtures from internal/test instead of duplicating mocks

Files:

  • internal/service/resource/service_test.go
  • internal/service/resource/store_test.go
  • internal/service/frontend/api/v2/resources_test.go
🧠 Learnings (2)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*_test.go : Co-locate Go tests as `*_test.go`; favour table-driven cases and cover failure paths

Applied to files:

  • internal/service/resource/store_test.go
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*_test.go : Use `stretchr/testify/require` and shared fixtures from `internal/test` instead of duplicating mocks

Applied to files:

  • internal/service/resource/store_test.go
🧬 Code graph analysis (3)
internal/service/resource/service_test.go (2)
internal/common/config/config.go (1)
  • MonitoringConfig (45-50)
internal/common/config/path.go (1)
  • Paths (13-30)
internal/service/resource/store_test.go (2)
internal/service/resource/store.go (2)
  • NewMemoryStore (41-45)
  • MetricPoint (10-13)
api/v2/api.gen.go (1)
  • MetricPoint (519-523)
internal/service/frontend/api/v2/resources_test.go (2)
internal/runtime/data.go (1)
  • Parallel (69-73)
internal/test/server.go (1)
  • SetupServer (25-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Build
🔇 Additional comments (8)
internal/service/resource/service_test.go (6)

13-29: LGTM!

Constructor test follows best practices: uses t.Parallel(), validates non-nil initialization, and confirms config assignment.


31-57: LGTM!

Basic lifecycle test validates Start/Stop operations complete without error. Uses t.TempDir() appropriately for temporary data directory.


59-86: LGTM!

Validates idempotent Start behavior—calling Start twice should not error. Good coverage of concurrent lifecycle edge cases.


88-124: LGTM!

Properly validates that GetHistory returns collected metrics. Using GreaterOrEqual is appropriate for timing-based collection where exact counts are non-deterministic.


126-145: LGTM!

Good edge-case coverage—validates that GetHistory returns an empty but non-nil result when the service hasn't started.


168-197: LGTM!

Validates periodic metric collection over multiple intervals. The conservative assertion (≥2 samples with a 50ms interval over 200ms) appropriately accounts for scheduling and timing uncertainties.

internal/service/resource/store_test.go (2)

83-133: Nice coverage of empty history, metric mapping, and copy semantics

These tests clearly validate:

  • nil slices for empty history,
  • correct mapping of CPU/Memory/Disk/Load values, and
  • that GetHistory returns independent copies.

This is exactly the kind of behavioral-contract coverage that will catch regressions if the internal storage is ever reworked.


135-167: This review comment is based on incorrect code snapshots and is no longer applicable.

The actual code at line 153 already uses assert.Same(t, &points[0], &result[0], "should return same slice") to verify slice identity, which is exactly the fix the review suggests. Additionally, the second test function TestFilterPoints_CopySliceBehavior referenced at lines 204–232 does not exist in the file. All recommendations in this review have either already been implemented or address non-existent code.

Likely an incorrect or invalid review comment.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
ui/src/pages/system-status/index.tsx (1)

118-128: Add accessible label to the icon-only refresh button

The refresh button is icon-only, so screen readers will not get a meaningful label. Add an aria-label (and optionally title) to describe the action, e.g. "Refresh system status", while keeping the compact styling.

Example:

-          <Button
+          <Button
             variant="outline"
             size="sm"
             onClick={handleRefresh}
             disabled={isRefreshing}
             className="h-7 px-2"
+            aria-label="Refresh system status"
+            title="Refresh system status"
           >

This keeps keyboard navigation intact and improves accessibility for assistive technologies.

As per coding guidelines, interactive components should include appropriate ARIA labels.

internal/service/resource/service_test.go (2)

31-66: Lifecycle test covers key behaviors; consider making wait more robust

TestService_Lifecycle validates idempotent Start, periodic collection, and clean Stop, which is good coverage. The fixed time.Sleep(150 * time.Millisecond) should be enough with a 50ms interval and immediate initial collect, but you might consider require.Eventually around the history length check to make the test more resilient to slow CI environments.

Example sketch:

require.Eventually(t, func() bool {
  h := svc.GetHistory(time.Hour)
  return h != nil && len(h.CPU) >= 1
}, time.Second, 50*time.Millisecond)

Not required, but would reduce timing sensitivity.


85-103: StopWithoutStart test can drop manual done channel closing

Given the current Stop implementation skips waiting on s.done when s.cancel is nil, pre‑closing svc.done is no longer needed and the comment is now misleading. Simplifying the test will better reflect the real behavior and avoid white‑box coupling to the internal done channel.

 func TestService_StopWithoutStart(t *testing.T) {
   t.Parallel()

   cfg := &config.Config{
     Monitoring: config.MonitoringConfig{
       Retention: time.Hour,
       Interval:  time.Second,
     },
   }

   svc := NewService(cfg)
   ctx := context.Background()

-  // Pre-close done channel to prevent blocking
-  close(svc.done)
-
   err := svc.Stop(ctx)
   require.NoError(t, err)
 }
internal/service/resource/service.go (1)

25-33: Guard against zero or negative monitoring intervals to avoid ticker panics

loop creates a ticker with time.NewTicker(s.config.Monitoring.Interval). If Monitoring.Interval is ever zero or negative due to misconfiguration, this will panic at runtime. Even if higher-level config code intends to enforce a positive interval, adding a small guard here would make the service more robust and self-contained.

For example:

func NewService(cfg *config.Config) *Service {
  if cfg == nil {
    panic("config must not be nil")
  }
+ if cfg.Monitoring.Interval <= 0 {
+   // fall back to a safe default, or panic with a clearer message
+   cfg.Monitoring.Interval = 10 * time.Second
+ }
  return &Service{
    config: cfg,
    store:  NewMemoryStore(cfg.Monitoring.Retention),
    done:   make(chan struct{}),
  }
}

Or, if you prefer a hard failure, update the panic message and explicitly check Interval to surface configuration errors early.

Please double-check whether the config layer already guarantees a positive Monitoring.Interval; if so, documenting that assumption here would also help future readers.

Also applies to: 69-74

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 083979e and 4bdf728.

📒 Files selected for processing (5)
  • internal/service/frontend/api/v2/resources_test.go (1 hunks)
  • internal/service/resource/service.go (1 hunks)
  • internal/service/resource/service_test.go (1 hunks)
  • internal/service/resource/store_test.go (1 hunks)
  • ui/src/pages/system-status/index.tsx (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/service/frontend/api/v2/resources_test.go
🧰 Additional context used
📓 Path-based instructions (4)
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/service/resource/service.go
  • internal/service/resource/store_test.go
  • internal/service/resource/service_test.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Co-locate Go tests as *_test.go; favour table-driven cases and cover failure paths
Use stretchr/testify/require and shared fixtures from internal/test instead of duplicating mocks

Files:

  • internal/service/resource/store_test.go
  • internal/service/resource/service_test.go
🧠 Learnings (3)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to ui/**/*.{ts,tsx} : The React + TypeScript frontend resides in `ui/`, with production bundles copied to `internal/service/frontend/assets` by `make ui`

Applied to files:

  • ui/src/pages/system-status/index.tsx
📚 Learning: 2025-12-04T10:34:01.023Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2025-12-04T10:34:01.023Z
Learning: Applies to ui/**/*.{ts,tsx,jsx,js} : NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead

Applied to files:

  • ui/src/pages/system-status/index.tsx
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*_test.go : Co-locate Go tests as `*_test.go`; favour table-driven cases and cover failure paths

Applied to files:

  • internal/service/resource/store_test.go
  • internal/service/resource/service_test.go
🧬 Code graph analysis (4)
ui/src/pages/system-status/index.tsx (3)
ui/src/lib/utils.ts (1)
  • cn (4-6)
api/v2/api.gen.go (2)
  • SchedulerInstance (735-747)
  • CoordinatorInstance (163-178)
ui/src/pages/dags/index.tsx (1)
  • mutate (213-213)
internal/service/resource/service.go (2)
internal/service/resource/store.go (3)
  • Store (24-27)
  • NewMemoryStore (41-45)
  • ResourceHistory (16-21)
internal/common/config/path.go (1)
  • Paths (13-30)
internal/service/resource/store_test.go (2)
internal/service/resource/store.go (2)
  • NewMemoryStore (41-45)
  • MetricPoint (10-13)
api/v2/api.gen.go (1)
  • MetricPoint (519-523)
internal/service/resource/service_test.go (5)
internal/runtime/data.go (1)
  • Parallel (69-73)
internal/common/config/config.go (1)
  • MonitoringConfig (45-50)
internal/common/logger/tag/tag.go (1)
  • Interval (217-219)
internal/service/resource/service.go (1)
  • NewService (25-34)
internal/common/config/path.go (1)
  • Paths (13-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test on ubuntu-latest
🔇 Additional comments (4)
ui/src/pages/system-status/index.tsx (2)

24-41: Resource polling and manual refresh wiring looks correct

Scheduler, coordinator, and resource queries all use the same remoteNode parameter and refreshInterval: autoRefresh ? 5000 : 0, and handleRefresh revalidates all three via Promise.all with a try/finally to clear isRefreshing. This matches the PR’s intent and avoids stale data or stuck loading state.

Also applies to: 43-77, 79-89


168-200: Resource Usage section integrates cleanly with resource history data

The four ResourceChart instances correctly bind to resourceData.{cpu,memory,disk,load} and reuse consistent isLoading/error conditions. The compact grid layout (md:grid-cols-2 lg:grid-cols-4 with tight gaps) matches the high‑density UI guideline while keeping the section readable.

internal/service/resource/store_test.go (1)

11-158: Comprehensive coverage of MemoryStore behavior

The tests exercise all critical behaviors of MemoryStore and filterPoints: per-metric mapping, ordering, time-window filtering, pruning via lastPruned, empty history semantics, and slice copy/aliasing guarantees. The use of distinct values and targeted assertions makes regressions easy to catch, and the parallelization is safe since each test uses its own store instance.

internal/service/resource/service.go (1)

17-33: Service lifecycle and metric collection logic look solid

The Service struct, Start/Stop lifecycle, and loop/collect logic are coherent:

  • Mutex protects cancel against concurrent Start/Stop.
  • Stop no-ops when never started and safely waits on done only when a loop is running.
  • loop collects once immediately, then on each ticker tick, and always closes done on exit.
  • collect degrades gracefully by logging metric errors and still appending a sample to the store.

This matches the intended behavior and integrates cleanly with the in-memory Store.

Also applies to: 69-86, 88-119

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ui/src/pages/system-status/index.tsx (1)

204-209: Timestamp shows render time, not actual data update time.

Line 206 computes new Date().toLocaleTimeString() on every render, which displays the component render time rather than when data was last fetched. This can mislead users if auto-refresh is off or if fetches fail, as the timestamp will update even when data hasn't.

Consider tracking the actual fetch timestamp by capturing it in the query callbacks or storing it in state when mutations succeed:

+  const [lastUpdateTime, setLastUpdateTime] = React.useState<Date>(new Date());
+
   const handleRefresh = async () => {
     setIsRefreshing(true);
     try {
       await Promise.all([
         mutateResource(),
         mutateScheduler(),
         mutateCoordinator(),
       ]);
+      setLastUpdateTime(new Date());
     } finally {
       setIsRefreshing(false);
     }
   };

   ...

   <div className="text-xs text-muted-foreground text-center">
-    Last updated: {new Date().toLocaleTimeString()}
+    Last updated: {lastUpdateTime.toLocaleTimeString()}
     {autoRefresh && ' • Refreshing every 5 seconds'}
   </div>

Then update lastUpdateTime when auto-refresh succeeds as well (e.g., via the onSuccess callback in useQuery options, if available).

🧹 Nitpick comments (1)
internal/service/resource/service.go (1)

58-69: LGTM with optional note on restart support.

The nil-check on line 60 correctly prevents the deadlock identified in previous review. The mutex unlock before waiting on done (line 62) avoids holding the lock during the blocking wait.

Note: The service does not currently support restart (calling Start after Stop) because cancel is never reset to nil and the done channel is never recreated. This is acceptable for a single-lifecycle service (typical server pattern), but if restart is needed in the future, you would need to reset state in Stop or guard against it in Start.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4bdf728 and cf4d56a.

📒 Files selected for processing (3)
  • internal/service/resource/service.go (1 hunks)
  • internal/service/resource/service_test.go (1 hunks)
  • ui/src/pages/system-status/index.tsx (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/service/resource/service_test.go
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/service/resource/service.go
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
🧠 Learnings (2)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to ui/**/*.{ts,tsx} : The React + TypeScript frontend resides in `ui/`, with production bundles copied to `internal/service/frontend/assets` by `make ui`

Applied to files:

  • ui/src/pages/system-status/index.tsx
📚 Learning: 2025-12-04T10:34:01.023Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: ui/CLAUDE.md:0-0
Timestamp: 2025-12-04T10:34:01.023Z
Learning: Applies to ui/**/*.{ts,tsx,jsx,js} : NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead

Applied to files:

  • ui/src/pages/system-status/index.tsx
🧬 Code graph analysis (1)
ui/src/pages/system-status/index.tsx (3)
ui/src/lib/utils.ts (1)
  • cn (4-6)
api/v2/api.gen.go (2)
  • SchedulerInstance (735-747)
  • CoordinatorInstance (163-178)
ui/src/pages/dags/index.tsx (1)
  • mutate (213-213)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Build
🔇 Additional comments (10)
ui/src/pages/system-status/index.tsx (6)

1-10: LGTM!

Imports are well-organized and include the necessary icons and components for resource monitoring. Type imports follow TypeScript best practices.


24-77: Auto-refresh properly implemented for all data sources.

The refreshInterval is now correctly configured for scheduler, coordinator, and resource queries, ensuring consistent auto-refresh behavior across all displayed data.


79-90: Manual refresh now updates all data sources correctly.

The refresh handler properly awaits all three data sources and uses try/finally to ensure the refreshing state is cleared even if mutations fail. Good error handling.


92-132: LGTM!

The header controls follow the coding guidelines with compact design (h-7), dark mode support, proper ARIA attributes for accessibility, and clear visual feedback for the auto-refresh state.


134-168: LGTM!

The services grid correctly maps scheduler and coordinator data, includes the port field for coordinators, and properly handles loading and error states per component without blocking the entire page.


170-202: LGTM!

The resource usage section uses a responsive grid layout with distinct colors for each metric. Each chart properly handles loading and error states independently, following the guideline to avoid full-page loading overlays.

internal/service/resource/service.go (4)

1-40: LGTM! Struct and constructor properly implement synchronization.

The addition of sync.Mutex at line 22 addresses the previous concurrency concern. The constructor appropriately validates the monitoring interval (lines 32-34), guards against nil config, and initializes the store and done channel. The separation of cancel initialization (deferred to Start) is correct.


42-56: LGTM! Start method properly synchronized.

The mutex acquisition (lines 43-44) and already-started check address the previous concurrency concern. The context creation and goroutine launch are correct.


71-92: LGTM! History access and collection loop are well-structured.

GetHistory correctly delegates to the store (line 72). The loop method properly defers channel close and ticker cleanup, collects an initial metric immediately (line 82) before entering the tick loop, and handles context cancellation cleanly.


94-126: LGTM! Metrics collection properly fixed and resilient.

The CPU interval fix (line 98: 100*time.Millisecond) addresses the previous critical issue where interval=0 returned meaningless data on the first call. All four metrics (CPU, memory, disk, load) are collected with appropriate error handling that logs failures but continues collection—a reasonable resilience trade-off. The bounds check on line 100 prevents index-out-of-range errors.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
ui/src/pages/system-status/index.tsx (2)

95-99: Consider tracking all data sources for timestamp updates.

The lastUpdateTime updates only when resourceData changes, but during auto-refresh all three data sources (scheduler, coordinator, resources) refresh independently. While they typically complete within milliseconds of each other, the timestamp could occasionally be slightly inaccurate if one query is slower.

For perfect consistency, consider updating the effect to track all three sources:

  React.useEffect(() => {
-   if (resourceData) {
+   if (resourceData || schedulerData || coordinatorData) {
      setLastUpdateTime(new Date());
    }
- }, [resourceData]);
+ }, [resourceData, schedulerData, coordinatorData]);

Note: The current approach works well in practice since queries refresh synchronously. The above is a minor refinement for theoretical completeness.


113-126: Add ARIA label for consistency and accessibility.

The auto-refresh button lacks an aria-label, while the refresh button below (lines 133-134) includes both aria-label and title. For consistency and to follow the coding guideline "Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels," consider adding an accessible label.

  <Button
    variant="outline"
    size="sm"
    onClick={() => setAutoRefresh(!autoRefresh)}
    className={cn(
      'h-7 px-2',
      autoRefresh && 'bg-green-50 dark:bg-green-950 border-green-500'
    )}
+   aria-label={`Auto-refresh ${autoRefresh ? 'enabled' : 'disabled'}`}
+   title={`Toggle auto-refresh (currently ${autoRefresh ? 'ON' : 'OFF'})`}
  >
    <Activity
      className={cn('h-3 w-3 mr-1', autoRefresh && 'text-green-500')}
    />
    <span className="text-xs">Auto: {autoRefresh ? 'ON' : 'OFF'}</span>
  </Button>
internal/service/resource/store_test.go (3)

12-36: Good coverage for Add/Get mapping and ordering

This test nicely checks both per‑metric lengths and CPU value ordering, and also validates that the different arguments map to the correct metric slices. You might optionally also assert the second values for Memory/Disk/Load for full symmetry, but as‑is the intent is clear and sufficient.


64-88: Prune behavior test is effective but a bit white‑box

Forcing pruning by manipulating lastPruned and then inspecting store.cpu under the lock is an effective way to exercise the pruning path, but it does couple the test to internal fields.

If you want to make this more black‑box and resilient to internal refactors, you could instead assert via GetHistory after the second Add, e.g., checking that only the newer point remains in the history.


118-164: Strengthen filterPoints copy/aliasing tests

The table‑driven subtests exercise the cutoff logic well, and the additional checks for nil on empty input and “no copy when idx == 0 && copySlice == false” are nice. The last scenario, however, is slightly under‑asserted:

  • The comment says // Copy when idx > 0 even with copySlice=false, but the test currently only checks Len(result) == 2, not that the returned slice is actually a copy and not an alias of points.

You can tighten this by asserting non‑aliasing in that case as well:

-	// Copy when idx > 0 even with copySlice=false
-	result = filterPoints(points, now-30, false)
-	assert.Len(t, result, 2)
+	// Copy when idx > 0 even with copySlice=false: should not alias original slice
+	result = filterPoints(points, now-30, false)
+	assert.Len(t, result, 2)
+	assert.NotSame(t, &points[2], &result[0], "should return a copied slice when dropping prefix")

This makes the intended contract of filterPoints fully enforced by the tests. The overall use of a table‑driven test here aligns well with the project’s testing guidelines. Based on learnings, this follows the preferred style for *_test.go files.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf4d56a and 06bef7e.

📒 Files selected for processing (2)
  • internal/service/resource/store_test.go (1 hunks)
  • ui/src/pages/system-status/index.tsx (5 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/service/resource/store_test.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Co-locate Go tests as *_test.go; favour table-driven cases and cover failure paths
Use stretchr/testify/require and shared fixtures from internal/test instead of duplicating mocks

Files:

  • internal/service/resource/store_test.go
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
🧠 Learnings (1)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*_test.go : Co-locate Go tests as `*_test.go`; favour table-driven cases and cover failure paths

Applied to files:

  • internal/service/resource/store_test.go
🧬 Code graph analysis (2)
internal/service/resource/store_test.go (2)
internal/service/resource/store.go (2)
  • NewMemoryStore (41-45)
  • MetricPoint (10-13)
api/v2/api.gen.go (1)
  • MetricPoint (519-523)
ui/src/pages/system-status/index.tsx (2)
ui/src/lib/utils.ts (1)
  • cn (4-6)
api/v2/api.gen.go (2)
  • SchedulerInstance (735-747)
  • CoordinatorInstance (163-178)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Build
🔇 Additional comments (6)
ui/src/pages/system-status/index.tsx (2)

146-176: Well-structured service card rendering with proper data mapping.

The implementation correctly:

  • Maps API responses to ServiceCard instances with appropriate property extraction
  • Provides || [] fallbacks for undefined data
  • Uses loading state !data && !error to show stale data during refreshes (per coding guidelines)
  • Includes semantic icons (Calendar for Scheduler, Server for Coordinator)
  • Handles errors by passing them to child components

The data mappings align with the SchedulerInstance and CoordinatorInstance schemas from the API.


179-211: Excellent implementation of Resource Usage section.

The new resource monitoring section demonstrates good practices:

  • Semantic HTML with <h2> heading
  • Responsive grid layout (1→2→4 columns) following coding guidelines
  • Distinct, accessible colors for each metric (blue/violet/emerald/amber)
  • Consistent loading and error handling across all charts
  • Optional chaining (resourceData?.cpu) prevents runtime errors
  • Appropriate empty string unit for Load Average (dimensionless metric)
  • Compact spacing (gap-4) maintains high information density per guidelines

The implementation integrates cleanly with the existing system status page and provides clear visual differentiation between metric types.

internal/service/resource/store_test.go (4)

1-10: Imports and test dependencies look appropriate

Package name, imports (including testing/synctest and testify) and overall layout are idiomatic and gofmt/goimports‑friendly. No issues here.


38-62: Deterministic time‑window filtering test looks solid

Using synctest.Test with a clear age gap between the two points gives a robust check that GetHistory respects the duration cutoff semantics (all points vs only recent ones). The expectations on counts and returned values are clear and non‑flaky.


90-100: Empty history expectations are clear

Asserting that all metric slices are nil for an empty store clearly documents the intended API semantics for GetHistory. This is straightforward and useful for callers to rely on.


102-116: Copy semantics test for GetHistory is correct

This is a good, minimal check that GetHistory returns independent data: mutating one history does not affect another. Given MetricPoint is a value type, this is sufficient to catch slice aliasing bugs.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
ui/src/pages/system-status/index.tsx (3)

25-42: Unified auto‑refresh and manual refresh behavior looks solid; consider small robustness tweaks

The three useQuery calls now share the same remoteNode handling and refreshInterval, and handleRefresh correctly revalidates all three datasets in parallel while always clearing isRefreshing and updating lastUpdateTime on success. The side effect that bumps lastUpdateTime when resourceData changes nicely keeps the “Last updated” label in sync with auto‑refresh.

Two optional polish ideas:

  • If you care about avoiding unhandled promise rejections from mutate*, you could add a catch branch (e.g., log or surface a toast) while still keeping the finally for setIsRefreshing(false).
  • If you ever see cases where resources fail but services succeed, you might want to base lastUpdateTime on any of the three datasets changing, rather than only resourceData.

These are non‑blocking; the current behavior is already coherent.

Also applies to: 44-60, 62-78, 82-90, 94-99, 217-217


118-123: Improve toggle semantics for the auto‑refresh button for assistive tech

The visual and textual indication of auto‑refresh state is clear, but for screen readers this would be more robust if the button exposed itself as a toggle:

  • Add aria-pressed={autoRefresh} to the auto‑refresh button so AT can treat it as a toggle button rather than a generic action.

Everything else about the controls (compact size, labels, titles, and spinner on refresh) fits the guidelines well.

Also applies to: 124-128, 135-140


181-213: Resource Usage grid is compact and handles loading/error states appropriately

The 1×2/4 grid with four ResourceChart instances is compact and information‑dense, and using isLoading={!resourceData && !resourceError} ensures you only show a loading state before the first payload while keeping charts visible during refreshes, as per the “stale data over blocking spinners” guideline. Passing distinct colors per metric also improves quick scanning.

If ResourceChart doesn’t already encode units from the data itself, you might consider explicitly passing units for each metric (e.g., "%" for CPU/load, "MiB" or similar for memory, "GiB"/"%" for disk) to make axes and legends clearer.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06bef7e and 607bf1d.

📒 Files selected for processing (2)
  • internal/service/resource/store_test.go (1 hunks)
  • ui/src/pages/system-status/index.tsx (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/service/resource/store_test.go
🧰 Additional context used
📓 Path-based instructions (2)
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
🧠 Learnings (1)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to ui/**/*.{ts,tsx} : The React + TypeScript frontend resides in `ui/`, with production bundles copied to `internal/service/frontend/assets` by `make ui`

Applied to files:

  • ui/src/pages/system-status/index.tsx
🧬 Code graph analysis (1)
ui/src/pages/system-status/index.tsx (2)
ui/src/lib/utils.ts (1)
  • cn (4-6)
api/v2/api.gen.go (2)
  • SchedulerInstance (735-747)
  • CoordinatorInstance (163-178)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Build
🔇 Additional comments (2)
ui/src/pages/system-status/index.tsx (2)

1-9: Imports and new timestamp state are consistent with the added functionality

The new imports (schema types, ResourceChart, ServiceCard, useQuery, icons) and the lastUpdateTime Date state are all used correctly later in the file; nothing stands out as redundant or inconsistent with the new resource-monitoring UI.

Also applies to: 19-19


150-157: ServiceCard instance mapping and loading/error flags are well structured

Mapping scheduler/coordinator payloads into plain instances arrays with a defensive || [] fallback, plus deriving isLoading from !data && !error, gives ServiceCard stable inputs and shows stale data while refreshing, which matches the UI guidelines. The field selection (instanceId, host, port, status, startedAt) is aligned with the API types from components['schemas'].

Also applies to: 166-174

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
internal/service/resource/store.go (3)

38-51: Clarify/guard retention semantics for zero or negative durations

If retention is <= 0, cutoff := now.Add(-s.retention).Unix() will be >= now, so prunePoints will tend to drop almost everything on each prune (and < 0 can effectively nuke all data). If that’s not intentional, consider validating retention in NewMemoryStore (e.g., reject or normalize non‑positive values) or at least document the behavior.


76-107: Consider explicit behavior for non‑positive history durations

With duration <= 0, cutoff := time.Now().Add(-duration).Unix() is >= now, so GetHistory will always return an empty ResourceHistory. Either ensure callers never pass non‑positive durations, or guard here (e.g., clamp to a minimum or interpret <=0 as “use full retention”) to avoid surprising behavior.


109-149: Document the sorted‑by‑time invariant relied on by prune/filter helpers

Both prunePoints and filterPoints use sort.Search assuming points is sorted by Timestamp ascending, which currently holds because Add only appends time.Now() samples. It’d be helpful to add a brief comment near points/dataPoint or Add noting this invariant (and that large clock skews could violate it) so future changes don’t accidentally break pruning/history.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 607bf1d and 08a2d42.

📒 Files selected for processing (2)
  • internal/service/resource/store.go (1 hunks)
  • internal/service/resource/store_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/service/resource/store_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/service/resource/store.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test on ubuntu-latest
🔇 Additional comments (2)
internal/service/resource/store.go (2)

9-27: Data model and store interface look clean and focused

MetricPoint, ResourceHistory, and Store are minimal, clear, and map well to time‑series resource metrics; nothing blocking here.


53-74: Add path and write‑side pruning look correct and thread‑safe

Write locking around append and prune, plus minute‑granularity pruning, is idiomatic and should keep memory growth bounded under concurrent callers; no issues from a concurrency standpoint.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/service/resource/ringbuffer.go (1)

3-50: RingBuffer implementation looks correct; consider documenting concurrency and pointer semantics

The wraparound logic (head, count, and start calculation in ForEach) correctly produces oldest‑first iteration in both partial and full states, and Push is O(1) with fixed memory as intended. Given how this underpins the store, this is a solid, simple implementation.

Two minor, doc‑level suggestions:

  • Clarify in a comment that RingBuffer is not concurrency‑safe on its own and should be externally synchronized (as you do in MemoryStore), to avoid misuse elsewhere.
  • Since ForEach exposes *T pointing directly into the internal slice, it may be worth noting that callers must not retain those pointers beyond the call or mutate them concurrently with Push, to avoid surprising aliasing later.

No code changes strictly required; this is more about making the contract explicit for future callers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08a2d42 and 257ed68.

📒 Files selected for processing (4)
  • internal/service/resource/ringbuffer.go (1 hunks)
  • internal/service/resource/ringbuffer_test.go (1 hunks)
  • internal/service/resource/store.go (1 hunks)
  • internal/service/resource/store_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/service/resource/ringbuffer.go
  • internal/service/resource/ringbuffer_test.go
  • internal/service/resource/store_test.go
  • internal/service/resource/store.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Co-locate Go tests as *_test.go; favour table-driven cases and cover failure paths
Use stretchr/testify/require and shared fixtures from internal/test instead of duplicating mocks

Files:

  • internal/service/resource/ringbuffer_test.go
  • internal/service/resource/store_test.go
🧠 Learnings (2)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*_test.go : Co-locate Go tests as `*_test.go`; favour table-driven cases and cover failure paths

Applied to files:

  • internal/service/resource/ringbuffer_test.go
  • internal/service/resource/store_test.go
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*_test.go : Use `stretchr/testify/require` and shared fixtures from `internal/test` instead of duplicating mocks

Applied to files:

  • internal/service/resource/ringbuffer_test.go
  • internal/service/resource/store_test.go
🧬 Code graph analysis (3)
internal/service/resource/ringbuffer_test.go (2)
internal/runtime/data.go (1)
  • Parallel (69-73)
internal/service/resource/ringbuffer.go (1)
  • NewRingBuffer (12-19)
internal/service/resource/store_test.go (1)
internal/service/resource/store.go (2)
  • NewMemoryStore (44-46)
  • NewMemoryStoreWithInterval (49-57)
internal/service/resource/store.go (2)
api/v2/api.gen.go (2)
  • MetricPoint (519-523)
  • ResourceHistory (688-693)
internal/service/resource/ringbuffer.go (2)
  • RingBuffer (5-9)
  • NewRingBuffer (12-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test on ubuntu-latest
  • GitHub Check: Build
🔇 Additional comments (9)
internal/service/resource/ringbuffer_test.go (1)

1-107: LGTM! Comprehensive test coverage for RingBuffer.

The test suite thoroughly covers RingBuffer behavior including wraparound, early termination, empty buffers, invalid capacity handling, single-element buffers, and partial fills. All tests are properly marked for parallel execution and use clear assertions.

internal/service/resource/store.go (4)

8-35: LGTM! Clean type definitions.

The type hierarchy is well-structured: MetricPoint for individual measurements, ResourceHistory for aggregated metrics, Store interface for storage abstraction, and internal dataPoint for efficient storage. The distinction between internal and API types is appropriate.


37-57: LGTM! Well-designed constructors with sensible defaults.

The capacity calculation int(retention/interval) + 10 provides a reasonable buffer (e.g., 370 slots for 1-hour retention at 10s intervals), and the interval validation with a 10-second default prevents invalid configurations.


59-71: LGTM! Correct synchronization and efficient implementation.

The Add method properly uses Lock for write protection and stores data points efficiently. The "Zero allocations, O(1)" comment is accurate for steady-state operation after the ring buffer is initialized.


73-118: LGTM! Correct filtering and proper synchronization.

The GetHistory method correctly uses RLock for concurrent reads, handles empty buffers gracefully, and pre-allocates result slices based on a counting pass. The two-pass approach ensures optimal memory allocation, and returning new slices prevents aliasing issues.

internal/service/resource/store_test.go (4)

12-36: LGTM! Thorough test of basic Add and GetHistory operations.

The test properly validates ordering, metric mapping, and uses distinct values to catch potential errors. Good use of require for critical length assertions and assert for value checks.


64-74: LGTM! Clean test of empty history edge case.

Properly validates that an empty store returns empty metric slices for all resource types.


76-90: LGTM! Important test for copy semantics.

Validates that GetHistory returns independent copies, preventing aliasing issues that could lead to data corruption in concurrent scenarios.


92-102: LGTM! Validates capacity calculation and default behavior.

The test correctly verifies the capacity formula (360 + 10 = 370 for 1-hour retention at 10s intervals) and ensures invalid intervals default to 10 seconds. Accessing the unexported buffer.buffer field is acceptable for internal testing within the same package.

Comment thread internal/service/resource/store_test.go
@yohamta0

yohamta0 commented Dec 7, 2025

Copy link
Copy Markdown
Collaborator Author

@coderabbitai generate docstrings

@coderabbitai

coderabbitai Bot commented Dec 7, 2025

Copy link
Copy Markdown
✅ Actions performed

Initiated docstring generation; will generate only if new commits exist.

@coderabbitai

coderabbitai Bot commented Dec 7, 2025

Copy link
Copy Markdown

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #1462

coderabbitai Bot added a commit that referenced this pull request Dec 7, 2025
Docstrings generation was requested by @yottahmd.

* #1461 (comment)

The following files were modified:

* `api/v2/api.gen.go`
* `internal/cmd/server.go`
* `internal/cmd/startall.go`
* `internal/common/config/loader.go`
* `internal/service/frontend/api/v2/api.go`
* `internal/service/frontend/api/v2/resources.go`
* `internal/service/frontend/server.go`
* `internal/service/resource/ringbuffer.go`
* `internal/service/resource/service.go`
* `internal/service/resource/store.go`
* `ui/src/pages/system-status/index.tsx`
coderabbitai Bot and others added 2 commits December 8, 2025 01:13
Docstrings generation was requested by @yottahmd.

* #1461 (comment)

The following files were modified:

* `api/v2/api.gen.go`
* `internal/cmd/server.go`
* `internal/cmd/startall.go`
* `internal/common/config/loader.go`
* `internal/service/frontend/api/v2/api.go`
* `internal/service/frontend/api/v2/resources.go`
* `internal/service/frontend/server.go`
* `internal/service/resource/ringbuffer.go`
* `internal/service/resource/service.go`
* `internal/service/resource/store.go`
* `ui/src/pages/system-status/index.tsx`

<details>
<summary>These files were ignored</summary>

* `internal/common/config/loader_test.go`
* `internal/integration/queue_shell_test.go`
* `internal/service/frontend/api/v2/resources_test.go`
* `internal/service/resource/ringbuffer_test.go`
* `internal/service/resource/service_test.go`
* `internal/service/resource/store_test.go`

</details>

<details>
<summary>These file types are not supported</summary>

* `api/v2/api.yaml`
* `go.mod`

</details>

<details>
<summary>ℹ️ Note</summary><blockquote>

CodeRabbit cannot perform edits on its own pull requests yet.

</blockquote></details>

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ui/src/pages/system-status/index.tsx (1)

103-108: Consider including all data sources in the dependency array for timestamp accuracy.

The effect only triggers on resourceData changes, so if scheduler or coordinator data updates independently (due to network timing differences), the timestamp won't reflect it. Since all queries share the same 5-second interval this is unlikely to cause noticeable issues, but for completeness you could include all three:

 React.useEffect(() => {
-  if (resourceData) {
+  if (resourceData || schedulerData || coordinatorData) {
     setLastUpdateTime(new Date());
   }
- }, [resourceData]);
+ }, [resourceData, schedulerData, coordinatorData]);

This is optional since the practical impact is minimal given identical refresh intervals.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 257ed68 and 6a8fd48.

📒 Files selected for processing (11)
  • api/v2/api.gen.go (12 hunks)
  • internal/cmd/server.go (2 hunks)
  • internal/cmd/startall.go (5 hunks)
  • internal/common/config/loader.go (4 hunks)
  • internal/service/frontend/api/v2/api.go (4 hunks)
  • internal/service/frontend/api/v2/resources.go (1 hunks)
  • internal/service/frontend/server.go (2 hunks)
  • internal/service/resource/ringbuffer.go (1 hunks)
  • internal/service/resource/service.go (1 hunks)
  • internal/service/resource/store.go (1 hunks)
  • ui/src/pages/system-status/index.tsx (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • internal/service/frontend/server.go
  • internal/service/frontend/api/v2/api.go
  • internal/service/resource/ringbuffer.go
  • internal/cmd/server.go
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Backend entrypoint in cmd/ orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under internal/* (for example internal/runtime, internal/persistence)
Keep Go files gofmt/goimports clean; use tabs, PascalCase for exported symbols (SchedulerClient), lowerCamelCase for locals, and Err... names for package-level errors
Repository linting relies on golangci-lint; prefer idiomatic Go patterns, minimal global state, and structured logging helpers in internal/common

Files:

  • internal/common/config/loader.go
  • internal/service/frontend/api/v2/resources.go
  • internal/service/resource/store.go
  • internal/cmd/startall.go
  • internal/service/resource/service.go
  • api/v2/api.gen.go
ui/**/*.{ts,tsx,jsx,js}

📄 CodeRabbit inference engine (ui/CLAUDE.md)

ui/**/*.{ts,tsx,jsx,js}: Use developer-centric UI design with high information density, minimal whitespace, compact components, and no unnecessary decorations
Support both light and dark modes for all UI components using Tailwind CSS class pairs like dark:bg-slate-700
NEVER use full-page loading overlays or LoadingIndicator components that hide content - show stale data while fetching updates instead
Use compact modal design with small headers, minimal padding (p-2 or p-3), tight spacing, and support keyboard navigation (arrows, enter, escape)
Use small heights for form elements: select boxes h-7 or smaller, buttons h-7 or h-8, inputs with compact padding (py-0.5 or py-1)
Minimize row heights in tables and lists while maintaining readability, merge related columns, and always handle long text with whitespace-normal break-words
Use consistent metadata styling with bg-slate-200 dark:bg-slate-700 backgrounds and maintain text hierarchy with primary/secondary/muted text colors
Use flexbox-first layouts with min-h-0 and overflow-hidden to prevent layout breaks, account for fixed elements when setting heights
Maintain keyboard navigation support in all interactive components with appropriate focus indicators and ARIA labels

Files:

  • ui/src/pages/system-status/index.tsx
ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

ui/**/*.{ts,tsx}: The React + TypeScript frontend resides in ui/, with production bundles copied to internal/service/frontend/assets by make ui
UI code follows ESLint + Prettier (2-space indent) and Tailwind utilities; name React components in PascalCase (JobList.tsx) and hooks with use* (useJobs.ts)

Files:

  • ui/src/pages/system-status/index.tsx
🧠 Learnings (2)
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to ui/**/*.{ts,tsx} : The React + TypeScript frontend resides in `ui/`, with production bundles copied to `internal/service/frontend/assets` by `make ui`

Applied to files:

  • ui/src/pages/system-status/index.tsx
📚 Learning: 2025-12-04T10:34:17.051Z
Learnt from: CR
Repo: dagu-org/dagu PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T10:34:17.051Z
Learning: Applies to **/*.go : Backend entrypoint in `cmd/` orchestrates the scheduler and CLI; runtime, persistence, and service layers sit under `internal/*` (for example `internal/runtime`, `internal/persistence`)

Applied to files:

  • internal/cmd/startall.go
🧬 Code graph analysis (4)
internal/service/frontend/api/v2/resources.go (3)
api/v2/api.gen.go (6)
  • GetResourceHistoryRequestObject (5790-5792)
  • GetResourceHistoryResponseObject (5794-5796)
  • GetResourceHistorydefaultJSONResponse (5807-5810)
  • Error (432-441)
  • GetResourceHistory200JSONResponse (5798-5798)
  • MetricPoint (519-523)
internal/common/logger/context.go (1)
  • Warn (45-47)
internal/service/resource/store.go (1)
  • MetricPoint (9-12)
internal/service/resource/store.go (2)
api/v2/api.gen.go (2)
  • MetricPoint (519-523)
  • ResourceHistory (688-693)
internal/service/resource/ringbuffer.go (2)
  • RingBuffer (5-9)
  • NewRingBuffer (14-21)
internal/cmd/startall.go (3)
internal/service/resource/service.go (1)
  • NewService (32-45)
internal/service/coordinator/service.go (1)
  • NewService (31-50)
internal/common/sock/server.go (1)
  • NewServer (36-44)
api/v2/api.gen.go (2)
internal/service/resource/store.go (2)
  • MetricPoint (9-12)
  • ResourceHistory (15-20)
internal/common/config/config.go (1)
  • RemoteNode (187-196)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test on ubuntu-latest
🔇 Additional comments (26)
ui/src/pages/system-status/index.tsx (5)

1-10: LGTM!

Imports are well-organized and all appear to be used in the component. The mix of lucide-react icons, UI components, and hooks is appropriate for this feature.


34-87: LGTM!

Data fetching is correctly implemented with consistent auto-refresh across all three endpoints. The pattern of extracting data, error, and mutate from each query enables unified refresh behavior.


89-101: LGTM!

The handleRefresh function correctly uses try/finally to ensure isRefreshing is reset even on errors, and Promise.all refreshes all data sources together. The timestamp update after successful refresh is appropriate.


122-150: LGTM!

The header controls follow coding guidelines with compact h-7 buttons, proper ARIA labels/titles for accessibility, and dark mode support via dark:bg-green-950. The animated refresh icon provides good visual feedback during loading.


190-222: LGTM!

The resource usage section has a well-structured responsive grid layout (1→2→4 columns) and consistent props across all ResourceChart components. The color comments are helpful for maintainability.

internal/common/config/loader.go (3)

345-367: LGTM! Monitoring configuration parsing follows established patterns.

The monitoring configuration handling mirrors the existing scheduler configuration approach, with proper duration parsing, warning generation for invalid values, and sensible defaults (24h retention, 5s interval).


592-594: LGTM! Viper defaults align with buildConfig defaults.

The default values for monitoring retention and interval are consistent with the fallback values set in buildConfig (lines 363, 366).


697-699: LGTM! Environment variable bindings follow project conventions.

The monitoring configuration environment variables are bound consistently with other settings in the function.

internal/service/frontend/api/v2/resources.go (2)

13-48: LGTM! Handler implements proper validation and error handling.

The GetResourceHistory handler correctly:

  • Guards against missing resource service
  • Validates and constrains the duration parameter
  • Logs warnings for invalid input
  • Returns structured responses

The implementation follows the project's logging and error handling patterns.


50-61: LGTM! Clean metric conversion with proper pre-allocation.

The convertMetrics helper correctly maps internal metric points to API types with efficient memory allocation.

internal/service/resource/service.go (5)

17-23: LGTM! Synchronization issues from previous review have been addressed.

The Service struct now includes a sync.Mutex to protect concurrent access to the cancel field, resolving the data race concerns raised in earlier reviews.


47-61: LGTM! Start method now properly synchronized.

The mutex protection for the cancel field access addresses the concurrency concerns raised in previous reviews.


63-74: LGTM! Stop method now handles the not-started case.

The guard against waiting on done when the service was never started resolves the potential deadlock identified in earlier reviews.


80-97: LGTM! Loop implementation is clean and correct.

The immediate initial collection followed by ticker-driven sampling provides good user experience, and the defer cleanup ensures proper resource management.


99-131: LGTM! CPU collection now uses proper interval.

Line 103 correctly uses a 100ms interval for cpu.PercentWithContext, which resolves the invalid-first-reading issue identified in previous reviews. The error handling appropriately continues collecting remaining metrics when one fails.

internal/service/resource/store.go (5)

8-20: LGTM! Data types are well-defined and match API schema.

The MetricPoint and ResourceHistory structures appropriately model the resource metrics for API exposure.


22-41: LGTM! Store interface and MemoryStore design are sound.

The use of a generic RingBuffer for fixed-capacity storage and RWMutex for read-write concurrency control is appropriate for this use case.


43-61: LGTM! Store constructors are correctly implemented.

The capacity calculation in NewMemoryStoreWithInterval properly accounts for retention and sampling interval. The +10 buffer provides a safety margin.

Note: The service should use NewMemoryStoreWithInterval instead of NewMemoryStore to ensure correct buffer sizing (see comment in service.go).


63-75: LGTM! Add method is simple and efficient.

The implementation correctly protects the ring buffer write with a mutex and uses the current timestamp for each data point.


77-122: LGTM! GetHistory uses efficient two-pass approach.

The implementation correctly pre-allocates result slices by counting matching points first, avoiding slice growth during the second pass. The use of RLock allows concurrent reads.

internal/cmd/startall.go (2)

97-135: LGTM! Resource service integration is clean and correct.

The resource monitoring service is properly:

  • Initialized with the configuration
  • Passed to the server for API wiring
  • Started early in the lifecycle with appropriate error handling
  • Integrated into the service orchestration

216-218: LGTM! Resource service shutdown is properly integrated.

The resource service is stopped during graceful shutdown with appropriate error logging, maintaining the overall shutdown sequence integrity.

api/v2/api.gen.go (4)

518-523: ResourceHistory / MetricPoint models look consistent

Timestamp/value representation and JSON tags for CPU/memory/disk/load match the described API shape and mirror existing patterns in this file. No issues from the generated structs themselves.

Also applies to: 687-693


1453-1460: GetResourceHistory endpoint wiring is coherent end‑to‑end

Params, server interfaces (regular and strict), unimplemented stub, chi routing, middleware wrapper, and typed request/response objects for GET /services/resources/history all follow the same patterns as existing endpoints (e.g., coordinator/scheduler status). Optional remoteNode and duration query binding is handled via the standard runtime helpers. Nothing stands out as incorrect here.

Also applies to: 1750-1752, 1987-1991, 3985-4026, 4326-4328, 5790-5817, 5999-6001, 7135-7159


4193-4198: Expanded HandlerWithOptions doc comment is accurate and helpful

The comment correctly describes default router and error‑handler behavior and improves discoverability without changing semantics.


7216-7383: Updated embedded swaggerSpec appears to be regenerated

This base64+gzipped blob is clearly the re‑encoded OpenAPI spec reflecting the new resource history models/endpoint. Given it’s generator output and not meant for manual editing, there’s nothing actionable to review here beyond ensuring it was regenerated from the canonical api.yaml, which it appears to be.

If you want extra safety, you can verify by re‑running the OpenAPI generator from the spec and confirming this file is unchanged.

Comment thread internal/service/resource/service.go
@yohamta0 yohamta0 merged commit b4c857d into main Dec 7, 2025
5 checks passed
@yohamta0 yohamta0 deleted the resource-monitoring branch December 7, 2025 16:46
@codecov

codecov Bot commented Dec 8, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 81.98758% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.01%. Comparing base (b2da767) to head (8e31d14).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
internal/service/resource/service.go 81.96% 4 Missing and 7 partials ⚠️
internal/common/config/loader.go 57.89% 4 Missing and 4 partials ⚠️
internal/cmd/server.go 42.85% 2 Missing and 2 partials ⚠️
internal/cmd/startall.go 33.33% 2 Missing and 2 partials ⚠️
internal/service/resource/store.go 95.55% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1461      +/-   ##
==========================================
+ Coverage   59.78%   60.01%   +0.23%     
==========================================
  Files         188      191       +3     
  Lines       21299    21473     +174     
==========================================
+ Hits        12733    12888     +155     
- Misses       7234     7240       +6     
- Partials     1332     1345      +13     
Files with missing lines Coverage Δ
internal/cmd/context.go 70.74% <100.00%> (ø)
internal/common/config/config.go 100.00% <ø> (ø)
internal/service/resource/ringbuffer.go 100.00% <100.00%> (ø)
internal/service/resource/store.go 95.55% <95.55%> (ø)
internal/cmd/server.go 78.94% <42.85%> (-8.56%) ⬇️
internal/cmd/startall.go 72.72% <33.33%> (-2.08%) ⬇️
internal/common/config/loader.go 90.38% <57.89%> (-1.48%) ⬇️
internal/service/resource/service.go 81.96% <81.96%> (ø)

... and 10 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2da767...8e31d14. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant