logging: add clock-jump recovery and tighten Alloy service ordering#1677
Merged
brianmcgillion merged 1 commit intotiiuae:mainfrom Jan 14, 2026
Merged
logging: add clock-jump recovery and tighten Alloy service ordering#1677brianmcgillion merged 1 commit intotiiuae:mainfrom
brianmcgillion merged 1 commit intotiiuae:mainfrom
Conversation
vunnyso
reviewed
Jan 13, 2026
ffc19de to
c244e4a
Compare
vunnyso
reviewed
Jan 14, 2026
- Add ghaf.logging.recovery options and shared clock-jump watcher + recover oneshot. - Ensure alloy.service is ordered after/requires systemd-journald on client and server. - Server pipeline: route journald through loki.process, drop entries older than 168h, and align WAL max_segment_age. Signed-off-by: Everton de Matos <everton.dematos@tii.ae>
c244e4a to
3ea543f
Compare
vunnyso
approved these changes
Jan 14, 2026
brianmcgillion
approved these changes
Jan 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of Changes
This PR introduces a clock-jump recovery mechanism for Ghaf logging, designed to handle manual or abrupt realtime clock changes that may otherwise disrupt journald ordering and Alloy log shipping. This PR aims to resolve the bug described at https://jira.tii.ae/browse/SSRCSP-7772. Summary of modifications:
ghaf.logging.recoveryoptions and clock-jump watcher + recover oneshot services.alloy.serviceis ordered after/requires systemd-journald on client and server.modules/common/logging/common.nixand reusable across all VMs.admin-vm, as it aggregates and forwards the system logs. Can be enabled for different VMs with different parameters (e.g.,thresholdSeconds,intervalSeconds, etc.)loki.process, drop entries older than 168h, and align WALmax_segment_age. Aligned WAL retention and log dropping policy (older_than= 168h). It is also aligned with the Grafana 7-day (168h) default policy.Performance Evaluation
The
ghaf-clock-jump-watcher.servicewas monitored in two 30-minute window situations:The following Table summarizes the CPU and memory consumption results for both scenarios:
Graph for scenario (i):

Graph for scenario (ii):

Type of Change
Related Issues / Tickets
https://jira.tii.ae/browse/SSRCSP-7772
Checklist
make-checksand it passesTesting Instructions
Applicable Targets
aarch64aarch64x86_64x86_64x86_64Installation Method
nixos-rebuild ... switchTest Steps To Verify:
You can perform the exact same steps as described at https://jira.tii.ae/browse/SSRCSP-7772:
9.1. This step is not mandatory, as the system does it by default when back online