feat(apply): magic rollback by water-sucks · Pull Request #133 · nix-community/nixos-cli

water-sucks · 2025-11-14T02:58:45Z

Description

In case changes in configuration settings for connections (i.e. SSH, internet interfaces, etc.) restart and cause loss of access to the machine in question, this PR implements a magic rollback mechanism that runs on the target.

It offloads a ton of the switch logic into a supervisor bash script, which gets ran as a systemd-run transient unit on the host. What makes this hugely different from deploy-rs is that it does not require modifying the destination closure whatsoever, so magic rollback can be used on systems without nixos-cli WHATSOEVER, including on a completely base NixOS system with no extra options.

Closes #119.

Summary by CodeRabbit

Release Notes

New Features
- Added --rollback-timeout and --no-rollback CLI flags for apply command to control automatic system rollback behavior.
- Introduced "magic rollback" for remote system deployments via activation supervisor with acknowledgement-based safety.
- Added new configuration options rollback.enable and rollback.timeout (replaces auto_rollback).
Documentation
- Updated apply command documentation to describe activation behavior and magic rollback for remote systems.
- Expanded debug mode environment variable documentation.
Tests
- Added comprehensive magic rollback test coverage for remote system deployments.

water-sucks · 2026-04-05T03:50:05Z

@coderabbitai review

coderabbitai · 2026-04-05T03:50:10Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-04-05T04:03:11Z

📝 Walkthrough

Walkthrough

Implements magic rollback for remote system switches via an activation supervisor script and infrastructure. Adds systemd-duration-based rollback timeout configuration, CLI flags (--no-rollback, --rollback-timeout), and refactors generation/activation code to use system abstractions.

Changes

Cohort / File(s)	Summary
Activation Infrastructure `internal/activation/activation.go`, `internal/activation/supervisor.sh`	Introduces activation lock/success constants, `RollbackNixProfile` function, `RunActivationSupervisor` control flow with SSH monitoring and reconnection logic, and a POSIX shell supervisor script implementing acknowledgement-based rollback for remote switches with flock-based concurrency control.
Apply Command Rollback Logic `cmd/apply/apply.go`	Adds `--no-rollback` and `--rollback-timeout` CLI flags with validation, introduces conditional `RunActivationSupervisor` path for remote switches vs fallback direct switch path, tracks generation creation to gate rollback behavior, and modifies logging for activation failures.
Activate Command Logging `cmd/activate/activate.go`, `cmd/activate/run.go`	Adds `PreRun` hook to toggle logger verbosity based on `opts.Verbose`, refactors lockfile handling via `acquireLock` helper using new `ACTIVATION_LOCKFILE` constant, and updates specialisation collection to use system abstraction.
Generation Management `cmd/generation/delete/delete.go`, `cmd/generation/delete/resolver.go`, `cmd/generation/delete/resolver_test.go`, `cmd/generation/rollback/rollback.go`, `cmd/generation/switch/switch.go`, `cmd/generation/shared/utils.go`, `cmd/info/info.go`	Updates deletion flag handling from string to `SystemdDuration`, refactors `GenerationFromDirectory` calls to pass system abstraction, changes rollback gate from `AutoRollback` to `Rollback.Enable`, and updates completion logic for system-aware generation collection.
System Abstraction Expansion `internal/system/fs.go`, `internal/system/local_fs.go`, `internal/system/sftp_fs.go`, `internal/system/ssh.go`, `internal/system/elevator.go`	Extends `Filesystem` interface with `CreateFile`, `ReadDir`, `RealPath`, `Glob` methods; implements these in local and SFTP filesystem backends; replaces dynamic username-based password prompts with static format; updates SSH command building to use new `quoteAndJoin` helper.
Generation Refactoring `internal/generation/generation.go`, `internal/generation/specialisations.go`, `internal/generation/completion.go`	Adds `Path` field to `Generation` struct, updates `GenerationFromDirectory` and `CollectGenerationsInProfile` signatures to accept system abstraction, refactors specialisations collection and completions to use system-aware filesystem operations.
Configuration & Duration Handling `internal/cmd/opts/opts.go`, `internal/settings/settings.go`, `internal/settings/settings_test.go`, `internal/systemd/time.go`, `internal/systemd/time_test.go`, `internal/constants/constants.go`	Introduces `SystemdDuration` type with flag/koanf parsing support, adds `rollback` config section with `enable`/`timeout` fields, deprecates `AutoRollback` in favor of `Rollback.Enable`, updates `DurationFromTimeSpan` return type, adds `NixOSActivationDirectory` constant, and includes new `RemoveDefaultValueDesc` utility.
Logging Infrastructure `internal/logger/replay.go`	Introduces `ReplayLogger` type that buffers logging calls and replays them to an underlying logger via `Flush()`, enabling delayed/conditional log emission across context transitions.
Command Root & Utilities `cmd/root/root.go`, `internal/cmd/utils/utils.go`	Removes upfront config validation from `mainCommand`, moves it to `PersistentPreRunE` post-command-line flag application, and adds `RemoveDefaultValueDesc` utility for flag help text adjustment.
Documentation `doc/man/nixos-cli-apply.1.scd`, `doc/man/nixos-cli-env.5.scd`	Documents new `--no-rollback` and `--rollback-timeout` CLI flags, describes local vs remote activation behavior including magic rollback, and updates `NIXOS_CLI_DEBUG_MODE` environment variable documentation.
Test Infrastructure & Cases `nix/tests/default.nix`, `nix/tests/example.test.nix`, `nix/tests/magic-rollback.test.nix`, `nix/tests/resources/ssh-keys.nix`	Updates test harness to use `pkgs.testers.runNixOSTest`, refactors test module wrapping with `self` context passing, adds new magic rollback integration test provisioning deployer/target VMs with SSH setup and validation of rollback behavior on configuration failures.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(apply): add support for remote building/activation #116 — Introduces ActivateOpts.Verbose flag and system abstraction threading throughout activate command, foundational for the verbose logging refactor and system-aware activation calls in this PR.
fix(ssh): perform root command elevation test once before password input #204 — Modifies root elevation abstraction (RootElevator) and AsRoot signatures used throughout activation supervisor and apply command logic in this PR.
refactor(log): generalize interface and add new log types #111 — Implements logger interface refactoring that enables context-based logger retrieval and SetLogLevel operations leveraged by the PreRun hook in activate command.

Suggested reviewers

Sporif

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(apply): magic rollback' directly reflects the main change—implementing magic rollback functionality for the apply command.
Linked Issues check	✅ Passed	The PR implements all core objectives from issue `#119`: a remote magic rollback mechanism (via supervisor script), self-contained on remote side (supervisor.sh), watchdog detection of SSH disconnection, disabling flag (--no-rollback), and improvements over local auto-rollback.
Out of Scope Changes check	✅ Passed	All changes support the magic rollback feature or related refactoring: system abstraction updates, logging enhancements, configuration management, duration typing, documentation, and test infrastructure are all within scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

internal/systemd/time.go (1)
16-94: ⚠️ Potential issue | 🟠 Major

Harden DurationFromTimeSpan against malformed components.

Inputs like 1 currently panic when the unit-skip loop walks past the end of the string, and oversized numeric components are silently accepted because the ParseInt error is ignored. This parser now backs both flag parsing and config unmarshalling, so bad user input should return an error instead of crashing or producing a nonsense duration. A regression case for 1 and an overflow-sized component would help lock this down.
Possible hardening
 func DurationFromTimeSpan(span string) (SystemdDuration, error) {
@@
-		num, _ := strconv.ParseInt(span[numStart:i], 10, 64)
+		num, err := strconv.ParseInt(span[numStart:i], 10, 64)
+		if err != nil {
+			return 0, fmt.Errorf("invalid duration component %q: %w", span[numStart:i], err)
+		}
@@
-		for unicode.IsSpace(rune(span[i])) {
+		for i < spanLen && unicode.IsSpace(rune(span[i])) {
 			i += 1
 		}
+		if i >= spanLen {
+			return 0, fmt.Errorf("span components must have units")
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/systemd/time.go` around lines 16 - 94, DurationFromTimeSpan can
panic or accept invalid values: check and handle strconv.ParseInt errors, ensure
bounds after skipping spaces (i < len(span)) before reading unit and return a
clear error if unit is missing or empty (fix the loop that skips spaces so it
doesn't advance past end), and guard against numeric overflow when converting
num*durationUnit (return an error on overflow); update the parsing in
DurationFromTimeSpan to validate ParseInt's err, verify unitStart < spanLen and
unit contains letters, and detect overflow when computing totalDuration; add
regression tests for inputs like "1   " and an extremely large numeric
component.
internal/generation/generation.go (1)
180-184: ⚠️ Potential issue | 🟠 Major

Use filesystem abstraction for ReadLink to maintain consistency.

Line 181 uses os.Readlink directly while the rest of the function uses s.FS() methods for filesystem operations. This breaks the abstraction pattern and will fail for remote systems. The Filesystem interface includes a ReadLink method that should be used instead.
Fix: Use filesystem abstraction
-	currentGenerationLink, err := os.Readlink(currentGenerationDirname)
+	currentGenerationLink, err := s.FS().ReadLink(currentGenerationDirname)
 	if err != nil {
 		log.Warnf("unable to determine current generation: %v", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/generation/generation.go` around lines 180 - 184, The code uses
os.Readlink directly for currentGenerationDirname (assigning to
currentGenerationLink) which breaks the filesystem abstraction; replace
os.Readlink with the Filesystem abstraction method by calling
s.FS().ReadLink(currentGenerationDirname), propagate/handle the returned value
and error exactly as done elsewhere in this function, and remove the direct os
dependency so remote or mocked filesystems work consistently with the rest of
the code.

🧹 Nitpick comments (5)

internal/logger/replay.go (1)
6-7: Remove unused level state to avoid drift/confusion.

ReplayLogger.level is initialized but never read; either use it for internal filtering or remove it.
♻️ Optional cleanup
type ReplayLogger struct {
-	level   LogLevel
 	entries []logEntry
 	out     Logger
}

func NewReplayLogger(out Logger) *ReplayLogger {
 	return &ReplayLogger{
-		level:   LogLevelInfo,
 		entries: make([]logEntry, 0, 64),
 		out:     out,
 	}
}
Also applies to: 38-39
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/logger/replay.go` around lines 6 - 7, ReplayLogger has an unused
field ReplayLogger.level that should be removed to avoid state drift; remove the
level field from the struct definition and delete any code that initializes or
assigns to ReplayLogger.level (e.g., in the NewReplayLogger constructor or any
setters) and ensure only entries ([]logEntry) remains, updating any methods that
referenced level to use entries or external filtering instead; also remove
corresponding unused imports or tests that reference the removed field.
internal/activation/supervisor.sh (1)
74-100: Consider handling case where prevToplevel is empty.

If readlink /run/current-system fails (line 76), prevToplevel will be empty. The rollback() function would then construct an invalid path like /bin/switch-to-configuration and fail with a confusing error.
♻️ Proposed fix: Validate prevToplevel before rollback
 prevToplevel=$(readlink /run/current-system)
+if [ -z "$prevToplevel" ]; then
+	log "warning: could not determine previous system; rollback may fail"
+fi

 rollback() {
+	if [ -z "$prevToplevel" ]; then
+		log "cannot rollback: previous system toplevel is unknown"
+		exit 1
+	fi
+
 	# $ROLLBACK_PROFILE_ON_FAILURE will not be set when switching
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/activation/supervisor.sh` around lines 74 - 100, The rollback()
function uses prevToplevel (set from readlink /run/current-system) without
validating it, which can produce an invalid path; update rollback() to check if
prevToplevel is non-empty and points to an existing directory before
constructing prevSwitchToConfigurationScript and invoking it (verify the
resolved path exists and is executable); if prevToplevel is empty or invalid,
log a clear error via the same logging mechanism and exit with a non-zero status
instead of attempting to call "/bin/switch-to-configuration"; keep the existing
behavior for PREVIOUS_SPECIALISATION when prevToplevel is valid.
cmd/apply/apply.go (1)
702-749: Verify newGenerationCreated handles edge cases.

The logic compares symlink targets before and after profile creation to detect new generations. Consider these edge cases:

If ReadLink fails on prevLink but succeeds on afterLink, newGenerationCreated stays false (line 744 requires both non-empty)

If both links resolve to the same path (no new generation created), rollback is correctly skipped

This seems correct - when we can't determine the previous state, we conservatively don't rollback. However, the warning at line 716-717 might benefit from noting that rollback won't be attempted if the previous link couldn't be read.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/apply/apply.go` around lines 702 - 749, The ReadLink failure for prevLink
(when calling targetHost.FS().ReadLink(activeProfileLink)) should log that
because prevLink couldn't be determined a rollback will be skipped; update the
warning emitted where prevLink is read (referencing activeProfileLink, prevLink,
and ReadLink) to include a note that rollback of the system profile will not be
attempted when the previous link is unknown, and keep newGenerationCreated logic
(which compares prevLink and afterLink) unchanged.
internal/activation/activation.go (2)
595-598: Use %w for error wrapping consistency.

Line 597 uses %v while line 525 uses %w. Using %w enables proper error unwrapping with errors.Is() and errors.As().
♻️ Proposed fix
 		activationErr := <-activationComplete
 		if activationErr != nil {
-			return fmt.Errorf("activation supervisor exited with error: %v", activationErr)
+			return fmt.Errorf("activation supervisor exited with error: %w", activationErr)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/activation/activation.go` around lines 595 - 598, Change the
fmt.Errorf call that reports errors read from the activationComplete channel to
use %w so the error is wrapped for unwrapping; specifically, where activationErr
:= <-activationComplete and you currently return fmt.Errorf("activation
supervisor exited with error: %v", activationErr), replace the format verb to
return fmt.Errorf("activation supervisor exited with error: %w", activationErr)
so tools like errors.Is()/errors.As() can inspect the original activationErr.
338-357: World-writable directory is a security consideration worth tracking.

The world-writable trigger directory (0o777) with sticky bit is documented as intentional to enable non-root activation. While the sticky bit prevents users from deleting each other's files, any local user can create trigger files, potentially allowing them to acknowledge activations they shouldn't control.

Consider opening an issue to track this for future hardening, perhaps using group-based permissions or a more restrictive mechanism when security is a concern.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/activation/activation.go` around lines 338 - 357, The
triggerDirectory in EnsureActivationDirectoryExists is intentionally created
world-writable (0o777|os.ModeSticky), which is a security concern to track; open
a tracking issue describing the risk and proposed hardening (e.g., group-based
permissions or configurable restrictiveness), then add a concise TODO comment
above the EnsureActivationDirectoryExists function referencing the new issue
number and summarizing why the directory is world-writable today and what should
be changed in future; optionally, also add a runtime warning log in
EnsureActivationDirectoryExists (mentioning triggerDirectory) that points to the
issue so operators are aware.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/activation/activation.go`:
- Around line 546-568: The goroutine can leak an SSH connection because
reconnectCh is created buffered (reconnectCh := make(chan *system.SSHSystem, 1))
so the send of s2 from s.(*system.SSHSystem).Clone() never blocks and s2 can
remain in the channel if the receiver times out; change reconnectCh to an
unbuffered channel (make(chan *system.SSHSystem)) so the send will block and the
goroutine's timeout branch (select with time.After(opts.AckTimeout)) will run
and call s2.Close(), or alternatively ensure the send path closes s2 when the
receiver times out by adding a coordinating select/cancel; update uses of
reconnectCh, Clone(), s2 and s2.Close() accordingly.

In `@internal/logger/replay.go`:
- Around line 173-175: ReplayLogger.SetLogLevel is currently a no-op so calls
like logger.SetLogLevel(logger.LogLevelDebug) (e.g., in PreRun) have no effect
when a ReplayLogger is active; implement SetLogLevel on type ReplayLogger to
actually honor the requested level by storing/applying it to the logger instance
(or forwarding it to its underlying sink/parent logger) so verbose/debug flags
take effect — update the method on ReplayLogger (SetLogLevel) to set an internal
field or call the underlying sink's SetLogLevel/Configure method (matching your
existing LogLevel type and any sink methods) and ensure any checks that use the
level (e.g., ReplayLogger.log or level checks) read that stored/applied level.

In `@internal/settings/settings.go`:
- Around line 520-534: The code currently only populates cfg.setDeprecatedFields
inside parse( k *koanf.Koanf ), so dynamic updates via (*Settings).SetValue
don't mark deprecated keys as explicitly set; add a single helper on Settings
(e.g., markDeprecatedFieldSet(key string)) that records keys into
setDeprecatedFields and use it from the parse loop (where you currently set
cfg.setDeprecatedFields[key]) and also call it from (*Settings).SetValue
immediately after a successful assignment; update tests to include a regression
case where SetValue("auto_rollback", "false") is applied and Validate() respects
the explicit false.

In `@nix/tests/magic-rollback.test.nix`:
- Around line 148-149: The test references
nodes.target.system.build.networkConfig via targetNetworkJSON but the target
node only defines targetConfig under system.build, causing evaluation failure;
either add a networkConfig attribute to the target node's system.build (e.g.,
populate nodes.target.system.build.networkConfig with the intended network data)
or remove/stop using targetNetworkJSON so the test no longer reads
nodes.target.system.build.networkConfig—update the test to consistently use
either targetConfig or a newly added networkConfig on the target node.

---

Outside diff comments:
In `@internal/generation/generation.go`:
- Around line 180-184: The code uses os.Readlink directly for
currentGenerationDirname (assigning to currentGenerationLink) which breaks the
filesystem abstraction; replace os.Readlink with the Filesystem abstraction
method by calling s.FS().ReadLink(currentGenerationDirname), propagate/handle
the returned value and error exactly as done elsewhere in this function, and
remove the direct os dependency so remote or mocked filesystems work
consistently with the rest of the code.

In `@internal/systemd/time.go`:
- Around line 16-94: DurationFromTimeSpan can panic or accept invalid values:
check and handle strconv.ParseInt errors, ensure bounds after skipping spaces (i
< len(span)) before reading unit and return a clear error if unit is missing or
empty (fix the loop that skips spaces so it doesn't advance past end), and guard
against numeric overflow when converting num*durationUnit (return an error on
overflow); update the parsing in DurationFromTimeSpan to validate ParseInt's
err, verify unitStart < spanLen and unit contains letters, and detect overflow
when computing totalDuration; add regression tests for inputs like "1   " and an
extremely large numeric component.

---

Nitpick comments:
In `@cmd/apply/apply.go`:
- Around line 702-749: The ReadLink failure for prevLink (when calling
targetHost.FS().ReadLink(activeProfileLink)) should log that because prevLink
couldn't be determined a rollback will be skipped; update the warning emitted
where prevLink is read (referencing activeProfileLink, prevLink, and ReadLink)
to include a note that rollback of the system profile will not be attempted when
the previous link is unknown, and keep newGenerationCreated logic (which
compares prevLink and afterLink) unchanged.

In `@internal/activation/activation.go`:
- Around line 595-598: Change the fmt.Errorf call that reports errors read from
the activationComplete channel to use %w so the error is wrapped for unwrapping;
specifically, where activationErr := <-activationComplete and you currently
return fmt.Errorf("activation supervisor exited with error: %v", activationErr),
replace the format verb to return fmt.Errorf("activation supervisor exited with
error: %w", activationErr) so tools like errors.Is()/errors.As() can inspect the
original activationErr.
- Around line 338-357: The triggerDirectory in EnsureActivationDirectoryExists
is intentionally created world-writable (0o777|os.ModeSticky), which is a
security concern to track; open a tracking issue describing the risk and
proposed hardening (e.g., group-based permissions or configurable
restrictiveness), then add a concise TODO comment above the
EnsureActivationDirectoryExists function referencing the new issue number and
summarizing why the directory is world-writable today and what should be changed
in future; optionally, also add a runtime warning log in
EnsureActivationDirectoryExists (mentioning triggerDirectory) that points to the
issue so operators are aware.

In `@internal/activation/supervisor.sh`:
- Around line 74-100: The rollback() function uses prevToplevel (set from
readlink /run/current-system) without validating it, which can produce an
invalid path; update rollback() to check if prevToplevel is non-empty and points
to an existing directory before constructing prevSwitchToConfigurationScript and
invoking it (verify the resolved path exists and is executable); if prevToplevel
is empty or invalid, log a clear error via the same logging mechanism and exit
with a non-zero status instead of attempting to call
"/bin/switch-to-configuration"; keep the existing behavior for
PREVIOUS_SPECIALISATION when prevToplevel is valid.

In `@internal/logger/replay.go`:
- Around line 6-7: ReplayLogger has an unused field ReplayLogger.level that
should be removed to avoid state drift; remove the level field from the struct
definition and delete any code that initializes or assigns to ReplayLogger.level
(e.g., in the NewReplayLogger constructor or any setters) and ensure only
entries ([]logEntry) remains, updating any methods that referenced level to use
entries or external filtering instead; also remove corresponding unused imports
or tests that reference the removed field.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 12b1226b-9172-41ad-8d87-5a819b4cb14d

📥 Commits

Reviewing files that changed from the base of the PR and between ad3c6b9 and d73488e.

📒 Files selected for processing (35)

cmd/activate/activate.go
cmd/activate/run.go
cmd/apply/apply.go
cmd/generation/delete/delete.go
cmd/generation/delete/resolver.go
cmd/generation/delete/resolver_test.go
cmd/generation/rollback/rollback.go
cmd/generation/shared/utils.go
cmd/generation/switch/switch.go
cmd/info/info.go
cmd/root/root.go
doc/man/nixos-cli-apply.1.scd
doc/man/nixos-cli-env.5.scd
internal/activation/activation.go
internal/activation/supervisor.sh
internal/cmd/opts/opts.go
internal/cmd/utils/utils.go
internal/constants/constants.go
internal/generation/completion.go
internal/generation/generation.go
internal/generation/specialisations.go
internal/logger/replay.go
internal/settings/settings.go
internal/settings/settings_test.go
internal/system/elevator.go
internal/system/fs.go
internal/system/local_fs.go
internal/system/sftp_fs.go
internal/system/ssh.go
internal/systemd/time.go
internal/systemd/time_test.go
nix/tests/default.nix
nix/tests/example.test.nix
nix/tests/magic-rollback.test.nix
nix/tests/resources/ssh-keys.nix

💤 Files with no reviewable changes (1)

cmd/root/root.go

internal/activation/activation.go

internal/logger/replay.go

internal/settings/settings.go

nix/tests/magic-rollback.test.nix

…ag/config

Before, time spans were stored as a string, and parsed whenever a direct time.Duration was needed. This is strange and requires manual conversions, as compared to just using a new type for this and implementing serialization/deserialization for the type directly so that it can be used by both cobra and koanf as flag/config types. This will also enable usage of systemd.time(7) spans in koanf configs, with stronger validation as well.

…uto_rollback

shlex.Quote() is much more strict in escaping than my much more lax utils.Quote() string. Using manual quoting allows passing multi-line strings such as the supervisor script properly.

…ly(1)

… mode

water-sucks force-pushed the magic-rollback branch 5 times, most recently from a186653 to 42e7674 Compare November 22, 2025 12:42

water-sucks force-pushed the magic-rollback branch from 42e7674 to 9f80b24 Compare January 31, 2026 00:35

water-sucks force-pushed the main branch from ebf709a to 7704548 Compare February 1, 2026 06:34

water-sucks force-pushed the magic-rollback branch 3 times, most recently from e59057e to 6188280 Compare February 11, 2026 23:23

nix-community deleted a comment from coderabbitai bot Feb 12, 2026

water-sucks force-pushed the magic-rollback branch 5 times, most recently from 6e5bcda to 6b0541f Compare February 20, 2026 10:08

water-sucks force-pushed the magic-rollback branch 5 times, most recently from 7f996e0 to c97de8f Compare March 2, 2026 00:16

water-sucks force-pushed the magic-rollback branch 2 times, most recently from 43a2e20 to 6ec21fc Compare March 11, 2026 07:01

water-sucks force-pushed the magic-rollback branch 2 times, most recently from 5435cca to f7c89e8 Compare April 2, 2026 10:08

water-sucks marked this pull request as ready for review April 2, 2026 10:09

water-sucks force-pushed the magic-rollback branch 4 times, most recently from 7b49dc6 to d73488e Compare April 5, 2026 03:50

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

internal/activation/activation.go Outdated Show resolved Hide resolved

internal/logger/replay.go Show resolved Hide resolved

internal/settings/settings.go Show resolved Hide resolved

nix/tests/magic-rollback.test.nix Show resolved Hide resolved

water-sucks added 14 commits April 4, 2026 22:12

refactor(generation): make generation info collection system-agnostic

3b0a610

feat(activation): init activation supervisor script

64c86b0

feat(apply): add configuring/disabling magic rollback mode through fl…

eaf6b87

…ag/config

feat(settings): add rollback section with timeout config, deprecate a…

b3fdeed

…uto_rollback

fix(ssh): use manual quoting for args string instead of shlex quoting

7108808

shlex.Quote() is much more strict in escaping than my much more lax utils.Quote() string. Using manual quoting allows passing multi-line strings such as the supervisor script properly.

docs: add section on available activation strategies to nixos-cli-app…

d13909e

…ly(1)

feat(activation): add reconnection retries

bf44e07

feat(logger): add replay logger to defer logs until after exiting raw…

b4d6ace

… mode

fix(activation): fix race condition where new connection is not closed

ab273b2

fix(system): do not use username when prompting for password

bd35f5f

fix(apply): make failed generation creation detection not fatal

3eedcb3

refactor(tests): pass module/pkgs through default args/options

371ec16

feat(tests): add magic rollback tests

35a306c

water-sucks force-pushed the magic-rollback branch from d73488e to 35a306c Compare April 5, 2026 05:45

water-sucks merged commit 1056a66 into nix-community:main Apr 5, 2026
2 checks passed

water-sucks deleted the magic-rollback branch April 5, 2026 05:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(apply): magic rollback#133

feat(apply): magic rollback#133
water-sucks merged 14 commits intonix-community:mainfrom
water-sucks:magic-rollback

water-sucks commented Nov 14, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

water-sucks commented Apr 5, 2026

Uh oh!

coderabbitai bot commented Apr 5, 2026

Uh oh!

coderabbitai bot commented Apr 5, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

water-sucks commented Nov 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary by CodeRabbit

Release Notes

Uh oh!

water-sucks commented Apr 5, 2026

Uh oh!

coderabbitai bot commented Apr 5, 2026

Uh oh!

coderabbitai bot commented Apr 5, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

water-sucks commented Nov 14, 2025 •

edited by coderabbitai bot

Loading