fix(windows): Windows specific script handling#1439
Conversation
|
Can we have a base structure and based on os populate environment variables. That may make the code more maintainable, testable and traceable. Similarly normalizeShell commands. |
|
@ghansham Good point, thanks for pointing this out! |
|
@coderabbitai review request |
✅ Actions performedReview triggered.
|
WalkthroughCentralizes command construction into a shellCommandBuilder with a Shell registry and per-shell builders; moves environment whitelist population to OS-specific init files; adds fileutil.IsFile, log-reading encoding/decoders, platform script-path normalization, tighter directory permissions, and expanded unit/integration tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Config as Step Config
participant Builder as shellCommandBuilder
participant Finder as findShell()
participant Shell as Shell impl
participant OS as OS utils (env/fileutil/loader)
participant Exec as exec.Cmd
Config->>Builder: instantiate (Command, Args, Script, Shell, Dir, ShellPackages)
Builder->>Finder: findShell(b.Shell[0])
Finder-->>Builder: selected Shell impl
alt script present
Builder->>OS: setupScript(workDir, script, shell)
OS-->>Builder: scriptFile path / error
end
Builder->>Shell: Shell.Build(ctx, b) (may call b.normalizeScriptPath())
Shell->>OS: may call fileutil.IsFile / normalizeEnvKey / getDefaultLogEncodingCharset
Shell->>Exec: construct exec.CommandContext(...)
Exec-->>Builder: *exec.Cmd or error
Builder-->>Config: return command or error (exit code normalized)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
@yottahmd I'll review the changes in this PR. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (11)
internal/common/fileutil/fileutil.go (1)
48-55:IsFilesemantics look good; just be aware of error/links behaviorImplementation is straightforward and correct: it returns true only for regular files and false on error. Just keep in mind this treats errors (including permission issues) the same as “not a regular file”, and follows symlinks because it uses
os.Statrather thanos.Lstat. If that matches the intended use (e.g., script path checks), this is fine.internal/runtime/builtin/command/command_builder.go (1)
21-30: Consider guarding against a nilShellimplementation fromfindShell
BuildassumesfindShell(b.Shell[0])always returns a non-nil implementation. If that invariant is ever broken (e.g., future refactor, tests swapping the registry),shell.Buildwould panic.You could harden this by returning an explicit error when
findShellyields nil:func (b *shellCommandBuilder) Build(ctx context.Context) (*exec.Cmd, error) { if len(b.Shell) == 0 { return nil, fmt.Errorf("shell command is required") } - shell := findShell(b.Shell[0]) - return shell.Build(ctx, b) + shell := findShell(b.Shell[0]) + if shell == nil { + return nil, fmt.Errorf("unknown shell: %s", b.Shell[0]) + } + return shell.Build(ctx, b) }internal/runtime/builtin/command/shell_cmd.go (1)
21-39: Verify working directory propagation and direct-command pathThe overall branching looks good:
- When both
CommandandScriptare set, you bypasscmd.exeand invoke the command directly withArgs + Script, which matches the “direct command + script file” intent.- Otherwise you:
- Clone shell arguments,
- Append step args,
- Ensure
/cor/Cis present,- Append the normalized script path via
b.normalizeScriptPath(),- Then execute
cmd.Two things to double‑check:
Working directory:
exec.CommandContextis created without settingcmd.Dirhere. Unless some higher layer setsDiron the returned*exec.Cmd, the step’s working directory (b.Dir) may be ignored on Windows forcmdShell. If that’s not handled elsewhere, consider:
- return exec.CommandContext(ctx, cmd, args...), nil // nolint: gosec
- c := exec.CommandContext(ctx, cmd, args...) // nolint: gosec
- c.Dir = b.Dir
- return c, nil
and similarly in the direct‑command branch if needed. 2. **Empty script path:** If neither `b.Script` nor `b.ShellCommandArgs` is set, `b.normalizeScriptPath()` might end up empty. That would add an empty argument to `cmd`. If that’s an impossible state by construction, fine; otherwise it may be worth asserting or failing earlier. </blockquote></details> <details> <summary>internal/runtime/builtin/command/command_script.go (3)</summary><blockquote> `13-49`: **`setupScript` is robust; consider optional cleanup on failure** The script setup flow is solid: - Chooses an extension based on the shell (via `GetScriptExtension`), - Uses `os.CreateTemp` in `workDir` with a clear pattern, - Applies preprocessing, writes, syncs, and then `Chmod` to `0750`. Two optional refinements: - On write/sync/Chmod failures you currently leave the temp file behind. If those failure paths are likely (e.g., disk full), you might want to `os.Remove(file.Name())` before returning the error. - The comment says “creates a temporary script file … in the given working directory”; with `workDir == ""`, `os.CreateTemp` falls back to the system temp dir, which is fine but slightly different from the comment. Both are small polish items; the core behavior looks correct. --- `64-72`: **Consider always cloning `args` in `createDirectCommand` for consistency** Logic is correct, but the aliasing behavior differs by branch: - When `scriptFile != ""`, you clone `args` before appending. - When `scriptFile == ""`, you pass `args` directly. For predictable behavior (callers never see their slice modified via `exec.Cmd.Args`), you could always clone: ```diff func createDirectCommand(ctx context.Context, cmd string, args []string, scriptFile string) *exec.Cmd { - if scriptFile != "" { - clonedArgs := cloneArgs(args) - clonedArgs = append(clonedArgs, scriptFile) - return exec.CommandContext(ctx, cmd, clonedArgs...) // nolint: gosec - } - return exec.CommandContext(ctx, cmd, args...) // nolint: gosec + clonedArgs := cloneArgs(args) + if scriptFile != "" { + clonedArgs = append(clonedArgs, scriptFile) + } + return exec.CommandContext(ctx, cmd, clonedArgs...) // nolint: gosec }
74-90:validateCommandSteplogic is sound; clarify combinations if neededThe validation covers:
- Command + Script (both set),
- Command‑only,
- Script‑only,
- SubDAG‑only,
and otherwise returns
core.ErrStepCommandIsRequired. That matches “at least one of Command, Script, or SubDAG must be set”.If the domain model intends to forbid combinations like Command + SubDAG or Script + SubDAG, you might want to tighten the conditions; otherwise this is fine as‑is.
internal/runtime/builtin/command/shell_unix.go (1)
46-62: Consider adding support for additional Unix shells.The current list covers the most common Unix shells, but consider whether
tcshorcshshould be included. However, since these shells have different syntax for errexit (e.g.,cshdoesn't support-e), the current approach of excluding them is likely intentional and correct.internal/runtime/builtin/command/shell_nix.go (2)
26-27: Consider making pure mode optional.The code always enforces
--puremode, but users might want impure nix-shell environments. Consider making this conditional based on whether the user explicitly specified shell arguments.Apply this diff to make pure mode conditional:
- // nix-shell runs in pure mode by default - args = append(args, "--pure") + // Add pure mode if not already specified + if !slices.Contains(args, "--pure") && !slices.Contains(args, "--impure") { + args = append(args, "--pure") + }
35-47: String concatenation for command construction may introduce issues.Building shell commands via string concatenation (lines 39-43) can be fragile with special characters or whitespace in arguments. While this works for the nix-shell use case, consider if proper shell escaping is needed.
internal/runtime/builtin/command/command_test.go (2)
1800-1814: Consider making the test assertion deterministic.The comment "This should work since Shell is now a slice" and the conditional
if err == nilsuggests uncertainty about the expected behavior. For reliable test coverage, consider either:
- Fixing the input to guarantee a specific outcome
- Using a more deterministic test case
t.Run("shellCommandBuilder error with shell cmd args", func(t *testing.T) { - // Test the path where shellCommandBuilder fails due to empty Shell - // when ShellCommandArgs is set config := commandConfig{ Ctx: ctx, - Shell: []string{"invalid shell with\ttabs"}, + Shell: []string{}, // Empty shell should cause error ShellCommandArgs: "echo hello", } - // This should work since Shell is now a slice cmd, err := config.newCmd(ctx, "") - // The command might succeed or fail depending on the shell parsing - if err == nil { - assert.NotNil(t, cmd) - } + assert.Error(t, err) + assert.Nil(t, cmd) })
2114-2123: Test assertion is weak and may not provide meaningful coverage.This test attempts to write to
/devdirectory but the assertionif err != nil { assert.Error(t, err) }is tautological - it only asserts an error when there is already an error. Consider removing this test or making it more deterministic.if _, err := os.Stat("/dev/full"); err == nil { t.Run("write to full device", func(t *testing.T) { - // /dev/full simulates a full disk - _, err := setupScript("/dev", "echo hello", []string{"/bin/sh"}) - // This might fail due to permission or other issues - if err != nil { - assert.Error(t, err) - } + t.Skip("Test is non-deterministic and depends on system permissions") }) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
internal/common/config/env.go(1 hunks)internal/common/config/env_unix.go(1 hunks)internal/common/config/env_windows.go(1 hunks)internal/common/fileutil/fileutil.go(1 hunks)internal/integration/shell_test.go(2 hunks)internal/runtime/builtin/command/command.go(2 hunks)internal/runtime/builtin/command/command_builder.go(1 hunks)internal/runtime/builtin/command/command_script.go(1 hunks)internal/runtime/builtin/command/command_test.go(5 hunks)internal/runtime/builtin/command/command_unix.go(1 hunks)internal/runtime/builtin/command/command_windows.go(1 hunks)internal/runtime/builtin/command/command_windows_test.go(1 hunks)internal/runtime/builtin/command/shell.go(1 hunks)internal/runtime/builtin/command/shell_cmd.go(1 hunks)internal/runtime/builtin/command/shell_direct.go(1 hunks)internal/runtime/builtin/command/shell_nix.go(1 hunks)internal/runtime/builtin/command/shell_powershell.go(1 hunks)internal/runtime/builtin/command/shell_unix.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
internal/runtime/builtin/command/shell_unix.go (1)
internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
internal/integration/shell_test.go (3)
internal/test/helper.go (2)
Setup(79-223)DAG(420-423)internal/runtime/subcmd.go (1)
Run(248-270)internal/core/dag.go (1)
DAG(33-135)
internal/runtime/builtin/command/command_builder.go (1)
internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
internal/runtime/builtin/command/shell_nix.go (1)
internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
internal/runtime/builtin/command/command_windows.go (1)
internal/common/fileutil/fileutil.go (1)
IsFile(49-55)
internal/runtime/builtin/command/command_script.go (1)
internal/common/cmdutil/cmdutil.go (1)
GetScriptExtension(306-338)
internal/runtime/builtin/command/command_test.go (2)
internal/runtime/subcmd.go (1)
Run(248-270)internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
🔇 Additional comments (20)
internal/runtime/builtin/command/command_builder.go (1)
9-19: Builder shape and fields look reasonableThe struct cleanly captures all the knobs needed for shell construction (dir, command, args, script, shell metadata, user-specified flag). No issues here from a modeling standpoint.
internal/runtime/builtin/command/shell_cmd.go (1)
9-19:cmdShell.Matchis clear and correctMatch logic for
"cmd.exe"/"cmd"is straightforward and aligned with expectations.internal/runtime/builtin/command/command_script.go (1)
51-62: PowerShell preprocessing correctly enforces failure on non‑zero exitThe
.ps1branch prepends:
$ErrorActionPreference = 'Stop'to fail on cmdlet errors.$PSNativeCommandUseErrorActionPreference = $trueto propagate non‑zero native command exits (in PS 7+; harmless as a normal variable earlier).This aligns well with the PR goal of failing on non‑zero interim commands in scripts. No changes needed here.
internal/common/config/env.go (1)
18-21: Platform‑specific whitelist initialization is reasonable; watch key casingMoving
defaultWhitelistto an empty map and populating it fromenv_windows.go/env_unix.gois a good step toward OS‑specific behavior.Because
filterEnvdoes a case‑sensitiveallow[key]lookup, it’s important that the keys inserted in those init functions match the actual casing returned byos.Environ()on each platform (especially on Windows where env keys are case‑insensitive but case‑preserving). The rest of the logic here looks fine.internal/runtime/builtin/command/command_unix.go (1)
1-11: UnixnormalizeScriptPathbehavior is appropriateOn non‑Windows builds you simply return
b.ShellCommandArgsunchanged, keeping the existing Unix expectation that callers use./scriptor absolute paths. This cleanly isolates the Windows‑specific correction logic to thewindowsbuild without changing Unix semantics.internal/common/config/env_unix.go (1)
5-30: Unix whitelist contents are sensible and consistentThe selected Unix/macOS environment variables (PATH, HOME, USER, SHELL, TMPDIR, TERM, locale vars, TZ, LD_LIBRARY_PATH, XDG_* dirs) are a reasonable default set to propagate. Combined with the shared prefix rules, this should give child processes what they need without over‑exposing the environment.
internal/runtime/builtin/command/shell_direct.go (1)
29-47: LGTM! Clean direct shell implementation.The validation logic correctly enforces array syntax for direct shell commands and provides clear error messages to guide users. The use of
cloneArgsprevents mutation issues.internal/integration/shell_test.go (2)
10-139: Excellent test coverage for direct shell execution.The test suite thoroughly covers direct shell behavior including command arrays, exit codes, special character handling, and Dagu variable substitution. The parallel execution is appropriate for these independent tests.
219-359: Comprehensive shell feature testing.Good coverage of shell-specific behaviors including command substitution, environment variables, multiline scripts, redirection, and glob expansion. These tests properly validate the distinction between direct and shell-wrapped execution.
internal/runtime/builtin/command/command_windows.go (1)
17-43: Well-designed Windows path normalization logic.The implementation correctly handles the Windows-specific behavior where PowerShell and cmd.exe don't search the current directory by default. The restriction to script extensions prevents false positives when commands happen to match filenames.
internal/runtime/builtin/command/shell_powershell.go (1)
21-40: PowerShell command construction looks correct.The implementation properly handles both Windows PowerShell and PowerShell Core, uses the appropriate
-Commandflag, and correctly bypasses shell wrapping for direct command execution.internal/runtime/builtin/command/shell_unix.go (1)
19-44: Solid Unix shell implementation with proper errexit handling.The conditional addition of the
-eflag for Unix-like shells improves error handling while respecting user-specified shell configurations. The direct command bypass path is consistent with other shell implementations.internal/runtime/builtin/command/command_windows_test.go (1)
15-127: Comprehensive test coverage for Windows path normalization.The test cases thoroughly cover edge cases including subdirectories, absolute paths, command collisions, and non-existent files. The use of real temp files provides confidence in the file system interaction logic.
internal/runtime/builtin/command/shell.go (1)
10-42: Excellent shell abstraction design.The Shell interface provides a clean abstraction for platform-specific command building. The registry-based approach with ordered matching is elegant and extensible. The fallback to
unixShellensures robustness.internal/runtime/builtin/command/command.go (2)
131-140: LGTM - Dir field addition supports relative path script execution.The addition of the
Dirfield to theshellCommandBuilderaligns with the PR objective of supporting relative path script execution. This ensures the builder has access to the working directory context when constructing commands.
168-182: Passing Command/Args to shellCommandBuilder when ShellCommandArgs is set is intentional.The shell implementations (
unixShellandnixShell) explicitly ignoreCommandandArgswhenShellCommandArgsis set. The logic branches on whetherScriptexists: ifCommand != "" && Script != "", it usesCommand/Args; otherwise, whenShellCommandArgsis set, it uses onlyShellCommandArgswith the-cflag. This is documented in the code comments (shell_unix.go:35-37) and confirmed by existing tests (command_test.go:2023-2034).internal/runtime/builtin/command/command_test.go (4)
22-85: Good test coverage for the direct shell type.The
TestDirectShellfunction provides comprehensive coverage of the "direct" shell execution path, including:
- Command + Script combination
- Command-only execution
- Verification that ShellCommandArgs is correctly ignored when Command is set
- Proper error handling for invalid configurations
160-205: LGTM - PowerShell test coverage is thorough.The tests properly validate PowerShell command construction including:
- Basic commands with
-Commandflag injection- Cases where
-Commandalready exists in shell args- Script execution bypassing PowerShell when Command is specified
207-252: LGTM - Cmd.exe test coverage is appropriate.The tests validate Windows cmd.exe command construction, including proper
/cflag handling and script execution paths.
1-20: Comprehensive test coverage for the shell abstraction refactoring.The test file provides excellent coverage including:
- Direct shell execution paths
- PowerShell and cmd.exe handling
- Script execution with and without shebangs
- Error handling and edge cases
- Proper platform-specific test skipping
|
@coderabbitai generate docstrings |
✅ Actions performedInitiated docstring generation; will generate only if new commits exist. |
|
Note Docstrings generation - SUCCESS |
Docstrings generation was requested by @yottahmd. * #1439 (comment) The following files were modified: * `internal/common/config/env.go` * `internal/common/config/env_unix.go` * `internal/common/config/env_windows.go` * `internal/common/fileutil/fileutil.go` * `internal/runtime/builtin/command/command.go` * `internal/runtime/builtin/command/command_script.go` * `internal/runtime/builtin/command/command_windows.go` * `internal/runtime/builtin/command/shell.go` * `internal/runtime/builtin/command/shell_unix.go`
Docstrings generation was requested by @yottahmd. * #1439 (comment) The following files were modified: * `internal/common/config/env.go` * `internal/common/config/env_unix.go` * `internal/common/config/env_windows.go` * `internal/common/fileutil/fileutil.go` * `internal/runtime/builtin/command/command.go` * `internal/runtime/builtin/command/command_script.go` * `internal/runtime/builtin/command/command_windows.go` * `internal/runtime/builtin/command/shell.go` * `internal/runtime/builtin/command/shell_unix.go` <details> <summary>These files were ignored</summary> * `internal/integration/shell_test.go` * `internal/runtime/builtin/command/command_test.go` * `internal/runtime/builtin/command/command_windows_test.go` </details> <details> <summary>ℹ️ Note</summary><blockquote> CodeRabbit cannot perform edits on its own pull requests yet. </blockquote></details> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Yota Hamada <yohamta@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
internal/common/config/env_windows.go (1)
9-25: COMSPEC casing still inconsistent with Windows env / whitelist lookupThe whitelist still uses
"COMSPEC"while Windows commonly exposes this variable as"ComSpec". With a case‑sensitivedefaultWhitelistlookup,ComSpecfromos.Environ()will not be whitelisted, so this base variable is still effectively filtered out.Recommend aligning with the actual casing (e.g. replace with
"ComSpec", or include both"ComSpec"and"COMSPEC"if you want to be extra defensive), or normalize casing infilterEnvbefore lookup.- "COMSPEC", // cmd.exe + "ComSpec", // cmd.exe (canonical Windows casing)
🧹 Nitpick comments (2)
internal/runtime/builtin/command/shell_unix.go (1)
19-44: Defensively guard against an emptyShellslice inunixShell.Build
Buildassumeslen(b.Shell) > 0:cmd := b.Shell[0] args := cloneArgs(b.Shell[1:])If
shellCommandBuilderever routes here with an emptyShellslice, this will panic. That may be impossible by construction, but it’s not enforced locally.Consider adding a small invariant check (returning an error or similar) so a misconfigured builder can’t crash the worker:
if len(b.Shell) == 0 { return nil, fmt.Errorf("unixShell.Build: empty Shell slice") } cmd := b.Shell[0] args := cloneArgs(b.Shell[1:])internal/runtime/builtin/command/command.go (1)
252-305: Exit-code normalization and executor registration are reasonable
exitCodeFromErrorcleanly centralizes exit-code extraction from*exec.ExitError, returning 0 for nil and 1 for non‑exit errors, and it’s used in both theStartandWaiterror paths soExitCode()won’t remain 0 on failures.NewCommand/NewCommandConfigand theinitregistration of"", "shell", "command"executors also look coherent with the runtime/env design (Dir and Shell sourced fromruntime.GetEnv(ctx)andUserSpecifiedShellderived fromstep.Shell != "").If you ever expect wrapped errors around
*exec.ExitError, consider switching toerrors.AsinexitCodeFromErrorfor extra robustness, but it’s not strictly necessary with the current call sites.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
internal/common/config/env.go(1 hunks)internal/common/config/env_unix.go(1 hunks)internal/common/config/env_windows.go(1 hunks)internal/common/fileutil/fileutil.go(1 hunks)internal/runtime/builtin/command/command.go(6 hunks)internal/runtime/builtin/command/command_script.go(1 hunks)internal/runtime/builtin/command/command_windows.go(1 hunks)internal/runtime/builtin/command/shell.go(1 hunks)internal/runtime/builtin/command/shell_unix.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- internal/common/fileutil/fileutil.go
- internal/common/config/env_unix.go
- internal/runtime/builtin/command/command_windows.go
- internal/common/config/env.go
- internal/runtime/builtin/command/command_script.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/runtime/builtin/command/command.go (1)
internal/common/logger/tag/tag.go (3)
Dir(77-79)Command(403-405)Args(408-410)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (2)
internal/runtime/builtin/command/shell.go (1)
10-44: Shell interface and registry wiring look solidThe
Shellinterface, registry ordering, andfindShellnormalization (strings.ToLower(filepath.Base(cmd))) cleanly capture the “first match wins, unixShell as fallback” model. GivenunixShell.Matchalways returns true, the loop guarantees a match as intended.internal/runtime/builtin/command/command.go (1)
127-201: Routing command+script and shell-command flows throughshellCommandBuilderlooks correctThe two updated branches:
cfg.Command != "" && scriptFile != ""len(cfg.Shell) > 0 && cfg.ShellCommandArgs != ""now both use
shellCommandBuilderwithDir,Command,Args,Shell,ShellCommandArgs,ShellPackages, andUserSpecifiedShellpopulated. Given the common post-build setup (cmd.Env,cmd.Dir,cmd.Stdout,cmd.Stderr,cmdutil.SetupCommand), this is consistent with the new shell abstraction and keeps Dir/script handling centralized.I don’t see functional issues here as long as
shellCommandBuilder.BuildhonoursDirandUserSpecifiedShellas expected.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
internal/common/config/env_windows.go (2)
5-7: Relocate the init comment after imports.The comment about
init(lines 5-6) is placed before theimportstatement, which is unconventional. By Go convention, import statements typically follow immediately after the package declaration, with package-level comments placed after imports.Apply this diff to improve code organization:
package config -// init populates the package's defaultWhitelist with common Windows environment -// variable names so they are treated as whitelisted on Windows builds. import "strings" +// init populates the package's defaultWhitelist with common Windows environment +// variable names so they are treated as whitelisted on Windows builds. + // normalizeEnvKey converts to uppercase on Windows for case-insensitive matching.
15-32: LGTM! Addresses the previous casing concern.The uppercase whitelist keys combined with
normalizeEnvKeyinfilterEnv(env.go:50) correctly resolves case-insensitive Windows environment variable matching. Whenos.Environ()returns variables like "ComSpec" or "COMSPEC", they are normalized to uppercase before the whitelist lookup, ensuring consistent matching.The selected environment variables are appropriate for Windows: USERPROFILE for user home, SYSTEMROOT/WINDIR for the system directory, COMSPEC for the command interpreter, PATH/PATHEXT for executables, TEMP/TMP for temporary files, PSMODULEPATH for PowerShell, and HOME for Git/Go compatibility.
However, consider adding explicit tests for Windows-specific behavior: normalizeEnvKey case conversion and whitelist matching with case variations (e.g., verifying "ComSpec=cmd.exe" is properly filtered).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
internal/common/config/env.go(2 hunks)internal/common/config/env_unix.go(1 hunks)internal/common/config/env_windows.go(1 hunks)internal/common/fileutil/fileutil_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/common/config/env_unix.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/common/fileutil/fileutil_test.go (1)
internal/common/fileutil/fileutil.go (1)
IsFile(51-57)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (5)
internal/common/fileutil/fileutil_test.go (1)
191-221: LGTM! Comprehensive test coverage for IsFile.The test suite properly covers the three main scenarios: regular file, directory, and non-existent path. The parallel test structure is correctly implemented—
t.TempDir()ensures cleanup waits for all subtests, and each subtest uses unique filenames to avoid conflicts.internal/common/config/env.go (3)
13-14: LGTM! Clear documentation of storage semantics.The updated documentation explicitly clarifies that the slice is stored without copying and recommends defensive copying by callers when isolation is needed. This transparency about memory semantics helps prevent unintended mutations and follows Go best practices.
19-21: LGTM! Architectural improvement for platform-specific behavior.Moving from a hard-coded whitelist to platform-specific initialization via
init()functions improves maintainability and separation of concerns. This aligns with the PR objective to support OS-specific environments and addresses the maintainability concerns raised in the PR comments.
49-50: LGTM! Elegant solution for cross-platform key matching.The introduction of
normalizeEnvKeyelegantly handles platform differences in environment variable casing. On Windows, it uppercases keys for case-insensitive matching; on Unix, it likely returns keys unchanged. This approach resolves the previous review concern about mixed casing in the Windows whitelist while maintaining clean, platform-agnostic code in the core filtering logic.internal/common/config/env_windows.go (1)
9-13: No action required – cross-platform implementation is already in place.The
normalizeEnvKeyfunction is correctly implemented with proper build tags:env_windows.go(//go:build windows) provides case-insensitive matching, whileenv_unix.go(//go:build !windows) returns the key unchanged. Go's build system automatically selects the correct version per platform, soenv.gocan safely call this function without compilation issues.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
internal/runtime/builtin/command/shell_cmd.go (2)
26-49: Path resolution is reasonable; consider COMSPEC for extra robustnessThe
resolveCmdPathflow (defaultC:\Windows\System32\cmd.exe→SystemRoot\System32 → fall back to original) looks correct and should avoid most Go exec security restrictions around relative paths.If you want to harden this further for non‑standard installs or customized command interpreters, consider optionally checking the
COMSPECenvironment variable first and validating it points to an existingcmd.exebefore falling back to the hard‑coded/default locations. That would better honor user/OS configuration while keeping the same overall behavior.
51-74: Tighten Build arg handling and guard against empty script pathsThe overall Build flow looks correct: resolve
cmd.exe, preserve any shell args, ensure/cor/Cis present, and append the normalized script path.Two small maintainability/robustness tweaks you might consider:
- Avoid cloning
b.Shell[1:]when the earlyb.Command != "" && b.Script != ""branch returns immediately.- Only append the normalized script path if it’s non‑empty, so a future change that allows “no script” for cmd won’t result in
cmd.exe /c "".A refactor along these lines keeps semantics while tightening things up:
func (s *cmdShell) Build(ctx context.Context, b *shellCommandBuilder) (*exec.Cmd, error) { - cmd := resolveCmdPath(b.Shell[0]) - args := cloneArgs(b.Shell[1:]) - - // When running a command directly with a script, don't include cmd.exe arguments - if b.Command != "" && b.Script != "" { - args := cloneArgs(b.Args) - if b.Script != "" { - args = append(args, b.Script) - } - return exec.CommandContext(ctx, b.Command, args...), nil // nolint: gosec - } - - args = append(args, b.Args...) + cmd := resolveCmdPath(b.Shell[0]) + + // When running a command directly with a script, don't include cmd.exe arguments + if b.Command != "" && b.Script != "" { + args := cloneArgs(b.Args) + args = append(args, b.Script) + return exec.CommandContext(ctx, b.Command, args...), nil // nolint: gosec + } + + args := cloneArgs(b.Shell[1:]) + args = append(args, b.Args...) // cmd.exe uses /c instead of -c if !slices.Contains(args, "/c") && !slices.Contains(args, "/C") { args = append(args, "/c") } - args = append(args, b.normalizeScriptPath()) + if scriptPath := b.normalizeScriptPath(); scriptPath != "" { + args = append(args, scriptPath) + } return exec.CommandContext(ctx, cmd, args...), nil // nolint: gosec }This should keep behavior identical for the current flows while making the function a bit safer against future builder changes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/runtime/builtin/command/shell_cmd.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/runtime/builtin/command/shell_cmd.go (1)
internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (1)
internal/runtime/builtin/command/shell_cmd.go (1)
14-21: cmdShell.Match logic is clear and correctThe matcher cleanly handles both
cmd.exeandcmdand aligns with the documented convention that names are lowercase and pathless. No changes needed here.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
internal/runtime/builtin/command/shell_unix.go (1)
68-92: Consider addingtcshandcshto shell detection.The
isUnixLikeShellfunction covers common Bourne-compatible shells but omitscsh/tcsh. While these shells don't support-ethe same way, the function could be renamed or documented to clarify it specifically identifies "errexit-compatible" shells.-// isUnixLikeShell reports whether the named shell is known to support the -e (errexit) flag. -// It returns true for "sh", "bash", "zsh", "ksh", "ash", and "dash"; returns false for "fish", an empty string, or any unrecognized shell. -func isUnixLikeShell(shell string) bool { +// isErrexitCompatibleShell reports whether the named shell supports the -e (errexit) flag. +// Returns true for POSIX-compatible shells (sh, bash, zsh, ksh, ash, dash). +// Returns false for fish, csh, tcsh, empty string, or unrecognized shells. +func isErrexitCompatibleShell(shell string) bool {internal/runtime/builtin/command/command_test.go (1)
1380-1420: Consider adding timeout to concurrent test.The concurrent execution test uses
sleep 0.1which could be flaky. Consider adding a test timeout or using a more deterministic synchronization mechanism.func TestCommandExecutor_ConcurrentRun(t *testing.T) { if goruntime.GOOS == "windows" { t.Skip("Skipping Unix-specific test on Windows") } + t.Parallel() step := core.Step{ Name: "test", - Command: "sleep", - Args: []string{"0.1"}, + Command: "true", // Use a fast command instead of sleep }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
internal/runtime/builtin/command/command.go(7 hunks)internal/runtime/builtin/command/command_test.go(8 hunks)internal/runtime/builtin/command/shell_cmd.go(1 hunks)internal/runtime/builtin/command/shell_nix.go(1 hunks)internal/runtime/builtin/command/shell_powershell.go(1 hunks)internal/runtime/builtin/command/shell_unix.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
internal/runtime/builtin/command/shell_nix.go (2)
internal/cmd/context.go (1)
Context(38-53)internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
internal/runtime/builtin/command/shell_powershell.go (1)
internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
internal/runtime/builtin/command/command.go (2)
internal/common/logger/tag/tag.go (4)
Dir(77-79)Command(403-405)Args(408-410)ExitCode(99-101)internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
internal/runtime/builtin/command/command_test.go (4)
internal/runtime/subcmd.go (1)
Run(248-270)internal/runtime/builtin/command/shell.go (1)
Shell(13-20)internal/common/logger/tag/tag.go (3)
Command(403-405)Args(408-410)Error(20-22)internal/test/helper.go (1)
Contains(519-519)
internal/runtime/builtin/command/shell_cmd.go (1)
internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
internal/runtime/builtin/command/shell_unix.go (1)
internal/runtime/builtin/command/shell.go (1)
Shell(13-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (17)
internal/runtime/builtin/command/shell_powershell.go (2)
1-19: LGTM!The Match implementation correctly identifies all common PowerShell executables (Windows PowerShell and PowerShell Core).
26-30: This design is intentional and does not require changes.The code correctly distinguishes between running an explicit interpreter directly (e.g.,
python script.py) versus running a PowerShell script file. When bothCommandandScriptare specified, bypassing PowerShell wrapping provides cross-platform consistency. This behavior is explicitly tested inTestCommandExecutor_CommandWithScriptand is implemented identically inshell_cmd.go, confirming it's a deliberate architectural pattern.internal/runtime/builtin/command/shell_cmd.go (3)
1-21: LGTM on package structure and Match implementation.The
cmdShelltype correctly implements theShellinterface. TheMatchmethod properly handles bothcmd.exeandcmdvariants.
23-58: Solid fallback chain for resolving cmd.exe path.The
resolveCmdPathfunction implements a robust resolution strategy: COMSPEC → default path → SystemRoot fallback. This handles non-standard Windows installations correctly.
60-97: This concern is incorrect;b.Scriptandb.ShellCommandArgsare different fields with different purposes.The
normalizeScriptPath()method operates onb.ShellCommandArgs, notb.Script. In the script-only branch,b.Scriptis a direct file path already provided by the user, while in the command-string branch,b.ShellCommandArgsis a command string that may require path normalization for Windows script extensions. The two code paths intentionally handle different data structures and are not inconsistent.Likely an incorrect or invalid review comment.
internal/runtime/builtin/command/shell_unix.go (2)
11-18: LGTM on fallback shell design.The
unixShellwithMatchalways returningtruecorrectly serves as the catch-all fallback in the shell registry.
20-66: Well-structured Build logic with proper errexit handling.The three execution paths are cleanly separated. The automatic
-eflag injection for Unix-like shells (when not user-specified) aligns with the PR objective of supporting failure on non-zero interim commands.internal/runtime/builtin/command/shell_nix.go (2)
10-15: LGTM on Match implementation.Clean and precise matching for
nix-shell.
17-73:--runargument positioning is correct; the review concern is unfounded.The code correctly positions
--runand its command argument. When line 32 appends"--run"toargs, subsequent calls toappend(args, cmdStr)(lines 49, 59, 69) place the command string immediately after it in the slice, which is the correct order. The test "NixShellWithScriptOnly" validates this by expecting{"--run", "set -e; /tmp/script.sh"}consecutively in the args. Nix-shell requires--runto be followed directly by its command string, and the current implementation satisfies this requirement.Extract the duplicated
set -eprefix logic into a helper function. The prefix prepending pattern appears identically in three branches (lines 45–47, 56–58, 64–66). A helper function would improve readability and maintainability.Likely an incorrect or invalid review comment.
internal/runtime/builtin/command/command.go (4)
70-75: Good security improvement with tighter directory permissions.Changing to
0750restricts group write and other access, following the principle of least privilege.
127-170: Consistent Dir propagation to shell command builders.The
Dirfield is now properly passed toshellCommandBuilderacross all code paths, ensuring working directory context is available for script path normalization.
256-267: Clean exit code extraction with proper error unwrapping.Using
errors.Ascorrectly handles wrapped errors, making this more robust than type assertions.
269-307: Well-documented public API with proper init registration.The docstrings clearly describe the behavior of
NewCommandandNewCommandConfig. Registering for "", "shell", and "command" executor types provides good backward compatibility.internal/runtime/builtin/command/command_test.go (4)
22-85: Comprehensive direct shell tests.Good coverage of the direct shell execution paths including error cases for missing command field and string-only command syntax.
264-335: Excellent shell-script-only coverage across all shell types.Testing each shell's script-only behavior (cmd, PowerShell, pwsh, bash, sh, nix-shell) ensures consistent behavior across platforms.
1099-1105: Appropriate platform-specific permission check.Correctly skipping the Unix permission check on Windows where the permission model differs.
2179-2196: Good edge case test for scanner buffer limits.Testing the 4KB buffer limit for shebang detection helps ensure the code handles edge cases gracefully.
| } | ||
|
|
||
| func (s *powerShell) Build(ctx context.Context, b *shellCommandBuilder) (*exec.Cmd, error) { | ||
| cmd := b.Shell[0] |
There was a problem hiding this comment.
Add bounds check for b.Shell[0].
Accessing b.Shell[0] without verifying that b.Shell is non-empty will panic if the slice is empty or nil.
Apply this diff to add defensive bounds checking:
func (s *powerShell) Build(ctx context.Context, b *shellCommandBuilder) (*exec.Cmd, error) {
+ if len(b.Shell) == 0 {
+ return nil, fmt.Errorf("shell executable not specified")
+ }
cmd := b.Shell[0]Note: You'll need to import "fmt" if not already imported.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cmd := b.Shell[0] | |
| func (s *powerShell) Build(ctx context.Context, b *shellCommandBuilder) (*exec.Cmd, error) { | |
| if len(b.Shell) == 0 { | |
| return nil, fmt.Errorf("shell executable not specified") | |
| } | |
| cmd := b.Shell[0] |
🤖 Prompt for AI Agents
In internal/runtime/builtin/command/shell_powershell.go around line 22, the code
unconditionally uses b.Shell[0] which will panic if b.Shell is nil or empty; add
a defensive bounds check that verifies len(b.Shell) > 0 before accessing index
0, return or wrap an error (using fmt.Errorf) if the slice is empty, and ensure
"fmt" is imported if not already present.
| if b.Script != "" { | ||
| args := []string{"-ExecutionPolicy", "Bypass", "-File", b.Script} | ||
| return exec.CommandContext(ctx, cmd, args...), nil // nolint: gosec | ||
| } |
There was a problem hiding this comment.
Include script arguments in PowerShell -File invocation.
When running a PowerShell script with the -File flag, any additional arguments in b.Args should be passed to the script. Currently, only b.Script is included, which means script arguments are ignored.
Apply this diff to include script arguments:
// When running just a script file with PowerShell (no explicit command)
// e.g., powershell -ExecutionPolicy Bypass -File script.ps1
if b.Script != "" {
args := []string{"-ExecutionPolicy", "Bypass", "-File", b.Script}
+ args = append(args, b.Args...)
return exec.CommandContext(ctx, cmd, args...), nil // nolint: gosec
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if b.Script != "" { | |
| args := []string{"-ExecutionPolicy", "Bypass", "-File", b.Script} | |
| return exec.CommandContext(ctx, cmd, args...), nil // nolint: gosec | |
| } | |
| if b.Script != "" { | |
| args := []string{"-ExecutionPolicy", "Bypass", "-File", b.Script} | |
| args = append(args, b.Args...) | |
| return exec.CommandContext(ctx, cmd, args...), nil // nolint: gosec | |
| } |
🤖 Prompt for AI Agents
In internal/runtime/builtin/command/shell_powershell.go around lines 34 to 37,
the code builds args for PowerShell with "-ExecutionPolicy", "Bypass", "-File",
b.Script but does not forward b.Args to the script; update the args construction
to append b.Args (if any) to the args slice so exec.CommandContext(ctx, cmd,
args...) passes script arguments through (preserve the existing nolint comment
and ensure b.Args is appended safely).
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/common/fileutil/logutil.go (1)
314-322: Division‑by‑zero risk in estimateLineCount for files with no sampled newlinesIf both
startLineCountandmiddleLineCountare zero (e.g., a large file with a single very long line or very sparse newlines),avgLineLengthwill divide by zero, causing a panic. This is independent of the new encoding changes but is a correctness issue for large logs.- // Calculate average line length from samples - var avgLineLength float64 - if fileSize > sampleSize*2 { - // Average of start and middle samples - avgLineLength = float64(sampleSize*2) / float64(startLineCount+middleLineCount) - } else { - // Just use start sample - avgLineLength = float64(sampleSize) / float64(startLineCount) - } + // Calculate average line length from samples + totalSampleLines := startLineCount + middleLineCount + if totalSampleLines == 0 { + // Fallback to exact count in degenerate cases (e.g. huge single-line files) + exactCount, err := countLinesExact(filePath) + if err != nil { + return 0, false, err + } + return exactCount, false, nil + } + + var avgLineLength float64 + if fileSize > sampleSize*2 { + // Average of start and middle samples + avgLineLength = float64(sampleSize*2) / float64(totalSampleLines) + } else { + // Just use start sample + avgLineLength = float64(sampleSize) / float64(startLineCount) + }Also applies to: 325-329
🧹 Nitpick comments (2)
internal/common/fileutil/logutil.go (2)
40-176: Encoding decoder helper is comprehensive but could be made more data‑drivenThe charset normalization and alias coverage here is extensive and well thought out. Long term, consider refactoring this giant switch into a table‑driven map (normalizedName → encoding) to reduce cognitive load and make it easier to add or tweak aliases.
-func getEncodingDecoder(charset string) *encoding.Decoder { - if charset == "" { - return nil - } - - // Normalize the charset name for comparison - normalized := strings.ToLower(strings.ReplaceAll(charset, "_", "-")) - normalized = strings.ReplaceAll(normalized, " ", "-") - - switch normalized { - // ... - } -} +var encodingDecoders = map[string]*encoding.Decoder{ + // utf-8 / default: handled by nil + "euc-jp": japanese.EUCJP.NewDecoder(), + // (populate with the existing aliases and decoders) +} + +func getEncodingDecoder(charset string) *encoding.Decoder { + if charset == "" { + return nil + } + + normalized := strings.ToLower(strings.ReplaceAll(charset, "_", "-")) + normalized = strings.ReplaceAll(normalized, " ", "-") + + if dec, ok := encodingDecoders[normalized]; ok { + return dec + } + // Unknown or utf-8 → nil (no decoding) + return nil +}
538-544: Use strings.Builder or Join in ReadLogContent to avoid quadratic concatenationBuilding
contentvia repeated+=on strings is O(n²) for many lines. Since you already have[]string, you can construct the result more efficiently.-func ReadLogContent(filePath string, options LogReadOptions) (string, int, int, bool, bool, error) { - result, err := ReadLogLines(filePath, options) +func ReadLogContent(filePath string, options LogReadOptions) (string, int, int, bool, bool, error) { + result, err := ReadLogLines(filePath, options) if err != nil { return "", 0, 0, false, false, err } - content := "" - for i, line := range result.Lines { - content += line - if i < len(result.Lines)-1 { - content += "\n" - } - } + // Preserve existing behavior (newline-separated, no trailing newline) + content := strings.Join(result.Lines, "\n") return content, result.LineCount, result.TotalLines, result.HasMore, result.IsEstimate, nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
internal/common/fileutil/logutil.go(13 hunks)internal/common/fileutil/logutil_test.go(3 hunks)internal/service/frontend/api/v2/dagruns.go(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
internal/service/frontend/api/v2/dagruns.go (1)
api/v2/api.gen.go (4)
Head(946-946)Tail(967-967)Offset(952-952)Limit(949-949)
internal/common/fileutil/logutil.go (1)
api/v2/api.gen.go (3)
Head(946-946)Tail(967-967)Offset(952-952)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (4)
internal/common/fileutil/logutil_test.go (1)
431-431: Tests correctly adapted to new decoder parameterThe extra
nilargument at these call sites matches the updated helper signatures and preserves the previous behavior (no decoding for default UTF‑8 logs). No issues from the change itself.Also applies to: 542-542, 680-680
internal/service/frontend/api/v2/dagruns.go (1)
311-315: Encoding plumbed through log read options consistentlyWiring
Encoding: a.logEncodingCharsetinto all DAG/SubDAG log endpoints keeps pagination behavior the same while enabling charset-aware decoding. The pattern is consistent across all four handlers and looks correct.Also applies to: 356-360, 491-495, 534-538
internal/common/fileutil/logutil.go (2)
24-29: LogReadOptions.Encoding addition is clear and backward compatibleAdding
Encoding stringtoLogReadOptionscleanly extends the API without affecting existing callers; empty string naturally maps to “no decoding”. The inline comment documents expected values sufficiently for now.
202-205: Decoder propagation through log readers is consistent and safeDeriving a single decoder from
options.Encodingand threading it throughreadFirstLines,readLastLines, andreadLinesRange, with conditionaltransform.NewReaderwrapping, keeps the non‑UTF‑8 path localized and preserves behavior whenEncodingis empty/UTF‑8 (decoder is nil). The ownership and lifetime of the decoder are confined to a single read path, so there are no obvious state or concurrency hazards.Also applies to: 213-251, 357-397, 400-470, 473-527
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
internal/common/config/loader_windows.go (1)
1-83: Windows code-page→charset mapping looks good; consider tests and documenting fallback behaviorThe Windows implementation is clear: you read the system code page once and map it through a focused switch, with a sensible UTF‑8 fallback for anything unknown. That should give good defaults for the vast majority of real-world locales.
As non-blocking follow-ups, you might:
- Add a small table-driven unit test for
codePageToCharset(a few known pages plus an unknown) to prevent regressions.- Consider a brief comment or doc note about the fallback-to-UTF‑8 behavior so it’s explicit how truly unknown/legacy code pages will be treated.
internal/common/config/loader_unix.go (1)
1-131: Unix locale→charset detection is solid; minor hardening could be added laterThe Unix-side logic is well-structured: LC_ALL/LANG precedence is correct, explicit encodings are parsed and normalized sensibly, and you fall back to UTF‑8 (via
inferEncodingFromLocale) in ambiguous or modern-default cases. The normalization table covers the usual suspects (UTF‑8 variants, EUC/Shift‑JIS, GBK/Big5, ISO‑8859, KOI8).If you want a bit more robustness later (non-blocking):
strings.TrimSpaceon the initial locale/encoding values would make the parsing more tolerant of stray whitespace.- A few focused unit tests for
normalizeUnixEncodingandgetDefaultLogEncodingCharset(e.g.,ja_JP.UTF-8,en_US.ISO-8859-1,C,POSIX) would lock in the behavior across platforms.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
internal/common/config/loader.go(1 hunks)internal/common/config/loader_unix.go(1 hunks)internal/common/config/loader_windows.go(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (1)
internal/common/config/loader.go (1)
548-552: Locale-aware default forui.logEncodingCharsetis wired correctlySwitching the default from a hard-coded
"utf-8"togetDefaultLogEncodingCharset()cleanly delegates charset choice to the new OS-specific helpers while still allowing config/env overrides through viper. The change is minimal, self-contained, and consistent with the rest of the loader flow.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/common/fileutil/logutil.go (1)
363-403: bufio.Scanner's default 64KB token limit requires explicit handlingWhile
transform.NewReaderwrapping the file is the correct pattern, usingbufio.Scannerwithout callingscanner.Buffer()to configure the token size may fail on log files with lines exceeding ~64KB. Either callscanner.Buffer(make([]byte, initialSize), maxTokenSize)before the scan loop to set an appropriate maximum, or usebufio.Reader.ReadString('\n')instead for more robust handling of arbitrarily long lines. Thebufio.ErrTooLongerror handling is present but only catches the failure after it occurs; it doesn't prevent the limitation.
🧹 Nitpick comments (2)
internal/common/fileutil/logutil.go (2)
479-534: Consider consistent handling of very long lines in range reads
readLinesRangenow correctly wraps the reader withtransform.NewReaderwhen a decoder is present, matching the other helpers. One small inconsistency:
readFirstLinesandreadLastLinesspecial‑casebufio.ErrTooLongand return a friendly"Error: Line too long to read."message.readLinesRangestill propagatesscanner.Err()directly, which will surfacebufio.ErrTooLongas a raw error to callers of the range API.If API symmetry matters for your callers, consider matching the head/tail behavior here:
- if err := scanner.Err(); err != nil { - return nil, err - } + if err := scanner.Err(); err != nil { + if errors.Is(err, bufio.ErrTooLong) { + return &LogResult{Lines: []string{"Error: Line too long to read."}}, nil + } + return nil, err + }
40-176: Consider usingianaindexto reduce charset alias maintenance and improve unknown-encoding transparencyThe manual alias table is comprehensive and correct, but long-term maintainability can be improved. Two points:
Silent fallback for unknown charsets: Treating unrecognized charset names as UTF-8 without logging creates silent data loss. For troubleshooting garbled logs, either log a warning when charset is not recognized or optionally return an error. This surfaces the problem instead of masking it.
Charset alias maintenance: The manual switch handles ~100+ aliases. Consider leveraging
golang.org/x/text/encoding/ianaindex.MIME(or.IANA) instead. It provides case-insensitive lookup of IANA-registered charset names and returnsencoding.Encodingobjects compatible with your.NewDecoder()pattern. This eliminates bookkeeping and ensures consistency with IANA standards. Note:ianaindex.Encoding()returns(nil, nil)for valid but unsupported charsets (distinct from unknown names), so you'll need explicit handling for that case when adopting it.These are non-blocking suggestions; the current implementation is functionally correct.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/common/fileutil/logutil.go(14 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/common/fileutil/logutil.go (1)
api/v2/api.gen.go (3)
Head(946-946)Tail(967-967)Offset(952-952)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (3)
internal/common/fileutil/logutil.go (3)
10-29: Struct and import changes look goodThe added imports and the
Encodingfield onLogReadOptionsare straightforward and keep defaults backward compatible (empty string → UTF-8 path via nil decoder). No issues here.
202-251: Decoder plumbing through ReadLogLines and helpers looks correctThreading
decoder := getEncodingDecoder(options.Encoding)through the head/tail/range helpers is clean and keeps UTF‑8 behavior unchanged (nil decoder). The control flow forHead,Tail, andOffset/Limitis preserved, andIsEstimateis still set only at the top level. No functional issues spotted in this wiring.
406-476: readLastLines decoder usage is correct; control flow preservedThe decoder integration mirrors
readFirstLines: you correctly reuse the samedecoderwhen delegating toreadFirstLinesforn >= totalLinesand otherwise wrap the file viatransform.NewReader. The ring‑buffer logic is unchanged and still safe under both decoded and undecoded paths. No correctness issues found.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
internal/common/fileutil/logutil.go (1)
315-319: Address the past review comment: mark zero-sample result as an estimate.This issue was previously flagged: when
totalSampleLines == 0, the function returns(1, false, nil), marking the result as exact even though it's a heuristic guess. TheIsEstimateflag should betrueto indicate that the line count is estimated.Apply this diff:
if totalSampleLines == 0 { // Avoid division by zero; assume 1 line - return 1, false, nil + return 1, true, nil }
🧹 Nitpick comments (2)
internal/common/fileutil/logutil.go (2)
172-175: Consider logging or returning an error for unknown encodings.Returning
nilfor unknown encodings silently treats them as UTF-8, which may mask configuration errors or typos in encoding names. Consider logging a warning to help users detect misconfigurations.
12-19: golang.org/x/text v0.29.0 is safe but not the latest version.The project uses v0.29.0, which is free from known vulnerabilities. However, v0.31.0 is the current latest release (as of December 2025). While the current version is reasonably recent and secure, consider updating to v0.31.0 if newer features or performance improvements are beneficial.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
internal/common/config/loader_unix.go(1 hunks)internal/common/config/loader_windows.go(1 hunks)internal/common/fileutil/logutil.go(15 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/common/config/loader_unix.go
- internal/common/config/loader_windows.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/common/fileutil/logutil.go (1)
api/v2/api.gen.go (3)
Head(946-946)Tail(967-967)Offset(952-952)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (6)
internal/common/fileutil/logutil.go (6)
10-19: LGTM! Imports are appropriate for encoding support.The added imports support the new character encoding functionality across multiple language families (Japanese, Chinese, Korean, Latin, etc.).
24-29: LGTM! Clear field addition with helpful documentation.The
Encodingfield and its comment appropriately document the supported character encodings.
363-403: LGTM! Decoder integration is correct.The function properly wraps the file reader with
transform.NewReaderwhen a decoder is provided, enabling transparent character set conversion while reading.
406-476: LGTM! Decoder integration preserves ring-buffer logic.The function correctly integrates the decoder while maintaining the existing ring-buffer approach for efficiently reading the last n lines.
479-534: LGTM! Consistent decoder integration.The range-reading function follows the same decoder-wrapping pattern as the other helpers, ensuring consistent behavior across all reading modes.
544-544: LGTM! Usingstrings.Joinis more efficient.This is a better approach than string concatenation in a loop.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
internal/common/fileutil/logutil.go (1)
67-68: GB2312 decoder mapping is still incorrect.This critical issue was previously identified and marked as addressed, but the code still contains the incorrect mapping. GB2312 is an 8-bit encoding, but the code maps it to
HZGB2312.NewDecoder(), which decodes HZ-GB-2312 (a 7-bit ASCII-safe encoding). This will cause incorrect decoding of GB2312-encoded log files.Apply this diff to fix the mapping:
// Simplified Chinese encodings case "gb2312", "gb-2312", "csgb2312": - return simplifiedchinese.HZGB2312.NewDecoder() + return simplifiedchinese.GBK.NewDecoder()Note: GBK is a superset of GB2312 and correctly decodes GB2312-encoded text.
🧹 Nitpick comments (2)
internal/common/config/loader_unix.go (1)
14-22: Consider checkingLC_CTYPEfor complete POSIX precedence.The standard POSIX locale precedence is
LC_ALL > LC_CTYPE > LANG. SinceLC_CTYPEspecifically controls character encoding behavior, it may be worth including it for completeness:func getDefaultLogEncodingCharset() string { // Check LC_ALL first, then LANG (standard precedence) locale := os.Getenv("LC_ALL") if locale == "" { + locale = os.Getenv("LC_CTYPE") + } + if locale == "" { locale = os.Getenv("LANG") }internal/common/fileutil/logutil.go (1)
167-170: Consider consistent BOM handling for explicitly-specified UTF-16 endianness.When UTF-16BE is explicitly specified, the code uses
IgnoreBOM, but UTF-16LE usesUseBOM. For consistency, consider usingIgnoreBOMfor UTF-16LE as well, since the endianness is already known:// Unicode variants -case "utf-16", "utf16", "utf-16le", "utf16le": - return unicode.UTF16(unicode.LittleEndian, unicode.UseBOM).NewDecoder() +case "utf-16", "utf16": + return unicode.UTF16(unicode.LittleEndian, unicode.UseBOM).NewDecoder() +case "utf-16le", "utf16le": + return unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM).NewDecoder() case "utf-16be", "utf16be": return unicode.UTF16(unicode.BigEndian, unicode.IgnoreBOM).NewDecoder()This makes the handling symmetric: when endianness is explicit (LE or BE), ignore the BOM; when unspecified, use the BOM to determine endianness.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
internal/common/config/loader_unix.go(1 hunks)internal/common/config/loader_windows.go(1 hunks)internal/common/fileutil/logutil.go(16 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/common/fileutil/logutil.go (1)
api/v2/api.gen.go (3)
Head(946-946)Tail(967-967)Offset(952-952)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (13)
internal/common/config/loader_windows.go (2)
15-18: LGTM!Clean delegation to
codePageToCharsetwith the system ANSI code page.
22-51: LGTM!The code page mapping logic is well-structured. The early return for UTF-8 (65001) avoids unnecessary IANA lookup, and the default case correctly handles Windows code pages using the standard "windows-XXX" format that IANA recognizes.
internal/common/config/loader_unix.go (1)
43-53: LGTM!Clean normalization helper that correctly uses the IANA index for canonical charset name resolution.
internal/common/fileutil/logutil.go (10)
10-19: LGTM!The imports are appropriate for the encoding support functionality. All packages are used within the file.
23-29: LGTM!The addition of the
Encodingfield is well-documented with clear examples. The enhanced comments for other fields improve usability.
202-203: LGTM!The decoder is obtained once and efficiently passed through to all helper functions. The integration is clean and consistent across all code paths.
Also applies to: 213-213, 223-223, 237-237, 250-250
315-319: LGTM!The division-by-zero guard is correctly implemented. Returning
IsEstimate: trueproperly indicates that the single-line count is a heuristic for files without newlines in the samples.
363-363: LGTM!The decoder integration in
readFirstLinesfollows the correct pattern: conditionally wrapping the file reader with a transform.NewReader when a decoder is provided. The implementation is clean and maintains backward compatibility.Also applies to: 372-378
406-406: LGTM!The decoder integration in
readLastLinesis consistent withreadFirstLines. The delegation toreadFirstLinescorrectly passes the decoder, and the transform wrapping follows the same pattern.Also applies to: 428-435, 439-439
479-479: LGTM!The decoder integration in
readLinesRangemaintains the same consistent pattern as the other helper functions. All three helpers now uniformly support encoding transformation.Also applies to: 502-508
547-547: LGTM!Using
strings.Joinis more efficient than concatenating strings in a loop. Good optimization.
49-51: LGTM!The charset normalization logic properly handles common variations (case, underscores, spaces), allowing flexible specification while maintaining consistency.
172-175: LGTM!Returning
nilfor unknown encodings (treating them as UTF-8) is a pragmatic default. This provides graceful degradation rather than failing hard, which is appropriate for a log reading utility.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
internal/common/fileutil/logutil.go (4)
24-38: Encoding option inLogReadOptions/LogResultlooks reasonable; consider documenting accepted labelsThe added
Encodingstring onLogReadOptionsplus the existingIsEstimateflag onLogResultgive callers enough control and feedback. To avoid confusion when users pass OS-specific names (e.g., Windows code pages), it would help to clearly document which charset labels are supported and that""/UTF‑8 use the “no decoding” fast path.
365-405: Decoder plumbing throughreadFirstLines/readLastLines/readLinesRangeis consistent; consider scanner buffer tuningPassing a single
*encoding.DecoderfromReadLogLinesinto all three helpers and wrapping viatransform.NewReaderkeeps the behavior uniform for non‑UTF‑8 encodings, and error handling forbufio.ErrTooLongis preserved.One possible follow‑up: if you expect very long log lines (especially in multibyte encodings like UTF‑16 or Shift‑JIS), you may want to raise the scanner’s max token size to avoid tripping the “Line too long to read.” path:
- scanner := bufio.NewScanner(reader) + scanner := bufio.NewScanner(reader) + // Optionally allow longer log lines, e.g. up to 1MB. + scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)You could factor that into a small helper to keep the three functions aligned.
Also applies to: 408-478, 481-539
204-205: Decoder integration inReadLogLinesis straightforward; consider whether unknown encodings should be surfacedUsing
getEncodingDecoder(options.Encoding)once inReadLogLinesand threading thedecoderinto all code paths (head, tail, range, default range) is clean and minimizes repeated charset lookups. The only behavioral question is whether silently treating unknown encodings as UTF‑8 (vianildecoder) is what you want operationally; if not, this is the natural place to fail fast oncegetEncodingDecoderstarts returning an error.Also applies to: 215-216, 225-226, 239-240, 252-253
549-550:ReadLogContentjoin is fine; verify trailing newline semantics vs previous behaviorSwitching to
strings.Join(result.Lines, "\n")is clearer and avoids manual string concatenation. It does, however, fix the exact placement of newlines to “between lines only”. If the old implementation happened to append a trailing newline after the last line, this would be a subtle behavior change. Please double‑check any callers/tests that might rely on a final newline when returning full‑file content.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/common/fileutil/logutil.go(16 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test on ubuntu-latest
🔇 Additional comments (1)
internal/common/fileutil/logutil.go (1)
317-321: Zero-sample guard inestimateLineCountcorrectly avoids divide-by-zero and marks result as heuristicThe new
totalSampleLines == 0check withreturn 1, true, nilfixes the divide‑by‑zero case for large files with no\nin the sampled regions and correctly marks the count as an estimate. This aligns with the earlier review feedback without changing behavior for normal files.
| // getEncodingDecoder returns an encoding.Decoder for the given charset name. | ||
| // Returns nil for UTF-8 or empty charset (no decoding needed). | ||
| // Supports a wide range of encodings including Japanese, Chinese, Korean, | ||
| // and various ISO-8859 and Windows code pages. | ||
| func getEncodingDecoder(charset string) *encoding.Decoder { | ||
| if charset == "" { | ||
| return nil | ||
| } | ||
|
|
||
| // Normalize the charset name for comparison | ||
| normalized := strings.ToLower(strings.ReplaceAll(charset, "_", "-")) | ||
| normalized = strings.ReplaceAll(normalized, " ", "-") | ||
|
|
||
| switch normalized { | ||
| // UTF-8 (no decoder needed) | ||
| case "utf-8", "utf8": | ||
| return nil | ||
|
|
||
| // Japanese encodings | ||
| case "euc-jp", "eucjp": | ||
| return japanese.EUCJP.NewDecoder() | ||
| case "shift-jis", "shiftjis", "sjis", "s-jis", "x-sjis", "ms-kanji", "csshiftjis": | ||
| return japanese.ShiftJIS.NewDecoder() | ||
| case "iso-2022-jp", "iso2022jp", "csiso2022jp": | ||
| return japanese.ISO2022JP.NewDecoder() | ||
|
|
||
| // Simplified Chinese encodings | ||
| case "gb2312", "gb-2312", "csgb2312": | ||
| return simplifiedchinese.GBK.NewDecoder() | ||
| case "gbk", "cp936", "ms936", "windows-936": | ||
| return simplifiedchinese.GBK.NewDecoder() | ||
| case "gb18030": | ||
| return simplifiedchinese.GB18030.NewDecoder() | ||
| case "hz-gb-2312", "hz": | ||
| return simplifiedchinese.HZGB2312.NewDecoder() | ||
|
|
||
| // Traditional Chinese encodings | ||
| case "big5", "big-5", "csbig5", "x-x-big5", "cn-big5": | ||
| return traditionalchinese.Big5.NewDecoder() | ||
|
|
||
| // Korean encodings | ||
| case "euc-kr", "euckr", "cseuckr", "ks-c-5601-1987", "ksc5601", "iso-ir-149", "korean": | ||
| return korean.EUCKR.NewDecoder() | ||
|
|
||
| // ISO-8859 encodings (Latin character sets) | ||
| case "iso-8859-1", "iso88591", "latin1", "latin-1", "l1", "csisolatin1", "iso-ir-100", "ibm819", "cp819": | ||
| return charmap.ISO8859_1.NewDecoder() | ||
| case "iso-8859-2", "iso88592", "latin2", "latin-2", "l2", "csisolatin2", "iso-ir-101": | ||
| return charmap.ISO8859_2.NewDecoder() | ||
| case "iso-8859-3", "iso88593", "latin3", "latin-3", "l3", "csisolatin3", "iso-ir-109": | ||
| return charmap.ISO8859_3.NewDecoder() | ||
| case "iso-8859-4", "iso88594", "latin4", "latin-4", "l4", "csisolatin4", "iso-ir-110": | ||
| return charmap.ISO8859_4.NewDecoder() | ||
| case "iso-8859-5", "iso88595", "cyrillic", "iso-ir-144", "csisolatincyrillic": | ||
| return charmap.ISO8859_5.NewDecoder() | ||
| case "iso-8859-6", "iso88596", "arabic", "iso-ir-127", "csisolatinarabic", "ecma-114", "asmo-708": | ||
| return charmap.ISO8859_6.NewDecoder() | ||
| case "iso-8859-7", "iso88597", "greek", "greek8", "iso-ir-126", "csisolatingreek", "ecma-118", "elot-928": | ||
| return charmap.ISO8859_7.NewDecoder() | ||
| case "iso-8859-8", "iso88598", "hebrew", "iso-ir-138", "csisolatinhebrew": | ||
| return charmap.ISO8859_8.NewDecoder() | ||
| case "iso-8859-9", "iso88599", "latin5", "latin-5", "l5", "iso-ir-148", "csisolatin5", "turkish": | ||
| return charmap.ISO8859_9.NewDecoder() | ||
| case "iso-8859-10", "iso885910", "latin6", "latin-6", "l6", "iso-ir-157", "csisolatin6": | ||
| return charmap.ISO8859_10.NewDecoder() | ||
| case "iso-8859-13", "iso885913", "latin7", "latin-7", "l7": | ||
| return charmap.ISO8859_13.NewDecoder() | ||
| case "iso-8859-14", "iso885914", "latin8", "latin-8", "l8", "iso-ir-199", "iso-celtic": | ||
| return charmap.ISO8859_14.NewDecoder() | ||
| case "iso-8859-15", "iso885915", "latin9", "latin-9", "l9", "latin0": | ||
| return charmap.ISO8859_15.NewDecoder() | ||
| case "iso-8859-16", "iso885916", "latin10", "latin-10", "l10", "iso-ir-226": | ||
| return charmap.ISO8859_16.NewDecoder() | ||
|
|
||
| // Windows code pages | ||
| case "windows-1250", "cp1250", "x-cp1250": | ||
| return charmap.Windows1250.NewDecoder() | ||
| case "windows-1251", "cp1251", "x-cp1251": | ||
| return charmap.Windows1251.NewDecoder() | ||
| case "windows-1252", "cp1252", "x-cp1252", "ansi": | ||
| return charmap.Windows1252.NewDecoder() | ||
| case "windows-1253", "cp1253", "x-cp1253": | ||
| return charmap.Windows1253.NewDecoder() | ||
| case "windows-1254", "cp1254", "x-cp1254": | ||
| return charmap.Windows1254.NewDecoder() | ||
| case "windows-1255", "cp1255", "x-cp1255": | ||
| return charmap.Windows1255.NewDecoder() | ||
| case "windows-1256", "cp1256", "x-cp1256": | ||
| return charmap.Windows1256.NewDecoder() | ||
| case "windows-1257", "cp1257", "x-cp1257": | ||
| return charmap.Windows1257.NewDecoder() | ||
| case "windows-1258", "cp1258", "x-cp1258": | ||
| return charmap.Windows1258.NewDecoder() | ||
|
|
||
| // Cyrillic encodings | ||
| case "koi8-r", "koi8r", "cskoi8r": | ||
| return charmap.KOI8R.NewDecoder() | ||
| case "koi8-u", "koi8u": | ||
| return charmap.KOI8U.NewDecoder() | ||
|
|
||
| // IBM code pages | ||
| case "ibm437", "cp437", "437", "cspc8codepage437": | ||
| return charmap.CodePage437.NewDecoder() | ||
| case "ibm850", "cp850", "850", "cspc850multilingual": | ||
| return charmap.CodePage850.NewDecoder() | ||
| case "ibm852", "cp852", "852": | ||
| return charmap.CodePage852.NewDecoder() | ||
| case "ibm855", "cp855", "855": | ||
| return charmap.CodePage855.NewDecoder() | ||
| case "ibm858", "cp858", "858": | ||
| return charmap.CodePage858.NewDecoder() | ||
| case "ibm860", "cp860", "860": | ||
| return charmap.CodePage860.NewDecoder() | ||
| case "ibm862", "cp862", "862": | ||
| return charmap.CodePage862.NewDecoder() | ||
| case "ibm863", "cp863", "863": | ||
| return charmap.CodePage863.NewDecoder() | ||
| case "ibm865", "cp865", "865": | ||
| return charmap.CodePage865.NewDecoder() | ||
| case "ibm866", "cp866", "866", "csibm866": | ||
| return charmap.CodePage866.NewDecoder() | ||
|
|
||
| // Mac encodings | ||
| case "macintosh", "mac", "macroman", "csmacintosh", "x-mac-roman": | ||
| return charmap.Macintosh.NewDecoder() | ||
|
|
||
| // Unicode variants | ||
| case "utf-16", "utf16": | ||
| return unicode.UTF16(unicode.LittleEndian, unicode.UseBOM).NewDecoder() | ||
| case "utf-16le", "utf16le": | ||
| return unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM).NewDecoder() | ||
| case "utf-16be", "utf16be": | ||
| return unicode.UTF16(unicode.BigEndian, unicode.IgnoreBOM).NewDecoder() | ||
|
|
||
| default: | ||
| // Unknown encoding, return nil (treat as UTF-8) | ||
| return nil | ||
| } | ||
| } |
There was a problem hiding this comment.
Encoding decoder table is solid; add a few common aliases and consider handling unknown encodings explicitly
The normalization and breadth of aliases here are good, and mapping GB2312 → simplifiedchinese.GBK is the right call given GBK is a superset. Two small refinements:
- You’re missing some very common Windows aliases, especially for Shift‑JIS. For example,
cp932is how Windows typically labels Shift‑JIS. Right now that will fall through to the default and be treated as UTF‑8, producing mojibake. Suggest at least:
- case "shift-jis", "shiftjis", "sjis", "s-jis", "x-sjis", "ms-kanji", "csshiftjis":
+ case "shift-jis", "shiftjis", "sjis", "s-jis", "x-sjis", "ms-kanji", "csshiftjis", "cp932", "windows-932":
return japanese.ShiftJIS.NewDecoder()- For unknown charset names you currently return
niland silently treat the file as UTF‑8. That’s safe but can hide configuration/typing errors. Consider optionally surfacing this as an error (e.g., by changinggetEncodingDecoderto return(*encoding.Decoder, error)and propagating a clear “unsupported encoding” message), or at least logging a warning path so operators can see why their logs are garbled.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1439 +/- ##
==========================================
- Coverage 60.13% 59.96% -0.18%
==========================================
Files 176 187 +11
Lines 20756 21022 +266
==========================================
+ Hits 12482 12605 +123
- Misses 6975 7104 +129
- Partials 1299 1313 +14
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Summary by CodeRabbit
New Features
Refactor
Tests
Chore
✏️ Tip: You can customize this high-level summary in your review settings.