Skip to content

[duplicate-code] Duplicate Code Pattern: String deduplication/normalization repeated in 3 packages #2051

@github-actions

Description

@github-actions

Part of duplicate code analysis: #2049

Summary

The same string-deduplication algorithm (trim whitespace → skip empties → skip duplicates via map[string]struct{} → optionally sort) appears in three separate places. The pattern is structurally identical but is re-implemented inline each time with minor variations, making future changes (e.g. adding case-folding, changing the empty-skip rule) require three separate edits.

Duplication Details

Pattern: inline string-slice deduplication/normalization

  • Severity: Low
  • Occurrences: 3
  • Locations:
    • internal/cmd/flags_difc.goparseDIFCSinkServerIDs (~lines 94–128)
    • internal/difc/sink_server_ids.goSetSinkServerIDs (~lines 37–56)
    • internal/config/guard_policy.gonormalizeAndValidateScopeArray (~lines 325–350)

Shared structure in all three:

seen := make(map[string]struct{}, len(input))
result := make([]string, 0, len(input))
for _, item := range input {
    trimmed := strings.TrimSpace(item)
    if trimmed == "" { continue }
    if _, exists := seen[trimmed]; exists { /* skip/error */ continue }
    seen[trimmed] = struct{}{}
    result = append(result, trimmed)
}
// sort.Strings(result) — present in SetSinkServerIDs and normalizeAndValidateScopeArray

Additional concern — redundant double-deduplication

parseDIFCSinkServerIDs (in flags_difc.go) already deduplicates before its result is passed to SetSinkServerIDs (in sink_server_ids.go), which deduplicates again. This means every string in the slice is checked for duplicates twice in the same code path.

Impact Analysis

  • Maintainability: Three copies of the algorithm must be kept in sync if the deduplication rule changes.
  • Bug Risk: Behavioral differences have already crept in:
    • parseDIFCSinkServerIDs rejects strings containing whitespace after trimming — SetSinkServerIDs does not.
    • normalizeAndValidateScopeArray returns an error on duplicate — the others silently skip.
  • Code Bloat: ~45 lines of near-identical logic spread across three packages.

Refactoring Recommendations

  1. Add a deduplicateStrings helper to internal/strutil/ (package already exists with truncate.go):

    // DeduplicateStrings returns a new slice with whitespace-trimmed, empty, and
    // duplicate entries removed. When sorted=true the result is sorted.
    func DeduplicateStrings(input []string, sorted bool) []string {
        seen := make(map[string]struct{}, len(input))
        out := make([]string, 0, len(input))
        for _, s := range input {
            s = strings.TrimSpace(s)
            if s == "" { continue }
            if _, exists := seen[s]; exists { continue }
            seen[s] = struct{}{}
            out = append(out, s)
        }
        if sorted { sort.Strings(out) }
        return out
    }
    • Each call site replaces ~15 lines with one line.
    • normalizeAndValidateScopeArray can keep its duplicate-as-error path while delegating the common trim/empty logic.
  2. Eliminate the double-deduplication in the parseDIFCSinkServerIDsSetSinkServerIDs call chain: since the caller already guarantees a deduplicated slice, SetSinkServerIDs can trust its input and skip the second pass (or add a doc comment clarifying the contract).

Implementation Checklist

  • Review duplication findings
  • Add DeduplicateStrings helper in internal/strutil/
  • Update parseDIFCSinkServerIDs and SetSinkServerIDs to use the helper
  • Decide whether normalizeAndValidateScopeArray should use helper or keep error-on-duplicate behavior
  • Add unit tests for the new helper
  • Verify no behavioral change to existing call sites

Parent Issue

See parent analysis report: #2049
Related to #2049

Generated by Duplicate Code Detector ·

  • expires on Mar 24, 2026, 3:02 AM UTC

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions