Skip to content

feat: add DuplicateValuesValidator to check for duplicate values in translation files#15

Merged
konradmichalik merged 3 commits intomainfrom
duplicate-values
Jul 3, 2025
Merged

feat: add DuplicateValuesValidator to check for duplicate values in translation files#15
konradmichalik merged 3 commits intomainfrom
duplicate-values

Conversation

@konradmichalik
Copy link
Copy Markdown
Contributor

@konradmichalik konradmichalik commented Jul 2, 2025

Summary by CodeRabbit

  • New Features
    • Introduced a validator that detects duplicate translation values assigned to different keys within the same file, reporting these as warnings.
  • Bug Fixes
    • Improved message display logic to distinguish between warnings and errors based on result type and dry run status.
  • Tests
    • Added comprehensive tests for the new duplicate values validator.
    • Updated existing tests and fixtures to cover new validation logic and ensure accurate validator registration.
  • Chores
    • Enhanced test output configuration for better visibility of warnings.
  • Documentation
    • Updated README to clarify supported translation formats with framework context.
    • Added details on validator severity levels and introduced the new duplicate values validator in documentation.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jul 2, 2025

Walkthrough

A new DuplicateValuesValidator was added to detect duplicate translation values across keys in translation files, with corresponding logic integrated into the validator registry and CLI renderer. Test fixtures and unit tests were updated and expanded to support and verify this validator's functionality. Minor configuration and output logic adjustments were also made.

Changes

File(s) Change Summary
src/Validator/DuplicateValuesValidator.php Added new DuplicateValuesValidator class for detecting duplicate translation values.
src/Validator/ValidatorRegistry.php Registered DuplicateValuesValidator in the available validators list.
src/Result/CliRenderer.php Modified message type logic to show warnings if dryRun is true or result type is warning.
phpunit.xml Enabled detailed output for tests that trigger warnings.
tests/composer.json Updated script to use DuplicateValuesValidator instead of DuplicatesValidator.
tests/src/Fixtures/translations/xliff/fail/de.locallang.xlf Added two translation units with empty sources and duplicate targets for testing.
tests/src/Fixtures/translations/xliff/fail/locallang.xlf Added a translation unit with a duplicate source text for testing.
tests/src/Validator/DuplicateValuesValidatorTest.php Introduced comprehensive tests for DuplicateValuesValidator.
tests/src/Validator/ValidatorRegistryTest.php Updated validator count assertion to reflect the new validator.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant ValidatorRegistry
    participant DuplicateValuesValidator
    participant Parser

    User->>CLI: Run validation command
    CLI->>ValidatorRegistry: getAvailableValidators()
    ValidatorRegistry-->>CLI: [DuplicateValuesValidator, ...]
    CLI->>DuplicateValuesValidator: processFile(Parser)
    DuplicateValuesValidator->>Parser: getTranslationKeys()
    Parser-->>DuplicateValuesValidator: keys/values
    DuplicateValuesValidator-->>CLI: Issues (if duplicates found)
    CLI->>DuplicateValuesValidator: postProcess()
    DuplicateValuesValidator-->>CLI: Aggregated duplicate issues
    CLI->>DuplicateValuesValidator: renderIssueSets()
    DuplicateValuesValidator-->>CLI: Formatted output
    CLI-->>User: Display results (warning/error)
Loading

Possibly related PRs

Poem

Hopping through code with a validator new,
Sniffing for doubles in translations' view.
Now warnings abound when values repeat,
And tests ensure the checks are neat.
With every hop, the code grows bright—
Duplicate words? Not on this rabbit's watchful night!
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
tests/src/Validator/ValidatorRegistryTest.php (1)

22-22: Count update is correct, but consider adding explicit assertion.

The count correctly reflects the addition of DuplicateValuesValidator. However, consider adding an explicit assertion to verify its presence in the array for more comprehensive test coverage.

Add this assertion after line 21:

         $this->assertContains(SchemaValidator::class, $validators);
+        $this->assertContains(\MoveElevator\ComposerTranslationValidator\Validator\DuplicateValuesValidator::class, $validators);
         $this->assertCount(4, $validators);
src/Validator/DuplicateValuesValidator.php (2)

66-66: Remove unused parameter.

The $input parameter is not used in this method and should be removed to clean up the method signature.

-public function renderIssueSets(InputInterface $input, OutputInterface $output, array $issueSets): void
+public function renderIssueSets(InputInterface $input, OutputInterface $output, array $issueSets): void

Note: The parameter needs to remain for interface compliance with ValidatorInterface, but the static analysis tool is flagging it as unused.


46-49: Consider edge case for array_unique optimization.

The logic checks for count(array_unique($keys)) > 1 but this could be optimized. Since keys are added one by one, checking count($keys) > 1 after ensuring uniqueness would be more efficient.

-if (count(array_unique($keys)) > 1) {
-    $duplicates[$value] = array_unique($keys);
-}
+$uniqueKeys = array_unique($keys);
+if (count($uniqueKeys) > 1) {
+    $duplicates[$value] = $uniqueKeys;
+}
tests/src/Validator/DuplicateValuesValidatorTest.php (1)

208-213: Simplify string assertion.

The assertIsString check is redundant since the explain() method is guaranteed to return a string. Focus on testing the content instead.

 public function testExplain(): void
 {
     $validator = new DuplicateValuesValidator($this->loggerMock);
-    $this->assertIsString($validator->explain());
+    $explanation = $validator->explain();
+    $this->assertNotEmpty($explanation);
     $this->assertStringContainsString('duplicate values', $validator->explain());
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6309e46 and 06d1b95.

📒 Files selected for processing (9)
  • phpunit.xml (1 hunks)
  • src/Result/CliRenderer.php (1 hunks)
  • src/Validator/DuplicateValuesValidator.php (1 hunks)
  • src/Validator/ValidatorRegistry.php (1 hunks)
  • tests/composer.json (1 hunks)
  • tests/src/Fixtures/translations/xliff/fail/de.locallang.xlf (1 hunks)
  • tests/src/Fixtures/translations/xliff/fail/locallang.xlf (1 hunks)
  • tests/src/Validator/DuplicateValuesValidatorTest.php (1 hunks)
  • tests/src/Validator/ValidatorRegistryTest.php (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/Validator/ValidatorRegistry.php (1)
src/Validator/DuplicateValuesValidator.php (1)
  • DuplicateValuesValidator (16-114)
tests/src/Validator/DuplicateValuesValidatorTest.php (3)
src/Parser/XliffParser.php (1)
  • XliffParser (7-76)
src/Parser/YamlParser.php (1)
  • YamlParser (9-82)
src/Validator/DuplicateValuesValidator.php (7)
  • DuplicateValuesValidator (16-114)
  • processFile (24-40)
  • postProcess (42-58)
  • renderIssueSets (66-94)
  • explain (96-100)
  • supportsParser (105-108)
  • resultTypeOnValidationFailure (110-113)
🪛 PHPStan (2.1.15)
tests/src/Validator/DuplicateValuesValidatorTest.php

191-191: Parameter #3 $issueSets of method MoveElevator\ComposerTranslationValidator\Validator\DuplicateValuesValidator::renderIssueSets() expects array<string, array<int, array{file: string, issues: array<string, array<int, string>>}>>, array{array{file: 'file1.xlf', issues: array{valueA: array{'key1', 'key3'}}}, array{file: 'file2.xlf', issues: array{valueX: array{'keyA', 'keyB'}}}} given.
Array does not have offset 'file'.

(argument.type)


211-211: Call to method PHPUnit\Framework\Assert::assertIsString() with string will always evaluate to true.
Because the type is coming from a PHPDoc, you can turn off this check by setting treatPhpDocTypesAsCertain: false in your %configurationFile%.

(method.alreadyNarrowedType)

src/Validator/DuplicateValuesValidator.php

72-72: Offset 'file' does not exist on array<int, array{file: string, issues: array<string, array<int, string>>}>.

(offsetAccess.notFound)


75-75: Offset 'file' does not exist on array<int, array{file: string, issues: array<string, array<int, string>>}>.

(offsetAccess.notFound)


76-76: Offset 'issues' does not exist on array<int, array{file: string, issues: array<string, array<int, string>>}>.

(offsetAccess.notFound)

🪛 GitHub Check: cgl
tests/src/Validator/DuplicateValuesValidatorTest.php

[failure] 211-211:
Call to method PHPUnit\Framework\Assert::assertIsString() with string will always evaluate to true.


[failure] 191-191:
Parameter #3 $issueSets of method MoveElevator\ComposerTranslationValidator\Validator\DuplicateValuesValidator::renderIssueSets() expects array<string, array<int, array{file: string, issues: array<string, array<int, string>>}>>, array{array{file: 'file1.xlf', issues: array{valueA: array{'key1', 'key3'}}}, array{file: 'file2.xlf', issues: array{valueX: array{'keyA', 'keyB'}}}} given.

src/Validator/DuplicateValuesValidator.php

[failure] 76-76:
Offset 'issues' does not exist on array<int, array{file: string, issues: array<string, array<int, string>>}>.


[failure] 75-75:
Offset 'file' does not exist on array<int, array{file: string, issues: array<string, array<int, string>>}>.


[failure] 72-72:
Offset 'file' does not exist on array<int, array{file: string, issues: array<string, array<int, string>>}>.

🪛 PHPMD (2.15.0)
src/Validator/DuplicateValuesValidator.php

66-66: Avoid unused parameters such as '$input'. (Unused Code Rules)

(UnusedFormalParameter)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Code Coverage Report
🔇 Additional comments (10)
phpunit.xml (1)

4-4: Good configuration enhancement for warning-based validators.

This attribute will provide detailed output for tests that generate warnings, which aligns well with the new DuplicateValuesValidator that returns warnings on validation failures.

src/Validator/ValidatorRegistry.php (1)

17-17: Correctly registers the new validator.

The addition of DuplicateValuesValidator::class to the registry is appropriate and follows the existing pattern.

tests/composer.json (1)

32-32: Script correctly updated to use the new validator class name.

The command now properly references DuplicateValuesValidator instead of the previous class name, maintaining consistency with the implementation.

tests/src/Fixtures/translations/xliff/fail/de.locallang.xlf (1)

14-21: Well-designed test data for duplicate value detection.

The added translation units create appropriate test scenarios with duplicate values:

  • key4 duplicates the value from key3 ("German Key #3")
  • key5 duplicates the value from key1 ("German Key #1")

This provides good coverage for testing the DuplicateValuesValidator functionality.

tests/src/Fixtures/translations/xliff/fail/locallang.xlf (1)

12-14: Test fixture addition looks good.

The new trans-unit with id="key3" and source="English Key #1" correctly creates a duplicate value scenario for testing the DuplicateValuesValidator. This provides the necessary test data where the same translation value appears under different keys.

src/Result/CliRenderer.php (1)

39-39: Enhanced warning display logic looks good.

The updated conditional logic correctly handles both dry-run scenarios and validators that return ResultType::WARNING. This change properly supports the new DuplicateValuesValidator which returns warnings rather than errors on validation failure.

src/Validator/DuplicateValuesValidator.php (1)

110-113: Result type configuration is appropriate.

The choice to return ResultType::WARNING for duplicate values is sensible, as duplicates are typically a quality concern rather than a critical error that should block deployment.

tests/src/Validator/DuplicateValuesValidatorTest.php (3)

28-52: Excellent test coverage for duplicate detection.

The test correctly validates the internal state of the validator after processing files with duplicate values. Using reflection to access protected properties is appropriate for unit testing the validator's internal logic.


83-102: Good error handling test.

The test properly verifies that invalid files are handled gracefully with appropriate error logging and no state contamination.


104-144: Comprehensive postProcess testing.

The test effectively validates the duplicate detection logic by manually setting up the internal state and verifying the generated issues structure.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/Validator/DuplicateValuesValidator.php (1)

64-66: $input is unused

renderIssueSets() receives InputInterface $input but never touches it; PHPMD flags this.
If the argument is truly unnecessary, drop it; otherwise use it (e.g. to honour verbosity options).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06d1b95 and a442fdc.

📒 Files selected for processing (2)
  • src/Validator/DuplicateValuesValidator.php (1 hunks)
  • tests/src/Validator/DuplicateValuesValidatorTest.php (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/src/Validator/DuplicateValuesValidatorTest.php
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/Validator/DuplicateValuesValidator.php (3)
src/Parser/XliffParser.php (1)
  • XliffParser (7-76)
src/Parser/YamlParser.php (1)
  • YamlParser (9-82)
src/Validator/AbstractValidator.php (1)
  • AbstractValidator (11-91)
🪛 PHPMD (2.15.0)
src/Validator/DuplicateValuesValidator.php

66-66: Avoid unused parameters such as '$input'. (Unused Code Rules)

(UnusedFormalParameter)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Report test coverage
  • GitHub Check: Test (PHP 8.1, Symfony 6.x, Composer 2.8)
🔇 Additional comments (1)
src/Validator/DuplicateValuesValidator.php (1)

34-40: 👍 Null-value safeguard implemented

The new null-check prevents nulls from becoming array keys—nice fix.

Comment on lines +21 to +24
/**
* @return array<string, int>
*/
public function processFile(ParserInterface $file): array
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Return-type PHPDoc is misleading

processFile() always returns an empty array, yet the docblock advertises array<string,int>.
Either update the PHPDoc to array<mixed> (or similar) or make the method actually return meaningful data to avoid confusing static analysers.

🤖 Prompt for AI Agents
In src/Validator/DuplicateValuesValidator.php around lines 21 to 24, the PHPDoc
for the processFile() method incorrectly states it returns array<string, int>
while the method always returns an empty array. To fix this, either update the
PHPDoc to reflect the actual return type such as array<mixed> or modify the
method to return meaningful data consistent with the documented type.

Comment on lines +113 to +116
public function resultTypeOnValidationFailure(): ResultType
{
return ResultType::WARNING;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing ResultType import breaks compilation

ResultType is referenced without a leading backslash or an import, so PHP will look for MoveElevator\ComposerTranslationValidator\Validator\ResultType, which does not exist.
Add the correct use-statement:

 use Symfony\Component\Console\Output\OutputInterface;
+use MoveElevator\ComposerTranslationValidator\Result\ResultType;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public function resultTypeOnValidationFailure(): ResultType
{
return ResultType::WARNING;
}
use Symfony\Component\Console\Output\OutputInterface;
+use MoveElevator\ComposerTranslationValidator\Result\ResultType;
🤖 Prompt for AI Agents
In src/Validator/DuplicateValuesValidator.php around lines 113 to 116, the
ResultType class is used without being imported or fully qualified, causing a
compilation error. Add a use statement at the top of the file to import the
correct ResultType class so PHP can resolve it properly.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
README.md (1)

49-55: Polish wording & capitalization in the “Supported Formats” table

Minor copy-editing will tighten the descriptions and avoid acronym tautology flagged by LanguageTool:

-| Format | Description                                                                                                  | Framework | Example files                          |
-|--------|--------------------------------------------------------------------------------------------------------------|-----------|----------------------------------------|
-| XLIFF  | Supports source/target translations in xliff language files. | [TYPO3 CMS](https://typo3.org/)          | `locallang.xlf`, `de.locallang.xlf`    |
-| Yaml   | Supports yaml language files.                     | [Symfony Framework](https://symfony.com/)          | `messages.en.yaml`, `messages.de.yaml` |
+| Format | Description                                                                      | Framework | Example files                          |
+|--------|----------------------------------------------------------------------------------|-----------|----------------------------------------|
+| XLIFF  | Supports source/target translations in XLIFF files. | [TYPO3 CMS](https://typo3.org/)          | `locallang.xlf`, `de.locallang.xlf`    |
+| YAML   | Supports YAML files.                                   | [Symfony Framework](https://symfony.com/)          | `messages.en.yaml`, `messages.de.yaml` |

– Replace “xliff/yaml language files” with concise “XLIFF/YAML files”.
– Capitalise “YAML” consistently, mirroring “XLIFF”.
No functional impact, just clearer docs.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a442fdc and c89f8f4.

📒 Files selected for processing (1)
  • README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md

[style] ~54-~54: This phrase is redundant (‘L’ stands for ‘language’). Use simply “yaml”.
Context: ...locallang.xlf` | | Yaml | Supports yaml language files. | [Symfony F...

(ACRONYM_TAUTOLOGY)

🔇 Additional comments (1)
README.md (1)

60-65: Supports column verified and correct

I’ve checked each supportsParser() implementation and they match the README table:

  • DuplicateKeysValidator → XLIFF
  • DuplicateValuesValidator → XLIFF, Yaml
  • MismatchValidator → XLIFF, Yaml
  • SchemaValidator → XLIFF

No updates to the table are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant