fix(bookdrop): clean bookdrop metadata before persisting by imnotjames · Pull Request #462 · grimmory-tools/grimmory

imnotjames · 2026-04-11T01:14:57Z

Description

updates the bookdrop initial metadata to truncate values - similar to the FileProcessor - so that data suggested in bookdrop metadata are of valid lengths

Linked Issue: Fixes #452

Changes

adds a cleanInitialMetadata method for truncating initial

Summary by CodeRabbit

Bug Fixes
- Metadata extraction now cleans and truncates long text and identifier fields (titles, descriptions, publishers, ISBNs, language, external IDs) and treats null extractor results gracefully so stored original metadata is consistent.
Tests
- Added and updated tests to verify metadata truncation and correct serialization behavior during initial metadata handling.

coderabbitai · 2026-04-11T01:15:06Z

📝 Walkthrough

Walkthrough

Introduced metadata sanitization in BookdropMetadataService: extracted BookMetadata is passed through a new cleaner that clones and truncates several text and identifier fields (title, subtitle, description, publisher, seriesName, language, ASIN/ISBNs/external IDs) before serialization/persistence.

Changes

Cohort / File(s)	Summary
Metadata Cleaning Logic `booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`	Added `cleanInitialMetadata(BookMetadata)` that returns null for null input or a cloned `BookMetadata` with truncated text and identifier fields. Updated `extractInitialMetadata(...)` to use the cleaner; added static import for `FileService.truncate`.
Test Coverage `booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java`	Relaxed a mock to use a generic matcher in `attachInitialMetadata_shouldHandleNullCoverGracefully`. Added `attachInitialMetadata_shouldTruncateFields` to capture serialized metadata and assert truncation of `asin`, `isbn10`, `isbn13`, and `language` before serialization.

Sequence Diagram(s)

sequenceDiagram
    participant Extractor as MetadataExtractor
    participant Service as BookdropMetadataService
    participant Ser as ObjectMapper
    participant DB as Persistence

    Extractor->>Service: extractMetadata(file)
    Service->>Service: cleanInitialMetadata(extracted)  %% truncates fields
    Service->>Ser: writeValueAsString(cleanedMetadata)
    Service->>DB: attachInitialMetadata(cleanedMetadata)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I found a long ASIN in a heap,
I nibbled and trimmed so the DB can sleep.
Titles and codes now fit quite neat,
I hop away pleased on tiny feet —
Metadata tidy, my work complete.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description includes both required template sections: a description explaining what the PR does and a linked issue reference (Fixes `#452`), though the Changes section is incomplete but present.
Linked Issues check	✅ Passed	The code changes directly address issue `#452` by implementing metadata truncation in BookdropMetadataService to prevent 'Data too long' database errors when persisting long ASIN and other metadata values.
Out of Scope Changes check	✅ Passed	All changes are focused on the bookdrop metadata cleaning functionality required by issue `#452`; both the production code and test updates are directly related to truncating metadata fields.
Title check	✅ Passed	The pull request title follows the conventional commit format with a type prefix (fix) and a clear, descriptive subject that accurately summarizes the main change of adding metadata cleaning/truncation before persistence.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

imnotjames · 2026-04-11T01:15:19Z

Bookdrop metadata shows a UUID ASIN truncated:

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java (1)
136-161: Consider extending truncation assertions to all cleaned ID fields

The new test is solid for ASIN/ISBN/language. Adding assertions for other cleaned external IDs (e.g., doubanId, audibleId) would better protect against metadata-loss regressions in the cleaner.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java`
around lines 136 - 161, The test attachInitialMetadata_shouldTruncateFields
should also assert truncation/cleaning for other external ID fields to prevent
regressions: after capturing the BookMetadata via the existing
ArgumentCaptor<BookMetadata> (argument) and obtaining actual, add assertions for
fields like getDoubanId() and getAudibleId() (and any other cleaned ID getters
present on BookMetadata) to verify they match expected truncated/cleaned values
based on the input metadata used in metadataExtractorFactory.extractMetadata;
keep the same mocking flow (when(...).thenReturn(metadata)) and verify calls as
already implemented.
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)
136-179: Reduce drift risk by centralizing truncation limits

The cleaner hardcodes many max lengths inline. Consider extracting named constants (or reusing a shared sanitizer aligned with FileProcessor) to keep schema-limit updates consistent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`
around lines 136 - 179, The builder in BookdropMetadataService uses many inline
numeric truncation limits (e.g., truncate(extracted.getTitle(), 1000),
truncate(..., 10), truncate(..., 5000)) which risks drift; refactor by
introducing descriptive constants (e.g., MAX_TITLE_LEN, MAX_SUBTITLE_LEN,
MAX_DESCRIPTION_LEN, MAX_ISBN_LEN, etc.) at the top of BookdropMetadataService
or by calling the shared sanitizer utilities used by FileProcessor, then replace
each numeric literal in the BookMetadata.builder() calls with the appropriate
constant or shared sanitizer method to centralize schema limits and keep them
consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 154-163: cleanInitialMetadata drops doubanId and audibleId from
the cleaned object, losing extracted identifiers; update the builder chain in
cleanInitialMetadata to include .doubanId(truncate(extracted.getDoubanId(),
100)) and .audibleId(truncate(extracted.getAudibleId(), 100)) (matching the
existing pattern that uses truncate and extracted getters) so those fields are
preserved during metadata cleaning.

---

Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 136-179: The builder in BookdropMetadataService uses many inline
numeric truncation limits (e.g., truncate(extracted.getTitle(), 1000),
truncate(..., 10), truncate(..., 5000)) which risks drift; refactor by
introducing descriptive constants (e.g., MAX_TITLE_LEN, MAX_SUBTITLE_LEN,
MAX_DESCRIPTION_LEN, MAX_ISBN_LEN, etc.) at the top of BookdropMetadataService
or by calling the shared sanitizer utilities used by FileProcessor, then replace
each numeric literal in the BookMetadata.builder() calls with the appropriate
constant or shared sanitizer method to centralize schema limits and keep them
consistent.

In
`@booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java`:
- Around line 136-161: The test attachInitialMetadata_shouldTruncateFields
should also assert truncation/cleaning for other external ID fields to prevent
regressions: after capturing the BookMetadata via the existing
ArgumentCaptor<BookMetadata> (argument) and obtaining actual, add assertions for
fields like getDoubanId() and getAudibleId() (and any other cleaned ID getters
present on BookMetadata) to verify they match expected truncated/cleaned values
based on the input metadata used in metadataExtractorFactory.extractMetadata;
keep the same mocking flow (when(...).thenReturn(metadata)) and verify calls as
already implemented.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5978bad1-1ffc-4cc2-b428-6c0314cc7b35

📥 Commits

Reviewing files that changed from the base of the PR and between 66dcc44 and c801156.

📒 Files selected for processing (2)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Test Suite / Backend Tests

🧰 Additional context used

📓 Path-based instructions (2)

booklore-api/src/**/*.java

📄 CodeRabbit inference engine (AGENTS.md)

booklore-api/src/**/*.java: Use 4-space indentation in Java files and match surrounding Java style
Prefer constructor injection via Lombok patterns already used in the codebase. Do not introduce @Autowired field injection
Use MapStruct for entity/DTO mapping

Files:

booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

booklore-api/src/test/**/*.java

📄 CodeRabbit inference engine (AGENTS.md)

Prefer focused unit tests; use @SpringBootTest only when the Spring context is required

Files:

booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java

🧠 Learnings (3)

📓 Common learnings

Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.

Learnt from: balazs-szucs
Repo: grimmory-tools/grimmory PR: 372
File: frontend/src/app/features/book/service/app-books-api.service.ts:150-178
Timestamp: 2026-04-04T14:03:28.295Z
Learning: In `frontend/src/app/features/book/service/app-books-api.service.ts`, the `summaryToBook` function intentionally uses `as unknown as Book` to cast a partial object to the `Book` type. The returned object omits some required `Book` fields (e.g. `metadata.bookId`, `pdfProgress.page`) and includes an extra `bookFiles: []` property not in the `Book` interface. This is by design — consumers of `AppBooksApiService.books()` (e.g. `BookCardComponent`) only access the subset of properties that are populated from `AppBookSummary`, so no runtime issues occur. Do not flag this as a type safety error.

📚 Learning: 2026-03-24T18:46:47.249Z

Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.

Applied to files:

booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

📚 Learning: 2026-04-10T08:15:37.436Z

Learnt from: imnotjames
Repo: grimmory-tools/grimmory PR: 449
File: booklore-api/src/main/java/org/booklore/service/book/BookDownloadService.java:139-145
Timestamp: 2026-04-10T08:15:37.436Z
Learning: When using Spring `ContentDisposition.builder(...).filename(name, StandardCharsets.UTF_8).build()` (i.e., explicitly providing UTF-8), the resulting header value should include both the quoted `filename="=?UTF-8?..."` and the RFC 5987 `filename*=` parameters. In this case, any extra ASCII fallback computation (e.g., deriving an ASCII `fallbackFilename` via `NON_ASCII_PATTERN` and calling `.filename(fallbackFilename)`) is likely redundant—prefer calling only `.filename(fallbackName?, StandardCharsets.UTF_8)` as appropriate and let Spring handle the UTF-8 header parameters. Verify by comparing the emitted header for `filename` and `filename*` before deciding to keep an ASCII fallback.

Applied to files:

booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

🔇 Additional comments (2)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)

186-186: Good sanitization hook point

Applying cleanup at Line 186 ensures originalMetadata is normalized before serialization/persistence.

booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java (1)

127-127: Nice improvement to mock stubbing robustness

Using any() here removes brittle coupling to a specific BookMetadata instance and keeps the test focused on behavior.

coderabbitai

🧹 Nitpick comments (1)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)

136-165: Centralize max-length limits to avoid drift.

This block hardcodes many size limits. Consider moving them to shared constants (or a shared sanitizer) so Bookdrop and other ingestion paths stay consistent as schema/provider rules evolve.

Refactor sketch

+private static final int MAX_TITLE_LENGTH = 1000;
+private static final int MAX_SUBTITLE_LENGTH = 1000;
+private static final int MAX_DESCRIPTION_LENGTH = 5000;
+private static final int MAX_PUBLISHER_LENGTH = 1000;
+private static final int MAX_LANGUAGE_LENGTH = 10;
+private static final int MAX_ISBN10_LENGTH = 10;
+private static final int MAX_ISBN13_LENGTH = 13;
+private static final int MAX_ASIN_LENGTH = 10;
+private static final int MAX_EXTERNAL_ID_LENGTH = 100;
...
-                .title(truncate(extracted.getTitle(), 1000))
+                .title(truncate(extracted.getTitle(), MAX_TITLE_LENGTH))
...
-                .isbn10(truncate(extracted.getIsbn10(), 10))
+                .isbn10(truncate(extracted.getIsbn10(), MAX_ISBN10_LENGTH))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`
around lines 136 - 165, The builder in BookdropMetadataService that calls
truncate(...) with many hardcoded numeric limits (e.g., title, subtitle,
description, isbn10, isbn13, goodreadsId, etc.) should use centralized constants
so limits don't drift; create or use a shared constants holder (e.g.,
FieldLengthLimits or MetadataSanitizer.MAX_TITLE_LENGTH) and replace the literal
numbers in BookMetadata.builder(...) truncate calls with those constants, update
the truncate(...) signature if needed to accept the constant, and refactor other
ingestion paths to import the same constants so all providers use the same
max-length rules.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 136-165: The builder in BookdropMetadataService that calls
truncate(...) with many hardcoded numeric limits (e.g., title, subtitle,
description, isbn10, isbn13, goodreadsId, etc.) should use centralized constants
so limits don't drift; create or use a shared constants holder (e.g.,
FieldLengthLimits or MetadataSanitizer.MAX_TITLE_LENGTH) and replace the literal
numbers in BookMetadata.builder(...) truncate calls with those constants, update
the truncate(...) signature if needed to accept the constant, and refactor other
ingestion paths to import the same constants so all providers use the same
max-length rules.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 07826e1b-e19a-4171-9787-cd579dac9019

📥 Commits

Reviewing files that changed from the base of the PR and between c801156 and a84ae6e.

📒 Files selected for processing (1)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test Suite / Frontend Tests
GitHub Check: Test Suite / Backend Tests
GitHub Check: Analyze (java-kotlin)

🧰 Additional context used

📓 Path-based instructions (1)

booklore-api/src/**/*.java

📄 CodeRabbit inference engine (AGENTS.md)

booklore-api/src/**/*.java: Use 4-space indentation in Java files and match surrounding Java style
Prefer constructor injection via Lombok patterns already used in the codebase. Do not introduce @Autowired field injection
Use MapStruct for entity/DTO mapping

Files:

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

🧠 Learnings (4)

📓 Common learnings

Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.

📚 Learning: 2026-03-24T18:46:47.249Z

Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.

Applied to files:

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

📚 Learning: 2026-04-04T15:36:56.558Z

Learnt from: balazs-szucs
Repo: grimmory-tools/grimmory PR: 372
File: booklore-api/src/main/java/org/booklore/app/service/AppBookService.java:357-359
Timestamp: 2026-04-04T15:36:56.558Z
Learning: In `booklore-api/src/main/java/org/booklore/app/service/AppBookService.java`, `shelfId` and `magicShelfId` are mutually exclusive navigation contexts in `getFilterOptions`. A request will never supply both at the same time, so the `else if (shelfId != null)` branch after the `magicBookIds != null` check is intentional and correct — there is no missing shelf-∩-magicShelf intersection case. Do not flag this pattern as a bug.

Applied to files:

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

📚 Learning: 2026-04-10T08:15:37.436Z

Learnt from: imnotjames
Repo: grimmory-tools/grimmory PR: 449
File: booklore-api/src/main/java/org/booklore/service/book/BookDownloadService.java:139-145
Timestamp: 2026-04-10T08:15:37.436Z
Learning: When using Spring `ContentDisposition.builder(...).filename(name, StandardCharsets.UTF_8).build()` (i.e., explicitly providing UTF-8), the resulting header value should include both the quoted `filename="=?UTF-8?..."` and the RFC 5987 `filename*=` parameters. In this case, any extra ASCII fallback computation (e.g., deriving an ASCII `fallbackFilename` via `NON_ASCII_PATTERN` and calling `.filename(fallbackFilename)`) is likely redundant—prefer calling only `.filename(fallbackName?, StandardCharsets.UTF_8)` as appropriate and let Spring handle the UTF-8 header parameters. Verify by comparing the emitted header for `filename` and `filename*` before deciding to keep an ASCII fallback.

Applied to files:

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

🔀 Multi-repo context grimmory-tools/grimmory-docs

[::grimmory-tools/grimmory-docs::] docs/audit-logs.md:123 — notes "Descriptions longer than 1024 characters are truncated." (indicates existing truncation policy for some metadata fields)

[::grimmory-tools/grimmory-docs::] docs/magic-shelf.md:134-136 — documents ISBN-13/ISBN-10 fields and names (isbn13, isbn10); line 205 — lists asin, isbn13, isbn10, language, and many external IDs as recognized metadata fields (shows these identifiers are first-class and surface in UI/filtering)

[::grimmory-tools/grimmory-docs::] docs/metadata/metadata-settings.md:62,75,134 — references provider identifiers including Amazon ASIN and related provider metadata (confirms ASIN is a configured provider identifier)

[::grimmory-tools/grimmory-docs::] docs/metadata/sidecar-files.md:27,41 — examples and mention of external identifiers (ASIN included) in sidecar files (implies persisted metadata flows into file-sidecar usage)

[::grimmory-tools/grimmory-docs::] docs/bookdrop.md & docs/bookdrop-advanced.md (multiple lines) — describe Bookdrop workflow, referencing ASIN and identifier extraction/features and placeholders including {ASIN} (confirms Bookdrop metadata is used in import/finalize flows and filename patterns)

Summary: Documentation shows ASIN, isbn10, isbn13, language and other external IDs are first-class metadata fields used across Bookdrop, sidecar files, filename patterns, and UI filters; there is at least one documented truncation policy (description → 1024 chars). This PR's truncation of initial Bookdrop metadata (ASIN/ISBN/language/etc.) aligns with existing concerns in docs and is relevant to import/finalize flows and sidecar/filename usage.

🔇 Additional comments (2)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (2)

131-165: Good sanitization coverage for the issue path.

Lines 131–165 correctly null-guard and truncate the high-risk metadata fields before persistence, including ASIN/ISBN and external IDs, which matches the Bookdrop finalize failure scenario.

192-192: Nice boundary placement for cleaning.

Line 192 applies cleaning immediately after extraction, so persisted originalMetadata is consistently sanitized for downstream finalize/import flow.

coderabbitai

🧹 Nitpick comments (1)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)

142-164: Centralize truncation limits to avoid drift across metadata flows.

The inline numeric limits are correct but scattered and easy to desync from other sanitization paths (e.g., FileProcessor/schema constraints). Prefer named constants (or a shared metadata sanitizer/constraints class) to keep behavior consistent long-term.

Proposed refactor

+    private static final int MAX_TITLE_LENGTH = 1000;
+    private static final int MAX_SUBTITLE_LENGTH = 1000;
+    private static final int MAX_DESCRIPTION_LENGTH = 5000;
+    private static final int MAX_PUBLISHER_LENGTH = 1000;
+    private static final int MAX_SERIES_NAME_LENGTH = 1000;
+    private static final int MAX_LANGUAGE_LENGTH = 10;
+    private static final int MAX_ISBN10_LENGTH = 10;
+    private static final int MAX_ISBN13_LENGTH = 13;
+    private static final int MAX_ASIN_LENGTH = 10;
+    private static final int MAX_AUDIBLE_ID_LENGTH = 10;
+    private static final int MAX_EXTERNAL_ID_LENGTH = 100;
...
-                .title(truncate(extracted.getTitle(), 1000))
-                .subtitle(truncate(extracted.getSubtitle(), 1000))
-                .description(truncate(extracted.getDescription(), 5000))
-                .publisher(truncate(extracted.getPublisher(), 1000))
+                .title(truncate(extracted.getTitle(), MAX_TITLE_LENGTH))
+                .subtitle(truncate(extracted.getSubtitle(), MAX_SUBTITLE_LENGTH))
+                .description(truncate(extracted.getDescription(), MAX_DESCRIPTION_LENGTH))
+                .publisher(truncate(extracted.getPublisher(), MAX_PUBLISHER_LENGTH))
...
-                .seriesName(truncate(extracted.getSeriesName(), 1000))
-                .language(truncate(extracted.getLanguage(), 10))
+                .seriesName(truncate(extracted.getSeriesName(), MAX_SERIES_NAME_LENGTH))
+                .language(truncate(extracted.getLanguage(), MAX_LANGUAGE_LENGTH))
...
-                .isbn10(truncate(extracted.getIsbn10(), 10))
-                .isbn13(truncate(extracted.getIsbn13(), 13))
+                .isbn10(truncate(extracted.getIsbn10(), MAX_ISBN10_LENGTH))
+                .isbn13(truncate(extracted.getIsbn13(), MAX_ISBN13_LENGTH))
...
-                .asin(truncate(extracted.getAsin(), 10))
-                .audibleId(truncate(extracted.getAudibleId(), 10))
-                .goodreadsId(truncate(extracted.getGoodreadsId(), 100))
+                .asin(truncate(extracted.getAsin(), MAX_ASIN_LENGTH))
+                .audibleId(truncate(extracted.getAudibleId(), MAX_AUDIBLE_ID_LENGTH))
+                .goodreadsId(truncate(extracted.getGoodreadsId(), MAX_EXTERNAL_ID_LENGTH))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`
around lines 142 - 164, Replace the scattered numeric literals used in
truncate(...) calls in BookdropMetadataService (e.g., title, subtitle,
description, publisher, seriesName, language, isbn10, isbn13, asin, audibleId,
goodreadsId, hardcoverId, hardcoverBookId, googleId, comicvineId,
lubimyczytacId, ranobedbId, doubanId) with shared named constants: either add
private static final constants in BookdropMetadataService or, preferably,
reference a new centralized MetadataConstraints (or MetadataSanitizer) class
that exposes descriptive constants like MAX_TITLE_LENGTH,
MAX_DESCRIPTION_LENGTH, MAX_ISBN10_LENGTH, MAX_EXTERNAL_ID_LENGTH,
MAX_LANGUAGE_LENGTH, etc.; update all truncate(...) calls to use those constants
so truncation limits are centralized and consistent with other sanitation/schema
flows (and add unit tests if present to validate the constant values).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 142-164: Replace the scattered numeric literals used in
truncate(...) calls in BookdropMetadataService (e.g., title, subtitle,
description, publisher, seriesName, language, isbn10, isbn13, asin, audibleId,
goodreadsId, hardcoverId, hardcoverBookId, googleId, comicvineId,
lubimyczytacId, ranobedbId, doubanId) with shared named constants: either add
private static final constants in BookdropMetadataService or, preferably,
reference a new centralized MetadataConstraints (or MetadataSanitizer) class
that exposes descriptive constants like MAX_TITLE_LENGTH,
MAX_DESCRIPTION_LENGTH, MAX_ISBN10_LENGTH, MAX_EXTERNAL_ID_LENGTH,
MAX_LANGUAGE_LENGTH, etc.; update all truncate(...) calls to use those constants
so truncation limits are centralized and consistent with other sanitation/schema
flows (and add unit tests if present to validate the constant values).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0e5ec8ea-8e95-4c0e-85a9-d87af2f019af

📥 Commits

Reviewing files that changed from the base of the PR and between a84ae6e and 837e058.

📒 Files selected for processing (1)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Analyze (java-kotlin)
GitHub Check: Test Suite / Frontend Tests
GitHub Check: Test Suite / Backend Tests

🧰 Additional context used

📓 Path-based instructions (1)

booklore-api/src/**/*.java

📄 CodeRabbit inference engine (AGENTS.md)

booklore-api/src/**/*.java: Use 4-space indentation in Java files and match surrounding Java style
Prefer constructor injection via Lombok patterns already used in the codebase. Do not introduce @Autowired field injection
Use MapStruct for entity/DTO mapping

Files:

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

🧠 Learnings (3)

📓 Common learnings

Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.

📚 Learning: 2026-03-24T18:46:47.249Z

Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.

Applied to files:

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

📚 Learning: 2026-04-10T08:15:37.436Z

Learnt from: imnotjames
Repo: grimmory-tools/grimmory PR: 449
File: booklore-api/src/main/java/org/booklore/service/book/BookDownloadService.java:139-145
Timestamp: 2026-04-10T08:15:37.436Z
Learning: When using Spring `ContentDisposition.builder(...).filename(name, StandardCharsets.UTF_8).build()` (i.e., explicitly providing UTF-8), the resulting header value should include both the quoted `filename="=?UTF-8?..."` and the RFC 5987 `filename*=` parameters. In this case, any extra ASCII fallback computation (e.g., deriving an ASCII `fallbackFilename` via `NON_ASCII_PATTERN` and calling `.filename(fallbackFilename)`) is likely redundant—prefer calling only `.filename(fallbackName?, StandardCharsets.UTF_8)` as appropriate and let Spring handle the UTF-8 header parameters. Verify by comparing the emitted header for `filename` and `filename*` before deciding to keep an ASCII fallback.

Applied to files:

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

🔀 Multi-repo context grimmory-tools/grimmory-docs

Linked repositories findings

grimmory-tools/grimmory-docs

docs/tools/audit-logs.md:123 — Documents that "Descriptions longer than 1024 characters are truncated." This shows existing truncation policies for metadata fields. [::grimmory-tools/grimmory-docs::docs/tools/audit-logs.md:123]
docs/metadata/metadata-settings.md:62,64,75,134 — Lists provider identifiers including Amazon ASIN and explicitly names Amazon ASIN among provider identifiers; shows ASIN is a first-class provider field exposed in settings. [::grimmory-tools/grimmory-docs::docs/metadata/metadata-settings.md:62,64,75,134]
docs/metadata/sidecar-files.md:27,41–42,53,57–70 — Sidecar files include external identifiers (ASIN, Goodreads ID, Google ID, etc.) and sidecar behavior is configurable; implying Bookdrop metadata flows into sidecars and persisted metadata truncation could affect exported sidecar content. [::grimmory-tools/grimmory-docs::docs/metadata/sidecar-files.md:27,41-42,53,57-70]
docs/bookdrop.md & docs/bookdrop-advanced.md — (multiple references) Describe the Bookdrop workflow and how extracted metadata (including ASIN and other identifiers) are used during finalize/import flows. This confirms Bookdrop metadata is used in finalize/file renaming and persists into the library. [::grimmory-tools/grimmory-docs::docs/bookdrop.md][::grimmory-tools/grimmory-docs::docs/bookdrop-advanced.md]
docs/metadata/file-naming-patterns.md:3,80–99 — File naming patterns use metadata placeholders (e.g., {ASIN}); truncated or cleaned metadata can change filenames/moves applied during finalize/import. [::grimmory-tools/grimmory-docs::docs/metadata/file-naming-patterns.md:3,80-99]
docs/magic-shelf.md:134–136,205 — Magic Shelf field lists include isbn13, isbn10, and asin; many UI features and filters depend on these metadata keys. Truncating these fields affects discoverability, filters, and shelves using those keys. [::grimmory-tools/grimmory-docs::docs/magic-shelf.md:134-136,205]

Summary assessment: Documentation confirms ASIN/isbn/language and other external IDs are first-class metadata used across Bookdrop finalize flows, sidecar exports, file-naming patterns, UI filters, and Magic Shelves. The PR's behavior to truncate/clean initial Bookdrop metadata before persisting aligns with existing truncation policies and is relevant for preventing DB column-length errors and for downstream behavior (sidecars, filename patterns, shelves). [::grimmory-tools/grimmory-docs::]

🔇 Additional comments (1)

booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)

173-173: Sanitization is applied at the right persistence boundary.

Calling cleanInitialMetadata(...) directly in extractInitialMetadata(...) ensures oversized extracted values are normalized before originalMetadata is serialized/saved. Nice fix for the reported ASIN length failure path.

balazs-szucs

Agreed this much better, thank you!

…ols#462) * fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields

* fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields

…ols#462) * fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields

fix(api): clean bookdrop metadata before persisting

6dd15f9

test: add tests for behaviors

c801156

imnotjames force-pushed the fix/452/bookdrop-clean-metadata branch from f3f522a to c801156 Compare April 11, 2026 01:35

imnotjames marked this pull request as ready for review April 11, 2026 01:37

coderabbitai Bot added backend enhancement labels Apr 11, 2026

coderabbitai Bot reviewed Apr 11, 2026

View reviewed changes

Comment thread booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java

imnotjames added 2 commits April 10, 2026 23:05

fix(api): add missing ratings + IDs

a84ae6e

fix(api): use toBuilder() to reduce risk of dropping fields

837e058

coderabbitai Bot reviewed Apr 11, 2026

View reviewed changes

imnotjames changed the title ~~fix(api): clean bookdrop metadata before persisting~~ fix(bookdrop): clean bookdrop metadata before persisting Apr 11, 2026

balazs-szucs approved these changes Apr 11, 2026

View reviewed changes

balazs-szucs merged commit b045191 into grimmory-tools:develop Apr 11, 2026
15 checks passed

imnotjames deleted the fix/452/bookdrop-clean-metadata branch April 11, 2026 23:13

coderabbitai Bot mentioned this pull request Apr 15, 2026

Finalizer still broken? #521

Open

1 task

coderabbitai Bot mentioned this pull request Apr 22, 2026

ASIN metadata property is too long - Preventing book import #765

Closed

2 tasks

coderabbitai Bot mentioned this pull request Apr 25, 2026

schema-width REFRESH failure + /api/v1/books pagination broken #871

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(bookdrop): clean bookdrop metadata before persisting#462

fix(bookdrop): clean bookdrop metadata before persisting#462
balazs-szucs merged 4 commits into
grimmory-tools:developfrom
imnotjames:fix/452/bookdrop-clean-metadata

imnotjames commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

imnotjames commented Apr 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

balazs-szucs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

imnotjames commented Apr 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

imnotjames commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Linked repositories findings

grimmory-tools/grimmory-docs

Uh oh!

balazs-szucs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

imnotjames commented Apr 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 11, 2026 •

edited

Loading

imnotjames commented Apr 11, 2026 •

edited

Loading