fix(bookdrop): clean bookdrop metadata before persisting#462
Conversation
📝 WalkthroughWalkthroughIntroduced metadata sanitization in BookdropMetadataService: extracted BookMetadata is passed through a new cleaner that clones and truncates several text and identifier fields (title, subtitle, description, publisher, seriesName, language, ASIN/ISBNs/external IDs) before serialization/persistence. Changes
Sequence Diagram(s)sequenceDiagram
participant Extractor as MetadataExtractor
participant Service as BookdropMetadataService
participant Ser as ObjectMapper
participant DB as Persistence
Extractor->>Service: extractMetadata(file)
Service->>Service: cleanInitialMetadata(extracted) %% truncates fields
Service->>Ser: writeValueAsString(cleanedMetadata)
Service->>DB: attachInitialMetadata(cleanedMetadata)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
f3f522a to
c801156
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java (1)
136-161: Consider extending truncation assertions to all cleaned ID fieldsThe new test is solid for ASIN/ISBN/language. Adding assertions for other cleaned external IDs (e.g.,
doubanId,audibleId) would better protect against metadata-loss regressions in the cleaner.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java` around lines 136 - 161, The test attachInitialMetadata_shouldTruncateFields should also assert truncation/cleaning for other external ID fields to prevent regressions: after capturing the BookMetadata via the existing ArgumentCaptor<BookMetadata> (argument) and obtaining actual, add assertions for fields like getDoubanId() and getAudibleId() (and any other cleaned ID getters present on BookMetadata) to verify they match expected truncated/cleaned values based on the input metadata used in metadataExtractorFactory.extractMetadata; keep the same mocking flow (when(...).thenReturn(metadata)) and verify calls as already implemented.booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)
136-179: Reduce drift risk by centralizing truncation limitsThe cleaner hardcodes many max lengths inline. Consider extracting named constants (or reusing a shared sanitizer aligned with
FileProcessor) to keep schema-limit updates consistent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java` around lines 136 - 179, The builder in BookdropMetadataService uses many inline numeric truncation limits (e.g., truncate(extracted.getTitle(), 1000), truncate(..., 10), truncate(..., 5000)) which risks drift; refactor by introducing descriptive constants (e.g., MAX_TITLE_LEN, MAX_SUBTITLE_LEN, MAX_DESCRIPTION_LEN, MAX_ISBN_LEN, etc.) at the top of BookdropMetadataService or by calling the shared sanitizer utilities used by FileProcessor, then replace each numeric literal in the BookMetadata.builder() calls with the appropriate constant or shared sanitizer method to centralize schema limits and keep them consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 154-163: cleanInitialMetadata drops doubanId and audibleId from
the cleaned object, losing extracted identifiers; update the builder chain in
cleanInitialMetadata to include .doubanId(truncate(extracted.getDoubanId(),
100)) and .audibleId(truncate(extracted.getAudibleId(), 100)) (matching the
existing pattern that uses truncate and extracted getters) so those fields are
preserved during metadata cleaning.
---
Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 136-179: The builder in BookdropMetadataService uses many inline
numeric truncation limits (e.g., truncate(extracted.getTitle(), 1000),
truncate(..., 10), truncate(..., 5000)) which risks drift; refactor by
introducing descriptive constants (e.g., MAX_TITLE_LEN, MAX_SUBTITLE_LEN,
MAX_DESCRIPTION_LEN, MAX_ISBN_LEN, etc.) at the top of BookdropMetadataService
or by calling the shared sanitizer utilities used by FileProcessor, then replace
each numeric literal in the BookMetadata.builder() calls with the appropriate
constant or shared sanitizer method to centralize schema limits and keep them
consistent.
In
`@booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java`:
- Around line 136-161: The test attachInitialMetadata_shouldTruncateFields
should also assert truncation/cleaning for other external ID fields to prevent
regressions: after capturing the BookMetadata via the existing
ArgumentCaptor<BookMetadata> (argument) and obtaining actual, add assertions for
fields like getDoubanId() and getAudibleId() (and any other cleaned ID getters
present on BookMetadata) to verify they match expected truncated/cleaned values
based on the input metadata used in metadataExtractorFactory.extractMetadata;
keep the same mocking flow (when(...).thenReturn(metadata)) and verify calls as
already implemented.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5978bad1-1ffc-4cc2-b428-6c0314cc7b35
📒 Files selected for processing (2)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.javabooklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test Suite / Backend Tests
🧰 Additional context used
📓 Path-based instructions (2)
booklore-api/src/**/*.java
📄 CodeRabbit inference engine (AGENTS.md)
booklore-api/src/**/*.java: Use 4-space indentation in Java files and match surrounding Java style
Prefer constructor injection via Lombok patterns already used in the codebase. Do not introduce@Autowiredfield injection
Use MapStruct for entity/DTO mapping
Files:
booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.javabooklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
booklore-api/src/test/**/*.java
📄 CodeRabbit inference engine (AGENTS.md)
Prefer focused unit tests; use
@SpringBootTestonly when the Spring context is required
Files:
booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java
🧠 Learnings (3)
📓 Common learnings
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.
Learnt from: balazs-szucs
Repo: grimmory-tools/grimmory PR: 372
File: frontend/src/app/features/book/service/app-books-api.service.ts:150-178
Timestamp: 2026-04-04T14:03:28.295Z
Learning: In `frontend/src/app/features/book/service/app-books-api.service.ts`, the `summaryToBook` function intentionally uses `as unknown as Book` to cast a partial object to the `Book` type. The returned object omits some required `Book` fields (e.g. `metadata.bookId`, `pdfProgress.page`) and includes an extra `bookFiles: []` property not in the `Book` interface. This is by design — consumers of `AppBooksApiService.books()` (e.g. `BookCardComponent`) only access the subset of properties that are populated from `AppBookSummary`, so no runtime issues occur. Do not flag this as a type safety error.
📚 Learning: 2026-03-24T18:46:47.249Z
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.
Applied to files:
booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.javabooklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
📚 Learning: 2026-04-10T08:15:37.436Z
Learnt from: imnotjames
Repo: grimmory-tools/grimmory PR: 449
File: booklore-api/src/main/java/org/booklore/service/book/BookDownloadService.java:139-145
Timestamp: 2026-04-10T08:15:37.436Z
Learning: When using Spring `ContentDisposition.builder(...).filename(name, StandardCharsets.UTF_8).build()` (i.e., explicitly providing UTF-8), the resulting header value should include both the quoted `filename="=?UTF-8?..."` and the RFC 5987 `filename*=` parameters. In this case, any extra ASCII fallback computation (e.g., deriving an ASCII `fallbackFilename` via `NON_ASCII_PATTERN` and calling `.filename(fallbackFilename)`) is likely redundant—prefer calling only `.filename(fallbackName?, StandardCharsets.UTF_8)` as appropriate and let Spring handle the UTF-8 header parameters. Verify by comparing the emitted header for `filename` and `filename*` before deciding to keep an ASCII fallback.
Applied to files:
booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.javabooklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
🔇 Additional comments (2)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)
186-186: Good sanitization hook pointApplying cleanup at Line 186 ensures
originalMetadatais normalized before serialization/persistence.booklore-api/src/test/java/org/booklore/service/BookdropMetadataServiceTest.java (1)
127-127: Nice improvement to mock stubbing robustnessUsing
any()here removes brittle coupling to a specificBookMetadatainstance and keeps the test focused on behavior.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)
136-165: Centralize max-length limits to avoid drift.This block hardcodes many size limits. Consider moving them to shared constants (or a shared sanitizer) so Bookdrop and other ingestion paths stay consistent as schema/provider rules evolve.
Refactor sketch
+private static final int MAX_TITLE_LENGTH = 1000; +private static final int MAX_SUBTITLE_LENGTH = 1000; +private static final int MAX_DESCRIPTION_LENGTH = 5000; +private static final int MAX_PUBLISHER_LENGTH = 1000; +private static final int MAX_LANGUAGE_LENGTH = 10; +private static final int MAX_ISBN10_LENGTH = 10; +private static final int MAX_ISBN13_LENGTH = 13; +private static final int MAX_ASIN_LENGTH = 10; +private static final int MAX_EXTERNAL_ID_LENGTH = 100; ... - .title(truncate(extracted.getTitle(), 1000)) + .title(truncate(extracted.getTitle(), MAX_TITLE_LENGTH)) ... - .isbn10(truncate(extracted.getIsbn10(), 10)) + .isbn10(truncate(extracted.getIsbn10(), MAX_ISBN10_LENGTH))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java` around lines 136 - 165, The builder in BookdropMetadataService that calls truncate(...) with many hardcoded numeric limits (e.g., title, subtitle, description, isbn10, isbn13, goodreadsId, etc.) should use centralized constants so limits don't drift; create or use a shared constants holder (e.g., FieldLengthLimits or MetadataSanitizer.MAX_TITLE_LENGTH) and replace the literal numbers in BookMetadata.builder(...) truncate calls with those constants, update the truncate(...) signature if needed to accept the constant, and refactor other ingestion paths to import the same constants so all providers use the same max-length rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 136-165: The builder in BookdropMetadataService that calls
truncate(...) with many hardcoded numeric limits (e.g., title, subtitle,
description, isbn10, isbn13, goodreadsId, etc.) should use centralized constants
so limits don't drift; create or use a shared constants holder (e.g.,
FieldLengthLimits or MetadataSanitizer.MAX_TITLE_LENGTH) and replace the literal
numbers in BookMetadata.builder(...) truncate calls with those constants, update
the truncate(...) signature if needed to accept the constant, and refactor other
ingestion paths to import the same constants so all providers use the same
max-length rules.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 07826e1b-e19a-4171-9787-cd579dac9019
📒 Files selected for processing (1)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test Suite / Frontend Tests
- GitHub Check: Test Suite / Backend Tests
- GitHub Check: Analyze (java-kotlin)
🧰 Additional context used
📓 Path-based instructions (1)
booklore-api/src/**/*.java
📄 CodeRabbit inference engine (AGENTS.md)
booklore-api/src/**/*.java: Use 4-space indentation in Java files and match surrounding Java style
Prefer constructor injection via Lombok patterns already used in the codebase. Do not introduce@Autowiredfield injection
Use MapStruct for entity/DTO mapping
Files:
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.
📚 Learning: 2026-03-24T18:46:47.249Z
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.
Applied to files:
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
📚 Learning: 2026-04-04T15:36:56.558Z
Learnt from: balazs-szucs
Repo: grimmory-tools/grimmory PR: 372
File: booklore-api/src/main/java/org/booklore/app/service/AppBookService.java:357-359
Timestamp: 2026-04-04T15:36:56.558Z
Learning: In `booklore-api/src/main/java/org/booklore/app/service/AppBookService.java`, `shelfId` and `magicShelfId` are mutually exclusive navigation contexts in `getFilterOptions`. A request will never supply both at the same time, so the `else if (shelfId != null)` branch after the `magicBookIds != null` check is intentional and correct — there is no missing shelf-∩-magicShelf intersection case. Do not flag this pattern as a bug.
Applied to files:
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
📚 Learning: 2026-04-10T08:15:37.436Z
Learnt from: imnotjames
Repo: grimmory-tools/grimmory PR: 449
File: booklore-api/src/main/java/org/booklore/service/book/BookDownloadService.java:139-145
Timestamp: 2026-04-10T08:15:37.436Z
Learning: When using Spring `ContentDisposition.builder(...).filename(name, StandardCharsets.UTF_8).build()` (i.e., explicitly providing UTF-8), the resulting header value should include both the quoted `filename="=?UTF-8?..."` and the RFC 5987 `filename*=` parameters. In this case, any extra ASCII fallback computation (e.g., deriving an ASCII `fallbackFilename` via `NON_ASCII_PATTERN` and calling `.filename(fallbackFilename)`) is likely redundant—prefer calling only `.filename(fallbackName?, StandardCharsets.UTF_8)` as appropriate and let Spring handle the UTF-8 header parameters. Verify by comparing the emitted header for `filename` and `filename*` before deciding to keep an ASCII fallback.
Applied to files:
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
🔀 Multi-repo context grimmory-tools/grimmory-docs
[::grimmory-tools/grimmory-docs::] docs/audit-logs.md:123 — notes "Descriptions longer than 1024 characters are truncated." (indicates existing truncation policy for some metadata fields)
[::grimmory-tools/grimmory-docs::] docs/magic-shelf.md:134-136 — documents ISBN-13/ISBN-10 fields and names (isbn13, isbn10); line 205 — lists asin, isbn13, isbn10, language, and many external IDs as recognized metadata fields (shows these identifiers are first-class and surface in UI/filtering)
[::grimmory-tools/grimmory-docs::] docs/metadata/metadata-settings.md:62,75,134 — references provider identifiers including Amazon ASIN and related provider metadata (confirms ASIN is a configured provider identifier)
[::grimmory-tools/grimmory-docs::] docs/metadata/sidecar-files.md:27,41 — examples and mention of external identifiers (ASIN included) in sidecar files (implies persisted metadata flows into file-sidecar usage)
[::grimmory-tools/grimmory-docs::] docs/bookdrop.md & docs/bookdrop-advanced.md (multiple lines) — describe Bookdrop workflow, referencing ASIN and identifier extraction/features and placeholders including {ASIN} (confirms Bookdrop metadata is used in import/finalize flows and filename patterns)
Summary: Documentation shows ASIN, isbn10, isbn13, language and other external IDs are first-class metadata fields used across Bookdrop, sidecar files, filename patterns, and UI filters; there is at least one documented truncation policy (description → 1024 chars). This PR's truncation of initial Bookdrop metadata (ASIN/ISBN/language/etc.) aligns with existing concerns in docs and is relevant to import/finalize flows and sidecar/filename usage.
🔇 Additional comments (2)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (2)
131-165: Good sanitization coverage for the issue path.Lines 131–165 correctly null-guard and truncate the high-risk metadata fields before persistence, including ASIN/ISBN and external IDs, which matches the Bookdrop finalize failure scenario.
192-192: Nice boundary placement for cleaning.Line 192 applies cleaning immediately after extraction, so persisted
originalMetadatais consistently sanitized for downstream finalize/import flow.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)
142-164: Centralize truncation limits to avoid drift across metadata flows.The inline numeric limits are correct but scattered and easy to desync from other sanitization paths (e.g., FileProcessor/schema constraints). Prefer named constants (or a shared metadata sanitizer/constraints class) to keep behavior consistent long-term.
Proposed refactor
+ private static final int MAX_TITLE_LENGTH = 1000; + private static final int MAX_SUBTITLE_LENGTH = 1000; + private static final int MAX_DESCRIPTION_LENGTH = 5000; + private static final int MAX_PUBLISHER_LENGTH = 1000; + private static final int MAX_SERIES_NAME_LENGTH = 1000; + private static final int MAX_LANGUAGE_LENGTH = 10; + private static final int MAX_ISBN10_LENGTH = 10; + private static final int MAX_ISBN13_LENGTH = 13; + private static final int MAX_ASIN_LENGTH = 10; + private static final int MAX_AUDIBLE_ID_LENGTH = 10; + private static final int MAX_EXTERNAL_ID_LENGTH = 100; ... - .title(truncate(extracted.getTitle(), 1000)) - .subtitle(truncate(extracted.getSubtitle(), 1000)) - .description(truncate(extracted.getDescription(), 5000)) - .publisher(truncate(extracted.getPublisher(), 1000)) + .title(truncate(extracted.getTitle(), MAX_TITLE_LENGTH)) + .subtitle(truncate(extracted.getSubtitle(), MAX_SUBTITLE_LENGTH)) + .description(truncate(extracted.getDescription(), MAX_DESCRIPTION_LENGTH)) + .publisher(truncate(extracted.getPublisher(), MAX_PUBLISHER_LENGTH)) ... - .seriesName(truncate(extracted.getSeriesName(), 1000)) - .language(truncate(extracted.getLanguage(), 10)) + .seriesName(truncate(extracted.getSeriesName(), MAX_SERIES_NAME_LENGTH)) + .language(truncate(extracted.getLanguage(), MAX_LANGUAGE_LENGTH)) ... - .isbn10(truncate(extracted.getIsbn10(), 10)) - .isbn13(truncate(extracted.getIsbn13(), 13)) + .isbn10(truncate(extracted.getIsbn10(), MAX_ISBN10_LENGTH)) + .isbn13(truncate(extracted.getIsbn13(), MAX_ISBN13_LENGTH)) ... - .asin(truncate(extracted.getAsin(), 10)) - .audibleId(truncate(extracted.getAudibleId(), 10)) - .goodreadsId(truncate(extracted.getGoodreadsId(), 100)) + .asin(truncate(extracted.getAsin(), MAX_ASIN_LENGTH)) + .audibleId(truncate(extracted.getAudibleId(), MAX_AUDIBLE_ID_LENGTH)) + .goodreadsId(truncate(extracted.getGoodreadsId(), MAX_EXTERNAL_ID_LENGTH))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java` around lines 142 - 164, Replace the scattered numeric literals used in truncate(...) calls in BookdropMetadataService (e.g., title, subtitle, description, publisher, seriesName, language, isbn10, isbn13, asin, audibleId, goodreadsId, hardcoverId, hardcoverBookId, googleId, comicvineId, lubimyczytacId, ranobedbId, doubanId) with shared named constants: either add private static final constants in BookdropMetadataService or, preferably, reference a new centralized MetadataConstraints (or MetadataSanitizer) class that exposes descriptive constants like MAX_TITLE_LENGTH, MAX_DESCRIPTION_LENGTH, MAX_ISBN10_LENGTH, MAX_EXTERNAL_ID_LENGTH, MAX_LANGUAGE_LENGTH, etc.; update all truncate(...) calls to use those constants so truncation limits are centralized and consistent with other sanitation/schema flows (and add unit tests if present to validate the constant values).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java`:
- Around line 142-164: Replace the scattered numeric literals used in
truncate(...) calls in BookdropMetadataService (e.g., title, subtitle,
description, publisher, seriesName, language, isbn10, isbn13, asin, audibleId,
goodreadsId, hardcoverId, hardcoverBookId, googleId, comicvineId,
lubimyczytacId, ranobedbId, doubanId) with shared named constants: either add
private static final constants in BookdropMetadataService or, preferably,
reference a new centralized MetadataConstraints (or MetadataSanitizer) class
that exposes descriptive constants like MAX_TITLE_LENGTH,
MAX_DESCRIPTION_LENGTH, MAX_ISBN10_LENGTH, MAX_EXTERNAL_ID_LENGTH,
MAX_LANGUAGE_LENGTH, etc.; update all truncate(...) calls to use those constants
so truncation limits are centralized and consistent with other sanitation/schema
flows (and add unit tests if present to validate the constant values).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0e5ec8ea-8e95-4c0e-85a9-d87af2f019af
📒 Files selected for processing (1)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Analyze (java-kotlin)
- GitHub Check: Test Suite / Frontend Tests
- GitHub Check: Test Suite / Backend Tests
🧰 Additional context used
📓 Path-based instructions (1)
booklore-api/src/**/*.java
📄 CodeRabbit inference engine (AGENTS.md)
booklore-api/src/**/*.java: Use 4-space indentation in Java files and match surrounding Java style
Prefer constructor injection via Lombok patterns already used in the codebase. Do not introduce@Autowiredfield injection
Use MapStruct for entity/DTO mapping
Files:
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
🧠 Learnings (3)
📓 Common learnings
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.
📚 Learning: 2026-03-24T18:46:47.249Z
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.
Applied to files:
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
📚 Learning: 2026-04-10T08:15:37.436Z
Learnt from: imnotjames
Repo: grimmory-tools/grimmory PR: 449
File: booklore-api/src/main/java/org/booklore/service/book/BookDownloadService.java:139-145
Timestamp: 2026-04-10T08:15:37.436Z
Learning: When using Spring `ContentDisposition.builder(...).filename(name, StandardCharsets.UTF_8).build()` (i.e., explicitly providing UTF-8), the resulting header value should include both the quoted `filename="=?UTF-8?..."` and the RFC 5987 `filename*=` parameters. In this case, any extra ASCII fallback computation (e.g., deriving an ASCII `fallbackFilename` via `NON_ASCII_PATTERN` and calling `.filename(fallbackFilename)`) is likely redundant—prefer calling only `.filename(fallbackName?, StandardCharsets.UTF_8)` as appropriate and let Spring handle the UTF-8 header parameters. Verify by comparing the emitted header for `filename` and `filename*` before deciding to keep an ASCII fallback.
Applied to files:
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java
🔀 Multi-repo context grimmory-tools/grimmory-docs
Linked repositories findings
grimmory-tools/grimmory-docs
-
docs/tools/audit-logs.md:123 — Documents that "Descriptions longer than 1024 characters are truncated." This shows existing truncation policies for metadata fields. [::grimmory-tools/grimmory-docs::docs/tools/audit-logs.md:123]
-
docs/metadata/metadata-settings.md:62,64,75,134 — Lists provider identifiers including Amazon ASIN and explicitly names Amazon ASIN among provider identifiers; shows ASIN is a first-class provider field exposed in settings. [::grimmory-tools/grimmory-docs::docs/metadata/metadata-settings.md:62,64,75,134]
-
docs/metadata/sidecar-files.md:27,41–42,53,57–70 — Sidecar files include external identifiers (ASIN, Goodreads ID, Google ID, etc.) and sidecar behavior is configurable; implying Bookdrop metadata flows into sidecars and persisted metadata truncation could affect exported sidecar content. [::grimmory-tools/grimmory-docs::docs/metadata/sidecar-files.md:27,41-42,53,57-70]
-
docs/bookdrop.md & docs/bookdrop-advanced.md — (multiple references) Describe the Bookdrop workflow and how extracted metadata (including ASIN and other identifiers) are used during finalize/import flows. This confirms Bookdrop metadata is used in finalize/file renaming and persists into the library. [::grimmory-tools/grimmory-docs::docs/bookdrop.md][::grimmory-tools/grimmory-docs::docs/bookdrop-advanced.md]
-
docs/metadata/file-naming-patterns.md:3,80–99 — File naming patterns use metadata placeholders (e.g., {ASIN}); truncated or cleaned metadata can change filenames/moves applied during finalize/import. [::grimmory-tools/grimmory-docs::docs/metadata/file-naming-patterns.md:3,80-99]
-
docs/magic-shelf.md:134–136,205 — Magic Shelf field lists include isbn13, isbn10, and asin; many UI features and filters depend on these metadata keys. Truncating these fields affects discoverability, filters, and shelves using those keys. [::grimmory-tools/grimmory-docs::docs/magic-shelf.md:134-136,205]
Summary assessment: Documentation confirms ASIN/isbn/language and other external IDs are first-class metadata used across Bookdrop finalize flows, sidecar exports, file-naming patterns, UI filters, and Magic Shelves. The PR's behavior to truncate/clean initial Bookdrop metadata before persisting aligns with existing truncation policies and is relevant for preventing DB column-length errors and for downstream behavior (sidecars, filename patterns, shelves). [::grimmory-tools/grimmory-docs::]
🔇 Additional comments (1)
booklore-api/src/main/java/org/booklore/service/bookdrop/BookdropMetadataService.java (1)
173-173: Sanitization is applied at the right persistence boundary.Calling
cleanInitialMetadata(...)directly inextractInitialMetadata(...)ensures oversized extracted values are normalized beforeoriginalMetadatais serialized/saved. Nice fix for the reported ASIN length failure path.
balazs-szucs
left a comment
There was a problem hiding this comment.
Agreed this much better, thank you!
…ols#462) * fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields
…ols#462) * fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields
* fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields
* fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields
…ols#462) * fix(api): clean bookdrop metadata before persisting * test: add tests for behaviors * fix(api): add missing ratings + IDs * fix(api): use `toBuilder()` to reduce risk of dropping fields

Description
updates the bookdrop initial metadata to truncate values - similar to the
FileProcessor- so that data suggested in bookdrop metadata are of valid lengthsLinked Issue: Fixes #452
Changes
cleanInitialMetadatamethod for truncating initialSummary by CodeRabbit