Skip to content

fix(api): use correct field lengths for file processors#459

Merged
balazs-szucs merged 1 commit into
grimmory-tools:developfrom
imnotjames:fix/458/processor-truncation-lengths
Apr 11, 2026
Merged

fix(api): use correct field lengths for file processors#459
balazs-szucs merged 1 commit into
grimmory-tools:developfrom
imnotjames:fix/458/processor-truncation-lengths

Conversation

@imnotjames

@imnotjames imnotjames commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

Description

The various processors truncate extracted metadata fields so they properly fit in database metadata fields.

However, the ASIN, ISBN, and language fields were improperly truncated, and were too long which caused database errors.

This updates the truncation length to 10 for language, 10 for ASIN, 10 for ISBN10, and 13 for ISBN13.

Linked Issue: Fixes #458

Changes

  • Updates truncation lengths for file processors to match BookMetadataEntity field lengths

Summary by CodeRabbit

  • Refactor
    • Standardized metadata field length constraints for ISBN numbers, language codes, and Amazon Standard Identifiers across all supported ebook formats (AZW3, EPUB, MOBI, FB2, CBX, PDF).
    • Enhanced book description field capacity for AZW3 format processing.

@coderabbitai

coderabbitai Bot commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d156a460-a043-441a-8182-628a2884af04

📥 Commits

Reviewing files that changed from the base of the PR and between 66dcc44 and 7d40a63.

📒 Files selected for processing (6)
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Azw3Processor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/CbxProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/EpubProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Fb2Processor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/MobiProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/PdfProcessor.java
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (1)
booklore-api/src/**/*.java

📄 CodeRabbit inference engine (AGENTS.md)

booklore-api/src/**/*.java: Use 4-space indentation in Java files and match surrounding Java style
Prefer constructor injection via Lombok patterns already used in the codebase. Do not introduce @Autowired field injection
Use MapStruct for entity/DTO mapping

Files:

  • booklore-api/src/main/java/org/booklore/service/fileprocessor/EpubProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/CbxProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Fb2Processor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Azw3Processor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/MobiProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/PdfProcessor.java
🧠 Learnings (6)
📓 Common learnings
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.
📚 Learning: 2026-04-02T09:25:37.417Z
Learnt from: balazs-szucs
Repo: grimmory-tools/grimmory PR: 334
File: booklore-api/src/main/java/org/booklore/service/reader/EpubReaderService.java:402-407
Timestamp: 2026-04-02T09:25:37.417Z
Learning: When reviewing epub-related file processing code in this project, remember that grimmory-tools/grimmory uses the `org.grimmory.epub4j` library (from grimmory-tools/epub4j) and that it is a grimmory-maintained fork fundamentally different from `io.documentnode.epub4j`. Do not assume the same API behavior, limitations, or semantics as `io.documentnode.epub4j`. Before flagging issues or proposing changes, verify the current API/capabilities in the grimmory-tools/epub4j repository.

Applied to files:

  • booklore-api/src/main/java/org/booklore/service/fileprocessor/EpubProcessor.java
📚 Learning: 2026-04-02T09:25:48.330Z
Learnt from: balazs-szucs
Repo: grimmory-tools/grimmory PR: 334
File: booklore-api/src/main/java/org/booklore/service/reader/EpubReaderService.java:402-407
Timestamp: 2026-04-02T09:25:48.330Z
Learning: In grimmory-tools/grimmory, before commenting on any file processing code (epub or pdf), always verify the current state of the relevant upstream grimmory libraries: `grimmory-tools/epub4j` (for epub) and `grimmory-tools/PDFium4j` (for pdf). These custom libraries may have different APIs, capabilities, and limitations compared to the third-party libraries they replace. Issues about streaming, buffering, or API surface may need to be filed/addressed in those upstream repos rather than in grimmory itself.

Applied to files:

  • booklore-api/src/main/java/org/booklore/service/fileprocessor/EpubProcessor.java
📚 Learning: 2026-03-24T18:46:47.249Z
Learnt from: pedronave
Repo: grimmory-tools/grimmory PR: 173
File: booklore-api/src/main/java/org/booklore/service/hardcover/HardcoverSyncService.java:188-199
Timestamp: 2026-03-24T18:46:47.249Z
Learning: In grimmory-tools/grimmory (HardcoverSyncService.java), when a `hardcoverBookId` is stored in book metadata but is malformed (non-numeric), the preferred behavior is to return null and skip the sync rather than falling back to ISBN-based lookup. The reasoning is that an explicitly set bookId represents deliberate user intent, and silently resolving via ISBN could sync progress to a different book than intended.

Applied to files:

  • booklore-api/src/main/java/org/booklore/service/fileprocessor/EpubProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Fb2Processor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/MobiProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/PdfProcessor.java
📚 Learning: 2026-04-10T08:15:37.436Z
Learnt from: imnotjames
Repo: grimmory-tools/grimmory PR: 449
File: booklore-api/src/main/java/org/booklore/service/book/BookDownloadService.java:139-145
Timestamp: 2026-04-10T08:15:37.436Z
Learning: When using Spring `ContentDisposition.builder(...).filename(name, StandardCharsets.UTF_8).build()` (i.e., explicitly providing UTF-8), the resulting header value should include both the quoted `filename="=?UTF-8?..."` and the RFC 5987 `filename*=` parameters. In this case, any extra ASCII fallback computation (e.g., deriving an ASCII `fallbackFilename` via `NON_ASCII_PATTERN` and calling `.filename(fallbackFilename)`) is likely redundant—prefer calling only `.filename(fallbackName?, StandardCharsets.UTF_8)` as appropriate and let Spring handle the UTF-8 header parameters. Verify by comparing the emitted header for `filename` and `filename*` before deciding to keep an ASCII fallback.

Applied to files:

  • booklore-api/src/main/java/org/booklore/service/fileprocessor/EpubProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/CbxProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Fb2Processor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Azw3Processor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/MobiProcessor.java
  • booklore-api/src/main/java/org/booklore/service/fileprocessor/PdfProcessor.java
📚 Learning: 2026-04-02T09:25:48.330Z
Learnt from: balazs-szucs
Repo: grimmory-tools/grimmory PR: 334
File: booklore-api/src/main/java/org/booklore/service/reader/EpubReaderService.java:402-407
Timestamp: 2026-04-02T09:25:48.330Z
Learning: In grimmory-tools/grimmory, epub processing uses `org.grimmory.epub4j` from https://github.com/grimmory-tools/epub4j. This is a grimmory-maintained fork and is fundamentally different from `io.documentnode.epub4j`. Do NOT generalize API knowledge, behavior, or limitations of `io.documentnode.epub4j` to the grimmory fork. Before raising review comments on epub-related file processing code, verify the current API and capabilities in the grimmory-tools/epub4j repository.

Applied to files:

  • booklore-api/src/main/java/org/booklore/service/fileprocessor/Azw3Processor.java
🔇 Additional comments (6)
booklore-api/src/main/java/org/booklore/service/fileprocessor/Fb2Processor.java (1)

111-112: Truncation updates are correct for constrained metadata fields.

These limits now match the intended column sizes for ISBN, language, and ASIN, which should prevent the length-related persistence failures described in the issue.

Also applies to: 116-116, 118-118

booklore-api/src/main/java/org/booklore/service/fileprocessor/MobiProcessor.java (1)

116-117: Good fix: field-length truncation is now aligned with storage constraints.

This should stop metadata writes from exceeding column limits for ISBN, language, and ASIN in MOBI imports.

Also applies to: 121-121, 123-123

booklore-api/src/main/java/org/booklore/service/fileprocessor/CbxProcessor.java (1)

148-148: Nice alignment of CBX metadata truncation limits.

The updated boundaries for language, ISBN-10/13, and ASIN look correct and should eliminate the column-length write errors from oversized extracted values.

Also applies to: 151-152, 155-155

booklore-api/src/main/java/org/booklore/service/fileprocessor/EpubProcessor.java (1)

123-124: EPUB truncation changes look correct and focused.

The revised limits for ISBN, language, and ASIN match the expected constrained-field behavior and should prevent persistence failures from long metadata values.

Also applies to: 128-128, 130-130

booklore-api/src/main/java/org/booklore/service/fileprocessor/Azw3Processor.java (1)

109-109: Azw3 metadata truncation updates look solid.

The constrained fields are now correctly bounded, and increasing description to 5000 remains compatible with the TEXT-backed description column.

Also applies to: 115-116, 120-120, 122-122

booklore-api/src/main/java/org/booklore/service/fileprocessor/PdfProcessor.java (1)

133-133: Great catch in PDF processor: truncation now matches exact DB limits.

Applying these bounds before persistence should prevent overflow errors for language, ASIN, and ISBN fields.

Also applies to: 141-141, 165-165, 168-168


📝 Walkthrough

Walkthrough

The PR corrects metadata truncation lengths across 6 file processors (Azw3, Cbx, Epub, Fb2, Mobi, PDF) to align with actual database field constraints. ISBN-13 truncates to 13 characters, ISBN-10 to 10, language to 10, ASIN to 10, and description to 5000 characters, addressing inconsistencies that caused validation failures.

Changes

Cohort / File(s) Summary
Metadata Truncation Limits Alignment
booklore-api/src/main/java/org/booklore/service/fileprocessor/Azw3Processor.java, CbxProcessor.java, EpubProcessor.java, Fb2Processor.java, MobiProcessor.java, PdfProcessor.java
Standardized truncation lengths across processors: ISBN-13 reduced to 13 chars (64→13), ISBN-10 to 10 chars (64→10), language to 10 chars (1000→10), ASIN to 10 chars (20→10). Azw3Processor also increases description to 5000 chars (2000→5000). PdfProcessor adds missing truncation to previously unconstrained fields.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

backend, enhancement

Poem

🐰 Numbers aligned in rabbit-approved measure,
Thirteen for ISBN—now that's a treasure!
Ten for the tongue, fields perfectly snug,
From chaos to order, a metadata hug!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title follows conventional commit format with a clear scope and concise description of the change.
Description check ✅ Passed The description covers the problem, solution, and changes clearly, and includes the linked issue reference as required by the template.
Linked Issues check ✅ Passed All changes directly address issue #458 by correcting truncation lengths across all file processors to match database field limits.
Out of Scope Changes check ✅ Passed All changes are within scope; only metadata truncation lengths are modified without altering any extraction logic or control flow.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@imnotjames imnotjames marked this pull request as ready for review April 11, 2026 01:00

@balazs-szucs balazs-szucs left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@balazs-szucs balazs-szucs merged commit d29e6b9 into grimmory-tools:develop Apr 11, 2026
14 checks passed
zachyale pushed a commit to zachyale/grimmory that referenced this pull request Apr 17, 2026
zachyale pushed a commit to zachyale/grimmory that referenced this pull request Apr 17, 2026
dsmouse pushed a commit to dsmouse/grimmory that referenced this pull request May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Processors have incorrect truncation lengths

2 participants