Fixes Epub.py and NovelFire.py by TheMr-Fool · Pull Request #2993 · lncrawl/lightnovel-crawler

TheMr-Fool · 2026-05-22T21:31:31Z

Removed the # serial number heading from chapters that already have a number in their title. Specifically:
The original code always added

#{chapter.serial}

above every chapter title. Changeged it so it only adds that line if the chapter title contains no numbers. So:

"Chapter 1 Sunny" → has a number → no #1 added
"Sunny" → no number → #1 gets added

That way sites where the title already includes the chapter number won't get the duplicate #1, but sites where the title is just a plain name still get the serial number shown.

Actual fix to NovelFire now removes duplicate titles (won't work if the title repeats 3 times(tested); the last code only worked for some

Added logic to remove chapter number and duplicate title from chapter content.

Removed the chapter serial display from the HTML output.

Refactor NovelFireCrawler to streamline chapter downloading and novel information extraction.

Reordered import statement and added serial heading to chapter content.

Added a normalization function to standardize text for fuzzy matching, improving chapter title comparison.

Removed the _normalize function and its usage for chapter title normalization.

Refactor download_chapter_body to remove leading chapter titles.

Refactor download_chapter_body to improve header handling and add regex checks for chapter titles.

dipu-bd · 2026-05-23T20:59:51Z

+    serial_heading = (
+        ""
+        if re.search(r"\d", chapter.title)
+        else f'<h4 style="opacity: 0.8">#{chapter.serial}</h4>'


you can set aria-hidden="true" attribute in h4 intead of hiding it. since your case is just to not let tts speak it out loud.

serial_heading = (
""
if re.search(r"\d", chapter.title)
else f'
#{chapter.serial}
'
)

i want to show the serial even if chapter title contains it. otherwise this breaks consistency. some chapter will have the serial, some won't.

Got ya that's fine. I can always just edit them out on my on

This has been stale for a while. As I have requested, keep the serial but put aria-hidden="true" attribute in the h4 tag.

Or, you can remove changes to the file, and we can merge the novelfire fix

TheMr-Fool

done

* Enhance chapter body download by cleaning content Added logic to remove chapter number and duplicate title from chapter content. * Update novelfire.py * Update novelfire.py * Update novelfire.py * Remove chapter serial from EPUB header * Update novelfire.py * Remove chapter serial from HTML output Removed the chapter serial display from the HTML output. * Update novelfire.py Refactor NovelFireCrawler to streamline chapter downloading and novel information extraction. * Update novelfire.py * Update chapter title removal logic in download_chapter_body * Update chapter title removal to handle h4 tags * Refactor epub.py for import order and chapter heading Reordered import statement and added serial heading to chapter content. * Implement text normalization for chapter title matching Added a normalization function to standardize text for fuzzy matching, improving chapter title comparison. * Remove unused _normalize function and related code Removed the _normalize function and its usage for chapter title normalization. * Update novelfire.py * Refactor chapter body download function Refactor download_chapter_body to remove leading chapter titles. * Update novelfire.py * Refactor download_chapter_body for header extraction Refactor download_chapter_body to improve header handling and add regex checks for chapter titles. * Fix comments and update logger info formatting * Refactor serial heading logic in epub.py * Update epub.py * Update novelfire.py * Update novelfire.py * Update novelfire.py * Update epub.py --------- Co-authored-by: Sudipto Chandra <dipu.sudipta@gmail.com>

TheMr-Fool and others added 19 commits May 21, 2026 17:41

Enhance chapter body download by cleaning content

cb39afb

Added logic to remove chapter number and duplicate title from chapter content.

Update novelfire.py

2cda66f

Update novelfire.py

a4b2914

Update novelfire.py

dac02f7

Remove chapter serial from EPUB header

7a38573

Update novelfire.py

4776260

Remove chapter serial from HTML output

6ce0622

Removed the chapter serial display from the HTML output.

Update novelfire.py

3a5f892

Refactor NovelFireCrawler to streamline chapter downloading and novel information extraction.

Update novelfire.py

7ed8d18

Update chapter title removal logic in download_chapter_body

cbd2802

Update chapter title removal to handle h4 tags

e5e84f5

Refactor epub.py for import order and chapter heading

7714dae

Reordered import statement and added serial heading to chapter content.

Implement text normalization for chapter title matching

dc51bae

Added a normalization function to standardize text for fuzzy matching, improving chapter title comparison.

Remove unused _normalize function and related code

d064e1d

Removed the _normalize function and its usage for chapter title normalization.

Update novelfire.py

9153fb9

Refactor chapter body download function

5b2b927

Refactor download_chapter_body to remove leading chapter titles.

Update novelfire.py

3ae7d8f

Refactor download_chapter_body for header extraction

45b10a8

Refactor download_chapter_body to improve header handling and add regex checks for chapter titles.

Merge branch 'dev' into dev

f39bce3

dipu-bd requested changes May 23, 2026

View reviewed changes

dipu-bd force-pushed the dev branch from 795de41 to bac3b0d Compare May 24, 2026 20:57

Fix comments and update logger info formatting

18cfb6f

TheMr-Fool requested a review from dipu-bd May 25, 2026 21:39

dipu-bd requested changes May 26, 2026

View reviewed changes

Refactor serial heading logic in epub.py

2e772d4

TheMr-Fool commented May 31, 2026

View reviewed changes

dipu-bd added 4 commits May 31, 2026 09:52

Update epub.py

757a46d

Update novelfire.py

3ce6122

Update novelfire.py

bd93c23

Update novelfire.py

4950a5e

Update epub.py

0e1c976

dipu-bd merged commit 4870870 into lncrawl:dev May 31, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes Epub.py and NovelFire.py#2993

Fixes Epub.py and NovelFire.py#2993
dipu-bd merged 26 commits into
lncrawl:devfrom
TheMr-Fool:dev

TheMr-Fool commented May 22, 2026 •

edited

Loading

Uh oh!

dipu-bd May 23, 2026

Uh oh!

TheMr-Fool May 25, 2026

Uh oh!

dipu-bd May 26, 2026

Uh oh!

TheMr-Fool May 26, 2026

Uh oh!

dipu-bd May 30, 2026 •

edited

Loading

Uh oh!

TheMr-Fool May 31, 2026

Uh oh!

TheMr-Fool left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

TheMr-Fool commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

#{chapter.serial}

Uh oh!

dipu-bd May 23, 2026

Choose a reason for hiding this comment

Uh oh!

TheMr-Fool May 25, 2026

Choose a reason for hiding this comment

#{chapter.serial}

Uh oh!

dipu-bd May 26, 2026

Choose a reason for hiding this comment

Uh oh!

TheMr-Fool May 26, 2026

Choose a reason for hiding this comment

Uh oh!

dipu-bd May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheMr-Fool May 31, 2026

Choose a reason for hiding this comment

Uh oh!

TheMr-Fool left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TheMr-Fool commented May 22, 2026 •

edited

Loading

dipu-bd May 30, 2026 •

edited

Loading