Releases · lncrawl/lightnovel-crawler

Added

Job notifications — JobNotificationService dispatches email on job state changes (pending → running → success/failure) via a background TaskManager; triggered from handler helpers (_set_running, _set_success, etc.)
Docker healthcheck — server container now exposes a /health probe

Changed

Job runner refactored into typed handlers — JobRunner now dispatches via a _HANDLER_REGISTRY of BaseHandler/BatchHandler subclasses; each job type has its own module under scheduler/handlers/
Web app synced before Docker build — lncrawl-web artifacts are pulled in as part of the Docker build step
crawler_version stamped on novel/chapter updates — upserts now use a merge strategy to preserve existing data

Fixed

Server hangup — root cause of hang addressed (event lock contention / crawler resource leak)
Server crash — crawler resource leak on shutdown fixed; Docker healthcheck added
crawl.py (#3030) — regression in CLI crawl flow corrected
Torproxy — re-enabled after an unintended regression

Full Changelog: v4.7.0...v4.8.0

@GabrielCWT

Added

Background search jobs — novel search is now a proper background job with two new JobType values:
- SEARCH_SOURCE — searches a single crawlable source; trigger via POST /api/job/create/search-sources?domain=…
- SEARCH_ALL_SOURCES — fans out across every searchable source, spawning one SEARCH_SOURCE child per source; idempotent on retry
- JobRunner handles execution: results stored in job.extra, matched URLs create a NOVEL_BATCH child job
- New Alembic migration (add_jobtype) for PostgreSQL compatibility
PAUSED job status — new JobStatus.PAUSED enum value for finer job lifecycle control
Per-tier search-job rate limiting — BASIC users are capped at 1 concurrent search while the general active-job quota remains independent; search query length validated (2–50 chars); results sorted by match ratio
NovelFire search — SEARCH capability added to NovelFireCrawler (#3009)

Changed

Removed vendored lncrawl/cloudscraper — the embedded Cloudflare-bypass fork (v1/v2/v3 handlers, captcha integrations, JS interpreters, 7 913-line browsers.json) has been removed; HTTP scraping is now delegated to the lncrawl-scraper package
JavaScript engine replaced: PyExecJs → quickjs → exejs — lighter dependency, no Node.js or external runtime required
Proxy support in scraper — the lncrawl-scraper integration now supports proxies; build-essentials added to Docker base image (#3014)
BrowserTemplate merged into SoupTemplate — BrowserTemplate is integrated directly into the soup template hierarchy rather than being a standalone class; all browser-based sources refactored accordingly
Job service hardening — event locking and improved update logic in JobService; request timeouts in the scraper adjusted
Docker improvements — faster image builds; updated compose.yml and server-compose files; fixed unintended root access in server-compose
truyenfull: updated domain and search URL; base_url changed to a list to support multiple domains (#3010)

Fixed

Security — path-traversal / static-file exposure vulnerability fixed in app.py and staticfiles.py (#3005)
katreadingcafe — chapter link validation logic corrected to filter out non-chapter URLs (#3026)
Race condition — parallel search result aggregation could yield inconsistent data under concurrent writes; fixed with proper locking
EPUB + NovelFire (#2993):
- Duplicate chapter title and serial number removed from chapter body content
- download_chapter_body header extraction improved (regex + text normalisation)
- EPUB serial heading logic refactored
Source loading on restart — a failure loading one source no longer aborts the full reload cycle
Cover download — full error stack trace suppressed for non-critical cover fetch failures
PyInstaller packaging (setup_pyi) — fixed a regression in frozen-binary builds

New Contributors

@GabrielCWT made their first contribution in #3009
@augustanational made their first contribution in #3010
@templeofshadow made their first contribution in #3026

Full Changelog: v4.6.0...v4.7.0

What's Changed

New Features

Novel Recommendations — the server now suggests related novels based on what you're reading
Machine Translation — full translation service with multiple backends (Bing, Google, Lingva, Baidu) with automatic failover; translates chapter content, chapter titles, and artifacts (EPUB/etc.)
Granular translation job types — translation tasks are now split per-resource (chapter, volume, title) instead of one monolithic TRANSLATION job, giving finer progress tracking
Referral / invite system — users can invite others via email with a referral link
Expanded browser detection — Brave, Vivaldi, Yandex, and Whale are now recognized for app-mode launching alongside Chrome/Edge
More supported translation languages

Improvements

Browser automation migrated from Selenium to nodriver for more reliable JS-rendered site scraping
Switched to a custom caching layer instead of cachetools for better control
Announcement banners improved in the web UI
Chapter body cleaning improved when downloading
User activity tracking added (page visits, static file downloads)
Webview fallback now shows just the terminal when no app-mode browser is found
Tightened API access control; auth guards now use Security() instead of Depends()
Removed initial content when a language is pre-defined
Invitation email subject line updated

Bug Fixes

Fixed SQLite compatibility issue with migrations (batch_alter_table)
Fixed Calibre-based artifact generation when using translations
Fixed searching regression
Fixed chapter fetch/translate functions not passing user ID correctly
Fixed select_descendants typo in security module (#2966)
Fixed invalid URL exceptions crashing fetch_chapter and fetch_image
Fixed ensure_load crashing when sync thread was already cleaned up
Fixed app startup issues

Source Updates

wtr-lab.com — multiple fixes and updates
novelfire.py — several iterative fixes
Chapter title tag removal extended to <h4> elements
More sources flagged as rejected/inactive in the index

Internal / Infrastructure

lncrawl-web is no longer a git submodule; web build artifacts are bundled directly
Removed deprecated fetch-novel API endpoint (replaced by fetch-novels)
Python 3.15 excluded from psycopg test matrix (not yet supported upstream)
server-compose.yml updated

Full diff: v4.5.0...v4.6.0

What's Changed

Bug Fixes

Fix crash when downloading novels with more than 9 volumes (#2970)
Fix artifact download failing with 400 Bad Request when filename contains % (#2963)
Fix PostgreSQL database connection broken since v4.2.1 (#2981)
Fix storage path directory not being created before writing URL in _build_url
Add MIME type handling for file responses in the web server
Fix browser detection on Flatpak environments

New Features

Windows installer: Added Inno Setup-based installer (lncrawl.exe) for proper install/uninstall on Windows
Fallback browser window: When Chrome/Edge is not found, a tkinter window with the app icon is used as fallback
Faster Windows startup: Switched to --onedir mode on Windows (vs --onefile on Mac/Linux) for quicker launch
Added explicit app subcommand to CLI for launching the webview directly
Improved URL building in webview server

New Sources

Added novelfrance.fr (#2946)

Updated Sources

Updated wattpad.com (#2983)

Internal Changes

Refactored LSP session management and source synchronization logic
Enhanced LSP configuration and logging; updated dependencies
Fixed ruff format command syntax in lint workflow

Full Changelog: v4.4.0...v4.5.0

What's Changed

New Features

LSP server: Implemented a built-in Python Language Server (pylsp) for source code editing, with improved readiness checks and restart logic
Source management API: Added API endpoints for source code retrieval, management, and live testing directly from the web UI
GitHub integration: Added GitHub token management and enhanced GitHubClient for fetching/editing remote source files; added remote edit link per source
Source testing for admins: Admin role check and expanded source testing functionality; non-admins receive a proper error when attempting to run modified source code
Domain endpoint: New endpoint to retrieve a source item by domain; extract_host utility for reliable domain extraction in novel creation
PageSoup.prettify: Added prettify method to PageSoup for cleaner debug output in crawler tests
dev Makefile target: New make dev target added; watch dependency updated
Pyright type checking: Added Pyright static analysis to the lint CI workflow

Bug Fixes

Fixed app launch inside the webview (#2942 — also fixes webview not starting on Windows, and UV path in Makefile on Windows)
Fixed empty chapter bodies produced by the NovelFull template
Fixed novelbin and related NovelFull-based sources
Fixed chapter list and chapter body parsing in the novelight source
Fixed executor initialization in CentralNovelCrawler
Fixed port extraction in extract_host when the port value is None

Improvements

Faster startup: Refactored initialization path to make CLI/server startup significantly faster
Chapter sync: ChapterService.sync now preserves is_done flag and merges extras rather than overwriting
BrowserTemplate: Fallback browser now runs in headless mode
TaskManager: Refactored to manage progress bars internally; removed unused proxy module
EPUB metadata: Corrected group position handling in EPUB metadata (#2905)
TextCleaner / Webfic: Enhanced text cleaning and Webfic source processing
Crawler versioning: Updated versioning logic; process_info now captures commit time
PR models: Refactored PR creation models, added PR fetch endpoint, improved error handling and formatting
Type hints: Improved type hint consistency across models, config, json_tools, and scripts
User index: Optimized user index file handling
History limit: Added configurable history limit to project setup

Source Updates

royalroad.com — updated (×2)
novelcool.com — updated (×2)
freewebnovel — updated
asianovel.net — updated

Dependency Updates

pyease-grpc → 1.8.0
mako → 1.3.12 (#2950)
Added urllib3 version constraint
Updated Dockerfile to sync all extras and groups during build
Updated license metadata in pyproject.toml

Full Changelog: v4.3.2...v4.4.0

Full Changelog: v4.3.1...v4.3.2

Updated the version from 4.3.0 to 4.3.1.
Modified the WebView initialization to persist cookies and storage under the APP_DIR, improving user experience and data management.

Full Changelog: v4.3.0...v4.3.1

@dipu-bd

What's Changed

Refactor core components and enhance crawling functionality by @dipu-bd in #2910
Bump mako from 1.3.10 to 1.3.11 by @dependabot[bot] in #2927
Bump cryptography from 46.0.6 to 46.0.7 by @dependabot[bot] in #2918
fix: fenrirealm.com crawler broken after site migration to SvelteKit by @pathsny in #2928
fix: update skydemonorder crawler for Livewire migration by @josegonzalez in #2935
Bump lxml from 6.0.2 to 6.1.0 by @dependabot[bot] in #2931
Update server configuration and improve database handling
- Changed the default server port from 8080 to 8181 in the Docker Compose configuration and server command.
- Enhanced the database connection handling by using engine.begin() for transactions.
- Updated the database schema verification method to improve clarity and logging.
- Refactored EPUB generation logic to ensure proper item addition to the book structure.
- Adjusted the HTML parsing logic in the Freewebnovel template for better selector usage.

New Contributors

@pathsny made their first contribution in #2928

Full Changelog: v4.2.1...v4.3.0

Uh oh!

Releases: lncrawl/lightnovel-crawler

v4.8.0

Added

Changed

Fixed

Uh oh!

v4.7.0

Added

Changed

Fixed

New Contributors

Contributors

Uh oh!

v4.6.0

What's Changed

New Features

Improvements

Bug Fixes

Source Updates

Internal / Infrastructure

Uh oh!

v4.5.0

What's Changed

Bug Fixes

New Features

New Sources

Updated Sources

Internal Changes

Uh oh!

v4.4.2

Uh oh!

v4.4.1

Uh oh!

v4.4.0

What's Changed

New Features

Bug Fixes

Improvements

Source Updates

Dependency Updates

Uh oh!

v4.3.2

Uh oh!

v4.3.1

Uh oh!

v4.3.0

What's Changed

New Contributors

Contributors

Uh oh!