Skip to content

feat: enhance offline assets support and improve Windows build/test workflows#712

Merged
awwaawwa merged 32 commits intoPDFMathTranslate:mainfrom
awwaawwa:better-offline-assets
Mar 2, 2025
Merged

feat: enhance offline assets support and improve Windows build/test workflows#712
awwaawwa merged 32 commits intoPDFMathTranslate:mainfrom
awwaawwa:better-offline-assets

Conversation

@awwaawwa
Copy link
Collaborator

@awwaawwa awwaawwa commented Mar 2, 2025

This PR introduces a series of improvements to enhance offline asset management and streamline the Windows build/test workflows for the project. The following changes have been made:

Offline Assets:

  • Support for offline assets:
    • Added functionality to generate, cache, and restore offline assets in the build process.
    • Integrated offline asset support into pdf2zh using babeldoc.assets.assets.
    • Added test cases to validate offline translation functionality.

GitHub Actions Workflow Updates:

  • Windows Executable Build Workflow:

    • Refactored Python version setup and environment variable handling.
    • Integrated caching for dependencies and offline assets metadata.
    • Enhanced artifact management, including compression level adjustments.
    • Included Visual C++ Redistributable installation in the runtime environment.
    • Introduced test modes for both online and offline scenarios.
  • Fork Build Workflow:

    • Added offline asset generation and validation steps.
    • Uploaded build artifacts with offline assets for testing purposes.
  • Python Publish Workflow:

    • Integrated offline asset preparation for publishable builds.
    • Enhanced repository condition checks and caching logic.
  • Python Test Workflow:

    • Included caching for babeldoc assets to optimize dependency handling.
    • Added support for testing offline assets functionality.

Codebase Enhancements:

  • Logging Improvements:

    • Introduced RichHandler for enhanced logging output.
    • Suppressed verbose logging for external libraries (e.g., httpx, openai).
  • Font Handling Updates:

    • Replaced manual font downloads with utility functions (get_font_and_metadata) for better maintainability.
    • Improved font path resolution for Docker and local environments.
  • Refactorings:

    • Simplified ONNX model loading with get_doclayout_onnx_model_path.
    • Clarified warning messages for missing dependencies (e.g., argostranslate).
    • Updated path handling in static scripts to support offline assets restoration.
  • Dependency Updates:

    • Updated babeldoc to >=0.1.20.
    • Added rich as a dependency for improved logging.

Dockerfile:

  • Disabled font downloads in the Dockerfile to reduce build time.
  • Included babeldoc warmup commands for optimized runtime performance.

awwaawwa added 30 commits March 3, 2025 03:43
@awwaawwa awwaawwa merged commit e4f2524 into PDFMathTranslate:main Mar 2, 2025
7 checks passed
@awwaawwa awwaawwa deleted the better-offline-assets branch March 2, 2025 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant