Skip to content

fix: demote misleading LibreOffice 'not found' warning to debug (closes #230)#237

Merged
LarFii merged 2 commits intoHKUDS:mainfrom
jwchmodx:fix/libreoffice-misleading-warning
Apr 7, 2026
Merged

fix: demote misleading LibreOffice 'not found' warning to debug (closes #230)#237
LarFii merged 2 commits intoHKUDS:mainfrom
jwchmodx:fix/libreoffice-misleading-warning

Conversation

@jwchmodx
Copy link
Copy Markdown
Contributor

@jwchmodx jwchmodx commented Apr 3, 2026

Problem

On systems where only soffice is on PATH (the typical case on macOS after
brew install --cask libreoffice), the converter loop logged a WARNING
for the libreoffice candidate before immediately succeeding via the
soffice fallback. Users saw:

WARNING:raganything.parser:LibreOffice command 'libreoffice' not found
INFO:raganything.parser:Successfully converted thaiculture.pptx to PDF using soffice

This made the conversion look broken (#230) even though it completed
successfully. Users were confused and opened support questions.

Root Cause

FileNotFoundError in the candidate loop was always logged at WARNING
level, regardless of whether more candidates remained to try.

Fix

Introduce an is_last flag and only emit a WARNING when the
FileNotFoundError is raised for the final candidate (i.e. all options
are exhausted). For intermediate candidates the message is demoted to
DEBUG, keeping normal logs clean while still allowing the full trace to
appear under --debug.

# Before — always WARNING
cls.logger.warning(f"LibreOffice command '{cmd}' not found")

# After — DEBUG for non-final, WARNING only when all candidates fail
if is_last:
    cls.logger.warning(f"LibreOffice command '{cmd}' not found")
else:
    cls.logger.debug(f"LibreOffice command '{cmd}' not found, trying next candidate")

Behaviour After This Change

Scenario Before After
libreoffice missing, soffice works ⚠️ WARNING + ✅ INFO 🔍 DEBUG + ✅ INFO
Both binaries missing ⚠️ WARNING + ⚠️ WARNING + ❌ RuntimeError ⚠️ WARNING (last) + ❌ RuntimeError
libreoffice works ✅ INFO ✅ INFO

Checklist

  • No new dependencies
  • No breaking API changes
  • ruff check + ruff format pass
  • Existing test suite: same 4 pre-existing failures unrelated to this change, all others pass

jwchmodx added 2 commits April 3, 2026 12:16
HKUDS#159)

Reasoning models (DeepSeek-R1, Qwen2.5-think, etc.) wrap their
chain-of-thought in <think>…</think> blocks before emitting the
final answer.  When _robust_json_parse fails to extract a valid JSON
object from the response, the four modal-processor parse methods
(_parse_response, _parse_table_response, _parse_equation_response,
_parse_generic_response) were returning the **raw** LLM response as
the fallback caption and summary.  This caused internal model
reasoning to be stored in the knowledge graph instead of the actual
content description.

Fix: add a static helper `BaseModalProcessor._strip_thinking_tags`
that removes <think>/<thinking> blocks (case-insensitive, multiline)
and apply it in every fallback branch so only the final-answer text
is stored or returned.

The helper is tested in tests/test_strip_thinking_tags.py with 13
unit tests covering: tag variants, multiline blocks, multiple blocks,
case-insensitivity, and the full fallback path for all four
processor classes.
HKUDS#230)

On systems where only 'soffice' is on PATH (common on macOS), the
existing fallback loop logged a WARNING for the 'libreoffice' candidate
before successfully converting via 'soffice'.  This caused users to see:

  WARNING: LibreOffice command 'libreoffice' not found
  INFO:    Successfully converted file.pptx to PDF using soffice

…and conclude that something was broken, even though the conversion
succeeded.

Fix: log FileNotFoundError at DEBUG level for any non-final candidate
so that routine 'libreoffice' → 'soffice' fallback stays silent in
normal logs.  The WARNING is preserved only when the last candidate in
the list is not found (meaning no usable LibreOffice binary exists at
all and the conversion is about to fail).
@LarFii
Copy link
Copy Markdown
Collaborator

LarFii commented Apr 7, 2026

Thanks for your contribution!

@LarFii LarFii merged commit 86abfd7 into HKUDS:main Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants