refactor(translator): enhance prompt conversion method#637
refactor(translator): enhance prompt conversion method#637awwaawwa merged 18 commits intoPDFMathTranslate:mainfrom Tql-ws1:change-prompt-conversion-method
Conversation
…hen using custom prompts.
Main Changes:
- Replaced `eval()` with `json.loads()` to convert text into a JSON object.
- Emphasized the occurrence of errors that prevent further execution.
- Used `string.Template.substitute()` instead of `string.Template.safe_substitute()` to avoid unintentional use of incorrectly formatted templates.
- Added logging information to inform users whether the custom prompt is being used effectively.
Other Changes:
Refactor!:
- Used `num_predict` parameter to set the output length limit for large language models in ollama.
Perf!:
- Removed broad and redundant exception throwing.
- Disabled streaming output and used regular expressions to remove thought chain outputs.
Test:
- Added unit test cases for `OllamaTranslator`.
Style:
- Sorted library imports.
- Added type hints where necessary.
Related to issue #636.
|
I have converted this PR to draft status. Please mark it as Ready after the work is completed. |
|
_remove_cot_content can use re.sub and |
Good idea! |
Here for you. |
…bstituted variables can cause JSON parsing errors
|
Completed. However, |
|
And you, @hellofinch! If you wanted to solve this issue, you should have raised it four days ago and also initiated a pull request at the same time, rather than waiting until today. This way, others would have one less merge conflict to handle and wouldn't have to clean up the mess you left behind! |
|
Please pass all CI checks, thank you! However, it might be because the format was changed recently but CI wasn't updated........ |
Thank you for your feedback. We understand that merge conflicts can create additional workload. At the same time, we want to thank @hellofinch for their efforts and voluntary contributions. This is an open-source project, and everyone's time and energy are valuable - we're grateful for all participation. To better avoid similar conflicts in the future, everyone can consider raising issues and discussing them early when problems are discovered, and plan solutions in advance. If you have any suggestions or better practices to make collaboration smoother, we very much welcome your sharing! Once again, thank you to every contributor for your dedication in making this project better together! |
|
Undoubtedly, @hellofinch's contributions are certainly commendable. But, code reviewers! Could you please help? How did something like using When I raised issue #636 and initiated pull request #637, I wondered if @hellofinch had reviewed them. I assume they didn't, otherwise, why would they suggest more verbose and less readable code and remind me to conduct thorough testing? If you had reviewed my changes, you might not have posted such content. Interestingly, about ten minutes after posting this code, @hellofinch began submitting a related pull request, with the commit time for That's why I'm frustrated with @hellofinch and the reviewers who approved @hellofinch's commit. It's absurd! Making contributions is cool, but when those contributions contradict the project's direction and quality, and are accepted without refinement, they become a burden. |
Apologies for my mistake in reviewing, and thank you for pointing it out. We will revert the erroneous commits and work on improving the maintainability of the codebase. |
|
GOOD! |
awwaawwa
left a comment
There was a problem hiding this comment.
Could you revise the test when you have time?
…r prompt parsing and add prompt to cache params
@awwaawwa Thank you for pointing out that "the object being mocked should be more concrete" and for helping me further improve the unit tests of
I reviewed your changes here, and I think we should not add any exception handling statements except for those expected exceptions. I raised #636 because when an error occurred, it did not provide specific exception information, making it difficult to pinpoint the root cause. You can try modifying the code in |
|
@Tql-ws1 The exception handling is a problem with the current program architecture. I plan to solve it through pdf2zh 2.0. The current solution is a workaround that can help reduce customer service pressure. As for Additionally, this modification can improve compatibility with models like o1/r1, since these models cannot set system prompts. |
Using template substitution alone can indeed avoid parsing issues caused by formatting, but what about models that rely on system prompts to constrain their behavior? This might reduce translation quality. How about considering releasing a nightly version to crowdtest and compare the effects before and after removing the system prompt? |
|
Unfortunately, the current release CI doesn't handle nightly builds very well... However, I hope to improve this in #645. After improving the release CI, releasing will become much easier, so it should be safe to include this change directly in the official version. If there are many reports of translation quality degradation, we can revert this change.
|
|
Okay, understood. Are there any other areas that need improvement regarding this submission? If not, I think I've covered everything I needed to do. |
Next, I'll wait for @hellofinch to help test the API and CLI custom prompt functionality. After his testing, I will merge this PR. |
|
@Tql-ws1 I just remembered that https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#custom-prompt this file needs some modifications. For example, this prompt I used for testing before: |
|
CLI and WebUI work fine for me. @awwaawwa |
|
Thank you for your contribution. According to https://funstory-ai.github.io/BabelDOC/CONTRIBUTOR_REWARD/ , you can apply for a monthly membership redemption code for Immersive Translate. |
|
Nice work! API works good now! Very Well!! |

#636