-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Description
Search before asking
- I had searched in the issues and found no similar issues.
Description
Description in English:
In the load process, if there are problems with the original data, we will store the error data in an error_log file on the disk for subsequent debugging. However, if there are many error data, it will occupy a lot of disk space. Now we want to limit the number of error data that is saved to the disk.
- Be familiar with the usage of doris' import function and internal implementation process
- Add a new be configuration item load_error_log_limit_bytes = default value 200MB
- Use the newly added threshold to limit the amount of data that RuntimeState::append_error_msg_to_file writes to disk
- Write regression cases for testing and verification
Description in Chinese:
在导入过程中,如果原始数据有问题,那么我们会把错误数据存放到磁盘上的一个 error_log 文件中方便后续 debug。但是如果错误数据很多,就会占用大量的磁盘空间。所以需要限制落盘的错误数据数量。
- 熟悉 doris 的导入功能用法和内部实现流程
- 增加新的 be 配置项目 load_error_log_limit_bytes = 默认值 200MB
- 使用新增的阈值限制 RuntimeState::append_error_msg_to_file 落盘数据量
- 编写回归case进行测试和验证
Solution
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
No labels