-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
A clear and concise description of what the bug is.
JDBC数据源在实时扫表拉取数据时,在数据量大于1万条时,会出现部分数据重复,后续在增加数据,便不会出现重复。
原因:在实时抽取数据过程中,第一次从JDBC数据源的SQL,没有按照splitKey进行排序,导致获取的下一次的起始位置数据不是上一次轮询后的最大值,导致数据重复。
发现版本:1.12.1
To Reproduce
Steps to reproduce the behavior:
1.在oracle中插入1万条数据
2.配置运行一个Oracel的实时数据抽取任务,到Hive中
3.可以发现,在最后几千条数据出现重复
4.任务不停止,再次插入数据
5.再次插入的数据没有重复
Expected behavior
A clear and concise description of what you expected to happen.
实时从oracle拉取数据不会出现重复
Screenshots
If applicable, add screenshots to help explain your problem.

Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working