In this document there are two missing lines which are placed at the end of the page and they start with a digit.
The issue, which at first appeared to be a pdfalto issue, it's a segmentation model issue:

where the full line is tagged as <page> (see training data output):

In this document there are two missing lines which are placed at the end of the page and they start with a digit.
The issue, which at first appeared to be a pdfalto issue, it's a segmentation model issue:
where the full line is tagged as
<page>(see training data output):