Skip to content

Conversation

@jazzido
Copy link
Contributor

@jazzido jazzido commented Jul 28, 2017

Biggest change is moving text extraction out of ObjectExtractionStreamEngine. Previous to this change, we pasted code from PDFBox's LegacyPDFStreamEngine into ObjectExtractionStreamEngine.

We now use PDFTextStripper (which extends LegacyPDFStreamEngine) in ObjectExtractor.

@jazzido jazzido merged commit ec02165 into master Jul 28, 2017
@jazzido jazzido deleted the fix/171 branch July 28, 2017 16:18
EmpowerZ pushed a commit to EmpowerZ/tabula-java that referenced this pull request Oct 23, 2020
* Started work on tabulapdf#171

* using PDFTextStripper instead of duplicating pdfbox's code in ObjectExtractorStreamEngine

* moved textstripper to its own file

* removed useless fields/methods in ObjectExtractorStreamEngine

* adjust test expectation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants