Skip to content

Extraneous string "ID" added to surname with PDF and authors with ORCID links #512

@rtournoy

Description

@rtournoy

Hi,
We have noticed that when processing a PDF with ORCID URL links labeled "ID", the "ID" string is concatenated to the author string.

Steps to reproduce:
For instance with this PDF:
https://journals.plos.org/plosbiology/article/file?id=10.1371/journal.pbio.3000097&type=printable

Article page:
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000097

Using process header document + Consolidate header=true (but we have the same result with Consolidate header=false)

We get :

<persName xmlns="http://www.tei-c.org/ns/1.0">
    <forename type="first">Thomas</forename>
    <surname>Burgoyne Id</surname>
</persName>

Whereas the expected string is <surname>Burgoyne</surname>

We are using application version="0.5.6". It also happened with previous versions.

Thank you for the new release 0.5.6 and all the work.

Metadata

Metadata

Assignees

Labels

bugFrom Hemiptera and especially its suborder Heteropteraenhancement

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions