Skip to content

Importing PDF with 'Any file' filter parses PDF as BibTeX instead of using PdfContentImporter or PdfXmpImporter #7984

@btut

Description

@btut

All files imported without explicitly selecting an importer in the file-choose dialog are parsed as BibTeX files before trying any of the dedicated importers.
This is done in the importUnknownFormat method.
The method:

  • First tries to parse the file (no matter the file type) as BibTeX.
  • Then tries to find a fitting importer

I noticed that with PDF's the BibTeX parser does not fail (but produces unusable results) and therefore the importers that are actually in place to import pdfs are not even used.

Before #694 and in particular f405cf4, the priority was the other way around. First, it was tried to find a fitting importer, if there is none the BibTeX parser would try to read the file.

Why was this changed? Maybe @lenhard remembers something?

I noticed that:

  • moving it back to how it was before Add command line options #694 or
  • removing the BibTeX parsing alltogether (if a .bib file is selected, it is imported using the BibtexImporter anyways

does not result in any failing tests. Are there any objections to moving it back?

JabRef version on

Steps to reproduce the behavior:

  1. File -> import -> import into current library
  2. Select a PDF file
  3. Get an unusable entry for that file

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions