Hi, I noticed that, at least since v0.7.3, GROBID started returning bibtex by default for /api/processHeaderDocument. This contradicts https://grobid.readthedocs.io/en/latest/Grobid-service/#apiprocessheaderdocument which claims a special Accept: application/x-bibtex header must be used for BibTeX and that the default is TEI XML.
Note that it's possible to get an XML response by using Accept: application/xml.
Steps to reproduce
- Get a PDF (I used https://arxiv.org/pdf/2212.12604v1.pdf but anything will do)
- Make a request against the GROBID API. I used the HuggingFace demo API:
curl https://kermitt2-grobid.hf.space/api/processHeaderDocument --form input=@Downloads/2212.12604v1.pdf
- See that the output contains BibTeX and not TEI XML:
@misc{-1,
author = {},
title = {Search for new physics in the τ lepton plus missing transverse momentum final state in proton-proton collisions at √ s = 13 TeV The CMS Collaboration},
date = {2022-12-23},
year = {2022},
month = {12},
day = {23},
eprint = {arXiv:2212.12604v1[hep-ex]},
abstract = {A search for physics beyond the standard model (SM) in the final state with a hadronically decaying tau lepton and a neutrino is presented. This analysis is based on data recorded by the CMS experiment from proton-proton collisions at a center-ofmass energy of 13 TeV at the LHC, corresponding to a total integrated luminosity of 138 fb-1. The transverse mass spectrum is analyzed for the presence of new physics. No significant deviation from the SM prediction is observed. Limits are set on the production cross section of a W boson decaying into a tau lepton and a neutrino. Lower limits are set on the mass of the sequential SM-like heavy charged vector boson and the mass of a quantum black hole. Upper limits are placed on the couplings of a new boson to the SM fermions. Constraints are put on a nonuniversal gauge interaction model and an effective field theory model. For the first time, upper limits on the cross section of t-channel leptoquark (LQ) exchange are presented. These limits are translated into exclusion limits on the LQ mass and on its coupling in the t-channel. The sensitivity of this analysis extends into the parameter space of LQ models that attempt to explain the anomalies observed in B meson decays. The limits presented for the various interpretations are the most stringent to date. Additionally, a model-independent limit is provided.}
}
Requested info
Linux amd64 through lfoppiano/grobid:0.7.3 Docker image & whatever huggingface is using
- What is your Java version (
java --version)?
openjdk 17.0.2 2022-01-18
OpenJDK Runtime Environment (build 17.0.2+8-86)
OpenJDK 64-Bit Server VM (build 17.0.2+8-86, mixed mode, sharing)
- In case of build or run errors, please submit the error while running gradlew with
--stacktrace and --info for better log traces (e.g. ./gradlew run --stacktrace --info) or attach the log file logs/grobid-service.log.
Hi, I noticed that, at least since v0.7.3, GROBID started returning bibtex by default for
/api/processHeaderDocument. This contradicts https://grobid.readthedocs.io/en/latest/Grobid-service/#apiprocessheaderdocument which claims a specialAccept: application/x-bibtexheader must be used for BibTeX and that the default is TEI XML.Note that it's possible to get an XML response by using
Accept: application/xml.Steps to reproduce
curl https://kermitt2-grobid.hf.space/api/processHeaderDocument --form input=@Downloads/2212.12604v1.pdfRequested info
Linux amd64 through
lfoppiano/grobid:0.7.3Docker image & whatever huggingface is usingjava --version)?openjdk 17.0.2 2022-01-18
OpenJDK Runtime Environment (build 17.0.2+8-86)
OpenJDK 64-Bit Server VM (build 17.0.2+8-86, mixed mode, sharing)
--stacktraceand--infofor better log traces (e.g../gradlew run --stacktrace --info) or attach the log filelogs/grobid-service.log.