grobid/dataseer

By grobid

Updated over 2 years ago

Dataset spotting in scholar articles with Machine Learning

Image
0

331

grobid/dataseer repository overview

dataseer-ml is a Machine Learning tool aiming at identifying implicit mentions of datasets in a scientific article and to classify the identified datasets in a hierarchy of dataset types, these data types being directly derived from MeSH. Most of the datasets discussed in scientific articles are actually not named, but these data are part of the disclosed scientific work and should be shared properly to meet the FAIR requirements.

The module can process a variety of scientific article formats, including mainstream publisher's native XML submission formats: PDF, TEI, JATS/NLM, ScholarOne, BMJ, Elsevier staging format, OUP, PNAS, RSC, Sage, Wiley, etc.

Tag summary

Content type

Image

Digest

sha256:0e105936d

Size

11.6 GB

Last updated

over 2 years ago

Requires Docker Desktop 4.37.1 or later.