Skip to content

Research infrastructure recognition#1085

Merged
kermitt2 merged 15 commits intomasterfrom
research-infrastructures
Feb 11, 2024
Merged

Research infrastructure recognition#1085
kermitt2 merged 15 commits intomasterfrom
research-infrastructures

Conversation

@kermitt2
Copy link
Copy Markdown
Collaborator

@kermitt2 kermitt2 commented Feb 11, 2024

This PR adds an explicit recognition of the acknowledged research infrastructure to the funding-acknowledgement model, with specific features and gazetteer resources.

Training data for the funding-acknowledgement model has been extended with an infrastructure class as a refinement of "institution".

The extracted research infrastructured are then given in an additional block <listOrg type="infrastructure"> (similar to the funding) at the back (/TEI/text/back/listOrg[@type="infrastructure"]):

           <listOrg type="infrastructure">
                <org type="infrastructure">
                    <orgName type="extracted">CINES</orgName>
                    <orgName type="full" lang="en">National Computer Center for Higher Education</orgName>
                    <orgName type="full" lang="fr">Centre informatique national de l'enseignement supérieur</orgName>
                </org>
                <org type="infrastructure">
                    <orgName type="extracted">GENCI</orgName>
                    <orgName type="full" lang="fr">Grand Équipement National de Calcul Intensif</orgName>
                </org>
            </listOrg>

And the refined mark-up are also visible in the acknowledgement and funding sections:

          <div type="acknowledgement">
                <div>
                    <head>Acknowledgment</head>
                    <p>This work was partially supported by the <rs type="funder">EIPHI Graduate School</rs> 
(contract "<rs type="grantNumber">ANR-17-EURE-0002</rs>"). This work was granted access to the AI resources 
of <rs type="institution" subtype="infrastructure">CINES</rs> under the allocation 
<rs type="grantNumber">AD010613582</rs> made by <rs type="institution" subtype="infrastructure">GENCI</rs> 
and also from the <rs type="institution">Mesocentre of Franche-Comté</rs>.
                    </p>
                </div>
            </div>

Two issues:

  • not a lot of training data currently relatively to research infrastructure
  • a very loose definition of what is a research infrastructure exactly, which makes the human annotation uneasy. Research infrastructures overlap usual research institutions, funded projects and even funders (as research infrastructures are often funding competitive research experiments with application numbers, etc.). Should research infrastructures be limited to "federated" infrastructures used by various organizations or should it include smaller research platform inside an institution offering research services to several departments of ths unique organization?

@coveralls
Copy link
Copy Markdown

coveralls commented Feb 11, 2024

Coverage Status

coverage: 39.954% (+0.07%) from 39.886%
when pulling 8751dcb on research-infrastructures
into cab0947 on master.

@kermitt2 kermitt2 merged commit 4daa2ce into master Feb 11, 2024
@lfoppiano lfoppiano added this to the 0.8.1 milestone Jun 9, 2024
@lfoppiano lfoppiano deleted the research-infrastructures branch March 21, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants