Skip to content

Include abstract labels with PubMed MEDLINE xml import #12527

@ryan-carpenter

Description

@ryan-carpenter

Is your suggestion for improvement related to a problem? Please describe.

Abstracts from PubMed include convenient section labels such as "Background", "Methods", and "Results" that make the content easier to read. They are part of the abstract in plain text records, and are included with the abstract when imported. This is not the case for PubMed XML, where the labels are enclosed in tags and do not get imported.

Describe the solution you'd like

Add section labels from Label= (see example) to the related content when importing abstracts from PubMed XML.
Note that labels also have a standardized translation shown as NlmCategory, which is also very useful but not available in plain text PubMed records, so I suggest including labels rather than categories in the import.

Additional context

Example:

<Abstract>
    <AbstractText Label="BACKGROUND" NlmCategory="BACKGROUND">This is the background section.</AbstractText>
    <AbstractText Label="MATERIALS AND METHODS" NlmCategory="METHODS">The methods belong here.</AbstractText>
    <AbstractText Label="RESULTS" NlmCategory="RESULTS">The results showed something interesting.</AbstractText>
    <AbstractText Label="CONCLUSION" NlmCategory="CONCLUSIONS">These are the conclusions.</AbstractText>
</Abstract>

Suggested result of importing:

BACKGROUND: This is the background section. MATERIALS AND METHODS: The methods belong here. RESULTS: The results showed something interesting. CONCLUSION: These are the conclusions.

Metadata

Metadata

Assignees

No one assigned
    No fields configured for feature.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions