Skip to content

Fix crossRef fetcher#14696

Merged
koppor merged 1 commit into
mainfrom
fix-crossref-fetcher
Dec 23, 2025
Merged

Fix crossRef fetcher#14696
koppor merged 1 commit into
mainfrom
fix-crossref-fetcher

Conversation

@koppor

@koppor koppor commented Dec 23, 2025

Copy link
Copy Markdown
Member

User description

Triggered by #14652

Follow-up to #14357

Still has issues with some references - but this is future work.

image

Steps to test

Patch this into #14652 and select "CrossRef"

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • [/] I manually tested my changes in running JabRef (always required)
  • [/] I added JUnit tests for changes (if applicable)
  • [/] I added screenshots in the PR description (if change is visible to the user)
  • [/] I described the change in CHANGELOG.md in a way that is understandable for the average user (if change is visible to the user)
  • [/] I checked the user documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request updating file(s) in https://github.com/JabRef/user-documentation/tree/main/en.

PR Type

Bug fix


Description

  • Handle CrossRef responses missing unstructured reference text

  • Add null-safety checks for optional reference fields

  • Return fallback BibEntry when both DOI and unstructured text unavailable

  • Improve error logging for incomplete reference data


Diagram Walkthrough

flowchart LR
  A["CrossRef Response"] --> B{"Has DOI?"}
  B -->|Yes| C{"Has unstructured?"}
  B -->|No| D{"Has unstructured?"}
  C -->|Yes| E["Create from DOI + text"]
  C -->|No| F["Create from DOI only"]
  D -->|Yes| G["Parse unstructured text"]
  D -->|No| H["Return fallback entry"]
Loading

File Walkthrough

Relevant files
Bug fix
CrossRefCitationFetcher.java
Add null-safety for missing reference fields                         

jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/crossref/CrossRefCitationFetcher.java

  • Changed unstructured from String to JsonNode to detect missing values
  • Added null-safety check for missing unstructured text before parsing
  • Return fallback BibEntry with error note when both DOI and
    unstructured text are missing
  • Updated getBibEntryFromDoi() to conditionally set NOTE field only when
    unstructured text exists
  • Added error logging when reference lacks both DOI and unstructured
    information
+18/-7   

@koppor koppor added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Dec 23, 2025
@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Sensitive Data Logging: The error log at line 93 outputs the entire reference JSON object which may contain
sensitive or internal system information that should not be logged.

Referred Code
LOGGER.error("Reference has neither DOI nor unstructured text: {}", reference);
return new BibEntry()

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Refactor entry creation for clarity

Refactor BibEntry creation to avoid inefficiently setting the changed status to
false with withField and then immediately to true with withChanged(true). Use
setField and setChanged(true) for better clarity.

jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/crossref/CrossRefCitationFetcher.java [94-96]

-return new BibEntry()
-        .withField(StandardField.NOTE, "Could not retrieve reference information from CrossRef")
-        .withChanged(true);
+BibEntry fallbackEntry = new BibEntry();
+fallbackEntry.setField(StandardField.NOTE, "Could not retrieve reference information from CrossRef");
+fallbackEntry.setChanged(true);
+return fallbackEntry;
  • Apply / Chat
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies that using withField followed by withChanged(true) is inefficient due to withField setting the changed status to false. The proposed refactoring using setField is clearer and more direct.

Low
Simplify BibEntry creation and modification

Simplify BibEntry creation by using setField for all fields and then calling
setChanged(true) once at the end, avoiding the inefficient use of withField
which sets the changed status to false.

jablib/src/main/java/org/jabref/logic/importer/fetcher/citation/crossref/CrossRefCitationFetcher.java [135-143]

 .orElseGet(() -> {
-    BibEntry bibEntry = new BibEntry()
-            .withField(StandardField.DOI, referenceDoi)
-            .withChanged(true);
+    BibEntry bibEntry = new BibEntry();
+    bibEntry.setField(StandardField.DOI, referenceDoi);
     if (!unstructured.isMissingNode()) {
         bibEntry.setField(StandardField.NOTE, unstructured.asText());
     }
+    bibEntry.setChanged(true);
     return bibEntry;
 });
  • Apply / Chat
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly points out an inefficient sequence of operations where withField sets the changed status to false only for it to be overridden. The proposed change simplifies the logic by using setField and setting the changed status once at the end.

Low
  • More

if (unstructured.isMissingNode()) {
LOGGER.error("Reference has neither DOI nor unstructured text: {}", reference);
return new BibEntry()
.withField(StandardField.NOTE, "Could not retrieve reference information from CrossRef")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is intentional

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - quick hack to transport that to the user somehow.

Future work: investigate the cases.

@koppor koppor added this pull request to the merge queue Dec 23, 2025
Merged via the queue into main with commit 40ccc59 Dec 23, 2025
76 of 91 checks passed
@koppor koppor deleted the fix-crossref-fetcher branch December 23, 2025 15:10
Siedlerchr added a commit that referenced this pull request Dec 23, 2025
* main:
  Update AI usage policy (#14698)
  Fix handling of DOIs (#14704)
  Handle ohter CrossRef response (#14696)
  Fix condition for processing closed issues/PRs
  Translate the English "change to Chinese(simplified)" to the Chinese in the warning dialog (#14690)
  More performance optimization (#14695)
  Add missing dot (and a link)
  Add link to PR template also if checklist is present, but not OK (#14694)
  Fix typo in IntelliJ code style instructions (#14693)
  Add import into new library to Welcome Tab (#14669)
  Add initial search requirements (#14633)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review effort 2/5 status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants