Skip to content

Fix IllegalCharacterException when downloading pdfs with links having :#14977

Merged
Siedlerchr merged 4 commits into
JabRef:mainfrom
subhramit:illega-paths
Jan 31, 2026
Merged

Fix IllegalCharacterException when downloading pdfs with links having :#14977
Siedlerchr merged 4 commits into
JabRef:mainfrom
subhramit:illega-paths

Conversation

@subhramit

@subhramit subhramit commented Jan 31, 2026

Copy link
Copy Markdown
Member

User description

Fixes #14975

Steps to test

Try to add the entry

@InCollection{Vecino2026,
  author    = {Vecino, Sara and Acevedo-Diaz, Gloria and Fernandez-Lanvin, Daniel and Andres, Javier and Gonzalez-Rodriguez, Martin},
  booktitle = {Artificial Intelligence in Healthcare},
  publisher = {Springer},
  title     = {Early Objective ASD Screening System Based on Eye-Tracking and Machine Learning},
  year      = {2026},
  isbn      = {978-3-032-00651-6},
  month     = jan,
  abstract  = {The diagnosis of autism spectrum disorder (ASD) is still based on clinical observation, as there are no validated biomarkers for use in clinical practice. Although the first suspicions may appear as early as 12 months, the diagnosis may be delayed until 3 to 6 years. We propose an early screening procedure for autism starting at 9 months of age, which will allow paediatricians to objectively detect the presence of ASD risk indicators, facilitate immediate access to a specific preventive action program, and minimize the effects of the disorder. Our system utilizes a series of videos specifically designed to detect ASD risk indicators combined with machine learning classifiers to predict ASD risk. Using Random Forest, SVM, MLP, kNN, and AdaBoost, we obtained a 0.9005 ROC AUC, 75.2% sensitivity with the best classifier, which was SVM, when comparing typical development (TD) to ASD levels 1, 2, and 3. The results were up to a ROC AUC score of 0.9508 and a sensitivity of 87.64% with Random Forest, when comparing TD to ASD levels 2 and 3.},
  date      = {2026-01-01},
  doi       = {10.1007/978-3-032-00652-3_11},
  file      = {:http\://link.springer.com/openurl/pdf?id=doi\:10.1007/978-3-032-00652-3_11:PDF},
}

to a library.
It should add successfully now.

Before:
image
and the entry would not be added (or visible in the library)
Now:
image

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • I manually tested my changes in running JabRef (always required)
  • I added JUnit tests for changes (if applicable)
  • I added screenshots in the PR description (if change is visible to the user)
  • I described the change in CHANGELOG.md in a way that is understandable for the average user (if change is visible to the user)
  • [/] I checked the user documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request updating file(s) in https://github.com/JabRef/user-documentation/tree/main/en.

PR Type

Bug fix


Description

  • Fix IllegalCharacterException when processing URLs with colons

  • Extract filename from URL path before getting file extension

  • Add tests for online links and URL path handling

  • Update changelog with issue reference


Diagram Walkthrough

flowchart LR
  A["LinkedFile with URL"] --> B{"Is Online Link?"}
  B -->|Yes| C["Extract filename from URL"]
  C --> D["Get file extension"]
  B -->|No| E["Get extension directly"]
  D --> F["Determine file type"]
  E --> F
  F --> G["Return ExternalFileType"]
Loading

File Walkthrough

Relevant files
Bug fix
ExternalFileTypes.java
Handle URL paths separately for extension detection           

jabgui/src/main/java/org/jabref/gui/externalfiletype/ExternalFileTypes.java

  • Added conditional logic to detect online links vs local files
  • For URLs, extract filename first using FileUtil.getFileNameFromUrl()
    to avoid InvalidPathException
  • For local files, get extension directly as before
  • Prevents colon characters in URLs from causing path validation errors
+15/-2   
Tests
ExternalFileTypesTest.java
Add tests for online link file type detection                       

jabgui/src/test/java/org/jabref/gui/externalfiletype/ExternalFileTypesTest.java

  • Added test for online links with unknown file type
  • Added test for URLs containing colons in the path
  • Both tests verify correct PDF file type detection
+16/-0   
Documentation
CHANGELOG.md
Document URL colon handling fix                                                   

CHANGELOG.md

+1/-0     

Signed-off-by: subhramit <subhramit.bb@live.in>
Signed-off-by: subhramit <subhramit.bb@live.in>
Signed-off-by: subhramit <subhramit.bb@live.in>
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jan 31, 2026

Copy link
Copy Markdown
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🟡
🎫 #14975
🟢 Fix the InvalidPathException (java.nio.file.InvalidPathException: Illegal char <:> at index
4) that occurs when auto-downloading linked PDFs with URLs containing colons
Handle the specific case where the file field contains a URL like
'http://link.springer.com/openurl/pdf?id=doi:10.1007/978-3-032-00652-3_11'
Allow the BibTeX entry with the problematic URL to be added successfully to the library
Prevent the exception from occurring in FileUtil.getFileExtension when processing URLs on
Windows
Display the entry correctly in the library table without throwing exceptions
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Learned
best practice
Add input validation for link path

Add null/empty validation for linkPath before processing to prevent potential
NullPointerException or unexpected behavior when linkedFile.getLink() returns
null or empty string.

jabgui/src/main/java/org/jabref/gui/externalfiletype/ExternalFileTypes.java [121-133]

 String linkPath = linkedFile.getLink();
+if (linkPath == null || linkPath.isBlank()) {
+    return Optional.empty();
+}
 Optional<String> extensionOpt;
 
 if (linkedFile.isOnlineLink()) {
     // For URLs, extract filename from URL path first to avoid InvalidPathException
     // URLs contain ":" which is illegal in Windows file paths (e.g., "http://")
     // See https://github.com/JabRef/jabref/issues/14975
     extensionOpt = FileUtil.getFileNameFromUrl(linkPath)
                            .flatMap(FileUtil::getFileExtension);
 } else {
     // For local files, get extension directly
     extensionOpt = FileUtil.getFileExtension(linkPath);
 }
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Add input validation checks to methods that accept external data to ensure parameters are not null, empty, or invalid before processing, throwing appropriate exceptions when validation fails.

Low
General
Simplify conditional logic for conciseness

Refactor the conditional logic to return directly from the if and else blocks,
eliminating the intermediate extensionOpt variable for conciseness.

jabgui/src/main/java/org/jabref/gui/externalfiletype/ExternalFileTypes.java [120-135]

 // No type could be found from mime type. Try based on the extension:
 String linkPath = linkedFile.getLink();
-Optional<String> extensionOpt;
 
 if (linkedFile.isOnlineLink()) {
     // For URLs, extract filename from URL path first to avoid InvalidPathException
     // URLs contain ":" which is illegal in Windows file paths (e.g., "http://")
     // See https://github.com/JabRef/jabref/issues/14975
-    extensionOpt = FileUtil.getFileNameFromUrl(linkPath)
-                           .flatMap(FileUtil::getFileExtension);
+    return FileUtil.getFileNameFromUrl(linkPath)
+                   .flatMap(FileUtil::getFileExtension)
+                   .flatMap(extension -> getExternalFileTypeByExt(extension, externalApplicationsPreferences));
 } else {
     // For local files, get extension directly
-    extensionOpt = FileUtil.getFileExtension(linkPath);
+    return FileUtil.getFileExtension(linkPath)
+                   .flatMap(extension -> getExternalFileTypeByExt(extension, externalApplicationsPreferences));
 }
 
-return extensionOpt.flatMap(extension -> getExternalFileTypeByExt(extension, externalApplicationsPreferences));
-
  • Apply / Chat
Suggestion importance[1-10]: 3

__

Why: The suggestion is a valid refactoring for conciseness, but it introduces code duplication by repeating the flatMap call, which goes against the DRY principle and may not improve readability.

Low
  • More

@subhramit subhramit added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Jan 31, 2026
@subhramit subhramit changed the title Extract filename from url path Fix IllegalCharacterException when downloading pdfs with links having : Jan 31, 2026
@Siedlerchr Siedlerchr enabled auto-merge January 31, 2026 21:43
@Siedlerchr Siedlerchr added this pull request to the merge queue Jan 31, 2026
@github-actions github-actions Bot added the status: to-be-merged PRs which are accepted and should go into the merge-queue. label Jan 31, 2026
Merged via the queue into JabRef:main with commit b243b20 Jan 31, 2026
67 of 82 checks passed
@Siedlerchr Siedlerchr deleted the illega-paths branch January 31, 2026 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: external-files Review effort 2/5 status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers status: to-be-merged PRs which are accepted and should go into the merge-queue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid path exception when auto-downloading linked pdf

2 participants