Add fulltext fetcher for Wiley via their TDM API#15388
Conversation
- Implemented a new `WileyFetcher` fulltext fetcher - Refactored the Download and isPDF execution flow to pass down headers from fetchers till the actual file download logic to support request headers - Adjusted the `getMimeType()` method to include the set headers in the HEAD request - Added another loop in the Preferences UI View (`WebSearchTabViewModel`) to include `CustomizableKeyFetcher`(s) which were not included as `SearchBasedFetchers` - Implemented Unit tests for `WileyFetcher` and added one new test to `FulltextFetchersTest` for asserting that headers propagate down to download logic - Wired `WileyTdmApiKey` preference through BuildInfo, build.properties, build.gradle.kts and JabRefCliPreferences - Updated devdocs (`fetchers.md`) - Updated `CHANGELOG.md`
|
Hey @hagerm98! 👋 Thank you for contributing to JabRef! We have automated checks in place, based on which you will soon get feedback if any of them are failing. We also use Qodo for review assistance. It will update your pull request description with a review help and offer suggestions to improve the pull request. After all automated checks pass, a maintainer will also review your contribution. Once that happens, you can go through their comments in the "Files changed" tab and act on them, or reply to the conversation if you have further inputs. You can read about the whole pull request process in our contribution guide. Please ensure that your pull request is in line with our AI Usage Policy and make necessary disclosures. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Review Summary by QodoAdd Wiley TDM API fulltext fetcher with authenticated download headers support
WalkthroughsDescription• Added Wiley TDM API fulltext fetcher for PDF downloads from Wiley journals • Extended fulltext fetcher infrastructure to support per-fetcher HTTP headers • Refactored download flow to propagate authentication headers through validation and download steps • Updated preferences UI to include customizable key fetchers in web search settings • Added comprehensive unit tests for WileyFetcher and header propagation Diagramflowchart LR
A["BibEntry with DOI"] -->|WileyFetcher| B["TDM API URL"]
B -->|getDownloadHeaders| C["Auth Header Map"]
C -->|FetcherResult| D["URL + Headers"]
D -->|URLDownload| E["PDF Download"]
E -->|LinkedFile| F["Entry Attachment"]
File Changes1. jablib/src/main/java/org/jabref/logic/importer/fetcher/WileyFetcher.java
|
Code Review by Qodo
1. download overload has boolean
|
This comment has been minimized.
This comment has been minimized.
|
Dont be overwhelmed by our bots, sorry for the confusion, they are just ment for a first automated review, since we are just a small team maintaining jabref. We will look into your changes asap. |
|
I think @InAnYan should also take a look at this one. |
🚨 TestLens detected 12 failed tests 🚨Here is what you can do:
Test Summary
🏷️ Commit: a7e9b97 Test Failures (first 5 of 12)ArXivFetcherTest > abstractIsCleanedUp() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)ArXivFetcherTest > findFullTextByDOI() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
ArXivFetcherTest > findFullTextByTitle() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
ArXivFetcherTest > findFullTextByTitleWithCurlyBracket() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
ArXivFetcherTest > findFullTextByTitleWithCurlyBracketAndPartOfAuthor() (:jablib:fetcherTest in Fetcher Tests / Fetcher tests)
Muted TestsSelect tests to mute in this pull request:
Reuse successful test results:
Click the checkbox to trigger a rerun:
Learn more about TestLens at testlens.app. |
InAnYan
left a comment
There was a problem hiding this comment.
Okay, very good! The code quality is high, and I especially like that you introduced a new structure that captures the headers.
But I haven't tested the code.
About the remarque of the loop: I think it can be improved in a follow up
|
Thanks @InAnYan for your review, If you've got a moment can you check the failing CI above (Auto-remove review request / guard-reviewers (pull_request_target))) ? It seems flakey sometimes failing and some not and I don't have access to retry |
Tagging @koppor (run https://github.com/JabRef/jabref/actions/runs/23514594953/job/68491180540?pr=15388) PR LGTM as well so letting this in. Thank you for leaving comments on your changes - it helped in reviewing. |
|
Thanks @subhramit for your review, appreciate it! Regarding the user documentation update, I've opened a PR here, feel free to take a look when you get a chance |
…o fix-group-icons * 'fix-group-icons' of github.com:geovani-rocha/jabref: (26 commits) chore(deps): update dependency org.apache.logging.log4j:log4j-to-slf4j to v2.25.4 (JabRef#15436) chore(deps): update jackson monorepo to v3.1.1 (JabRef#15435) Fix PushToPreferences reset and import (JabRef#15395) Add fulltext fetcher for Wiley via their TDM API (JabRef#15388) Embed in-text nature in reference marks for CSL citations (JabRef#15381) Chore(deps): Bump com.gradleup.shadow:shadow-gradle-plugin (JabRef#15430) Fix not on fx thread exceptions for cleanup and cite key generator (JabRef#15424) Revert "Update gradle to nightly of 2026-03-23 (JabRef#15372)" feat: add benchmarks for Lucene fulltext search and linked file indexing, including setup and teardown of the index. (JabRef#15385) Chore(deps): Bump org.openrewrite.recipe:rewrite-recipe-bom (JabRef#15418) Add claude gitignore (JabRef#15413) Fix group filter icon in side pane (JabRef#15408) Add new prs_link feature Chore(deps): Bump org.glassfish.hk2:hk2-api in /versions (JabRef#15422) Chore(deps): Bump org.openrewrite.rewrite from 7.28.2 to 7.29.0 (JabRef#15419) Chore(deps): Bump jablib/src/main/resources/csl-styles (JabRef#15417) Fix for inconsistent "hide tab bar" behavior (JabRef#15409) Update dependency org.glassfish.hk2:hk2-utils to v4 (JabRef#15407) Persist file notifications (JabRef#15403) Update dependency org.glassfish.hk2:hk2-locator to v4 (JabRef#15405) ...
* Add fulltext fetcher for Wiley via their TDM API - Implemented a new `WileyFetcher` fulltext fetcher - Refactored the Download and isPDF execution flow to pass down headers from fetchers till the actual file download logic to support request headers - Adjusted the `getMimeType()` method to include the set headers in the HEAD request - Added another loop in the Preferences UI View (`WebSearchTabViewModel`) to include `CustomizableKeyFetcher`(s) which were not included as `SearchBasedFetchers` - Implemented Unit tests for `WileyFetcher` and added one new test to `FulltextFetchersTest` for asserting that headers propagate down to download logic - Wired `WileyTdmApiKey` preference through BuildInfo, build.properties, build.gradle.kts and JabRefCliPreferences - Updated devdocs (`fetchers.md`) - Updated `CHANGELOG.md` * Move `setDownloadHeaders()` to correct position to fix constructor order * Exclude onlinelibrary.wiley.com from link checker since blocked by Cloudflare * Update CHANGELOG.md to use end user focused language * Switch FetcherResult to a record and add TODO comment to refactor Preferences UI loop
Related issues and pull requests
Closes #13404
PR Description
Wiley journals block direct PDF downloads from website links with Cloudflare, so "Get fulltext" always fails for Wiley DOIs.
This PR adds a fulltext fetcher that uses Wiley's official TDM REST API to retrieve PDFs, with the user's personal TDM token configured in Preferences UI > Web search.
Since the TDM API requires an auth header (
Wiley-TDM-Client-Token) on every request including both theisPDFcheck (HEAD requests) the PDF download itself, I had to touch across many files to extend the FullText fetchers download logic to support per-fetcher HTTP headers, and passing those headers down the way through to both the PDF validation and download steps. This explains the number of files touched in this PRTODO/In-Progress: Currently working on user documentation and will submit another PR against https://github.com/JabRef/user-documentation
Steps to test
At the end a video of my end-to-end test
add-wiley-tdm-fetcher-1340410.1002/we.2952/ a Wiley Wind Energy article)Without the API token configured, step 5 should return the error in the original issue "403 Access Denied" returned by Cloudflare.
To verify existing fetchers still work, try the same with an arXiv entry (e.g., DOI
10.48550/arXiv.2301.00234).This below is an End-to-end video I recorded demonstrating all three scenarios (fail without key, succeed with key, arXiv still works/backward compatible):
wiley-fetcher-test-e2e.mp4
Here also is the screenshot asked for in the checklist below
Checklist
CHANGELOG.mdin a way that can be understood by the average user (if change is visible to the user)