Skip to content

Web-based Search supports web-fetcher syntax#15249

Merged
calixtus merged 17 commits into
JabRef:mainfrom
faneeshh:fix-12637
Apr 4, 2026
Merged

Web-based Search supports web-fetcher syntax#15249
calixtus merged 17 commits into
JabRef:mainfrom
faneeshh:fix-12637

Conversation

@faneeshh

@faneeshh faneeshh commented Mar 2, 2026

Copy link
Copy Markdown
Collaborator

Related issues and pull requests

Closes #12637

PR Description

This covers mainly the fundamental back end part of the issue but not the UI mode display yet. Previously, entering a fetcher specific syntax like pica.tit = quantum into the web search bar would throw an "Invalid Query" error because the ANTLR parser didn't recognize it. Now DOI outputs are detected and routed to IdBasedFetcher and queries that fail that ANTLR parsing fall back to a raw search term instead of being rejected. I've also changed the UI validation from an error to a warning so it doesn't block the user from searching.

Steps to test

  1. Go to Web Search and select any fetcher like 'GVK'
image
  1. Type fetcher specific syntax like pica.tit=quantum or !term → should show a warning and the search should still go through to the fetcher as a raw string.

Before :
image

image image

After :
image

image
  1. Typing a normal query like quantum computing should work as before. DOI works as before too
image

Here's a rough sketch on how the flow works

image

Checklist

  • I own the copyright of the code submitted and I license it under the MIT license
  • I manually tested my changes in running JabRef (always required)
  • I added JUnit tests for changes (if applicable)
  • I added screenshots in the PR description (if change is visible to the user)
  • [/] I added a screenshot in the PR description showing a library with a single entry with me as author and as title the issue number
  • I described the change in CHANGELOG.md in a way that can be understood by the average user (if change is visible to the user)
  • I checked the user documentation for up to dateness and submitted a pull request to our user documentation repository

faneeshh added 2 commits March 1, 2026 17:06
…oute DOI queries to IDBasedFetcher bypassing ANTLR parsing
@qodo-free-for-open-source-projects

Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Web search supports fetcher-specific syntax and DOI routing

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Web search now accepts fetcher-specific syntax instead of rejecting it
• DOI queries automatically route to IdBasedFetcher bypassing ANTLR parsing
• Invalid queries fall back to raw search terms sent to fetchers
• UI validation changed from error to warning for non-standard syntax
Diagram
flowchart LR
  A["Search Query Input"] --> B{"Is DOI?"}
  B -->|Yes & IdBasedFetcher| C["Route to IdBasedFetcher"]
  B -->|No| D{"Valid ANTLR?"}
  D -->|Yes| E["Parse with SearchQueryVisitor"]
  D -->|No| F["Fall back to Raw Term"]
  E --> G["Perform Search"]
  C --> G
  F --> G
Loading

Grey Divider

File Changes

1. jabgui/src/main/java/org/jabref/gui/importer/fetcher/WebSearchPaneViewModel.java ✨ Enhancement +5/-2

UI validation changed to warnings

• Changed validation messages from error to warning for invalid query syntax
• Added explanatory comments about fallback behavior for non-standard syntax
• Allows users to proceed with searches despite syntax warnings

jabgui/src/main/java/org/jabref/gui/importer/fetcher/WebSearchPaneViewModel.java


2. jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java 🐞 Bug fix +25/-8

DOI routing and fallback to raw terms

• Added DOI detection and routing to IdBasedFetcher when available
• Implemented fallback to raw search terms for invalid ANTLR queries
• Added debug logging for query parsing failures and fallback behavior
• Removed exception throwing for unparseable queries

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java


3. jablib/src/main/java/org/jabref/logic/importer/PagedSearchBasedFetcher.java 🐞 Bug fix +20/-5

Paged search fallback implementation

• Implemented fallback to raw search terms for paged search operations
• Added validity check before attempting ANTLR parsing
• Added debug logging for parsing failures and fallback behavior
• Removed exception throwing for unparseable queries

jablib/src/main/java/org/jabref/logic/importer/PagedSearchBasedFetcher.java


View more (2)
4. jablib/src/test/java/org/jabref/logic/importer/SearchBasedFetcherTest.java 🧪 Tests +156/-0

New test suite for search fetcher

• Added comprehensive test suite for SearchBasedFetcher behavior
• Tests cover empty/blank queries, valid queries, invalid syntax fallback
• Tests verify fetcher-specific syntax handling and DOI routing
• Tests ensure special characters are handled as raw terms

jablib/src/test/java/org/jabref/logic/importer/SearchBasedFetcherTest.java


5. CHANGELOG.md 📝 Documentation +1/-0

Changelog entry for web search fix

• Added entry documenting fix for web search rejecting non-standard syntax
• References issue #12637

CHANGELOG.md


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Mar 2, 2026

Copy link
Copy Markdown
Contributor

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX Issues (0)

Grey Divider


Action required

1. DOI routing depends on fetcher📎 Requirement gap ≡ Correctness
Description
DOI input is only routed to ID search when the currently selected fetcher also implements
IdBasedFetcher, so selecting a non-ID fetcher can still send a DOI as a normal query. This
violates the requirement to route DOI searches through an identifier-based fetch independent of the
selected fetcher.
Code

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[R40-45]

+        // DOIs bypass ANTLR parsing because they contain characters ('/', '.') invalid in the search grammar
+        Optional<DOI> doi = DOI.parse(searchQuery);
+        if (doi.isPresent() && this instanceof IdBasedFetcher idBasedFetcher) {
+            return idBasedFetcher.performSearchById(searchQuery)
+                                 .map(List::of)
+                                 .orElse(List.of());
Evidence
PR Compliance ID 1 requires DOI detection to use an identifier-based fetch regardless of which
fetcher is selected. The new logic gates DOI routing on this instanceof IdBasedFetcher, meaning
the routing still depends on the selected fetcher type.

Web search detects DOI/identifier input and uses ID fetcher independent of selected fetcher
jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[40-45]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
DOI routing is currently conditional on the selected fetcher implementing `IdBasedFetcher`, which violates the requirement that DOI searches must use an identifier-based fetch independent of fetcher selection.
## Issue Context
`SearchBasedFetcher.performSearch(String)` detects a DOI but only performs ID-based search when `this instanceof IdBasedFetcher`. If the UI-selected fetcher is not ID-capable, a DOI will fall back to raw-term search and be sent to that fetcher.
## Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[40-45]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Invalid queries run despite warning📎 Requirement gap ≡ Correctness
Description
Queries that fail parsing are downgraded from error to warning and are still executed via raw-term
fallback, which can silently change how an invalid query is handled. This violates the requirement
that truly invalid/unrecognized queries must produce a clear error rather than ambiguous execution.
Code

jabgui/src/main/java/org/jabref/gui/importer/fetcher/WebSearchPaneViewModel.java[R100-104]

+                            return ValidationMessage.warning(Localization.lang("Invalid query element '%0' at position %1", line, charPositionInLine));
          }
          // Fallback for other failing reasons
-                        return ValidationMessage.error(Localization.lang("Invalid query"));
+                        return ValidationMessage.warning(Localization.lang("Invalid query"));
Evidence
PR Compliance ID 5 requires a clear error state for truly invalid queries, not executing them in an
unintended mode. The UI now returns warnings instead of errors for parsing failures, and the logic
layer explicitly falls back to executing the raw query after parse failures/invalid syntax.

Invalid web search queries produce a clear error instead of silent modification or ambiguous behavior
jabgui/src/main/java/org/jabref/gui/importer/fetcher/WebSearchPaneViewModel.java[100-104]
jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[51-62]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The UI validation now emits only warnings for invalid queries while the logic layer falls back to executing the query as a raw term. This can lead to ambiguous behavior where truly invalid input is still executed.
## Issue Context
Compliance requires that truly invalid/unrecognized queries show an explicit error state rather than being executed in an unintended mode.
## Fix Focus Areas
- jabgui/src/main/java/org/jabref/gui/importer/fetcher/WebSearchPaneViewModel.java[100-104]
- jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[51-62]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. DOI routed to wrong id search🐞 Bug ≡ Correctness
Description
SearchBasedFetcher routes any DOI-like query to performSearchById for any fetcher implementing
IdBasedFetcher. For fetchers whose ID-search is not DOI-based (e.g., ArXivFetcher), DOI queries
return empty results and never reach the intended raw-term search fallback.
Code

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[R40-46]

+        // DOIs bypass ANTLR parsing because they contain characters ('/', '.') invalid in the search grammar
+        Optional<DOI> doi = DOI.parse(searchQuery);
+        if (doi.isPresent() && this instanceof IdBasedFetcher idBasedFetcher) {
+            return idBasedFetcher.performSearchById(searchQuery)
+                                 .map(List::of)
+                                 .orElse(List.of());
}
Evidence
The new default method routes DOI-looking inputs to IdBasedFetcher unconditionally. ArXivFetcher
implements both SearchBasedFetcher (via PagedSearchBasedFetcher) and IdBasedFetcher, but its id
search only accepts arXiv identifiers; non-arXiv identifiers (including DOIs) short-circuit to
empty, bypassing search-by-query behavior.

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[40-46]
jablib/src/main/java/org/jabref/logic/importer/fetcher/ArXivFetcher.java[71-72]
jablib/src/main/java/org/jabref/logic/importer/fetcher/ArXivFetcher.java[411-415]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`SearchBasedFetcher#performSearch(String)` routes *any* DOI-like input to `performSearchById` whenever the fetcher also implements `IdBasedFetcher`. This misroutes DOI queries for fetchers whose ID-search is not DOI-based (notably `ArXivFetcher`), returning empty results and skipping the raw-term fallback.
### Issue Context
- The routing is currently based solely on `DOI.parse(searchQuery).isPresent()` and `this instanceof IdBasedFetcher`.
- `ArXivFetcher` implements `PagedSearchBasedFetcher` (thus `SearchBasedFetcher`) **and** `IdBasedFetcher`, but its ID search is for arXiv identifiers.
### Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[40-46]
- jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[48-65]
- jablib/src/main/java/org/jabref/logic/importer/fetcher/ArXivFetcher.java[71-72]
### Suggested change
- Change DOI routing to either:
- Only apply for DOI-capable fetchers (introduce/consult an explicit capability), **or**
- Attempt `performSearchById(...)` and if it yields `Optional.empty()` then continue with the existing parse/fallback logic (do not return early with an empty list).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. DOI URL breaks CrossRef lookup🐞 Bug ≡ Correctness
Description
When DOI.parse succeeds, SearchBasedFetcher routes to IdBasedFetcher but passes the original user
input string (including DOI URLs) into performSearchById instead of the canonical DOI. For DOI
id-based fetchers like CrossRef, passing a DOI URL produces an incorrect endpoint path and will fail
to resolve.
Code

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[R40-45]

+        // DOIs bypass ANTLR parsing because they contain characters ('/', '.') invalid in the search grammar
+        Optional<DOI> doi = DOI.parse(searchQuery);
+        if (doi.isPresent() && this instanceof IdBasedFetcher idBasedFetcher) {
+            return idBasedFetcher.performSearchById(searchQuery)
+                                 .map(List::of)
+                                 .orElse(List.of());
Evidence
SearchBasedFetcher uses DOI.parse only as a boolean and forwards the raw query string into
performSearchById. CrossRef’s id-based URL construction appends the identifier directly to
/works/{identifier}; supplying a full https://doi.org/... URL yields
/works/https%3A%2F%2Fdoi.org%2F..., which is not a DOI and will not work with the CrossRef API.
The codebase elsewhere normalizes to doi.asString() before delegating to DOI fetchers.

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[40-45]
jablib/src/main/java/org/jabref/logic/importer/fetcher/CrossRef.java[80-83]
jablib/src/main/java/org/jabref/logic/importer/fetcher/TitleFetcher.java[41-48]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When DOI routing is triggered, the code forwards the raw `searchQuery` string into `performSearchById(...)`. If the user entered a DOI URL (e.g., `https://doi.org/...`), DOI-capable id fetchers like `CrossRef` will build an invalid `/works/{identifier}` URL and fail.
### Issue Context
- `DOI.parse(searchQuery)` already extracted/validated the DOI.
- Some id-based fetchers expect a plain DOI, not a full DOI URL.
### Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[40-45]
- jablib/src/test/java/org/jabref/logic/importer/SearchBasedFetcherTest.java[126-145]
- jablib/src/main/java/org/jabref/logic/importer/fetcher/CrossRef.java[80-83]
### Suggested change
- Replace `performSearchById(searchQuery)` with `performSearchById(doi.get().asString())`.
- Adjust tests accordingly (the DOI URL case should assert canonical DOI forwarding, or explicitly test that URL inputs are normalized before calling `performSearchById`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. Tests use assertTrue on Optional📘 Rule violation ≡ Correctness
Description
New tests assert Optional emptiness via boolean conditions (assertTrue(optional.isEmpty()))
where direct expected/actual assertions would be clearer and more diagnostic. This weakens assertion
clarity compared to asserting the object contents directly.
Code

jablib/src/test/java/org/jabref/logic/importer/SearchBasedFetcherTest.java[R87-90]

+        SearchQueryNode searchNode = assertInstanceOf(SearchQueryNode.class, fetcher.getLastQueryNode());
+        assertEquals("quantum", searchNode.term());
+        assertTrue(searchNode.field().isEmpty(), "Unfielded term should have empty field");
+    }
Evidence
PR Compliance ID 29 prefers asserting expected object contents rather than boolean predicates when a
direct assertion is possible. The test uses assertTrue(searchNode.field().isEmpty(), ...) instead
of assertEquals(Optional.empty(), searchNode.field()) (or an equivalent direct assertion).

AGENTS.md
jablib/src/test/java/org/jabref/logic/importer/SearchBasedFetcherTest.java[87-90]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Tests use boolean-condition assertions (`assertTrue(optional.isEmpty())`) where direct value assertions would be clearer.
## Issue Context
JUnit assertion style in this repo prefers asserting object contents directly (expected vs actual) when feasible.
## Fix Focus Areas
- jablib/src/test/java/org/jabref/logic/importer/SearchBasedFetcherTest.java[87-90]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Fallback still logs ERROR🐞 Bug ≡ Correctness
Description
The PR makes parse failures non-fatal by falling back to raw-term searching, but SearchQuery still
logs parse failures at ERROR level. With graceful fallback as the intended behavior, this produces
misleading/noisy error logs for expected user inputs.
Code

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[R48-63]

+        SearchQuery searchQueryObject = new SearchQuery(searchQuery);
BaseQueryNode queryNode;
-        SearchQueryVisitor visitor = new SearchQueryVisitor(searchQueryObject.getSearchFlags());
-        try {
-            queryNode = visitor.visitStart(searchQueryObject.getContext());
-        } catch (ParseCancellationException e) {
-            throw new FetcherException("A syntax error occurred during parsing of the query");
+
+        if (searchQueryObject.isValid()) {
+            SearchQueryVisitor visitor = new SearchQueryVisitor(searchQueryObject.getSearchFlags());
+            try {
+                queryNode = visitor.visitStart(searchQueryObject.getContext());
+            } catch (ParseCancellationException e) {
+                LoggerFactory.getLogger(SearchBasedFetcher.class).debug("Search query visitor failed for '{}', falling back to raw term search", searchQuery, e);
+                queryNode = new SearchQueryNode(Optional.empty(), searchQuery);
+            }
+        } else {
+            // Treat unparseable input as a raw unfielded term so fetchers pass it directly to their web API
+            LoggerFactory.getLogger(SearchBasedFetcher.class).debug("Search query '{}' is not valid ANTLR syntax, falling back to raw term search", searchQuery);
+            queryNode = new SearchQueryNode(Optional.empty(), searchQuery);
}
Evidence
SearchBasedFetcher now explicitly treats invalid queries as a normal case and falls back to raw
searching. However, SearchQuery logs parse failures at ERROR in its constructor whenever parsing
fails, even though the caller handles invalid syntax gracefully afterward.

jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[51-63]
jablib/src/main/java/org/jabref/model/search/query/SearchQuery.java[36-44]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Parse failures are now handled as a normal fallback path (raw-term search), but `SearchQuery` still logs parse failures at **ERROR**, which is misleading and can clutter logs.
### Issue Context
- Fetcher code intentionally falls back when parsing fails.
- `SearchQuery` constructor logs `LOGGER.error(&amp;amp;amp;amp;amp;amp;amp;quot;Search query Parsing error&amp;amp;amp;amp;amp;amp;amp;quot;, ...)` on parse failure.
### Fix Focus Areas
- jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java[48-63]
- jablib/src/main/java/org/jabref/logic/importer/PagedSearchBasedFetcher.java[32-47]
- jablib/src/main/java/org/jabref/model/search/query/SearchQuery.java[36-44]
### Suggested change
- Prefer not instantiating `SearchQuery` in fetcher search paths; call `SearchQuery.getStartContext(...)` in a local try/catch and build the visitor/node directly.
- Alternatively, downgrade/adjust `SearchQuery` parse-failure logging to `debug`/`trace` (or add a non-logging validation method) when parsing failures are expected.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@testlens-app

This comment has been minimized.

Comment thread jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java Outdated
Comment thread jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java Outdated
Comment thread jablib/src/main/java/org/jabref/logic/importer/SearchBasedFetcher.java Outdated
@faneeshh

faneeshh commented Mar 2, 2026

Copy link
Copy Markdown
Collaborator Author

I still have to implement the UI mode indicator and also I noticed (koppor/jabref#559) centralizes query interpretation into a FetcherDelegator class. Right now my changes duplicate the parse then fallback logic in both SearchBasedFetcher and PagedSearchBasedFetcher. Should this be extracted into a shared utility/delegator as part of this PR or is that a larger refactor that should be handled separately?

@github-actions github-actions Bot added the status: changes-required Pull requests that are not yet complete label Mar 2, 2026
@testlens-app

This comment has been minimized.

@github-actions github-actions Bot added status: no-bot-comments and removed status: changes-required Pull requests that are not yet complete labels Mar 2, 2026
@testlens-app

This comment has been minimized.

@github-actions github-actions Bot added status: changes-required Pull requests that are not yet complete and removed status: no-bot-comments labels Mar 3, 2026
@github-actions

github-actions Bot commented Mar 3, 2026

Copy link
Copy Markdown
Contributor

Your pull request conflicts with the target branch.

Please merge with your code. For a step-by-step guide to resolve merge conflicts, see https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-using-the-command-line.

Comment thread jablib/src/main/java/org/jabref/logic/importer/PagedSearchBasedFetcher.java Outdated
@testlens-app

This comment has been minimized.

@github-actions github-actions Bot added status: no-bot-comments and removed status: changes-required Pull requests that are not yet complete labels Mar 9, 2026
@testlens-app

This comment has been minimized.

@github-actions github-actions Bot added status: changes-required Pull requests that are not yet complete and removed status: no-bot-comments labels Mar 9, 2026
@testlens-app

This comment has been minimized.

@testlens-app

This comment has been minimized.

@github-actions github-actions Bot added status: no-bot-comments status: changes-required Pull requests that are not yet complete and removed status: changes-required Pull requests that are not yet complete status: no-bot-comments labels Mar 22, 2026
modeIndicator.textProperty().bind(viewModel.searchModeIndicatorProperty());
modeIndicator.managedProperty().bind(modeIndicator.visibleProperty());
modeIndicator.visibleProperty().bind(modeIndicator.textProperty().isNotEmpty());
modeIndicator.setStyle("-fx-font-size: 0.85em; -fx-text-fill: -fx-mid-text-color;");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No direct style in javafx. Put everything into base.css and use styleclasses instead

@testlens-app

This comment has been minimized.

@testlens-app

testlens-app Bot commented Mar 26, 2026

Copy link
Copy Markdown

✅ All tests passed ✅

🏷️ Commit: 99e09e4
▶️ Tests: 10210 executed
⚪️ Checks: 50/50 completed


Learn more about TestLens at testlens.app.

@github-actions github-actions Bot added status: no-bot-comments status: changes-required Pull requests that are not yet complete and removed status: changes-required Pull requests that are not yet complete status: no-bot-comments labels Mar 26, 2026
calixtus
calixtus previously approved these changes Apr 4, 2026
@calixtus calixtus enabled auto-merge April 4, 2026 07:39
@github-actions github-actions Bot added status: no-bot-comments and removed status: changes-required Pull requests that are not yet complete labels Apr 4, 2026
@calixtus calixtus added this pull request to the merge queue Apr 4, 2026
@github-actions github-actions Bot added the status: to-be-merged PRs which are accepted and should go into the merge-queue. label Apr 4, 2026
Merged via the queue into JabRef:main with commit a9c1ff2 Apr 4, 2026
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: no-bot-comments status: to-be-merged PRs which are accepted and should go into the merge-queue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Web-based Search should support web-fetcher syntax

3 participants