implement auto-link#14710
Conversation
| import org.jabref.model.entry.BibEntry; | ||
| import org.jabref.model.entry.LinkedFile; | ||
|
|
||
| import org.jetbrains.annotations.NotNull; |
| // Wait because there are several rounds in one auto-link operation | ||
| // The later round depends on the updated bibEntry of the previous round | ||
| UiTaskExecutor.runAndWaitInJavaFXThread(() -> entry.setFiles(newLinkedFiles)); | ||
| }; |
There was a problem hiding this comment.
The original onLinkedFile is not reasonable, considering that we not only add new linked files, but also update old ones, and multiple linked files may be updated at the same time. So it is changed to onLinkedFilesUpdated.
Also, onLinkedFile creates the undoable object by assigning the newVal using the only passed-in linked file, which does not make sense.
| * one trick: we accumulate Part B after Step 1, so Part A does not affect those broken linked files we use | ||
| * to query Part B | ||
| */ | ||
| private void doLinkAssociateFiles(BibEntry entry, BiConsumer<List<LinkedFile>, BibEntry> onAddLinkedFile, LinkFilesResult result) { |
There was a problem hiding this comment.
In this function, I have to call onAddLinkedFile.accept multiple times (with UiTaskExecutor.runAndWaitInJavaFXThread) because the updated BibEntry after the first auto-link step is needed for the second auto-link step.
This also affects the Undo Manager behavior that multiple undo actions may be added in the auto-link process for one entry.
Probably a better way is to isolate the BibEntry from the code here with another layer of POJO to do non-UI operations.
There was a problem hiding this comment.
I did not take the high-level AI-suggestion to update the code here.
| public Collection<LinkedFile> findAssociatedNotLinkedFiles(BibEntry entry) throws IOException { | ||
| List<LinkedFile> associatedNotLinkedFiles = new ArrayList<>(); | ||
|
|
||
| List<String> extensions = externalApplicationsPreferences.getExternalFileTypes().stream().map(ExternalFileType::getExtension).toList(); | ||
|
|
||
| LOGGER.debug("Searching for extensions {} in directories {}", extensions, directories); | ||
|
|
||
| // Run the search operation | ||
| FileFinder fileFinder = FileFinders.constructFromConfiguration(autoLinkPreferences); | ||
| List<Path> result = new ArrayList<>(fileFinder.findAssociatedFiles(entry, directories, extensions)); | ||
| result.addAll(findAssociatedFilesByBrokenLinkedFile(entry)); | ||
|
|
||
| // Collect the found files that are not yet linked | ||
| List<Path> linkedFiles = entry.getFiles().stream() | ||
| .map(file -> file.findIn(directories)) | ||
| .filter(Optional::isPresent) | ||
| .map(Optional::get) | ||
| .toList(); | ||
| for (Path foundFile : result) { | ||
| boolean fileAlreadyLinked = linkedFiles.stream() | ||
| .anyMatch(linked -> { | ||
| try { | ||
| return Files.isSameFile(linked, foundFile); | ||
| } catch (IOException e) { | ||
| LOGGER.debug("Unable to check file identity, assuming no identity", e); | ||
| return false; | ||
| } | ||
| }); | ||
|
|
||
| if (!fileAlreadyLinked) { | ||
| Optional<ExternalFileType> type = FileUtil.getFileExtension(foundFile) | ||
| .map(extension -> ExternalFileTypes.getExternalFileTypeByExt(extension, externalApplicationsPreferences)) | ||
| .orElse(Optional.of(new UnknownExternalFileType(""))); | ||
|
|
||
| String strType = type.map(ExternalFileType::getName).orElse(""); | ||
| Path relativeFilePath = FileUtil.relativize(foundFile, directories); | ||
| LinkedFile linkedFile = new LinkedFile("", relativeFilePath, strType); | ||
| associatedNotLinkedFiles.add(linkedFile); | ||
| LOGGER.debug("Found file {} for entry {}", linkedFile, entry.getCitationKey()); | ||
| } | ||
| } | ||
| // Find the associated files | ||
| List<String> extensions = getConfiguredExtensions(); | ||
| LOGGER.debug("Searching for associated not linked files with extensions {} in directories {}", extensions, directories); | ||
| return Stream.concat( | ||
| findAssociatedNotLinkedFilesWithFinder(entry, preConfiguredFileFinder, extensions).stream(), | ||
| findAssociatedNotLinkedFilesWithFinder(entry, brokenLinkedFileNameBasedFileFinder, extensions).stream()) | ||
| .distinct() | ||
| .toList(); | ||
| } |
There was a problem hiding this comment.
I wish it were much clearer by writing code like this to show that we have two sources of associated files.
And the new BrokenLinkedFileNameBasedFileFinder.java also helped this purpose.
| //****************************************************************************************************** | ||
| //*********************************** Test linkAssociatedFiles ***************************************** | ||
| //****************************************************************************************************** |
There was a problem hiding this comment.
The following test cases are grouped by commets like this
There was a problem hiding this comment.
You can use nested tests to structure your tests more easily.
https://docs.junit.org/6.0.1/writing-tests/nested-tests.html
There was a problem hiding this comment.
Thanks! I’ve already applied @Nested; with code folding, the test structure is clean and clear.
| // 1. it has the same name (case-insensitive) as one of the broken linked file names | ||
| // 2. its file extension matches one specified in extensions |
There was a problem hiding this comment.
Here we do not consider a strict file extension match as:
Entry Editor - Automatically search and show unlinked files in the entity editormay want to show those file with same name but different extension as possible candidateQuality - Automatically set file linkswill make sure only file with the same name and extention is auto linked. The logic is inautoLinkBrokenLinkedFiles
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Many file movals at test-support/src/manual-tests/ - I would suggest to add a README.md to the directory of how to test.
The "best" software engineering update would be to create a section in docs/requirements/files.md for this automatic re-link - and link the implementation and test case(s)
| /** | ||
| * Source of associated not linked files: | ||
| * Part A. match file name with CitationKey, (START, EXACT, REGEX) configured by user | ||
| * Part B. match file name with broken linked file names, currently silently happen | ||
| * | ||
| * The auto-link process: | ||
| * Prolog: we only consider the file with a unique name | ||
| * if a file's name is found multiple times in Part A, we do not consider it in Step 1 | ||
| * if a file's name is found multiple times in Part B, we do not consider it in Step 2 | ||
| * Step 1. try to auto link each broken linked file with Part A at first. For unlinked files left in Part A, if no | ||
| * linked file with same name exists, add them | ||
| * why `add`: Part A are found mainly by CitationKey, which has strong connection to the entry, so we are | ||
| * confident that we should add them automatically | ||
| * Step 2. try to auto link each broken linked file with Part B. | ||
| * how about `unlinked files left in Part B`: there should be no files left as they are found based on broken | ||
| * linked files and are used to fix them. `files left` hints a bug | ||
| * one trick: we accumulate Part B after Step 1, so Part A does not affect those broken linked files we use | ||
| * to query Part B | ||
| */ |
There was a problem hiding this comment.
Use markdown JavaDoc to keep formatting
There was a problem hiding this comment.
fixed. Used markdown JavaDoc instead
| dialogService.notify(String.format("%s %s\n%s", | ||
| Localization.lang("Finished automatically setting external links."), | ||
| Localization.lang("Changed %0 entries.", result.getChangedEntries().size()), | ||
| Localization.lang("Affected entries: %0", result.getChangedEntries().stream() |
There was a problem hiding this comment.
Please recheck our localiaztion howto - use entry(s) because there could be one, two, more, much more.
Also for the string the line above.
| */ | ||
| private void doLinkAssociatedFiles(BibEntry entry, BiConsumer<List<LinkedFile>, BibEntry> onAddLinkedFile, LinkFilesResult result) { | ||
| boolean entryUpdated = false; | ||
| // Step 1: try matched files based on CitationKey configured by user |
There was a problem hiding this comment.
empty line added
| // It is a bug if there are files left | ||
| if (!files.isEmpty()) { | ||
| LOGGER.error("Cannot auto-link all the files found based on broken linked file names. Files left: {} ", | ||
| files.keySet().stream().map(files::get).map(LinkedFile::getLink).collect(Collectors.joining("||"))); |
There was a problem hiding this comment.
Strange format, use
["{filename}", "{filename}", "{filename}"]
Easily can be achived with joining.
There was a problem hiding this comment.
This line has been removed. I confirmed that there are normal cases where some associated files may not be used to do auto-link.
@koppor
I have read through https://devdocs.jabref.org/requirements/ |
subhramit
left a comment
There was a problem hiding this comment.
Lgtm
Would like if @koppor could answer https://github.com/JabRef/jabref/pull/14710/changes#r2646673356
User description
Closes #9798
I implemented the feature - auto-link a file to a broken linked file in a BibEntry. The feature stands at the intersection of
Quality - Automatically set file linksandEntry Editor - Automatically search and show unlinked files in the entity editoras the two features share the code to find associated files using the functionfindAssociatedNotLinkedFilesinAutoSetFileLinksUtil.java.Before #13326, the associated files are only searched by three variants of citation-index-based strategy configured by user (see doc auto-linking-files). #13326 expands the search scope to also consider broken linked file names. It has a flaw that it only returns the first found file based on broken linked file names, which is fixed by #14700. #13326 does not fix the problem that the associated files are never used to auto-link to the broken linked file, but arbitrarily add them as new linked files, which will be addressed by this PR. For details, check the function
doLinkAssociateFilesinAutoSetFileLinksUtil.java.Steps to test
Many unit tests for the function
AutoSetFileLinksUtil.linkAssociatedFilesare created inAutoSetFileLinksUtilTest. They are categorized into the following groups. Most categories also consider cases where auto-link should not work.Required files for the three manual test cases are put in the following directory
The recording of the three manual tests is here:
Screen.Recording.2025-12-24.at.23.39.06.mov
Mandatory checks
CHANGELOG.mdin a way that is understandable for the average user (if change is visible to the user)auto-linkdocumentation has already been created earlier. The detail about finding associated files by broken linked file names should be added there.PR Type
Enhancement, Bug fix
Description
Auto-link broken linked files to their actual locations when files are moved
Implement two-step auto-linking: first by citation key, then by broken file names
Only auto-link files with matching extensions to prevent incorrect associations
Add comprehensive unit tests covering citation key and file name matching scenarios
Diagram Walkthrough
File Walkthrough
1 files
Add Apache Commons Lang3 dependency4 files
Update callback to handle multiple linked filesImplement two-step auto-linking with file extension validationNew file finder based on broken linked file namesAdd factory method for broken linked file finder3 files
Add comprehensive unit tests for auto-linking functionalityAdd manual test case for multiple entriesAdd manual test case for broken file name matching2 files
Document auto-link fix for moved filesAdd localization string for affected entries1 files