Fix repair --merge-axiom-annotations#1240
Merged
jamesaoverton merged 2 commits intoontodev:masterfrom Mar 5, 2025
Merged
Conversation
The RepairOperation::mergeAxiomAnnotations() method only ever looks at
annotated axioms. As a result, it misses the case where the ontology
contains both an unnanotated axiom and one (or more) annotated
version(s) of that same axiom. For example:
SubClassOf(UBERON:1 UBERON:2)
SubClassOf(Annotation(rdfs:comment "A comment") UBERON:1 UBERON:2)
SubClassOf(Annotation(rdfs:comment "Another one") UBERON:1 UBERON:2)
The last two axioms would be merged into a single one carrying the two
comments, but the first one would remain distinct:
SubClassOf(UBERON:1 UBERON:2)
SubClassOf(Annotation(rdfs:comment "A comment")
Annotation(rdfs:comment "Another one")
UBERON:1 UBERON:2)
We fix that by rewriting the method to forcefully look at _all_ axioms,
whether they are annotated or not.
Theoretically, this is sligthly inefficient as axioms that are not
annotated and that do not have any annotated duplicate would still be
removed and then added back. However I do not think that a more
elaborated approach would yield enough performance gain to be worth the
increased complexity.
Update the test case for `repair --merge-axiom-annotations` so that it includes an instance of an unannotated axiom associated with annotated duplicates. This requires changing the format of the test case file from RDF/XML to OFN, since RDF/XML does not allow to represent both an unannotated axiom and an annotated version of the same axiom.
matentzn
approved these changes
Feb 8, 2025
Contributor
matentzn
left a comment
There was a problem hiding this comment.
This implementation is not only more correct, but also 100 times more readable than my old implementation. Full approve!
| manager.removeAxioms(ontology, axiomsToMerge); | ||
| for (OWLAxiom axiom : origAxioms) { | ||
| annotsByAxiom | ||
| .computeIfAbsent(axiom.getAxiomWithoutAnnotations(), k -> new HashSet<>()) |
Contributor
There was a problem hiding this comment.
I have never seen this computeIfAbsent but it makes the code so much more compact!
This was referenced Mar 5, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves [#1239]
docs/have been added/updatedmvn verifysays all tests passmvn sitesays all JavaDocs correctCHANGELOG.mdhas been updatedThe
RepairOperation::mergeAxiomAnnotations()method only ever looks at annotated axioms. As a result, it misses the case where the ontology contains both an unannotated axiom and one (or more) annotated version(s) of that same axiom. For example:The last two axioms would be merged into a single one carrying the two comments, but the first axiom would remain distinct:
This PR fixes the issue by rewriting the method to look at all axioms, whether they are annotated or not.
Theoretically, this is slightly inefficient as axioms that are not annotated and that do not have any annotated duplicate (which probably represent the majority of axioms in most ontologies) would still be forcibly removed and then added back. However I do not think that a more elaborated approach would yield enough performance gain to be worth the increased complexity. For what it’s worth, on my machine and with Uberon, the rewritten method is about 20% slower than the original (incorrect) one (running in ~5 seconds instead of ~4 seconds).
Importantly, the test file for this feature had to be converted from RDF/XML to OFN in order to add an instance of an unannotated axiom alongside annotated duplicates of the same axiom -- something the RDF/XML format does not allow to represent).