Fix for some Unicode characters in citation keys#6938
Merged
Conversation
calixtus
reviewed
Sep 25, 2020
calixtus
left a comment
Member
There was a problem hiding this comment.
Thanks for your work here, one question remaining...
Siedlerchr
approved these changes
Sep 25, 2020
Siedlerchr
left a comment
Member
There was a problem hiding this comment.
Thanks, I was not aware of the Normalizhe/Unicode stuff methods. Never heard of them before
Member
Author
|
@Siedlerchr in my opinion it is a mess best avoided if possible X) |
Siedlerchr
added a commit
that referenced
this pull request
Sep 26, 2020
* upstream/master: (55 commits) Rename menus citation style in preview style (#6899) Fix for some Unicode characters in citation keys (#6938) Add missing authors Fix a fetcher test for the ShortDOIService (#6945) Fixes Shared Database: Changes filtering in CoarseChangeFilter to attribute property (#6868) Changed default value of "search and store files relative to bibtex file" to true (#6928) Replace comment by just a failure (#6943) Fix: in entry types editor selected field is not removed after first click (#6941) Fix remove actions for entry types in the editor (#6933) Always use Java 15 (#6929) Update DevDocs: workaround for issues with local openjfx maven libraries (#6931) Fixes bugs in the `regex` cite key pattern modifier (#6893) Add missing author Readability for citation key patterns (#6706) Add new author Reset to master and add default case to switch (#6847) Bump mockito-core from 3.5.10 to 3.5.11 (#6924) Bump byte-buddy-parent from 1.10.14 to 1.10.15 (#6923) Bump org.beryx.jlink from 2.21.4 to 2.22.0 (#6925) Bump xmpbox from 2.0.20 to 2.0.21 (#6926) ... # Conflicts: # src/main/java/org/jabref/logic/util/DelayTaskThrottler.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #6583 . Some unicode characters can be encoded in multiple ways, and the mapping that
StringUtil#replaceSpecialCharactersrelies on does not contain all cases. The proposed solution uses NFC to re-encode the characters so that these characters can be found.There exists more information on Unicode normalization in the Java API.
My subjective opinion is that most people expect Unicode to work similar to NFC, i.e., if characters looks the same, it is likely that they are equivalent. Hence, if someone debugs issues in the
UNICODE_CHAR_MAP, they will expect NFC.A more holistic approach should likely start with the compatibility equivalence, which will require larger changes, and there does not seem to be any bugs/issues that requires these larger changes.
Change in CHANGELOG.md described (if applicable)Screenshots added in PR description (for UI changes)