Skip to content

Fix for some Unicode characters in citation keys#6938

Merged
koppor merged 4 commits into
JabRef:masterfrom
k3KAW8Pnf7mkmdSMPHz27:fix-for-issue-6583
Sep 26, 2020
Merged

Fix for some Unicode characters in citation keys#6938
koppor merged 4 commits into
JabRef:masterfrom
k3KAW8Pnf7mkmdSMPHz27:fix-for-issue-6583

Conversation

@k3KAW8Pnf7mkmdSMPHz27

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 commented Sep 23, 2020

Copy link
Copy Markdown
Member

Fixes #6583 . Some unicode characters can be encoded in multiple ways, and the mapping that StringUtil#replaceSpecialCharacters relies on does not contain all cases. The proposed solution uses NFC to re-encode the characters so that these characters can be found.
There exists more information on Unicode normalization in the Java API.

My subjective opinion is that most people expect Unicode to work similar to NFC, i.e., if characters looks the same, it is likely that they are equivalent. Hence, if someone debugs issues in the UNICODE_CHAR_MAP, they will expect NFC.
A more holistic approach should likely start with the compatibility equivalence, which will require larger changes, and there does not seem to be any bugs/issues that requires these larger changes.

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 changed the title [WIP] Fix for issue 6583 [WIP] Fix for some Unicode characters in citation keys Sep 24, 2020
@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 changed the title [WIP] Fix for some Unicode characters in citation keys Fix for some Unicode characters in citation keys Sep 24, 2020
@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 marked this pull request as ready for review September 24, 2020 15:41

@calixtus calixtus left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work here, one question remaining...

Comment thread src/test/java/org/jabref/logic/citationkeypattern/CitationKeyGeneratorTest.java Outdated

@Siedlerchr Siedlerchr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I was not aware of the Normalizhe/Unicode stuff methods. Never heard of them before

@Siedlerchr Siedlerchr added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Sep 25, 2020
@k3KAW8Pnf7mkmdSMPHz27

Copy link
Copy Markdown
Member Author

@Siedlerchr in my opinion it is a mess best avoided if possible X)
I believe NFC is how most people expect Unicode to work (which is why I am using it here), I'll add some more details to the top part of the PR in case someone needs to patch this patch later.

@koppor koppor left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work! LGTM.

@koppor koppor merged commit 47edbbd into JabRef:master Sep 26, 2020
Siedlerchr added a commit that referenced this pull request Sep 26, 2020
* upstream/master: (55 commits)
  Rename menus citation style in preview style (#6899)
  Fix for some Unicode characters in citation keys (#6938)
  Add missing authors
  Fix a fetcher test for the ShortDOIService (#6945)
  Fixes Shared Database: Changes filtering in CoarseChangeFilter to attribute property (#6868)
  Changed default value of "search and store files relative to bibtex file" to true (#6928)
  Replace comment by just a failure (#6943)
  Fix: in entry types editor selected field is not removed after first click  (#6941)
  Fix remove actions for entry types in the editor (#6933)
  Always use Java 15 (#6929)
  Update DevDocs: workaround for issues with local openjfx maven libraries (#6931)
  Fixes bugs in the `regex` cite key pattern modifier (#6893)
  Add missing author
  Readability for citation key patterns (#6706)
  Add new author
  Reset to master and add default case to switch (#6847)
  Bump mockito-core from 3.5.10 to 3.5.11 (#6924)
  Bump byte-buddy-parent from 1.10.14 to 1.10.15 (#6923)
  Bump org.beryx.jlink from 2.21.4 to 2.22.0 (#6925)
  Bump xmpbox from 2.0.20 to 2.0.21 (#6926)
  ...

# Conflicts:
#	src/main/java/org/jabref/logic/util/DelayTaskThrottler.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bibtex citekey has non-ASCII letters

4 participants