Skip to content

Non-Latin author names (Hindi/Arabic) parsed as namePrefix instead of familyName #15813

@shamka17

Description

@shamka17

JabRef version

5.15 (latest release)

Operating system

Windows

Details on version and operating system

No response

Checked with the latest development build (copy version output from About dialog)

  • I made a backup of my libraries before testing the latest development version.
  • I have tested the latest development version and the problem persists

Steps to reproduce the behaviour

Description

While testing JabRef author parsing using parameterized JUnit tests, I observed inconsistent parsing behaviour for certain non-Latin scripts.

Chinese and Cyrillic author names are parsed correctly into familyName and givenName components, but Hindi and Arabic names are interpreted as namePrefix values instead of familyName.

Reproducible Example

Input: हिंदी, परीक्षण

Observed result:

  • namePrefix = हिंदी
  • familyName = null

Input: العربية, اختبار

Observed result:

  • namePrefix = العربية
  • familyName = null

Comparison

The following inputs parse correctly:

  • 王, 伟
  • Иванов, Алексей

Expected Behaviour

All comma-separated author names using the format:
Lastname, Firstname

should be parsed consistently regardless of script.

Environment

  • JabRef version: latest master branch
  • Java version: 21
  • Testing framework: JUnit 5 parameterized tests

Additional Notes

The issue was observed during automated parameterized testing of AuthorList.parse() behaviour.

Appendix

Image

Metadata

Metadata

Assignees

Labels

📌 Pinned📍 AssignedAssigned by assign-issue-action (or manually assigned)

Type

No fields configured for bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions