-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Description
Compiling a pdf with non-latin (in this case specifically devanagari) text in it can sometimes result is strange text encoding. This results in text that is not properly selectable. The part of the font that is selectable differs across pdf viewers, and sometimes introduces non-existent characters into the selection.
Minimal Working Example:
#set text(font: "Siddhanta")
आ रु॒क्मैरा यु॒धा नर॑ ऋ॒ष्वा ऋ॒ष्टीर॑सृक्षत ।
अन्वे॑नाँ॒ अह॑ वि॒द्युतो॑ म॒रुतो॒ जज्झ॑तीरिव भा॒नुर॑र्त॒ त्मना॑ दि॒वः ॥
(explicitly states font for repeatability)
typst output
This results in the following chunk of selectable text in Ubuntu's Document Viewer:
अा
ैरा युध
ा नर॑ ऋ॒ ष्वा ऋ॒ ीर॑ सृक्षत ꠰
अन्वे॑नाँ॒ अह॑ व॒ ुताे॑ म॒ ताे॒ जज्झ॑ती रव भा॒नुर॑त॒ त्मना॑ द॒वः ꠱
In other pdf viewers the output may be different, e.g. firefox gives:
अा ˳॒ƨैरा यु॒धा नर॑ ऋ॒ष्वा ऋ॒ʆीर॑सृक्षत ꠰
अन्वे॑ नाँ॒ अह॑ Vव॒ȭुताे॑ म॒˳ताे॒ जज्झ॑तीRरव भा॒नुर॑तA॒ त् मना॑ Tद॒वः ꠱
(notice the introduced latin characters)
It seems that this is not an inherent limitation of pdf itself, LuaLaTeX can generate properly functioning pdfs. A minimal working example is provided for LaTeX as well:
\documentclass[a4paper]{article}
\usepackage{fontspec}
\setromanfont{Siddhanta}
\begin{document}
आ रु॒क्मैरा यु॒धा नर॑ ऋ॒ष्वा ऋ॒ष्टीर॑सृक्षत ।
अन्वे॑नाँ॒ अह॑ वि॒द्युतो॑ म॒रुतो॒ जज्झ॑तीरव भनर॑र्त॒ त्मना॑ दि॒वः ॥
\end{document}
Because of this it seems that it is possible to do this in such a way that makes the text encoding work in pdfs.
Reproduction URL
No response
Operating system
Linux
Typst version
- I am using the latest version of Typst