Skip to content

Use locale.getpreferredencoding rather than locale.getlocale as latter can fail if Python doesn't know the locale#11384

Merged
michaelDCurran merged 3 commits into
nvaccess:masterfrom
lukaszgo1:I11155
Jul 20, 2020
Merged

Use locale.getpreferredencoding rather than locale.getlocale as latter can fail if Python doesn't know the locale#11384
michaelDCurran merged 3 commits into
nvaccess:masterfrom
lukaszgo1:I11155

Conversation

@lukaszgo1

Copy link
Copy Markdown
Contributor

Link to issue number:

Fixes #11155

Summary of the issue:

Whenever we needed to access the ANSI Windows code page we were using locale.getlocale. This was problematic for two reasons:

  1. `locale.getlocale`` returns the Python locale which we're changing when setting NVDA language so returned code page may not correspond to the user code page for example when someone's system is in Polish but NVDA in English.
  2. More importantly Python, or rather the win32 function which Python invokes under the hood, cannot determine what code page should be used for some locales e.g. Aragonese.

Description of how this pull request fixes the issue:

In places where locale.getlocale was formerly used I've switched to locale.getpreferredencoding as this method returns ANSI code page of the current Windows user.

Testing performed:

Switched NVDA language to Aragonese, navigated in Notepad, run dialog and cmd - previously navigation was not possible due to the fact that AN was not recognized as a valid locale by Python.

Known issues with pull request:

Change log entry:

Bug fixes

It is once again possible to navigate in various controls when NVDA is set to Aragonese.

@michaelDCurran

Copy link
Copy Markdown
Member

I'm not sure that locale.getpreferredencoding actually honours the locale NVDA is set to, rather it only honours the locale the Windows user account is configured for.
At least, for me:
On my system:
locale.getlocale returns:
('English_United Kingdom', '1252')
locale.getpreferredencoding returns:
'cp1252'
Which is fine. But:
Then if I call
locale.setlocale(locale.LC_ALL, 'zh')
I get:
locale.getlocale returns:
('zh_CN', 'eucCN')
locale.getpreferredencoding returns:
'cp1252'
This last one looks wrong to me.

Perhaps we can catch the exception from locale.getlocale and fall back to locale.getpreferredencoding?

@LeonarddeR

Copy link
Copy Markdown
Collaborator

I actually think this pr is right, and @michaelDCurran's concern doesn't apply here.

  1. First of all, the only reason why i decided to use locale.getlocale here was that it was used all over the place in the code pre textUtils module to deal with offset differences between Python 3 strings and Windows wide character strings with surrogate characters #9545, so it was merely copying existing behaviour. The reason why this fails in Python 3 versions of NVDA is that this code is used more widely now and that we're fetching the locale more often.
  2. locale.getlocale returns locale and codepage of NVDA while locale.getpreferredencoding returns the code page of the system. I believe that we really should use the code page of the system in this code, which can be fetched with locale.getpreferredencoding.

@LeonarddeR LeonarddeR left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this pr doesn't fix the issue that locale.getlocale fails when NVDA is set to Aragonese. I think that's still a thing that should be addressed in a follow up. May be we should file this against Python or something.

Comment thread source/textUtils.py Outdated
Comment thread source/textUtils.py Outdated
from logHandler import log

WCHAR_ENCODING = "utf_16_le"
USERANSICODEPAGE = locale.getpreferredencoding()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the style used for constants here:

Suggested change
USERANSICODEPAGE = locale.getpreferredencoding()
USER_ANSI_CODE_PAGE = locale.getpreferredencoding()

Or rather

Suggested change
USERANSICODEPAGE = locale.getpreferredencoding()
SYSTEM_PREFERRED_ENCODING = locale.getpreferredencoding()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed it to USER_ANSI_CODE_PAGE because it can be set per Windows user account and is not global to the system.

Comment thread source/textInfos/offsets.py Outdated
@josephsl

josephsl commented Jul 16, 2020 via email

Copy link
Copy Markdown
Contributor

@lukaszgo1

Copy link
Copy Markdown
Contributor Author

@LeonarddeR wrote:

Note that this pr doesn't fix the issue that locale.getlocale fails when NVDA is set to Aragonese. I think that's still a thing that should be addressed in a follow up. May be we should file this against Python or something.

I don't think there is much we can do about that - an as a locale code is simply not known to Windows that's why Python fails.

@lukaszgo1

Copy link
Copy Markdown
Contributor Author

@LeonarddeR All your commends are now addressed.

@michaelDCurran

michaelDCurran commented Jul 16, 2020 via email

Copy link
Copy Markdown
Member

@michaelDCurran michaelDCurran merged commit 41970ca into nvaccess:master Jul 20, 2020
@nvaccessAuto nvaccessAuto added this to the 2020.3 milestone Jul 20, 2020
michaelDCurran added a commit that referenced this pull request Jul 20, 2020
@lukaszgo1 lukaszgo1 deleted the I11155 branch July 20, 2020 09:00
@josephsl josephsl mentioned this pull request Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Odd behaviour when NVDA is in aragonese

5 participants