Skip to content

Copy encoding fixes from loadFile() to reloadBuffer()#2633

Closed
vlakoff wants to merge 1 commit intonotepad-plus-plus:masterfrom
vlakoff:encoding-on-reload
Closed

Copy encoding fixes from loadFile() to reloadBuffer()#2633
vlakoff wants to merge 1 commit intonotepad-plus-plus:masterfrom
vlakoff:encoding-on-reload

Conversation

@vlakoff
Copy link
Copy Markdown
Contributor

@vlakoff vlakoff commented Dec 1, 2016

(noticed this while investigating #1874)

Copy some code from FileManager::loadFile to reloadBuffer, to fix two bugs also when reloading file from disk. I think the bug 1 is quite important, as it affects "reload from disk" in a very common scenario (ANSI-only content, but force-detected as UTF-8 without BOM).

Bug 1:

  1. The settings "New Document > UTF-8 + Apply to opened ANSI files" should be checked.
  2. Create a file with only ANSI content (e.g. "aaa"). Save this file in ANSI. Close the file.
  3. Open the file. It should be detected as UTF-8 (without BOM).
  4. Execute menu command "Reload from Disk". The file is now detected as ANSI. Expected: should be detected as UTF-8.

Bug 2:

  1. Create a file with this exact content:
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> ééé
  2. Save it in ANSI and with a .html extension. Close the file.
  3. Open the file. It should be detected as UTF-8 (and display garbage but that's "expected").
  4. Execute command "Reload from Disk". The file is now detected as Windows-1255. Expected: should be detected as UTF-8.


By the way, maybe we should drop the getHtmlXmlEncoding method. For instance look at the HTML regexes, they are outdated (no html5), not flexible and in most cases wouldn't match. Also, I think this method does more harm than good. These <meta> tags don't determine the file encoding, but the expected file encoding.

@vlakoff
Copy link
Copy Markdown
Contributor Author

vlakoff commented Dec 1, 2016

Of course, you might want to factorize this code from loadFile and reloadBuffer into a separate method.

@donho
Copy link
Copy Markdown
Member

donho commented Feb 19, 2017

@vlakoff Have you tested your PR?
It doesn't work for me.

@vlakoff
Copy link
Copy Markdown
Contributor Author

vlakoff commented Feb 20, 2017

I cannot compile, I just copied the existing code.

  • Can you reproduce the bugs, at least the first one?
  • What's your opinion about dropping getHtmlXmlEncoding as I described at the end of my first post?

@donho donho added reject and removed regression labels Feb 20, 2017
@donho
Copy link
Copy Markdown
Member

donho commented Feb 20, 2017

I can reproduce the bug. But your PR doesn't work.
Please submit ONLY pr that you have tested. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants