Skip to content

Rebased and cleaned up version of UTF-16/32 BE/LE PR#206

Merged
dan-blanchard merged 5 commits intomasterfrom
jpz-improved_utf_detection
Jun 24, 2022
Merged

Rebased and cleaned up version of UTF-16/32 BE/LE PR#206
dan-blanchard merged 5 commits intomasterfrom
jpz-improved_utf_detection

Conversation

@dan-blanchard
Copy link
Copy Markdown
Member

This is just #109 with some rebasing/formatting changes.

@dan-blanchard dan-blanchard force-pushed the jpz-improved_utf_detection branch from d717c42 to b1f7335 Compare December 11, 2020 18:33
@yinyue200
Copy link
Copy Markdown

Any new?

@HippocampusGirl
Copy link
Copy Markdown

It would be really cool if this could be merged. Is there anything still left to do?

@dan-blanchard
Copy link
Copy Markdown
Member Author

It's possibly unrelated, but I've had hypothesis test failures locally with this branch and I've been trying to get that squared away before merging.

jpz and others added 5 commits June 24, 2022 16:20
- Restored file suffix filter in test.py
- Added functionality to identify valid unicode, to enhance detection
- Generated some non-trivial unicode examples using supplementary plane 1
@dan-blanchard dan-blanchard force-pushed the jpz-improved_utf_detection branch from b1f7335 to c30a33f Compare June 24, 2022 20:22
@dan-blanchard dan-blanchard merged commit 57abbca into master Jun 24, 2022
@dan-blanchard dan-blanchard deleted the jpz-improved_utf_detection branch June 24, 2022 20:32
This was referenced Jun 24, 2022
hswong3i pushed a commit to alvistack/chardet-chardet that referenced this pull request Jan 4, 2025
* support for UTF-16 and UTF-32 detection missing BOMs

* Changes per PR comments

- Restored file suffix filter in test.py
- Added functionality to identify valid unicode, to enhance detection
- Generated some non-trivial unicode examples using supplementary plane 1

* clean up poorly written comments

* Run black on PR

* Fix some minor linting issues

Co-authored-by: Jason Zavaglia <jason.zavaglia@gmail.com>
hswong3i pushed a commit to alvistack/chardet-chardet that referenced this pull request Sep 8, 2025
* support for UTF-16 and UTF-32 detection missing BOMs

* Changes per PR comments

- Restored file suffix filter in test.py
- Added functionality to identify valid unicode, to enhance detection
- Generated some non-trivial unicode examples using supplementary plane 1

* clean up poorly written comments

* Run black on PR

* Fix some minor linting issues

Co-authored-by: Jason Zavaglia <jason.zavaglia@gmail.com>
hswong3i pushed a commit to alvistack/chardet-chardet that referenced this pull request Sep 8, 2025
* support for UTF-16 and UTF-32 detection missing BOMs

* Changes per PR comments

- Restored file suffix filter in test.py
- Added functionality to identify valid unicode, to enhance detection
- Generated some non-trivial unicode examples using supplementary plane 1

* clean up poorly written comments

* Run black on PR

* Fix some minor linting issues

Co-authored-by: Jason Zavaglia <jason.zavaglia@gmail.com>
hswong3i pushed a commit to alvistack/chardet-chardet that referenced this pull request Feb 16, 2026
* support for UTF-16 and UTF-32 detection missing BOMs

* Changes per PR comments

- Restored file suffix filter in test.py
- Added functionality to identify valid unicode, to enhance detection
- Generated some non-trivial unicode examples using supplementary plane 1

* clean up poorly written comments

* Run black on PR

* Fix some minor linting issues

Co-authored-by: Jason Zavaglia <jason.zavaglia@gmail.com>
hswong3i pushed a commit to alvistack/chardet-chardet that referenced this pull request Feb 17, 2026
* support for UTF-16 and UTF-32 detection missing BOMs

* Changes per PR comments

- Restored file suffix filter in test.py
- Added functionality to identify valid unicode, to enhance detection
- Generated some non-trivial unicode examples using supplementary plane 1

* clean up poorly written comments

* Run black on PR

* Fix some minor linting issues

Co-authored-by: Jason Zavaglia <jason.zavaglia@gmail.com>
hswong3i pushed a commit to alvistack/chardet-chardet that referenced this pull request Feb 17, 2026
* support for UTF-16 and UTF-32 detection missing BOMs

* Changes per PR comments

- Restored file suffix filter in test.py
- Added functionality to identify valid unicode, to enhance detection
- Generated some non-trivial unicode examples using supplementary plane 1

* clean up poorly written comments

* Run black on PR

* Fix some minor linting issues

Co-authored-by: Jason Zavaglia <jason.zavaglia@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants