Skip to content

stop reading file immediately when filetype known#103

Merged
dan-blanchard merged 1 commit intochardet:masterfrom
jpz:perf_improvement
Apr 10, 2017
Merged

stop reading file immediately when filetype known#103
dan-blanchard merged 1 commit intochardet:masterfrom
jpz:perf_improvement

Conversation

@jpz
Copy link
Copy Markdown
Contributor

@jpz jpz commented Apr 10, 2017

This should be a useful performance improvement. I've got some 100MB files and they take quite a few seconds to read through from my local SSD - and imagine if someone was reading across the network.

If the unicode byte-order mark is read in the first line of the file, it really makes no sense to read the rest of the file off disk.

I fixed the unit tests in the previous PR because I wanted to assure myself this introduced no regression (it appears not to.)

cheers

@dan-blanchard
Copy link
Copy Markdown
Member

Nice catch!

@dan-blanchard dan-blanchard merged commit 2979943 into chardet:master Apr 10, 2017
@dan-blanchard dan-blanchard mentioned this pull request Apr 11, 2017
dan-blanchard added a commit that referenced this pull request Mar 12, 2026
…arade history

- Add contributor names with GitHub profile links to all changelog entries
- Add PR/issue links where available across all versions
- Add missing entries: 6.0.0.post1, max_bytes fix (#314), Codespaces (#312),
  LGPLv2.1 update (#307), ISO-8859-15 (#222), LGPL classifier (#255),
  Hypothesis testing (#66), UTF-16/32 BOM fix (#73), early exit (#103)
- Include charade fork releases (1.0.0–1.0.3) interleaved chronologically
- Prefix pre-merger versions with "chardet" or "charade" for clarity
- Add sphinx-lint to prek pre-commit hooks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants