Fix high CPU usage from log writer spin-wait, regex recompilation, and byte-by-byte tunnel I/O#197
Merged
atauenis merged 6 commits intoatauenis:devfrom Feb 2, 2026
Merged
Conversation
Replace spin-wait with proper lock for log file writes. The previous implementation used a busy-wait loop that consumed CPU cycles while waiting for LogStreamWriterReady, causing high CPU usage under load. Co-Authored-By: Claude <noreply@anthropic.com>
Cache compiled Regex instances using ConcurrentDictionary to avoid recompiling the same patterns on every request. Patterns are compiled with a 5-second timeout to prevent ReDoS attacks. This significantly reduces CPU usage when processing requests through Edit rules, as the ~140 default regex patterns no longer need to be recompiled for each request. Co-Authored-By: Claude <noreply@anthropic.com>
Replace byte-by-byte I/O with 8KB buffered reads/writes in tunnel servers. The previous implementation read and wrote one byte at a time, causing excessive system call overhead and high CPU usage during tunnel connections. Affected servers: - HttpSecurePassthroughServer (CONNECT passthrough) - HttpSecureNonHttpServer (non-HTTP SSL tunneling) - HttpSecureNonHttpDecryptServer (decrypted non-HTTP tunneling) Co-Authored-By: Claude <noreply@anthropic.com>
Owner
|
Thank you for these important fixes! The high CPU load is a rare bug which I have not caught manually in last ~18 major versions. The simple loop-based lock was not a good thing (however, which haven't produced any problems on low load conditions and when there are no problems on write to log). And when there are problems on writing to log, it make a hardest freeze with 100%+ CPU load. Now it is gone. One byte buffer for non-HTTPS tunnels initially introduced for "short message with variable length"-based protocols like IRC, FTP or Telnet for lower ping. But seems that IRC is still correctly working even with 8K buffers, so the buffer is not really too expensive. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I was experiencing extremely high CPU usage (900%+) after only a bit of use when running WebOne in a Docker container on an Apple Silicon Mac. The proxy would become unusable and the container would consume all available CPU cores.
After investigating, identified three independent issues that compound under load:
1. Log writer spin-wait
LogAgent.WriteLinespawns a Task for each log message that busy-waits withwhile (!LogStreamWriterReady) { }. Under high connection volume, these Tasks pile up and spin, burning CPU cycles.2. Regex patterns recompiled on every request
The default configuration has ~140 Edit rules with regex patterns. Each request recompiles these patterns via
new Regex()calls in hot paths likeHttpTransit.ProcessTransit(). Regex compilation is expensive.3. Byte-by-byte tunnel I/O
The tunnel servers (
HttpSecurePassthroughServer,HttpSecureNonHttpServer,HttpSecureNonHttpDecryptServer) read and write one byte at a time usingBinaryReader.ReadByte(). This causes excessive system call overhead for every byte transferred.Solution
lock()synchronizationRegexinstances in aConcurrentDictionarywith 5-second timeoutTesting
Note
This fix was developed with assistance from Claude Opus 4.5. All changes were tested on my local environment before submission.