Skip to content

Efficiency: Let thread do other tasks as long encoder/decoder is busy#215

Merged
hyperxpro merged 1 commit into
hyperxpro:mainfrom
mkarg:yield-when-busy
Jun 9, 2025
Merged

Efficiency: Let thread do other tasks as long encoder/decoder is busy#215
hyperxpro merged 1 commit into
hyperxpro:mainfrom
mkarg:yield-when-busy

Conversation

@mkarg

@mkarg mkarg commented Jun 5, 2025

Copy link
Copy Markdown
Contributor

Motivation:

Busy loops are blocking valuable threads, so other tasks are blocked while the current task spins waiting. This reduces concurrency without a need, hence makes inefficient use of the existing hardware.

Modification:

Using Thread.yield() in all busy waiting loops.

Result:

Other tasks are allowed to run while encode / decoder is busy.

@mkarg mkarg force-pushed the yield-when-busy branch from 8598e44 to d667766 Compare June 5, 2025 17:03
@hyperxpro hyperxpro requested a review from MarcoLotz June 5, 2025 20:29
@hyperxpro

Copy link
Copy Markdown
Owner

@mkarg There are test failures.

@hyperxpro

Copy link
Copy Markdown
Owner

Thread#yield has been notorious, at least what I have observed working on multiple different CPU architectures. I wonder if we should rather use a synchronized block for this or a synchronized with Thread#yield combined.

@mkarg

mkarg commented Jun 6, 2025

Copy link
Copy Markdown
Contributor Author

Thread#yield has been notorious, at least what I have observed working on multiple different CPU architectures. I wonder if we should rather use a synchronized block for this or a synchronized with Thread#yield combined.

I cannot second that, and it should not be necessary to synchronize (and it would uselessly cost resources), as all what we want solely is exactly what Thread.yield() claims to do: Let the thread do other work (not block it). What is the actual problem facing here? If needed I can discuss with OpenJDK team (NB: I am an OpenJDK contributor) to see what the cause is.

Edit: Just thinking out loudly: Maybe we like to add LockSupport.parkNanos(1) to cool down the CPU?

@mkarg mkarg force-pushed the yield-when-busy branch from d667766 to b04611f Compare June 6, 2025 10:59
@mkarg

mkarg commented Jun 6, 2025

Copy link
Copy Markdown
Contributor Author

@mkarg There are test failures.

Sorry for this. There was a typo. Fixed that. Please re-run the tests. NB: In fact, it was not a test failure but a compilation failure.

@hyperxpro

Copy link
Copy Markdown
Owner

I'd suggest talking to someone in OpenJDK because ARM behaves differently with this. We can keep LockSupport.parkNanos(1) as a last resort or maybe even rewrite this in a true async way.

@mkarg

mkarg commented Jun 7, 2025

Copy link
Copy Markdown
Contributor Author

I'd suggest talking to someone in OpenJDK because ARM behaves differently with this. We can keep LockSupport.parkNanos(1) as a last resort or maybe even rewrite this in a true async way.

Can you post your concerns / experience with arm wrt yield? So we understand better what to talk about.

@mkarg

mkarg commented Jun 7, 2025

Copy link
Copy Markdown
Contributor Author

I'd suggest talking to someone in OpenJDK because ARM behaves differently with this. We can keep LockSupport.parkNanos(1) as a last resort or maybe even rewrite this in a true async way.

True async sounds great, but is this supported by the invoked classes? If so, it would be much better than busy wait (with or without yield / parkNanos)! 🤩

@hyperxpro

Copy link
Copy Markdown
Owner

JDK Thread#yield can be ignored by the OS scheduler as per the doc, and it will most likely happen with ARM v7, although most v7s have a PAUSE instruction set.

Sure, it can happen with x86, but ARM is something I never use, so people using Brotli4j on those CPUs are most likely to notice. OpenJDK contributors working with ARM will have the best answer to this. Although it's very much CPU-dependent as well.

True async is the way to go, it's a JNI call, so we have to wrap the blocking call behind the scenes. I'd prefer a new class in a dedicated "async" package for this.

But you get the idea. This way, it fits better for async libraries or applications relying on native async support from Brotli4j.

@mkarg

mkarg commented Jun 9, 2025

Copy link
Copy Markdown
Contributor Author

Sure, async is the best option.

Speaking of yield, I still do not see the problem. In x86 yield will yield, while on arm it will be a no-op. A no-op cannot be worse than what we have today. So my contribution should not improse any problems other than most likely be useless on arm.

@hyperxpro

Copy link
Copy Markdown
Owner

Agreed.

Curiously enough, what will the cost of yield be if there is a no-op on ARM inside the while loop? As per the Maven Central report, most people are on x86 architecture, so it's fine. For ARM or others with no-op, we can bring the async package and ask them to move if they suspect performance regression.

@mkarg

mkarg commented Jun 9, 2025

Copy link
Copy Markdown
Contributor Author

When yield is a no-op (like assumed on arm), the costs for the JVM is literally zero, as due to the missing implementation the HotSpot compiler simply removes the call completely.

Implementing async is a great idea, but should go in a separate PR.

@hyperxpro

Copy link
Copy Markdown
Owner

Indeed, async will go in another PR. For now, let's address this only.
JIT should also take care after a few misses on yield.

@hyperxpro hyperxpro merged commit e08da01 into hyperxpro:main Jun 9, 2025
28 checks passed
@hyperxpro

Copy link
Copy Markdown
Owner

Thanks a lot! :)

@mkarg mkarg deleted the yield-when-busy branch June 9, 2025 17:53
@mkarg

mkarg commented Jun 15, 2025

Copy link
Copy Markdown
Contributor Author

Indeed, async will go in another PR. For now, let's address this only. JIT should also take care after a few misses on yield.

Actually I have no found a way for async, as encoder.c does not contain any means to register for notification on buffer state change.

@hyperxpro

Copy link
Copy Markdown
Owner

Yes, native code does not support that. Our best bet is to go with an async wrapper because writing async from the ground up will be challenging. We can always later implement native async without breaking Java APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants