Skip to content

Optimize readLong_b to read 32bit at once from the serial buffer.#95

Merged
breakintoprogram merged 1 commit intobreakintoprogram:mainfrom
astralaster:optimize_upload_speed
Sep 24, 2023
Merged

Optimize readLong_b to read 32bit at once from the serial buffer.#95
breakintoprogram merged 1 commit intobreakintoprogram:mainfrom
astralaster:optimize_upload_speed

Conversation

@astralaster
Copy link
Copy Markdown
Contributor

This small change does optimize the upload speed from the ez80 to the VDP. The attached benchmark shows upload speed of 227kbits with the current code and 749kbits with my change. You can also "feel" the difference if you start games like rokky.

agon-bench.zip

@stevesims
Copy link
Copy Markdown
Contributor

most awesome

I've already tested this and can confirm that it does indeed deliver the big bandwidth increase

@breakintoprogram please merge ASAP 😁

@breakintoprogram
Copy link
Copy Markdown
Owner

Thanks. That looks reasonable to me. I'll merge and test later.

@breakintoprogram breakintoprogram added the enhancement New feature or request label Sep 20, 2023
@theflynn49
Copy link
Copy Markdown

Awesome. It's surprising (to me) at the first glance that this change makes such a notable difference !

@stevesims
Copy link
Copy Markdown
Contributor

Awesome. It's surprising (to me) at the first glance that this change makes such a notable difference !

the reason for this I believe is that essentially the previous version was calling readByte_b 4 times, each time of which runs thru a while loop waiting for a byte to become available. if the comms buffer were full (or rather just had those 4 bytes available, which is fairly likely), there would always be bytes available so that's 3x the amount of checking loops needed. additionally the readBytes call here would be a simple memory copy, so there's no shuffling of bytes around to form the 32-bit word.

I kinda get why it's a fairly big improvement - and conversely am also still surprised by how big an improvement it is. 😁

gives me some ideas for some further improvements that can be made - but those will have to wait a little while.

@astralaster
Copy link
Copy Markdown
Contributor Author

astralaster commented Sep 21, 2023

I was a bit surprised as well. But you have to remember this is a 32bit CPU. It's most happy to fetch 32bit around. It could basically as costly to fetch 32bit at once or just 8bit.
The emulator for example does run the old code at the same speed, too. It just that the esp32 was limiting here. And now I can show the ez80 does limit. If you remove the last waitstate for the flash memory currently in place you even get 800kbits.

@breakintoprogram breakintoprogram merged commit b2b01f2 into breakintoprogram:main Sep 24, 2023
@astralaster astralaster deleted the optimize_upload_speed branch October 3, 2023 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: Released

Development

Successfully merging this pull request may close these issues.

4 participants