Skip to content

eio_linux: make read_dir faster#219

Merged
talex5 merged 1 commit intoocaml-multicore:mainfrom
talex5:faster-read-dir
Jun 1, 2022
Merged

eio_linux: make read_dir faster#219
talex5 merged 1 commit intoocaml-multicore:mainfrom
talex5:faster-read-dir

Conversation

@talex5
Copy link
Copy Markdown
Collaborator

@talex5 talex5 commented Jun 1, 2022

This further improves the performance of directory reads by using a larger buffer and by performing all reads in a single systhread, rather than creating a new one for each chunk.

  • Read 152448 items in 1.188156 seconds with the old code.
  • Read 152448 items in 0.217013 seconds with the new code.

(continues #218)

This improves performance of directory reads by using a larger buffer
and by performing all reads in a single systhread, rather than creating
a new one for each chunk.

- `Read 152448 items in 1.188156 seconds` with the old code.
- `Read 152448 items in 0.217013 seconds` with the new code.

It also ensures we can already read at least one item.
@talex5 talex5 force-pushed the faster-read-dir branch from cfc0d95 to df430b4 Compare June 1, 2022 12:07
Copy link
Copy Markdown
Collaborator

@patricoferris patricoferris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! :)) Thanks for the fixes in #218 as well

@talex5 talex5 merged commit f6b97a2 into ocaml-multicore:main Jun 1, 2022
@talex5 talex5 deleted the faster-read-dir branch June 1, 2022 12:38
@talex5
Copy link
Copy Markdown
Collaborator Author

talex5 commented Jun 1, 2022

BTW, I found a few other interesting details in https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/readdir64.c;h=e876d84b0249d720d33fd01aa0299e23ff6a2546;hb=HEAD:

  • It treats ENOENT as end-of-directory, and so does musl.
  • It skips entries with d_ino = 0 (saying they're deleted). It looks like this code started off as generic Unix code and got specialised to Linux, so this may not be relevant. musl doesn't have that check, anyway.
  • It uses st_blksize as a hint about a sensible buffer size. musl doesn't do this.

We can probably leave it for now and see if it causes any trouble.

talex5 added a commit to talex5/opam-repository that referenced this pull request Jun 28, 2022
CHANGES:

API changes:

- `Net.accept_sub` is deprecated in favour of `accept_fork` (@talex5 ocaml-multicore/eio#240).
  `Fiber.fork_on_accept`, which it used internally, has been removed.

- Allow short writes in `Read_source_buffer` (@talex5 ocaml-multicore/eio#239).
  The reader is no longer required to consume all the data in one go.
  Also, add `Linux_eio.Low_level.writev_single` to expose this behaviour directly.

- `Eio.Unix_perm` is now `Eio.Dir.Unix_perm`.

New features:

- Add `Eio.Mutex` (@TheLortex @talex5 ocaml-multicore/eio#223).

- Add `Eio.Buf_write` (@talex5 ocaml-multicore/eio#235).
  This is a buffered writer for Eio sinks, based on Faraday.

- Add `Eio_mock` library for testing (@talex5 ocaml-multicore/eio#228).
  At the moment it has mock flows and networks.

- Add `Eio_mock.Backend` (@talex5 ocaml-multicore/eio#237 ocaml-multicore/eio#238).
  Allows running tests without needing a dependency on eio_main.
  Also, as it is single-threaded, it can detect deadlocks in test code instead of just hanging.

- Add `Buf_read.{of_buffer, of_string, parse_string{,_exn}, return}` (@talex5 ocaml-multicore/eio#225).

- Add `<*>` combinator to `Buf_read.Syntax` (@talex5 ocaml-multicore/eio#227).

- Add `Eio.Dir.read_dir` (@patricoferris @talex5 ocaml-multicore/eio#207 ocaml-multicore/eio#218 ocaml-multicore/eio#219)

Performance:

- Add `Buf_read` benchmark and optimise it a bit (@talex5 ocaml-multicore/eio#230).

- Inline `Buf_read.consume` to improve performance (@talex5 ocaml-multicore/eio#232).

Bug fixes / minor changes:

- Allow IO to happen even if a fiber keeps yielding (@TheLortex @talex5 ocaml-multicore/eio#213).

- Fallback for `traceln` without an effect handler (@talex5 ocaml-multicore/eio#226).
  `traceln` now works outside of an event loop too.

- Check for cancellation when creating a non-protected child context (@talex5 ocaml-multicore/eio#222).

- eio_linux: handle EINTR when calling `getrandom` (@bikallem ocaml-multicore/eio#212).

- Update to cmdliner.1.1.0 (@talex5 ocaml-multicore/eio#190).
talex5 added a commit to talex5/opam-repository that referenced this pull request Jun 28, 2022
CHANGES:

API changes:

- `Net.accept_sub` is deprecated in favour of `accept_fork` (@talex5 ocaml-multicore/eio#240).
  `Fiber.fork_on_accept`, which it used internally, has been removed.

- Allow short writes in `Read_source_buffer` (@talex5 ocaml-multicore/eio#239).
  The reader is no longer required to consume all the data in one go.
  Also, add `Linux_eio.Low_level.writev_single` to expose this behaviour directly.

- `Eio.Unix_perm` is now `Eio.Dir.Unix_perm`.

New features:

- Add `Eio.Mutex` (@TheLortex @talex5 ocaml-multicore/eio#223).

- Add `Eio.Buf_write` (@talex5 ocaml-multicore/eio#235).
  This is a buffered writer for Eio sinks, based on Faraday.

- Add `Eio_mock` library for testing (@talex5 ocaml-multicore/eio#228).
  At the moment it has mock flows and networks.

- Add `Eio_mock.Backend` (@talex5 ocaml-multicore/eio#237 ocaml-multicore/eio#238).
  Allows running tests without needing a dependency on eio_main.
  Also, as it is single-threaded, it can detect deadlocks in test code instead of just hanging.

- Add `Buf_read.{of_buffer, of_string, parse_string{,_exn}, return}` (@talex5 ocaml-multicore/eio#225).

- Add `<*>` combinator to `Buf_read.Syntax` (@talex5 ocaml-multicore/eio#227).

- Add `Eio.Dir.read_dir` (@patricoferris @talex5 ocaml-multicore/eio#207 ocaml-multicore/eio#218 ocaml-multicore/eio#219)

Performance:

- Add `Buf_read` benchmark and optimise it a bit (@talex5 ocaml-multicore/eio#230).

- Inline `Buf_read.consume` to improve performance (@talex5 ocaml-multicore/eio#232).

Bug fixes / minor changes:

- Allow IO to happen even if a fiber keeps yielding (@TheLortex @talex5 ocaml-multicore/eio#213).

- Fallback for `traceln` without an effect handler (@talex5 ocaml-multicore/eio#226).
  `traceln` now works outside of an event loop too.

- Check for cancellation when creating a non-protected child context (@talex5 ocaml-multicore/eio#222).

- eio_linux: handle EINTR when calling `getrandom` (@bikallem ocaml-multicore/eio#212).

- Update to cmdliner.1.1.0 (@talex5 ocaml-multicore/eio#190).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants