Skip to content

core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost.#445

Merged
LinZhihao-723 merged 9 commits into
y-scope:mainfrom
LinZhihao-723:mmap
Jun 17, 2024

Conversation

@LinZhihao-723

@LinZhihao-723 LinZhihao-723 commented Jun 16, 2024

Copy link
Copy Markdown
Member

Description

Motivation:

As we plan to publish an npm package for the JavaScript clp-ffi library, we need to remove standard Decompressor's boost dependency to build the necessary components properly. To achieve this milestone, we should replace boost's memory-mapped file with our own implementation. This change is first introduced in PR #387

Implementation

This PR introduces two classes necessary to wrap mmap system call:

  • FileDescriptor: This is a wrapper class to a raw file descriptor. With RAII, it is easier to handle failures/exceptions where a raw file descriptor is required. To handle to potential failure in the destructor, we provide a customizable callback if users don't want to swallow the error.
  • MemoryMappedFileView: This is a wrapper class to mmap system call. It maintains the mapped memory region using RAII, and provides methods to access an immutable view of the file. The memory region is unmapped in the destructor.

With the above two classes, we can replace boost's mmap support with our implementation in standard Decompressor.

Validation performed

  1. Ensure the code can be successfully built for all the relevant build targets.
  2. Ensure all existing unit tests are passed. The current streaming compression unit tests will call the above implementation to decompress zstd format files.
  3. Add new unit tests to test memory-mapped file functionalities separately. It also tests the special case where the mapped file is empty.

Comment thread components/core/src/clp/FileDescriptor.hpp
Comment thread components/core/tests/test-MemoryMappedFile.cpp Outdated
junhaoliao
junhaoliao previously approved these changes Jun 16, 2024

@junhaoliao junhaoliao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks!

I believe the code changes are sufficient to get rid of the "boost" header dependencies for Emscripten (WASM) compilation for the Zstd decompressor. Looking at cleaning up the CLP repo, can we now also remove references to boost::iostreams in the CMakeList.txt files? (May we need to replace all memeory-mapped file implementations with ours. )

@LinZhihao-723

LinZhihao-723 commented Jun 16, 2024

Copy link
Copy Markdown
Member Author

lgtm thanks!

I believe the code changes are sufficient to get rid of the "boost" header dependencies for Emscripten (WASM) compilation for the Zstd decompressor. Looking at cleaning up the CLP repo, can we now also remove references to boost::iostreams in the CMakeList.txt files? (May we need to replace all memeory-mapped file implementations with ours. )

I agree that we can do more cleaning. However, this is already a non-trivial PR. I think we should delay the rest of the cleanup to the future PRs. Arguably speaking, this PR should only contain two new classes. We should delay the boost deprecation in zstd decompressor to the next PR. What do u think? @kirkrodrigues

Comment thread components/core/tests/test-MemoryMappedFile.cpp Outdated
Comment thread components/core/src/clp/FileDescriptor.cpp Outdated
Comment thread components/core/src/clp/FileDescriptor.cpp Outdated
Comment thread components/core/src/clp/FileDescriptor.hpp Outdated
Comment thread components/core/src/clp/FileDescriptor.hpp Outdated
Comment thread components/core/tests/test-MemoryMappedFile.cpp Outdated
Comment thread components/core/tests/test-MemoryMappedFile.cpp Outdated
Comment thread components/core/tests/test-MemoryMappedFile.cpp Outdated
Comment thread components/core/src/clp/MemoryMappedFileView.hpp Outdated
Comment thread components/core/src/clp/streaming_compression/zstd/Decompressor.cpp Outdated
Comment thread components/core/src/clp/streaming_compression/zstd/Decompressor.cpp Outdated

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to resolve.

@kirkrodrigues

Copy link
Copy Markdown
Member

However, this is already a non-trivial PR. I think we should delay the rest of the cleanup to the future PRs. Arguably speaking, this PR should only contain two new classes. We should delay the boost deprecation in zstd decompressor to the next PR. What do u think? @kirkrodrigues

Leaving further clean-up to another PR sounds good.

LinZhihao-723 and others added 2 commits June 17, 2024 10:37
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Comment thread components/core/tests/test-MemoryMappedFile.cpp
Comment thread components/core/src/clp/FileDescriptor.cpp Outdated
Comment thread components/core/src/clp/FileDescriptor.hpp Outdated
Comment thread components/core/src/clp/FileDescriptor.hpp Outdated
Comment thread components/core/tests/test-MemoryMappedFile.cpp Outdated
LinZhihao-723 and others added 3 commits June 17, 2024 16:03
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost.

@LinZhihao-723

Copy link
Copy Markdown
Member Author

For the PR title, how about:

core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost.

lgtm

@LinZhihao-723 LinZhihao-723 changed the title Add support for read-only memory mapped file; Remove zstandard Decompressor's dependency on boost. core: Add C++ classes for read-only memory-mapped files and UNIX file descriptors; Remove Zstandard's dependency on Boost. Jun 17, 2024
@LinZhihao-723 LinZhihao-723 merged commit b335c11 into y-scope:main Jun 17, 2024
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
… descriptors; Remove Zstandard's dependency on Boost. (y-scope#445)

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
junhaoliao pushed a commit to junhaoliao/clp that referenced this pull request May 17, 2026
… descriptors; Remove Zstandard's dependency on Boost. (y-scope#445)

Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants