Skip to content

os/bluestore: add BtreeAllocator#41828

Merged
tchaikov merged 3 commits intoceph:masterfrom
tchaikov:wip-btree-alloc
Jun 24, 2021
Merged

os/bluestore: add BtreeAllocator#41828
tchaikov merged 3 commits intoceph:masterfrom
tchaikov:wip-btree-alloc

Conversation

@tchaikov
Copy link
Contributor

@tchaikov tchaikov commented Jun 12, 2021

because each AVL tree node needs to keep track of its children, the
overhead of each extent in AvlAllocator is relatively higher than that
of bitmap allocator, per node sizes 80 bytes, per mempool stats provided
by unittest_alloc_bench. while btree has lower overhead, as it keeps multiple
entries in each node / block, the node size of abseil's btree implementation
defaults to 256, so each node is able to contain up to 256 values. this means
we have lower overhead than the binary-trees like AVL and red-black tree.

but because the overhead of the rebalance of the btree is higher than that of
binary tree, its takes more CPU cycles when performing alloc-release
than its AVL version. but its upside is that its memory layout is
more compact, under some use case, it might be a better alternative
than AvlAllocator.

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 14, 2021

per the btree.stats which is not exposed by its wrapper,

  • average_bytes_per_value: 22.7556: each extent takes 22.75 bytes.
  • overhead: 40: each b-tree node (which contains up to 256 values) has 40 bytes overhead.

this is much lower than that of AVL tree, where each node contains a single extent, and it takes 80 bytes. please note, each value (extent) only uses 16 bytes for a uint64_t start and a uint64_t end.

@tchaikov tchaikov force-pushed the wip-btree-alloc branch 2 times, most recently from 19465bc to 03a396a Compare June 14, 2021 10:09
@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 14, 2021

test AVL btree
test_alloc_bench 15358 ms 16719 ms
test_alloc_bench_90_300 37853 ms 46812 ms
test_alloc_bench_50_300 37880 ms 43598 ms
test_alloc_bench_10_300 37824 40372 ms
test_alloc_bench_seq/2 577766 ms 827261 ms

@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 14, 2021

[ RUN      ] Queue.SpawnAsyncRequest
../src/test/rgw/test_rgw_dmclock_scheduler.cc:425: Failure
Value of: context.stopped()
  Actual: false
Expected: true
[  FAILED  ] Queue.SpawnAsyncRequest (0 ms)

https://tracker.ceph.com/issues/42788

@tchaikov
Copy link
Contributor Author

jenkins test make check

@tchaikov
Copy link
Contributor Author

jenkins test make check

@aclamk
Copy link
Contributor

aclamk commented Jun 17, 2021

I added a memory tracking to aging test: aclamk@0a2279e

Summary:

allocator fragmented allocs #frags free_score time mempool tcmalloc
avl 0.727767% 2.00015 0.756594 15590.9ms 3281490 3283234
bitmap 55.3145% 2.60743 0.583628 4836.3ms 18708994 18710322
btree 0.674791% 2.00038 0.756787 8716.12ms 1844074 1845818
stupid 1.40833% 2.04938 0.749268 29840ms 893859 895612

btree really looks impressive, Kefu @tchaikov !

I guess it really should add btree to mempool accounting.

Copy link
Contributor

@aclamk aclamk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent reduction of used memory!
Must move btree to mempool::bluestore_alloc.

@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 17, 2021

@aclamk Adam, thank you for profiling the BtreeAllocator! i reformatted the test result in your comment as a table. hope it's fine by you =)

the btree block / node is more of an implementation details which are not exposed as part of its public interface. so i thought it'd be difficult to inject the MEMPOOL_CLASS_HELPERS macro and to define MEMPOOL_DEFINE_OBJECT_FACTORY() in an appropriate way. but it seems we can just pass mempool::bluestore_alloc::pool_allocator<T> to the btree container templates as their allocator template parameter. just updated with this change.

could you take another look?

@tchaikov
Copy link
Contributor Author

@aclamk i reran the aging test with your patch and updated the table in your comment.

@tchaikov
Copy link
Contributor Author

rebased against master.

@tchaikov
Copy link
Contributor Author

jenkins test make check

1 similar comment
@tchaikov
Copy link
Contributor Author

jenkins test make check

@tchaikov
Copy link
Contributor Author

@aclamk hi Adam, is there anything else i can do to improve this changeset?

Copy link
Contributor

@aclamk aclamk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good work.
In my opinion btree allocator should replace avl in any case.

There should be done some work on replacing avl->btree for 'hybrid' strategy. If so, new thresholds for switching from interval to bitmap mode must be calculated.

@tchaikov tchaikov merged commit 827c663 into ceph:master Jun 24, 2021
@tchaikov
Copy link
Contributor Author

@aclamk yeah, will port the hybrid strategy to btree allocator later on.

@tchaikov tchaikov deleted the wip-btree-alloc branch June 24, 2021 11:11
NUABO added a commit to NUABO/ceph that referenced this pull request Apr 28, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use Btree mode.

Signed-off-by: tan changzhi <544463199@qq.com>
@NUABO NUABO mentioned this pull request Apr 29, 2024
14 tasks
NUABO added a commit to NUABO/ceph that referenced this pull request Apr 29, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use Btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
NUABO added a commit to NUABO/ceph that referenced this pull request Apr 29, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use Btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
NUABO added a commit to NUABO/ceph that referenced this pull request Apr 29, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
NUABO added a commit to NUABO/ceph that referenced this pull request Apr 29, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
baum pushed a commit to ceph/ceph-ci that referenced this pull request Aug 1, 2024
BtreeAllocator was added in the commit
ceph/ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
baum pushed a commit to ceph/ceph-ci that referenced this pull request Aug 1, 2024
BtreeAllocator was added in the commit
ceph/ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
pponnuvel pushed a commit to pponnuvel/ceph that referenced this pull request Aug 29, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
(cherry picked from commit 8d47912)
pponnuvel pushed a commit to pponnuvel/ceph that referenced this pull request Aug 29, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
(cherry picked from commit 8d47912)
pponnuvel pushed a commit to pponnuvel/ceph that referenced this pull request Aug 29, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
(cherry picked from commit 8d47912)
pponnuvel pushed a commit to pponnuvel/ceph that referenced this pull request Nov 26, 2024
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
(cherry picked from commit 8d47912)
pponnuvel pushed a commit to pponnuvel/ceph that referenced this pull request Jan 13, 2025
BtreeAllocator was added in the commit
ceph#41828.
Its performance has advantages in some scenarios,
but we cannot configure to use btree mode.

Fixes: https://tracker.ceph.com/issues/65678
Signed-off-by: tan changzhi <544463199@qq.com>
(cherry picked from commit 8d47912)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants