Skip to content

neorados_pool_test: Call create_pool at the end of NeoRados delete_pool related test#61449

Merged
Matan-B merged 1 commit intoceph:mainfrom
mohit84:issue_69405
Jan 28, 2025
Merged

neorados_pool_test: Call create_pool at the end of NeoRados delete_pool related test#61449
Matan-B merged 1 commit intoceph:mainfrom
mohit84:issue_69405

Conversation

@mohit84
Copy link
Contributor

@mohit84 mohit84 commented Jan 20, 2025

The test case itself call CoTearDown function by the end of every test case and the function call delete_pool on the basis of pool name exists in local set <created_pools> and if the pool does not exist during CoTearDown it throws an exception so the test case expects the pool should be exists before leaving the test case.

Solution: Call create_pool to avoid an error during CoTearDown

Fixes: https://tracker.ceph.com/issues/69405

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@mohit84
Copy link
Contributor Author

mohit84 commented Jan 20, 2025

jenkins test make check

Copy link
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work, it seems that CoTearDown unconditionally will try to delete any created pool (even if it was deleted already).
@mohit84, what do you think about updating this test delete_pool to also remove the deleted pool from created_pools instead?
That way we can avoid the double deletion on teardown.

@mohit84
Copy link
Contributor Author

mohit84 commented Jan 20, 2025

This will work, it seems that CoTearDown unconditionally will try to delete any created pool (even if it was deleted already). @mohit84, what do you think about updating this test delete_pool to also remove the deleted pool from created_pools instead? That way we can avoid the double deletion on teardown.

We can but i am not sure the purpose of this. I think @adamemerson can confirm more about it , what is the best way to resolve it. Actually it is not unconditional, it try to delete a pool on the pool exists in created_pool map but the issue is it seems it is not in sync. Ideally we should erase a pool name during delete_pool then CoTearDown should work but i already tried it was not working due to the nature of async of co_await. We have to to use a mutex to synchronize created_pools map.

Copy link
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can but i am not sure the purpose of this. I think @adamemerson can confirm more about it , what is the best way to resolve it. Actually it is not unconditional, it try to delete a pool on the pool exists in created_pool map but the issue is it seems it is not in sync. Ideally we should erase a pool name during delete_pool then CoTearDown should work but i already tried it was not working due to the nature of async of co_await. We have to to use a mutex to synchronize created_pools map.

The purpose would be to not add extra "create pool", which is not related to the test case to not trigger errors on teardown.

The test case of e.g PoolDelete - creates a pool and deletes it.
With the current patch, the test actually:

  1. creates a pool
  2. deletes it
  3. creates a pool again

Regarding the sync issues you mentioned, I'm not sure I understand. When using co_await we will wait until the function future is ready.

@mohit84
Copy link
Contributor Author

mohit84 commented Jan 20, 2025

We can but i am not sure the purpose of this. I think @adamemerson can confirm more about it , what is the best way to resolve it. Actually it is not unconditional, it try to delete a pool on the pool exists in created_pool map but the issue is it seems it is not in sync. Ideally we should erase a pool name during delete_pool then CoTearDown should work but i already tried it was not working due to the nature of async of co_await. We have to to use a mutex to synchronize created_pools map.

The purpose would be to not add extra "create pool", which is not related to the test case to not trigger errors on teardown.

The test case of e.g PoolDelete - creates a pool and deletes it. With the current patch, the test actually:

  1. creates a pool
  2. deletes it
  3. creates a pool again

Regarding the sync issues you mentioned, I'm not sure I understand. When using co_await we will wait until the function future is ready.

Actually it was my understanding gap, after erasing a pool name the test case was getting crashed because CoTearDown was trying to iterate over original copy, after make it local_copy i am able to resolve the crash.

@mohit84
Copy link
Contributor Author

mohit84 commented Jan 20, 2025

jenkins test make check

Copy link
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it was my understanding gap, after erasing a pool name the test case was getting crashed because CoTearDown was trying to iterate over original copy, after make it local_copy i am able to resolve the crash.

This looks much better, I'm not sure why the local copy is needed though. Can you please explain?

nit: PR title is outdated.

/// \brief Delete pool used for testing
boost::asio::awaitable<void> CoTearDown() override {
for (const auto& name : created_pools) try {
auto pools_to_delete = created_pools;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we iterate over created_pools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required because created_pools is only modified by delete_pool not by delete_pool is calling via CoTearDown. I observed without this change a test case was getting crashed

@mohit84
Copy link
Contributor Author

mohit84 commented Jan 21, 2025

Actually it was my understanding gap, after erasing a pool name the test case was getting crashed because CoTearDown was trying to iterate over original copy, after make it local_copy i am able to resolve the crash.

This looks much better, I'm not sure why the local copy is needed though. Can you please explain?

nit: PR title is outdated.

It is required because created_pools is only modified by delete_pool not by delete_pool is calling via CoTearDown. I observed without this change a test case was getting crashed

The test case itself call CoTearDown function by the end of every test case
and the function call delete_pool on the basis of pool name exists in
local set <created_pools> and if the pool does not exist during CoTearDown
it throws an exception so the test case expects the pool should be exists
before leaving the test case.

Solution: During delete_pool erase a pool name from a local map
          and CoTearDown iterate over local copy instead of the
          orignal copy.

Fixes: https://tracker.ceph.com/issues/69405
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
@mohit84
Copy link
Contributor Author

mohit84 commented Jan 21, 2025

Actually it was my understanding gap, after erasing a pool name the test case was getting crashed because CoTearDown was trying to iterate over original copy, after make it local_copy i am able to resolve the crash.

This looks much better, I'm not sure why the local copy is needed though. Can you please explain?

nit: PR title is outdated.

done

@mohit84
Copy link
Contributor Author

mohit84 commented Jan 21, 2025

jenkins test make check

Copy link
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the local copy iteration but I prefer getting this one merged since it causes failures in every rados_api job tests. We can improve this in a follow up PR (cc @adamemerson)

@Matan-B
Copy link
Contributor

Matan-B commented Jan 27, 2025

@mohit84, can you please run this one through the suite?
I think that we can filter rados_api_tests only as it impacts only it.

@Matan-B
Copy link
Contributor

Matan-B commented Jan 27, 2025

jenkins test make check arm64

Copy link
Contributor

@adamemerson adamemerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also catch and swallow the exception, I think. Though considering how few tests there were the 'cleanup' phase is probably overkill and we could just get rid of it. I was expecting to be creating and listing large numbers of pools when I wrote it.

@mohit84
Copy link
Contributor Author

mohit84 commented Jan 28, 2025

@mohit84, can you please run this one through the suite? I think that we can filter rados_api_tests only as it impacts only it.

The test case is passed
https://pulpito.ceph.com/moagrawa-2025-01-28_00:19:22-crimson-rados-issue_69405-distro-crimson-smithi/

@Matan-B
Copy link
Contributor

Matan-B commented Jan 28, 2025

The test case is passed https://pulpito.ceph.com/moagrawa-2025-01-28_00:19:22-crimson-rados-issue_69405-distro-crimson-smithi/

Classic run: https://pulpito.ceph.com/matan-2025-01-28_09:11:13-rados-issue_69405-distro-default-smithi/

2025-01-28T09:33:10.698 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: Running main() from gmock_main.cc
2025-01-28T09:33:10.698 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [==========] Running 6 tests from 1 test suite.
2025-01-28T09:33:10.698 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [----------] Global test environment set-up.
2025-01-28T09:33:10.698 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [----------] 6 tests from NeoRadosPools
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [ RUN      ] NeoRadosPools.PoolList
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [       OK ] NeoRadosPools.PoolList (1635 ms)
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [ RUN      ] NeoRadosPools.PoolLookup
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [       OK ] NeoRadosPools.PoolLookup (2005 ms)
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [ RUN      ] NeoRadosPools.PoolLookupOtherInstance
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [       OK ] NeoRadosPools.PoolLookupOtherInstance (2030 ms)
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [ RUN      ] NeoRadosPools.PoolDelete
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [       OK ] NeoRadosPools.PoolDelete (2021 ms)
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [ RUN      ] NeoRadosPools.PoolCreateDelete
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [       OK ] NeoRadosPools.PoolCreateDelete (2000 ms)
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [ RUN      ] NeoRadosPools.PoolCreateWithCrushRule
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [       OK ] NeoRadosPools.PoolCreateWithCrushRule (2016 ms)
2025-01-28T09:33:10.699 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool: [----------] 6 tests from NeoRadosPools (11707 ms total)
2025-01-28T09:33:10.700 INFO:tasks.workunit.client.0.smithi102.stdout:                     pool:

@Matan-B Matan-B merged commit e83b124 into ceph:main Jan 28, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Merged (Pre Tentacle Freeze)

Development

Successfully merging this pull request may close these issues.

3 participants