Project

General

Profile

Actions

Bug #71114

closed

rgw: bucket notification tests against different endpoints (squid)

Added by J. Eric Ivancich about 1 year ago. Updated about 1 year ago.

Status:
Duplicate
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Backport:
squid, reef, tentacle
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

Problematic run: https://qa-proxy.ceph.com/teuthology/hyelloji-2025-03-12_09:28:17-rgw-wip-hemanth-testing-2025-03-03-1505-squid-distro-default-smithi/8183652/teuthology.log

Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_teuthology_8e60109c2ce9ac81275a6501eacf5f84a082ec68/teuthology/contextutil.py", line 30, in nested
    vars.append(enter())
  File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_b048a4a9022f27b92721b8006d592f2627fb0c62/qa/tasks/notification_tests.py", line 238, in run_tests
    remote.run(
  File "/home/teuthworker/src/git.ceph.com_teuthology_8e60109c2ce9ac81275a6501eacf5f84a082ec68/teuthology/orchestra/remote.py", line 535, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_teuthology_8e60109c2ce9ac81275a6501eacf5f84a082ec68/teuthology/orchestra/run.py", line 461, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_teuthology_8e60109c2ce9ac81275a6501eacf5f84a082ec68/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_teuthology_8e60109c2ce9ac81275a6501eacf5f84a082ec68/teuthology/orchestra/run.py", line 181, in _raise_for_status
    raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed ( bucket notification tests against different endpoints ) on smithi045 with status 1: 'BNTESTS_CONF=/home/ubuntu/cephtest/ceph/src/test/rgw/bucket_notification/bn-tests.client.0.conf /home/ubuntu/cephtest/ceph/src/test/rgw/bucket_notification/virtualenv/bin/python -m nose -s /home/ubuntu/cephtest/ceph/src/test/rgw/bucket_notification/test_bn.py -v -a kafka_test'
2025-03-17T14:30:02.095 INFO:tasks.notification_tests:Removing bn-tests.conf file...

Related issues 1 (0 open1 closed)

Is duplicate of rgw - Backport #67309: squid: test_bn.py -v -a kafka_test: Fatal glibc error: tpp.c:87 (__pthread_tpp_change_priority): assertion failedResolvedYuval LifshitzActions
Actions #1

Updated by J. Eric Ivancich about 1 year ago

  • Description updated (diff)
Actions #2

Updated by J. Eric Ivancich about 1 year ago

  • Description updated (diff)
Actions #3

Updated by Vinayak Tiwari about 1 year ago

After reviewing the failure logs, it appears that the primary issue stems from radosgw-admin cleanup commands failing due to pending garbage collection or incomplete user data purging. The Kafka notification test might contribute to this by leaving unprocessed notifications or orphaned objects. I can work on improving the test cleanup process and ensuring proper handling of GC before teardown. Please confirm if this approach is aligned.

Actions #4

Updated by Vinayak Tiwari about 1 year ago

Vinayak Tiwari wrote in #note-3:

After reviewing the failure logs, it appears that the primary issue stems from radosgw-admin cleanup commands failing due to pending garbage collection or incomplete user data purging. The Kafka notification test might contribute to this by leaving unprocessed notifications or orphaned objects. I can work on improving the test cleanup process and ensuring proper handling of GC before teardown. Please confirm if this approach is aligned.

PR #63015 addresses cleanup issues in test_ps_s3_notification_kafka_idle_behaviour to prevent Teuthology failures. Let me know if any changes are needed.
https://github.com/ceph/ceph/pull/63015

Actions #5

Updated by J. Eric Ivancich about 1 year ago

  • Pull request ID set to 63015
Actions #6

Updated by J. Eric Ivancich about 1 year ago

  • Backport set to squid, reef, tentacle
Actions #7

Updated by Yuval Lifshitz about 1 year ago

this looks like an RGW crash. no crash trace, but stdout log shows:

Fatal glibc error: tpp.c:87 (__pthread_tpp_change_priority): assertion failed: previous_prio == -1 || (previous_prio >= fifo_min_prio && previous_prio <= fifo_max_prio)
*** Caught signal (Segmentation fault) **
 in thread 7ff166120640 thread_name:io_context_pool

is it possible to check if this fix: https://github.com/ceph/ceph/pull/62337
was merged before or after the crash?

Actions #8

Updated by Casey Bodley about 1 year ago

  • Is duplicate of Backport #67309: squid: test_bn.py -v -a kafka_test: Fatal glibc error: tpp.c:87 (__pthread_tpp_change_priority): assertion failed added
Actions #9

Updated by Casey Bodley about 1 year ago

  • Status changed from New to Duplicate

Yuval Lifshitz wrote in #note-7:

this looks like an RGW crash. no crash trace, but stdout log shows:
[...]

is it possible to check if this fix: https://github.com/ceph/ceph/pull/62337
was merged before or after the crash?

from the teuthology log:

sha1: b048a4a9022f27b92721b8006d592f2627fb0c62

this corresponds to ceph-ci branch https://github.com/ceph/ceph-ci/commits/b048a4a9022f27b92721b8006d592f2627fb0c62

pull request 62337 adds a file common/async/yield_waiter.h that is not present on this branch: https://github.com/ceph/ceph-ci/tree/b048a4a9022f27b92721b8006d592f2627fb0c62/src/common/async

so it merged after. closing as a dup of 67309. thanks Yuval!

Actions

Also available in: Atom PDF