Skip to content

mds: initialize epoch for quiescedb#57993

Merged
batrick merged 1 commit intoceph:mainfrom
batrick:i66449
Jun 23, 2024
Merged

mds: initialize epoch for quiescedb#57993
batrick merged 1 commit intoceph:mainfrom
batrick:i66449

Conversation

@batrick
Copy link
Member

@batrick batrick commented Jun 12, 2024

Fixes: https://tracker.ceph.com/issues/66449

Checklist

  • Tracker (select at least one)
    • References tracker ticket
  • Component impact
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • No doc update is appropriate
  • Tests (select at least one)
    • No tests
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@batrick
Copy link
Member Author

batrick commented Jun 12, 2024

jenkins test make check arm64

@batrick
Copy link
Member Author

batrick commented Jun 13, 2024

This PR is under test in https://tracker.ceph.com/issues/66462.

batrick added a commit to batrick/ceph that referenced this pull request Jun 13, 2024
* refs/pull/57993/head:
	mds: initialize epoch for quiescedb
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Patrick!

@batrick
Copy link
Member Author

batrick commented Jun 15, 2024

It seems there is a follow-up issue but I'm not immediately sure why this wasn't also solved by this patch:

<error>
  <unique>0x42d</unique>
  <tid>29</tid>
  <threadname>quiesce_db_mgr</threadname>
  <kind>UninitCondition</kind>
  <what>Conditional jump or move depends on uninitialised value(s)</what>
  <stack>
    <frame>
      <ip>0x5E732A</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>UnknownInlinedFun</fn>
      <dir>/usr/src/debug/ceph-19.0.0-4302.g119deb3b.el9.x86_64/src/mds</dir>
      <file>QuiesceDb.h</file>
      <line>129</line>
    </frame>
    <frame>
      <ip>0x5E732A</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>QuiesceDbManager::leader_upkeep_db()</fn>
      <dir>/usr/src/debug/ceph-19.0.0-4302.g119deb3b.el9.x86_64/src/mds</dir>
      <file>QuiesceDbManager.cc</file>
      <line>889</line>
    </frame>
    <frame>
      <ip>0x5D923F</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>QuiesceDbManager::leader_upkeep(std::deque&lt;QuiesceDbPeerAck, std::allocator&lt;QuiesceDbPeerAck&gt; &gt;&amp;&amp;, std::deque&lt;QuiesceDbManager::RequestContext*, std::allocator&lt;QuiesceDbManager::RequestContext*&gt; &gt;&amp;&amp;)</fn>
      <dir>/usr/src/debug/ceph-19.0.0-4302.g119deb3b.el9.x86_64/src/mds</dir>
      <file>QuiesceDbManager.cc</file>
      <line>423</line>
    </frame>
    <frame>
      <ip>0x5DC856</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>QuiesceDbManager::quiesce_db_thread_main()</fn>
      <dir>/usr/src/debug/ceph-19.0.0-4302.g119deb3b.el9.x86_64/src/mds</dir>
      <file>QuiesceDbManager.cc</file>
      <line>111</line>
    </frame>
    <frame>
      <ip>0x210C80</ip>
      <obj>/usr/bin/ceph-mds</obj>
      <fn>QuiesceDbManager::QuiesceDbThread::entry()</fn>
      <dir>/usr/src/debug/ceph-19.0.0-4302.g119deb3b.el9.x86_64/src/mds</dir>
      <file>QuiesceDbManager.h</file>
      <line>217</line>
    </frame>
    <frame>
      <ip>0x4C89D14</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.2</obj>
      <fn>Thread::entry_wrapper()</fn>
      <dir>/usr/src/debug/ceph-19.0.0-4302.g119deb3b.el9.x86_64/src/common</dir>
      <file>Thread.cc</file>
      <line>87</line>
    </frame>
    <frame>
      <ip>0x4C89D30</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.2</obj>
      <fn>Thread::_entry_func(void*)</fn>
      <dir>/usr/src/debug/ceph-19.0.0-4302.g119deb3b.el9.x86_64/src/common</dir>
      <file>Thread.cc</file>
      <line>74</line>
    </frame>
    <frame>
      <ip>0x5952C01</ip>
      <obj>/usr/lib64/libc.so.6</obj>
      <fn>start_thread</fn>
    </frame>
    <frame>
      <ip>0x59D6F33</ip>
      <obj>/usr/lib64/libc.so.6</obj>
      <fn>clone</fn>
    </frame>
  </stack>
</error>

/teuthology/pdonnell-2024-06-13_18:50:03-fs-wip-pdonnell-testing-20240613.014923-debug-distro-default-smithi/7754213/remote/smithi119/log/valgrind/mds.b.log.gz

Or I no longer reproduced it locally not because of this fix and the problem is somewhere else.

Fixes: https://tracker.ceph.com/issues/66449
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
@batrick
Copy link
Member Author

batrick commented Jun 19, 2024

jenkins test make check arm64

@batrick
Copy link
Member Author

batrick commented Jun 22, 2024

This PR is under test in https://tracker.ceph.com/issues/66609.

@batrick
Copy link
Member Author

batrick commented Jun 23, 2024

@batrick batrick merged commit 7b7a3ca into ceph:main Jun 23, 2024
@batrick batrick deleted the i66449 branch June 23, 2024 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants