qa/tasks/cephfs/mount.py: defer deleting the netnses and bridge#35944
qa/tasks/cephfs/mount.py: defer deleting the netnses and bridge#35944batrick merged 5 commits intoceph:masterfrom
Conversation
3a2d705 to
6793049
Compare
|
jenkins retest this please |
qa/tasks/cephfs/mount.py
Outdated
| args = ["sudo", "bash", "-c", | ||
| "iptables -A FORWARD -o {0} -i ceph-brx -j ACCEPT".format(gw)] | ||
| args = ['sudo', 'iptables', '-A', 'FORWARD', '-o', '{0}'.format(gw), '-i', 'ceph-brx', '-j', 'ACCEPT'] | ||
| self.client_remote.run(args=args, timeout=(5*60), omit_sudo=False) |
There was a problem hiding this comment.
While you're here fixing this, why not do:
self.run_shell_payload(f"""
sudo iptables ...
sudo iptables ...
""", omit_sudo=False)
There was a problem hiding this comment.
Cool, I didn't notice this helper, will fix it.
Thanks.
|
jenkins retest this please |
batrick
left a comment
There was a problem hiding this comment.
flake8 run-test: commands[0] | flake8 --select=F,E9 --exclude=venv,.tox
./tasks/cephfs/mount.py:279:0: F541 f-string is missing placeholders
ERROR: InvocationError for command /home/jenkins-build/build/workspace/ceph-pull-requests/qa/.tox/flake8/bin/flake8 --select=F,E9 --exclude=venv,.tox (exited with code 1)
make check failure ^
qa/tasks/cephfs/mount.py
Outdated
|
|
||
| # This will cleanup the stale netnses, which are from the | ||
| # last failed test cases. | ||
| def cleanup_stale_netnses_and_bridge(remote): |
There was a problem hiding this comment.
Let's make this a static method of CephFSMount.
5f075a3 to
0af0bf7
Compare
Fixed it. Thx. |
|
jenkins test dashboard backend |
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Done. thanks. |
|
jenkins retest this please |
f1e2b98 to
5c3db71
Compare
|
jenkins test dashboard backend |
batrick
left a comment
There was a problem hiding this comment.
Found this error in QA:
2020-07-28T04:11:10.187 INFO:teuthology.orchestra.run.smithi094:> (cd /home/ubuntu/cephtest/mnt.0 && exec sudo bash -c '
2020-07-28T04:11:10.188 INFO:teuthology.orchestra.run.smithi094:> set -e
2020-07-28T04:11:10.188 INFO:teuthology.orchestra.run.smithi094:> sudo ip link add name ceph-brx type bridge
2020-07-28T04:11:10.189 INFO:teuthology.orchestra.run.smithi094:> sudo ip addr flush dev ceph-brx
2020-07-28T04:11:10.189 INFO:teuthology.orchestra.run.smithi094:> sudo ip link set ceph-brx up
2020-07-28T04:11:10.190 INFO:teuthology.orchestra.run.smithi094:> sudo ip addr add 192.168.255.254/16 brd 192.168.255.255 dev ceph-brx
2020-07-28T04:11:10.191 INFO:teuthology.orchestra.run.smithi094:> ')
2020-07-28T04:11:10.232 INFO:teuthology.orchestra.run.smithi094.stderr:bash: line 0: cd: /home/ubuntu/cephtest/mnt.0: No such file or directory
From: /ceph/teuthology-archive/pdonnell-2020-07-28_03:46:25-fs-wip-pdonnell-testing-20200728.022107-distro-basic-smithi/5262950/teuthology.log
Xiubo, please run through teuthology when you've updated your PR so we can get this ready to merge ASAP.
qa/tasks/cephfs/mount.py
Outdated
| sudo ip addr flush dev ceph-brx | ||
| sudo ip link set ceph-brx up | ||
| sudo ip addr add {ip}/{mask} brd {brd} dev ceph-brx | ||
| """, timeout=(5*60), omit_sudo=False) |
There was a problem hiding this comment.
| """, timeout=(5*60), omit_sudo=False) | |
| """, timeout=(5*60), omit_sudo=False, cwd='/') |
qa/tasks/cephfs/mount.py
Outdated
| sudo iptables -A FORWARD -o {gw} -i ceph-brx -j ACCEPT | ||
| sudo iptables -A FORWARD -i {gw} -o ceph-brx -j ACCEPT | ||
| sudo iptables -t nat -A POSTROUTING -s {ip}/{mask} -o {gw} -j MASQUERADE | ||
| """, timeout=(5*60), omit_sudo=False) |
There was a problem hiding this comment.
| """, timeout=(5*60), omit_sudo=False) | |
| """, timeout=(5*60), omit_sudo=False, cwd='/') |
qa/tasks/cephfs/mount.py
Outdated
| set -e | ||
| sudo ip netns add {self.netns_name} | ||
| sudo ip netns set {self.netns_name} {nsid} | ||
| """, timeout=(5*60), omit_sudo=False) |
There was a problem hiding this comment.
| """, timeout=(5*60), omit_sudo=False) | |
| """, timeout=(5*60), omit_sudo=False, cwd='/') |
qa/tasks/cephfs/mount.py
Outdated
| sudo ip netns exec {self.netns_name} ip link set veth0 up | ||
| sudo ip netns exec {self.netns_name} ip link set lo up | ||
| sudo ip netns exec {self.netns_name} ip route add default via {brxip} | ||
| """, timeout=(5*60), omit_sudo=False) |
qa/tasks/cephfs/mount.py
Outdated
| set -e | ||
| sudo ip link set brx.{nsid} up | ||
| sudo ip link set dev brx.{nsid} master ceph-brx | ||
| """, timeout=(5*60), omit_sudo=False) |
qa/tasks/cephfs/mount.py
Outdated
| sudo ip link set brx.{self.nsid} down | ||
| sudo ip link delete dev brx.{self.nsid} | ||
| sudo ip netns delete {self.netns_name} | ||
| """, timeout=(5*60), omit_sudo=False) |
qa/tasks/cephfs/mount.py
Outdated
| set -e | ||
| sudo ip link set ceph-brx down | ||
| sudo ip link delete ceph-brx | ||
| """, timeout=(5*60), omit_sudo=False) |
qa/tasks/cephfs/mount.py
Outdated
| sudo iptables -D FORWARD -o {gw} -i ceph-brx -j ACCEPT | ||
| sudo iptables -D FORWARD -i {gw} -o ceph-brx -j ACCEPT | ||
| sudo iptables -t nat -D POSTROUTING -s {ip}/{mask} -o {gw} -j MASQUERADE | ||
| """, timeout=(5*60), omit_sudo=False) |
Here's what I used ot test:
|
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Once we have run the test cases and the ceph-brx bridge is setup,
it will save the config in "/etc/sysconfig/network-scripts/ifcfg-ceph-brx"
or somewhere else. It will be kept after the ceph-brx bridge removed.
So next time once the ceph-brx bridge is created or added, it will
read the config from it, then when we config it again we will get
error like:
"RTNETLINK answers: File exists"
Here we need to flush it before config it.
Fixes: https://tracker.ceph.com/issues/45817
Signed-off-by: Xiubo Li <xiubli@redhat.com>
If the previous test cases failed, the netnses and bridge will be left. Here will remove them when new test cases begin. Fixes: https://tracker.ceph.com/issues/45806 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Sure, fixed them all and I am now running the test, the branch is Thanks. |
|
jenkins test dashboard backend |
|
Updated it with a small fix, for some test cases they will call mount_a.kill() first, which will suspend the netns by bring the network interface down, and a while later they will call the kill_cleanup() to do the unmount, if we reuse the netns later in the next test case, the netns will keep suspended, so we need to resume it. |
The netnses maybe created/deleted many times in the whole test cases, we can defer cleaning them untile the last mountpoint is unmounted or when the test is exiting. Fixes: https://tracker.ceph.com/issues/46282 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Test done and passed, please see: https://pulpito.ceph.com/xiubli-2020-07-30_04:53:06-fs-wip-lxbsz-testing-20200730-0901-distro-basic-smithi/ |
|
https://pulpito.ceph.com/?branch=wip-pdonnell-testing-20200730.193838 Failures known/unrelated. |
Fixes: https://tracker.ceph.com/issues/46282
Signed-off-by: Xiubo Li xiubli@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard backendjenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox