selftests/bpf: merge most of test_btf into test_progs by kernel-patches-bot · Pull Request #59 · kernel-patches/bpf

kernel-patches-bot · 2020-09-15T01:48:33Z

Pull request for series with
subject: selftests/bpf: merge most of test_btf into test_progs
version: 2
url: https://patchwork.ozlabs.org/project/netdev/list/?series=201720

kernel-patches-bot · 2020-09-15T01:48:34Z

Master branch: bf74a37
series: https://patchwork.ozlabs.org/project/netdev/list/?series=201720
version: 2

patch https://patchwork.ozlabs.org/project/netdev/patch/20200915014341.2949692-1-andriin@fb.com/ applied successfully

…xercised regularly. Pretty-printing tests were left alone and renamed into test_btf_pprint because they are very slow and were not even executed by default with test_btf. All the test_btf tests that were moved are modeled as proper sub-tests in test_progs framework for ease of debugging and reporting. No functional or behavioral changes were intended, I tried to preserve original behavior as close to the original as possible. `test_progs -v` will activate "always_log" flag to emit BTF validation log. Signed-off-by: Andrii Nakryiko <andriin@fb.com> --- v1->v2: - pretty-print BTF tests were renamed test_btf -> test_btf_pprint, which allowed GIT to detect that majority of test_btf code was moved into prog_tests/btf.c; so diff is much-much smaller; tools/testing/selftests/bpf/.gitignore | 2 +- .../bpf/{test_btf.c => prog_tests/btf.c} | 1069 +---------------- tools/testing/selftests/bpf/test_btf_pprint.c | 969 +++++++++++++++ 3 files changed, 1033 insertions(+), 1007 deletions(-) rename tools/testing/selftests/bpf/{test_btf.c => prog_tests/btf.c} (85%) create mode 100644 tools/testing/selftests/bpf/test_btf_pprint.c

kernel-patches-bot · 2020-09-15T02:03:26Z

Master branch: d317b0a
series: https://patchwork.ozlabs.org/project/netdev/list/?series=201720
version: 2

patch https://patchwork.ozlabs.org/project/netdev/patch/20200915014341.2949692-1-andriin@fb.com/ applied successfully

In case of memory pressure the MPTCP xmit path keeps at most a single skb in the tx cache, eventually freeing additional ones. The associated counter for forward memory is not update accordingly, and that causes the following splat: WARNING: CPU: 0 PID: 12 at net/core/stream.c:208 sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.11.0-rc2 #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 63 01 00 00 8b ab 00 01 00 00 e9 60 ff ff ff e8 2f 24 d3 fe 0f 0b eb 97 e8 26 24 d3 fe <0f> 0b eb a0 e8 1d 24 d3 fe 0f 0b e9 a5 fe ff ff 4c 89 e7 e8 0e d0 RSP: 0018:ffffc900000c7bc8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88810030ac40 RSI: ffffffff8262ca4a RDI: 0000000000000003 RBP: 0000000000000d00 R08: 0000000000000000 R09: ffffffff85095aa7 R10: ffffffff8262c9ea R11: 0000000000000001 R12: ffff888108908100 R13: ffffffff85095aa0 R14: ffffc900000c7c48 R15: 1ffff92000018f85 FS: 0000000000000000(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa7444baef8 CR3: 0000000035ee9005 CR4: 0000000000170ef0 Call Trace: __mptcp_destroy_sock+0x4a7/0x6c0 net/mptcp/protocol.c:2547 mptcp_worker+0x7dd/0x1610 net/mptcp/protocol.c:2272 process_one_work+0x896/0x1170 kernel/workqueue.c:2275 worker_thread+0x605/0x1350 kernel/workqueue.c:2421 kthread+0x344/0x410 kernel/kthread.c:292 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296 At close time, as reported by syzkaller/Christoph. This change address the issue properly updating the fwd allocated memory counter in the error path. Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: multipath-tcp/mptcp_net-next#136 Fixes: 724cfd2 ("mptcp: allocate TX skbs in msk context") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>

The rtla osnoise tool is an interface for the osnoise tracer. The osnoise tracer dispatches a kernel thread per-cpu. These threads read the time in a loop while with preemption, softirqs and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise threads take note of the entry and exit point of any source of interferences, increasing a per-cpu interference counter. The osnoise tracer also saves an interference counter for each source of interference. The rtla osnoise top mode displays information about the periodic summary from the osnoise tracer. One example of rtla osnoise top output is: [root@alien ~]# rtla osnoise top -c 0-3 -d 1m -q -r 900000 -P F:1 Operating System Noise duration: 0 00:01:00 | time is in us CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI IRQ Softirq Thread 0 #58 52200000 1031 99.99802 91 60 0 0 52285 0 101 1 #59 53100000 5 99.99999 5 5 0 9 53122 0 18 2 #59 53100000 7 99.99998 7 7 0 8 53115 0 18 3 #59 53100000 8274 99.98441 277 23 0 9 53778 0 660 "rtla osnoise top --help" works and provide information about the available options. Link: https://lkml.kernel.org/r/0d796993abf587ae5a170bb8415c49368d4999e1.1639158831.git.bristot@kernel.org Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <zanussi@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: linux-rt-users@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com>

Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>

Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #18 bpf_tcp_ca:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK #233 xdp_bpf2bpf:OK Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>

Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Song Liu <songliubraving@fb.com>

Christoph reported a splat hinting at a corrupted snd_una: WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 Modules linked in: CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 kernel-patches#59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8 8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe <0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9 RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4 RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000 R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000 FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0 Call Trace: <TASK> __mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline] mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline] __mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615 mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767 process_one_work+0x1e0/0x560 kernel/workqueue.c:3254 process_scheduled_works kernel/workqueue.c:3335 [inline] worker_thread+0x3c7/0x640 kernel/workqueue.c:3416 kthread+0x121/0x170 kernel/kthread.c:388 ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 </TASK> When fallback to TCP happens early on a client socket, snd_nxt is not yet initialized and any incoming ack will copy such value into snd_una. If the mptcp worker (dumbly) tries mptcp-level re-injection after such ack, that would unconditionally trigger a send buffer cleanup using 'bad' snd_una values. We could easily disable re-injection for fallback sockets, but such dumb behavior already helped catching a few subtle issues and a very low to zero impact in practice. Instead address the issue always initializing snd_nxt (and write_seq, for consistency) at connect time. Fixes: 8fd7380 ("mptcp: fallback in case of simultaneous connect") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: multipath-tcp/mptcp_net-next#485 Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: NipaLocal <nipa@local>

Christoph reported a splat hinting at a corrupted snd_una: WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 Modules linked in: CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 kernel-patches#59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8 8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe <0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9 RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4 RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000 R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000 FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0 Call Trace: <TASK> __mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline] mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline] __mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615 mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767 process_one_work+0x1e0/0x560 kernel/workqueue.c:3254 process_scheduled_works kernel/workqueue.c:3335 [inline] worker_thread+0x3c7/0x640 kernel/workqueue.c:3416 kthread+0x121/0x170 kernel/kthread.c:388 ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 </TASK> When fallback to TCP happens early on a client socket, snd_nxt is not yet initialized and any incoming ack will copy such value into snd_una. If the mptcp worker (dumbly) tries mptcp-level re-injection after such ack, that would unconditionally trigger a send buffer cleanup using 'bad' snd_una values. We could easily disable re-injection for fallback sockets, but such dumb behavior already helped catching a few subtle issues and a very low to zero impact in practice. Instead address the issue always initializing snd_nxt (and write_seq, for consistency) at connect time. Fixes: 8fd7380 ("mptcp: fallback in case of simultaneous connect") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: multipath-tcp/mptcp_net-next#485 Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240429-upstream-net-20240429-mptcp-snd_nxt-init-connect-v1-1-59ceac0a7dcb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>

25216af ("PCI: Add managed pcim_intx()") moved the allocation step for pci_intx()'s device resource from pcim_enable_device() to pcim_intx(). As before, pcim_enable_device() sets pci_dev.is_managed to true; and it is never set to false again. Due to the lifecycle of a struct pci_dev, it can happen that a second driver obtains the same pci_dev after a first driver ran. If one driver uses pcim_enable_device() and the other doesn't, this causes the other driver to run into managed pcim_intx(), which will try to allocate when called for the first time. Allocations might sleep, so calling pci_intx() while holding spinlocks becomes then invalid, which causes lockdep warnings and could cause deadlocks: ======================================================== WARNING: possible irq lock inversion dependency detected 6.11.0-rc6+ kernel-patches#59 Tainted: G W -------------------------------------------------------- CPU 0/KVM/1537 just changed the state of lock: ffffa0f0cff965f0 (&vdev->irqlock){-...}-{2:2}, at: vfio_intx_handler+0x21/0xd0 [vfio_pci_core] but this lock took another, HARDIRQ-unsafe lock in the past: (fs_reclaim){+.+.}-{0:0} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); local_irq_disable(); lock(&vdev->irqlock); lock(fs_reclaim); <Interrupt> lock(&vdev->irqlock); *** DEADLOCK *** Have pcim_enable_device()'s release function, pcim_disable_device(), set pci_dev.is_managed to false so that subsequent drivers using the same struct pci_dev do not implicitly run into managed code. Link: https://lore.kernel.org/r/20240905072556.11375-2-pstanner@redhat.com Fixes: 25216af ("PCI: Add managed pcim_intx()") Reported-by: Alex Williamson <alex.williamson@redhat.com> Closes: https://lore.kernel.org/all/20240903094431.63551744.alex.williamson@redhat.com/ Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by skip acking the annouce in virtnet_config_changed_work() when probing. The annouce will still get done when ndo_open() enables the virtio_config_driver_enable(). We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>

This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Signed-off-by: NipaLocal <nipa@local>

kernel-patches-bot added bpf-next new V2 labels Sep 15, 2020

kernel-patches-bot and others added 2 commits September 14, 2020 19:03

adding ci files

94a7039

kernel-patches-bot force-pushed the series/200694 branch from 49583d8 to 6be61d6 Compare September 15, 2020 02:03

kernel-patches-bot closed this Sep 15, 2020

kernel-patches-bot deleted the series/200694 branch September 15, 2020 17:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

selftests/bpf: merge most of test_btf into test_progs#59

selftests/bpf: merge most of test_btf into test_progs#59
kernel-patches-bot wants to merge 2 commits intobpf-nextfrom
series/200694

kernel-patches-bot commented Sep 15, 2020

Uh oh!

kernel-patches-bot commented Sep 15, 2020

Uh oh!

kernel-patches-bot commented Sep 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kernel-patches-bot commented Sep 15, 2020

Uh oh!

kernel-patches-bot commented Sep 15, 2020

Uh oh!

kernel-patches-bot commented Sep 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants