ASoC: core: delete component->card_list in soc_remove_component only#1191
ASoC: core: delete component->card_list in soc_remove_component only#1191bardliao wants to merge 1 commit intothesofproject:topic/sof-devfrom
Conversation
| static void soc_cleanup_component(struct snd_soc_component *component) | ||
| { | ||
| snd_soc_component_set_jack(component, NULL, NULL); | ||
| list_del(&component->card_list); |
There was a problem hiding this comment.
@bardliao did you actually got a crash for this?
There was a problem hiding this comment.
@dbaluta Yes, I hit the issue when codec driver return -EPROBE_DEFER from its probe function.
There was a problem hiding this comment.
@bardliao You could use INIT_LIST_HEAD early on before adding the component to the list. In that case list_del will work just fine and not cause crashes of any kind (doing list_del on an empty list_head is a no-op).
There was a problem hiding this comment.
@paulstelian97 Yes, INIT_LIST_HEAD(&component->card_list); is called by snd_soc_component_initialize() which is before soc_probe_component(). And yes, it is fine if we call list_del() once, but it will crash if we call it twice. When codec probe return -EPROBE_DEFER, kernel will try to re probe it again, and it will crash at the second probe failed.
| { | ||
| snd_soc_component_set_jack(component, NULL, NULL); | ||
| list_del(&component->card_list); | ||
| snd_soc_dapm_free(snd_soc_component_get_dapm(component)); |
There was a problem hiding this comment.
if you follow the logic, then this should also not be done in cleanup_component but in remove_component, when we hit the err_probe in soc_probe_component we shouldn't deal with DAPM
Can you review all changes in 'ASoC: soc-core: add soc_cleanup_component()' and resubmit with a Fixes: tag?
There was a problem hiding this comment.
@plbossart I see "And soc_probe_component() doesn't call snd_soc_dapm_free(), but it should" from the commit message of "ASoC: soc-core: add soc_cleanup_component()". It is right after snd_soc_dapm_new_controls() run successfully. And snd_soc_dapm_free() will do nothing if snd_soc_dapm_new_controls() fails. So I think it is fine to keep snd_soc_dapm_free() in the soc_cleanup_component()
|
@bardliao you must have pushed the wrong branch? conflicts and tons of old commits in there. |
Sorry about it. I just pushed again. I added fix: tag in the commit header only. |
something's wrong on my side, the commit looks the same? |
@plbossart I didn't change the commit. I just added a fix: tag on the commit header. |
@bardliao what Pierre meant was to add Fixes: of the commit that introduced this issue. |
We add component->card_list in the end of soc_probe_component(). In other words, component->card_list will not be added if there is an error in the soc_probe_component() function. So we can't delete component->card_list in the error handling of soc_probe_component(). Fixes: 22d1423 ("ASoC: soc-core: add soc_cleanup_component()") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
|
merged upstream w/ follow-up patches, will be handled with the usual weekly upstream merge. |
[ Upstream commit a47bd78 ] Dave hit this splat during testing btrfs/078: ====================================================== WARNING: possible circular locking dependency detected 5.8.0-rc6-default+ thesofproject#1191 Not tainted ------------------------------------------------------ kswapd0/75 is trying to acquire lock: ffffa040e9d04ff8 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs] but task is already holding lock: ffffffff8b0c8040 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (fs_reclaim){+.+.}-{0:0}: __lock_acquire+0x56f/0xaa0 lock_acquire+0xa3/0x440 fs_reclaim_acquire.part.0+0x25/0x30 __kmalloc_track_caller+0x49/0x330 kstrdup+0x2e/0x60 __kernfs_new_node.constprop.0+0x44/0x250 kernfs_new_node+0x25/0x50 kernfs_create_link+0x34/0xa0 sysfs_do_create_link_sd+0x5e/0xd0 btrfs_sysfs_add_devices_dir+0x65/0x100 [btrfs] btrfs_init_new_device+0x44c/0x12b0 [btrfs] btrfs_ioctl+0xc3c/0x25c0 [btrfs] ksys_ioctl+0x68/0xa0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x50/0xe0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}: __lock_acquire+0x56f/0xaa0 lock_acquire+0xa3/0x440 __mutex_lock+0xa0/0xaf0 btrfs_chunk_alloc+0x137/0x3e0 [btrfs] find_free_extent+0xb44/0xfb0 [btrfs] btrfs_reserve_extent+0x9b/0x180 [btrfs] btrfs_alloc_tree_block+0xc1/0x350 [btrfs] alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs] __btrfs_cow_block+0x143/0x7a0 [btrfs] btrfs_cow_block+0x15f/0x310 [btrfs] push_leaf_right+0x150/0x240 [btrfs] split_leaf+0x3cd/0x6d0 [btrfs] btrfs_search_slot+0xd14/0xf70 [btrfs] btrfs_insert_empty_items+0x64/0xc0 [btrfs] __btrfs_commit_inode_delayed_items+0xb2/0x840 [btrfs] btrfs_async_run_delayed_root+0x10e/0x1d0 [btrfs] btrfs_work_helper+0x2f9/0x650 [btrfs] process_one_work+0x22c/0x600 worker_thread+0x50/0x3b0 kthread+0x137/0x150 ret_from_fork+0x1f/0x30 -> #0 (&delayed_node->mutex){+.+.}-{3:3}: check_prev_add+0x98/0xa20 validate_chain+0xa8c/0x2a00 __lock_acquire+0x56f/0xaa0 lock_acquire+0xa3/0x440 __mutex_lock+0xa0/0xaf0 __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs] btrfs_evict_inode+0x3bf/0x560 [btrfs] evict+0xd6/0x1c0 dispose_list+0x48/0x70 prune_icache_sb+0x54/0x80 super_cache_scan+0x121/0x1a0 do_shrink_slab+0x175/0x420 shrink_slab+0xb1/0x2e0 shrink_node+0x192/0x600 balance_pgdat+0x31f/0x750 kswapd+0x206/0x510 kthread+0x137/0x150 ret_from_fork+0x1f/0x30 other info that might help us debug this: Chain exists of: &delayed_node->mutex --> &fs_info->chunk_mutex --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&fs_info->chunk_mutex); lock(fs_reclaim); lock(&delayed_node->mutex); *** DEADLOCK *** 3 locks held by kswapd0/75: #0: ffffffff8b0c8040 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 #1: ffffffff8b0b50b8 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x54/0x2e0 #2: ffffa040e057c0e8 (&type->s_umount_key#26){++++}-{3:3}, at: trylock_super+0x16/0x50 stack backtrace: CPU: 2 PID: 75 Comm: kswapd0 Not tainted 5.8.0-rc6-default+ thesofproject#1191 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack+0x78/0xa0 check_noncircular+0x16f/0x190 check_prev_add+0x98/0xa20 validate_chain+0xa8c/0x2a00 __lock_acquire+0x56f/0xaa0 lock_acquire+0xa3/0x440 ? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs] __mutex_lock+0xa0/0xaf0 ? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs] ? __lock_acquire+0x56f/0xaa0 ? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs] ? lock_acquire+0xa3/0x440 ? btrfs_evict_inode+0x138/0x560 [btrfs] ? btrfs_evict_inode+0x2fe/0x560 [btrfs] ? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs] __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs] btrfs_evict_inode+0x3bf/0x560 [btrfs] evict+0xd6/0x1c0 dispose_list+0x48/0x70 prune_icache_sb+0x54/0x80 super_cache_scan+0x121/0x1a0 do_shrink_slab+0x175/0x420 shrink_slab+0xb1/0x2e0 shrink_node+0x192/0x600 balance_pgdat+0x31f/0x750 kswapd+0x206/0x510 ? _raw_spin_unlock_irqrestore+0x3e/0x50 ? finish_wait+0x90/0x90 ? balance_pgdat+0x750/0x750 kthread+0x137/0x150 ? kthread_stop+0x2a0/0x2a0 ret_from_fork+0x1f/0x30 This is because we're holding the chunk_mutex while adding this device and adding its sysfs entries. We actually hold different locks in different places when calling this function, the dev_replace semaphore for instance in dev replace, so instead of moving this call around simply wrap it's operations in NOFS. CC: stable@vger.kernel.org # 4.14+ Reported-by: David Sterba <dsterba@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We add component->card_list in the end of soc_probe_component(). In
other words, component->card_list will not be added if there is an
error in the soc_probe_component() function. So we can't delete
component->card_list in the error handling of soc_probe_component().
Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com