Skip to content

sss_nss: hang when looking up a group with stale cache entry and a LDAP provider #8194

@gollub

Description

@gollub

With sssd 2.10.2 I noticed hangs when invoking sudo on server which I just have been logging in. On that system with a LDAP provider configured the sudo process hang in the NSS sss module doing a query about the "docker" (492) group.
This was not seen with 2.9.6.

The nsswitch.conf holds:

group:      files [SUCCESS=merge] sss [SUCCESS=merge] systemd

What potentially contributed to this issue is following config entry in /etc/sssd/sssd.conf:

[domain/LDAP]
[...]

filter_groups = ...,docker,....

Reverting 32f5782 on top of 2.10.2 seems to resolve the issue in my testing.

I suspect the introduction of cache_req_dp_contacted() in one of the conditions prevents a "cache_req_global_ncache_add()" call, resulting in a hang.

Logs

tail -f while running:

sss_cache -E
getent group 492

sssd-2.10.2 (Bad) - hang

==> /var/log/sssd/sssd_nss.log <==
(2025-11-13 15:09:57): [nss] [sss_nss_getby_id] (0x0400): [CID#3] Input ID: 492 (looking up 'POSIX data')
(2025-11-13 15:09:57): [nss] [cache_req_search_send] (0x0400): [CID#3] CR #3: Object found, but needs to be refreshed.
(2025-11-13 15:09:57): [nss] [cache_req_search_dp] (0x0400): [CID#3] CR #3: Looking up [GID:492@LDAP] in data provider
(2025-11-13 15:09:57): [nss] [sss_dp_get_account_send] (0x0400): [CID#3] Creating request for [LDAP][0x2][BE_REQ_GROUP][idnumber=492:-]
(2025-11-13 15:09:57): [nss] [sss_nss_get_object_send] (0x0400): [CID#3] Client [0x5646f8ed7eb0][22]: sent cache request #3

==> /var/log/sssd/sssd_LDAP.log <==
(2025-11-13 15:09:57): [be[LDAP]] [sdap_print_server] (0x2000): [RID#9] Searching 192.0.2.10:389
(2025-11-13 15:09:57): [be[LDAP]] [sdap_get_generic_ext_step] (0x0400): [RID#9] calling ldap_search_ext with [(&(gidNumber=492)(objectClass=posixGroup)(cn=*)(&(gidNumber=*)(!(gidNumber=0))))][dc=example,dc=com].
(2025-11-13 15:09:58): [be[LDAP]] [sdap_parse_entry] (0x1000): [RID#9] OriginalDN: [cn=docker,ou=groups,ou=global,dc=example,dc=com].
(2025-11-13 15:09:58): [be[LDAP]] [sysdb_store_group] (0x0400): [RID#9] Group "docker@LDAP" has been stored
(2025-11-13 15:09:58): [be[LDAP]] [dp_req_done] (0x0400): [RID#9] DP Request [Account #9]: Request handler finished [0]: Success
(2025-11-13 15:09:58): [be[LDAP]] [sbus_issue_request_done] (0x0400): sssd.dataprovider.getAccountInfo on /sssd from sssd.nss: Success

==> /var/log/sssd/sssd_nss.log <==
(2025-11-13 15:09:58): [nss] [sbus_dispatch] (0x4000): Dispatching.
(2025-11-13 15:09:58): [nss] [cache_req_search_ncache_filter] (0x0400): [CID#3] CR #3: [docker@LDAP] filtered out! (negative cache)
(2025-11-13 15:09:58): [nss] [cache_req_select_domains] (0x0400): [CID#3] CR #3: Performing a multi-domain search
(2025-11-13 15:09:58): [nss] [cache_req_search_domains] (0x0400): [CID#3] CR #3: Search will bypass the cache and check the data provider
(2025-11-13 15:09:58): [nss] [cache_req_set_domain] (0x0400): [CID#3] CR #3: Using domain [LDAP]
(2025-11-13 15:09:58): [nss] [cache_req_search_dp] (0x0400): [CID#3] CR #3: Looking up [GID:492@LDAP] in data provider
(2025-11-13 15:09:58): [nss] [sss_dp_get_account_send] (0x0400): [CID#3] Creating request for [LDAP][0x2][BE_REQ_GROUP][idnumber=492:-]
(2025-11-13 15:09:58): [nss] [sbus_requests_add] (0x4000): [CID#3] Chaining request: -:0:sssd.domain_LDAP:sssd.dataprovider.getAccountInfo:/sssd:1:2:idnumber=492:LDAP:(null)

sssd-2.9.6 (Good) - no hang

==> /var/log/sssd/sssd_nss.log <==
(2025-11-13 15:41:08): [nss] [sss_nss_getby_id] (0x0400): [CID#5] Input ID: 492 (looking up 'POSIX data')
(2025-11-13 15:41:08): [nss] [cache_req_search_send] (0x0400): [CID#5] CR #18: Object found, but needs to be refreshed.
(2025-11-13 15:41:08): [nss] [cache_req_search_dp] (0x0400): [CID#5] CR #18: Looking up [GID:492@LDAP] in data provider
(2025-11-13 15:41:08): [nss] [sss_dp_get_account_send] (0x0400): [CID#5] Creating request for [LDAP][0x2][BE_REQ_GROUP][idnumber=492:-]
(2025-11-13 15:41:08): [nss] [sss_nss_get_object_send] (0x0400): [CID#5] Client [0x55b0ddf431e0][26]: sent cache request #18

==> /var/log/sssd/sssd_LDAP.log <==
(2025-11-13 15:41:08): [be[LDAP]] [sdap_print_server] (0x2000): [RID#13] Searching 192.0.2.11:389
(2025-11-13 15:41:08): [be[LDAP]] [sdap_get_generic_ext_step] (0x0400): [RID#13] calling ldap_search_ext with [(&(gidNumber=492)(objectClass=posixGroup)(cn=*)(&(gidNumber=*)(!(gidNumber=0))))][dc=example,dc=com].
(2025-11-13 15:41:08): [be[LDAP]] [sdap_parse_entry] (0x1000): [RID#13] OriginalDN: [cn=docker,ou=groups,ou=global,dc=example,dc=com].
(2025-11-13 15:41:08): [be[LDAP]] [sysdb_store_group] (0x0400): [RID#13] Group "docker@LDAP" has been stored
(2025-11-13 15:41:08): [be[LDAP]] [dp_req_done] (0x0400): [RID#13] DP Request [Account #13]: Request handler finished [0]: Success
(2025-11-13 15:41:08): [be[LDAP]] [sbus_issue_request_done] (0x0400): sssd.dataprovider.getAccountInfo: Success

==> /var/log/sssd/sssd_nss.log <==
(2025-11-13 15:41:08): [nss] [sbus_dispatch] (0x4000): Dispatching.
(2025-11-13 15:41:08): [nss] [cache_req_search_ncache_filter] (0x0400): [CID#5] CR #18: [docker@LDAP] filtered out! (negative cache)
(2025-11-13 15:41:08): [nss] [cache_req_global_ncache_add] (0x0400): [CID#5] CR #18: Adding [GID:492@LDAP] to global negative cache
(2025-11-13 15:41:08): [nss] [cache_req_process_result] (0x0400): [CID#5] CR #18: Finished: Not found
(2025-11-13 15:41:08): [nss] [sss_nss_protocol_done] (0x4000): [CID#5] Sending reply: not found
(2025-11-13 15:41:08): [nss] [client_recv] (0x0200): [CID#5] Client disconnected!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions