Skip to content

SCIM with DENY ALL firewall policy delays group deletion #14818

@josegomezr

Description

@josegomezr

Describe the bug

SCIM delays (and even prevents in the worst case) group deletion when a SCIM application cannot perform an http request on a network protected by a firewall dropping packets.

To Reproduce
Steps to reproduce the behavior:

  1. Configure a SCIM application + provider that points to an IP address outside of the network where your instance is running.
  2. Delete a group
  3. See error

Expected behavior
Group deletion should not error out (or be prevented from happening).

Screenshots

Inspector view of the failed request:
Image

Logs

{"auth_via": "session", "domain_url": "sheldon01-dev.dmz-prg2.suse.org", "event": "Task published", "host": "sheldon01-dev.dmz-prg2.suse.org", "level": "info", "logger": "authentik.root.celery", "pid": 22932, "request_id": "9921b4f9ab394ce997281b8d1c83a97f", "schema_name": "public", "task_id": "81d00ecb794347c3a496e2dcc8a88179", "task_name": "authentik.providers.scim.tasks.scim_sync_direct", "timestamp": "2025-06-02T12:31:35.556212"}

{"domain_url": null, "event": "failed to get ServiceProviderConfig", "exc": "SCIMRequestException()", "level": "warning", "logger": "authentik.lib.sync.outgoing.base", "pid": 27028, "provider": "scim-test", "schema_name": "public", "task_id": "task-81d00ecb794347c3a496e2dcc8a88179", "timestamp": "2025-06-02T12:36:05.528477"}
{"domain_url": null, "event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 27028, "schema_name": "public", "state": "SUCCESS", "task_id": "81d00ecb794347c3a496e2dcc8a88179", "task_name": "scim_sync_direct", "timestamp": "2025-06-02T12:36:05.550913"}

Version and Deployment (please complete the following information):

  • authentik version: 2025.2.1
  • Deployment: podman [matching the docker-compose structure]

Additional context

When the firewall rule blocking traffic is in place, curl cannot complete the request, however it doesn't exit immediately:

curl -v https://ifconfig.me/ip
* Host ifconfig.me:443 was resolved.
* IPv6: 2600:1901:0:b2bd::
* IPv4: 34.160.111.145
*   Trying [2600:1901:0:b2bd::]:443...
*   Trying 34.160.111.145:443...
[cancelled after more than 30s] 
^C

When trying a request directly from python inside of the webserver container we can see from the console the following:

>>> import httpx
httpx.request("GET", "https://ifconfig.me/ip")>>> httpx.request("GET", "https://ifconfig.me/ip")
Traceback (most recent call last):
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
    yield
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 233, in handle_request
    resp = self._pool.handle_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request
    raise exc from None
  File "/ak-root/venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request
    response = connection.handle_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 99, in handle_request
    raise exc
  File "/ak-root/venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 76, in handle_request
    stream = self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 122, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpcore/_backends/sync.py", line 205, in connect_tcp
    with map_exceptions(exc_map):
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/ak-root/venv/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_api.py", line 106, in request
    return client.request(
           ^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_client.py", line 827, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_client.py", line 914, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_client.py", line 1015, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 232, in handle_request
    with map_httpcore_exceptions():
         ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/ak-root/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectTimeout: timed out

We've observed that in 50% of the cases: the group is not deleted, and in the rest it is, even after nginx has cut the connection and replied with a HTTP 504: Gateway Timeout

When the SCIM application/provider pair was deleted from the instance, it all worked out fine.

I tried to replicate it with docker + iptables but couldn't really get the same behavior as the firewall was generating.

Our suspicion is that:

  • The way the firewall drops packets makes believe the networks stack that it's fighting a packet-loss condition so it doesn't immediately cuts the connection (and thus delaying the operation waiting for the connection to have a result [connected or refused])
  • nginx cuts the connection after *_timeout seconds (which explains the 504)

We've circumvented the issue by fixing our firewall policies and extending timeouts, but it is an interesting side effect of running on strict network conditions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions