Skip to content

[Bug]: File descriptor leak in platform reconnect loop — adapter sqlite3 connections never closed on failed reconnect - gateway dead after ~12h uptime #37011

@LeanLabsAI

Description

@LeanLabsAI

Bug Description

File descriptor leak in platform reconnect loop — adapter sqlite3 connections never closed on failed reconnect

Severity: High (causes gateway to exhaust 2560 fd limit → all platforms fail silently)

Versions affected: 0.15.0, 0.15.1, possibly earlier (introduced by 3b509da57 feat: auto-reconnect failed gateway platforms with exponential backoff)

Steps to Reproduce

Reproduction: Configure a platform (e.g., API_SERVER) with an intentionally failing config (missing API_SERVER_KEY). Start the gateway. Wait for the reconnect loop to run. Watch fd count climb: lsof -p | grep response_store | wc -l. After ~15 hours, the gateway hits 2560 fds and becomes non-functional.

Expected Behavior

Stable gateway.

Actual Behavior

After ~15 hours, the gateway hits 2560 fds and becomes non-functional.

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

No response

Debug Report

https://docs.google.com/document/d/1Hnh_qUiIpQK-XrQxkIorQqpWecbpFLmB8svACKMd07A/edit?usp=sharing

Hermes Debug log

Full report printed below — copy-paste it manually:
[hermes debug share: log content redacted at upload time. run with --no-redact to disable]
--- hermes dump ---
version:          0.15.1 (2026.5.29) [355af2c2]
os:               Darwin 21.6.0 x86_64
python:           3.11.14
openai_sdk:       2.24.0
profile:          default
hermes_home:      ~/.hermes
model:            deepseek/deepseek-v4-flash
provider:         openrouter
terminal:         local
api_keys:
  openrouter           set
  openai               not set
  anthropic            set
  anthropic_token      not set
  nous                 not set
  google/gemini        not set
  gemini               set
  glm/zai              not set
  zai                  not set
  kimi                 not set
  minimax              not set
  deepseek             not set
  dashscope            not set
  huggingface          not set
  nvidia               not set
  opencode_zen         not set
  opencode_go          not set
  kilocode             not set
  firecrawl            not set
  tavily               set
  browserbase          not set
  fal                  set
  elevenlabs           not set
  github               not set
features:
  toolsets:           hermes-cli
  mcp_servers:        0
  memory_provider:    supermemory
  gateway:            unknown
  platforms:          telegram
  cron_jobs:          6 active / 17 total
  skills:             128
config_overrides:
  agent.max_turns: 150
  compression.threshold: 0.8
  display.streaming: True
  fallback_providers: [{'provider': 'openrouter', 'model': 'minimax/minimax-m2.7'}]
--- end dump ---
--- agent.log (last 200 lines) ---
2026-06-01 15:12:04,330 INFO tools.terminal_tool: Cleaned up inactive environment for task: default
2026-06-01 15:12:05,013 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:12:05,013 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:12:05,014 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:12:05,014 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:12:05,014 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:13:05,014 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:13:05,015 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:13:05,015 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:13:05,015 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:13:05,016 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:13:59,762 INFO gateway.memory_monitor: [MEMORY] rss=131MB gc=(239, 0, 0) threads=7 uptime=1200s
2026-06-01 15:14:05,021 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:14:05,021 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:14:05,022 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:14:05,022 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:14:05,022 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:15:05,097 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:15:05,097 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:15:05,097 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:15:05,098 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:15:05,098 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:16:05,247 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:16:05,247 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:16:05,248 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:16:05,248 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:16:05,248 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:16:34,902 INFO gateway.run: Reconnecting api_server (attempt 8)...
2026-06-01 15:16:34,905 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:16:34,907 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:17:05,392 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:17:05,393 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:17:05,393 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:17:05,394 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:17:05,394 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:18:05,533 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:18:05,534 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:18:05,534 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:18:05,534 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:18:05,535 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:18:59,767 INFO gateway.memory_monitor: [MEMORY] rss=131MB gc=(1296, 0, 0) threads=7 uptime=1500s
2026-06-01 15:19:05,539 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:19:05,539 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:19:05,540 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:19:05,540 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:19:05,541 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:20:05,689 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:20:05,689 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:20:05,690 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:20:05,690 WARNING cron.jobs: Cannot compute next run for cron schedule '0 9 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:20:05,690 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8,12,16 * * *': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:20:17,801 INFO gateway.platforms.telegram: [Telegram] Flushing text batch agent:main:telegram:dm:8484962442 (1443 chars)
2026-06-01 15:20:17,804 INFO gateway.run: inbound message: platform=telegram user=John Hawkins chat=8484962442 msg='we need to get to generating revenue!! I think the best place to start is an ema'
2026-06-01 15:20:17,830 INFO run_agent: Loaded environment variables from /Users/leanlabsai/.hermes/.env
2026-06-01 15:20:19,516 INFO run_agent: OpenAI client created (agent_init, shared=True) thread=asyncio_0:123145410109440 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:19,540 WARNING agent.auxiliary_client: Auxiliary Nous client unavailable: no Nous authentication found (run: hermes auth).
2026-06-01 15:20:19,540 WARNING agent.auxiliary_client: Auxiliary: marking nous unhealthy for 60s (payment / credit error). Subsequent auxiliary calls will skip it until 15:21:19.
2026-06-01 15:20:19,549 WARNING agent.auxiliary_client: Auxiliary Nous client unavailable: no Nous authentication found (run: hermes auth).
2026-06-01 15:20:19,549 WARNING agent.auxiliary_client: Auxiliary: marking nous unhealthy for 60s (payment / credit error). Subsequent auxiliary calls will skip it until 15:21:19.
2026-06-01 15:20:19,739 WARNING agent.auxiliary_client: Auxiliary Nous client unavailable: no Nous authentication found (run: hermes auth).
2026-06-01 15:20:19,739 WARNING agent.auxiliary_client: Auxiliary: marking nous unhealthy for 60s (payment / credit error). Subsequent auxiliary calls will skip it until 15:21:19.
2026-06-01 15:20:19,749 WARNING agent.auxiliary_client: Auxiliary Nous client unavailable: no Nous authentication found (run: hermes auth).
2026-06-01 15:20:19,750 WARNING agent.auxiliary_client: Auxiliary: marking nous unhealthy for 60s (payment / credit error). Subsequent auxiliary calls will skip it until 15:21:19.
2026-06-01 15:20:19,872 INFO [20260531_132111_70dae7] agent.conversation_loop: conversation turn: session=20260531_132111_70dae7 model=gpt-5.5 provider=openai-codex platform=telegram history=67 msg='[Note: model was just switched from deepseek/deepseek-v4-flash to gpt-5.5 via Op...'
2026-06-01 15:20:19,912 INFO [20260531_132111_70dae7] agent.chat_completion_helpers: Disabling openai-codex no-byte TTFB watchdog for large request (context=~84,416 tokens >= 25000). Waiting for backend response instead. Set HERMES_CODEX_TTFB_STRICT=1 to force early reconnects.
2026-06-01 15:20:19,924 INFO run_agent: OpenAI client created (codex_stream_request, shared=False) thread=Thread-1 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:26,138 INFO run_agent: OpenAI client closed (request_complete, shared=False, tcp_force_closed=0) thread=Thread-1 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:26,139 INFO [20260531_132111_70dae7] agent.conversation_loop: API call #1: model=gpt-5.5 provider=openai-codex in=77811 out=89 total=77900 latency=6.3s
2026-06-01 15:20:26,242 INFO [20260531_132111_70dae7] agent.tool_executor: tool skill_view completed (0.10s, 26716 chars)
2026-06-01 15:20:26,265 INFO [20260531_132111_70dae7] agent.chat_completion_helpers: Disabling openai-codex no-byte TTFB watchdog for large request (context=~91,769 tokens >= 25000). Waiting for backend response instead. Set HERMES_CODEX_TTFB_STRICT=1 to force early reconnects.
2026-06-01 15:20:26,278 INFO run_agent: OpenAI client created (codex_stream_request, shared=False) thread=Thread-2 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:36,540 INFO run_agent: OpenAI client closed (request_complete, shared=False, tcp_force_closed=0) thread=Thread-2 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:36,540 INFO [20260531_132111_70dae7] agent.conversation_loop: API call #2: model=gpt-5.5 provider=openai-codex in=84547 out=228 total=84775 latency=10.3s cache=77312/84547 (91%)
2026-06-01 15:20:36,547 INFO tools.file_tools: Creating new local environment for task default...
2026-06-01 15:20:36,620 INFO tools.environments.base: Session snapshot created (session=c67bed304161, cwd=/Users/leanlabsai)
2026-06-01 15:20:36,622 INFO tools.file_tools: local environment ready for task default
2026-06-01 15:20:36,873 INFO agent.tool_executor: tool search_files completed (0.33s, 11601 chars)
2026-06-01 15:20:36,897 INFO agent.tool_executor: tool read_file completed (0.35s, 1784 chars)
2026-06-01 15:20:36,938 INFO agent.tool_executor: tool read_file completed (0.39s, 5515 chars)
2026-06-01 15:20:36,941 INFO agent.tool_executor: tool read_file completed (0.40s, 7924 chars)
2026-06-01 15:20:36,975 INFO [20260531_132111_70dae7] agent.chat_completion_helpers: Disabling openai-codex no-byte TTFB watchdog for large request (context=~100,791 tokens >= 25000). Waiting for backend response instead. Set HERMES_CODEX_TTFB_STRICT=1 to force early reconnects.
2026-06-01 15:20:36,988 INFO run_agent: OpenAI client created (codex_stream_request, shared=False) thread=Thread-20 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:43,425 INFO run_agent: OpenAI client closed (request_complete, shared=False, tcp_force_closed=0) thread=Thread-20 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:43,425 INFO [20260531_132111_70dae7] agent.conversation_loop: API call #3: model=gpt-5.5 provider=openai-codex in=93694 out=260 total=93954 latency=6.5s cache=84480/93694 (90%)
2026-06-01 15:20:43,693 INFO agent.tool_executor: tool read_file completed (0.26s, 31737 chars)
2026-06-01 15:20:43,699 INFO agent.tool_executor: tool read_file completed (0.27s, 10402 chars)
2026-06-01 15:20:43,705 INFO agent.tool_executor: tool read_file completed (0.28s, 21707 chars)
2026-06-01 15:20:43,708 INFO agent.tool_executor: tool read_file completed (0.28s, 3583 chars)
2026-06-01 15:20:43,741 INFO [20260531_132111_70dae7] agent.chat_completion_helpers: Disabling openai-codex no-byte TTFB watchdog for large request (context=~118,747 tokens >= 25000). Waiting for backend response instead. Set HERMES_CODEX_TTFB_STRICT=1 to force early reconnects.
2026-06-01 15:20:43,755 INFO run_agent: OpenAI client created (codex_stream_request, shared=False) thread=Thread-37 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:53,517 INFO run_agent: OpenAI client closed (request_complete, shared=False, tcp_force_closed=0) thread=Thread-37 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:20:53,518 INFO [20260531_132111_70dae7] agent.conversation_loop: API call #4: model=gpt-5.5 provider=openai-codex in=111054 out=364 total=111418 latency=9.8s cache=93184/111054 (84%)
2026-06-01 15:20:53,521 INFO [20260531_132111_70dae7] agent.tool_executor: tool todo completed (0.00s, 826 chars)
2026-06-01 15:20:53,549 INFO [20260531_132111_70dae7] agent.chat_completion_helpers: Disabling openai-codex no-byte TTFB watchdog for large request (context=~119,893 tokens >= 25000). Waiting for backend response instead. Set HERMES_CODEX_TTFB_STRICT=1 to force early reconnects.
2026-06-01 15:20:53,562 INFO run_agent: OpenAI client created (codex_stream_request, shared=False) thread=Thread-38 (_call):123145478340608 provider=openai-codex base_url=https://chatgpt.com/backend-api/codex model=gpt-5.5
2026-06-01 15:21:05,837 WARNING cron.jobs: Cannot compute next run for cron schedule '0 8 * * 1': 'croniter' is not installed. croniter is a core dependency as of v0.9.x; reinstall hermes-agent or run 'pip install croniter' in your runtime env.
2026-06-01 15:21:05,838 WARNING cron.jobs: Cannot compute next run for cron schedule '0 */4 
--- gateway.log (last 100 lines) ---
         ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/usr/local/Cellar/python@3.14/3.14.3_1/Frameworks/Python.framework/Versions/3.14/lib/python3.14/contextlib.py", line 141, in __enter__
  File "/Users/leanlabsai/.hermes/hermes-agent/hermes_cli/kanban_db.py", line 1111, in _cross_process_init_lock
    handle = lock_path.open("a+b")
  File "/usr/local/Cellar/python@3.14/3.14.3_1/Frameworks/Python.framework/Versions/3.14/lib/python3.14/pathlib/__init__.py", line 771, in open
OSError: [Errno 24] Too many open files: '/Users/leanlabsai/.hermes/kanban.db.init.lock'
2026-06-01 14:53:54,867 INFO gateway.run: Received SIGTERM — initiating shutdown
2026-06-01 14:53:54,867 WARNING gateway.run: Shutdown context: signal=SIGTERM under_systemd=yes parent_pid=1 parent_name=? loadavg_1m=1.51 parent_cmdline='(unknown)'
2026-06-01 14:53:54,868 INFO gateway.run: Stopping gateway...
2026-06-01 14:53:55,687 INFO gateway.run: Sent shutdown notification to home channel telegram:8484962442
2026-06-01 14:53:55,688 INFO gateway.run: Shutdown phase: notify_active_sessions done at +0.82s
2026-06-01 14:53:55,688 INFO gateway.run: Shutdown phase: drain done at +0.82s (drain took 0.00s, timed_out=False, active_at_start=0, active_now=0)
2026-06-01 14:53:56,104 INFO gateway.platforms.telegram: [Telegram] Disconnected from Telegram
2026-06-01 14:53:56,104 INFO gateway.run: ✓ telegram disconnected (0.42s)
2026-06-01 14:53:56,105 INFO gateway.run: Shutdown phase: all adapters disconnected at +1.24s
2026-06-01 14:53:56,106 INFO gateway.run: Shutdown phase: final-cleanup tool kill done at +1.24s
2026-06-01 14:53:56,107 INFO gateway.run: Shutdown phase: SessionDB close done at +1.24s
2026-06-01 14:53:56,109 INFO gateway.run: Gateway stopped (total teardown 1.24s)
2026-06-01 14:53:56,109 INFO gateway.run: Cron ticker stopped
2026-06-01 14:53:56,110 INFO gateway.memory_monitor: [MEMORY] shutdown rss=404MB gc=(1194, 0, 0) threads=7 uptime=93865s
2026-06-01 14:53:56,110 INFO gateway.memory_monitor: [MEMORY] Periodic memory monitoring stopped
2026-06-01 14:53:56,110 INFO gateway.run: Exiting with code 1 (signal-initiated shutdown without restart request) so systemd Restart=on-failure can revive the gateway.
┌─────────────────────────────────────────────────────────┐
│           ⚕ Hermes Gateway Starting...                 │
├─────────────────────────────────────────────────────────┤
│  Messaging platforms + cron scheduler                    │
│  Press Ctrl+C to stop                                   │
└─────────────────────────────────────────────────────────┘
📦 Preflight compression: ~252,801 tokens >= 217,600 threshold. This may take a moment.
🗜️ Compacting context — summarizing earlier conversation so I can continue...
⚠ Compression summary failed: Error code: 400 - {'detail': "The 'google/gemini-3-flash-preview' model is not supported when using Codex with a ChatGPT account."}. Inserted a fallback context marker.
2026-06-01 14:53:59,701 INFO gateway.memory_monitor: [MEMORY] baseline rss=82MB gc=(136, 0, 0) threads=1 uptime=0s
2026-06-01 14:53:59,701 INFO gateway.memory_monitor: [MEMORY] Periodic memory monitoring started (interval: 300s)
2026-06-01 14:54:00,133 INFO gateway.run: Starting Hermes Gateway...
2026-06-01 14:54:00,133 INFO gateway.run: Session storage: /Users/leanlabsai/.hermes/sessions
2026-06-01 14:54:00,135 INFO gateway.run: Agent budget: max_iterations=150 (agent.max_turns from config.yaml, or HERMES_MAX_ITERATIONS from .env, or default 90)
2026-06-01 14:54:00,135 INFO gateway.run: Secret redaction: ENABLED (tool output, logs, and chat responses are scrubbed before delivery)
2026-06-01 14:54:00,143 INFO gateway.run: Previous gateway exited cleanly — skipping session suspension
2026-06-01 14:54:00,486 INFO gateway.run: Connecting to telegram...
2026-06-01 14:54:00,699 INFO gateway.platforms.telegram: [Telegram] Auto-discovered Telegram fallback IPs: 149.154.166.110
2026-06-01 14:54:00,719 INFO gateway.platforms.telegram: [Telegram] Telegram fallback IPs active: 149.154.166.110
2026-06-01 14:54:02,035 INFO gateway.platforms.telegram: [Telegram] set_my_commands OK for scope BotCommandScopeDefault (30 cmds)
2026-06-01 14:54:02,304 INFO gateway.platforms.telegram: [Telegram] set_my_commands OK for scope BotCommandScopeAllPrivateChats (30 cmds)
2026-06-01 14:54:02,576 INFO gateway.platforms.telegram: [Telegram] set_my_commands OK for scope BotCommandScopeAllGroupChats (30 cmds)
2026-06-01 14:54:02,576 INFO gateway.platforms.telegram: [Telegram] Telegram menu: 30 commands registered, 142 hidden (over 30 limit). Use /commands for full list.
2026-06-01 14:54:02,579 INFO gateway.platforms.telegram: [Telegram] Connected to Telegram (polling mode)
2026-06-01 14:54:02,582 INFO gateway.run: ✓ telegram connected
2026-06-01 14:54:02,590 INFO gateway.run: Connecting to api_server...
2026-06-01 14:54:02,594 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 14:54:02,594 WARNING gateway.run: ✗ api_server failed to connect
2026-06-01 14:54:02,596 INFO gateway.platforms.api_server: [Api_Server] API server stopped
2026-06-01 14:54:02,598 INFO gateway.run: Gateway running with 1 platform(s)
2026-06-01 14:54:02,603 INFO gateway.run: Channel directory built: 1 target(s)
2026-06-01 14:54:03,605 INFO gateway.run: Starting reconnection watcher for 1 failed platform(s): api_server
2026-06-01 14:54:03,606 INFO gateway.run: Press Ctrl+C to stop
2026-06-01 14:54:03,609 INFO gateway.run: Cron ticker started (interval=60s)
2026-06-01 14:54:08,610 INFO gateway.run: kanban dispatcher: embedded in gateway (interval=60.0s)
2026-06-01 14:54:33,627 INFO gateway.run: Reconnecting api_server (attempt 2)...
2026-06-01 14:54:33,630 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 14:54:33,632 INFO gateway.run: Reconnect api_server failed, next retry in 60s
2026-06-01 14:55:33,676 INFO gateway.run: Reconnecting api_server (attempt 3)...
2026-06-01 14:55:33,678 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 14:55:33,681 INFO gateway.run: Reconnect api_server failed, next retry in 120s
2026-06-01 14:57:33,792 INFO gateway.run: Reconnecting api_server (attempt 4)...
2026-06-01 14:57:33,795 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 14:57:33,797 INFO gateway.run: Reconnect api_server failed, next retry in 240s
2026-06-01 14:58:59,757 INFO gateway.memory_monitor: [MEMORY] rss=128MB gc=(441, 0, 0) threads=7 uptime=300s
2026-06-01 15:01:34,026 INFO gateway.run: Reconnecting api_server (attempt 5)...
2026-06-01 15:01:34,029 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:01:34,031 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:03:59,741 INFO gateway.memory_monitor: [MEMORY] rss=130MB gc=(1464, 0, 0) threads=7 uptime=600s
2026-06-01 15:06:34,307 INFO gateway.run: Reconnecting api_server (attempt 6)...
2026-06-01 15:06:34,310 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:06:34,312 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:08:59,778 INFO gateway.memory_monitor: [MEMORY] rss=131MB gc=(753, 0, 0) threads=7 uptime=900s
2026-06-01 15:11:34,559 INFO gateway.run: Reconnecting api_server (attempt 7)...
2026-06-01 15:11:34,562 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:11:34,564 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:13:59,762 INFO gateway.memory_monitor: [MEMORY] rss=131MB gc=(239, 0, 0) threads=7 uptime=1200s
2026-06-01 15:16:34,902 INFO gateway.run: Reconnecting api_server (attempt 8)...
2026-06-01 15:16:34,905 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:16:34,907 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:18:59,767 INFO gateway.memory_monitor: [MEMORY] rss=131MB gc=(1296, 0, 0) threads=7 uptime=1500s
2026-06-01 15:20:17,801 INFO gateway.platforms.telegram: [Telegram] Flushing text batch agent:main:telegram:dm:8484962442 (1443 chars)
2026-06-01 15:20:17,804 INFO gateway.run: inbound message: platform=telegram user=John Hawkins chat=8484962442 msg='we need to get to generating revenue!! I think the best place to start is an ema'
2026-06-01 15:21:38,695 INFO gateway.run: Reconnecting api_server (attempt 9)...
2026-06-01 15:21:38,698 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:21:38,701 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:22:52,286 INFO gateway.run: response ready: platform=telegram chat=8484962442 time=154.5s api_calls=7 response=9980 chars
2026-06-01 15:22:52,301 INFO gateway.platforms.base: [Telegram] Sending response (9980 chars) to 8484962442
2026-06-01 15:23:59,751 INFO gateway.memory_monitor: [MEMORY] rss=257MB gc=(666, 0, 0) threads=9 uptime=1800s
2026-06-01 15:26:38,932 INFO gateway.run: Reconnecting api_server (attempt 10)...
2026-06-01 15:26:38,935 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:26:38,938 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:28:59,709 INFO gateway.memory_monitor: [MEMORY] rss=258MB gc=(1773, 0, 0) threads=9 uptime=2100s
2026-06-01 15:31:39,235 INFO gateway.run: Reconnecting api_server (attempt 11)...
2026-06-01 15:31:39,237 ERROR gateway.platforms.api_server: [Api_Server] Refusing to start: API_SERVER_KEY is required for the API server, including loopback-only binds on 127.0.0.1.
2026-06-01 15:31:39,239 INFO gateway.run: Reconnect api_server failed, next retry in 300s
2026-06-01 15:33:59,691 INFO gateway.memory_monitor: [MEMORY] rss=258MB gc=(1053, 0, 0) threads=9 uptime=2400s

Operating System

Mac OS 12.7.6

Python Version

3.11.14

Hermes Version

0.15.1

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

When a platform fails to connect and is queued for reconnect, the loop creates a fresh adapter on every retry attempt:

python
adapter = self._create_adapter(platform, platform_config)


For APIServerAdapter, init creates a ResponseStore() which opens a sqlite3.connect() with 2 fds (db + WAL file). When _connect_adapter_with_timeout(adapter, platform) fails (3 failure paths on lines 5983-6033), the new adapter is simply dropped — it is never disconnected, never closed, and never cleaned up. The old adapter's disconnect() was already called when it initially failed, so the leak is entirely from the newly created but unused adapters in the retry loop.

Three paths leak:
1. Non-retryable error (line 5983): _connect_adapter_with_timeout returns False with has_fatal_error. Adapter reference lost on continue.
2. Retryable failure (line 5995-6008): Returns False but error is retryable. Adapter reference lost; next retry in 30s/60s/.../300s.
3. Exception during reconnect (line 6017-6033): _connect_adapter_with_timeout raises. adapter variable created at 5946 is never referenced again.

Impact: 2 leaked fds per reconnect attempt. At 300s backoff cap, that's ~12 leaked fds/hour, or 288 fds/day. On a 26-hour uptime this hit the 2560 fd limit, causing:
- OSError: [Errno 24] Too many open files on every new file operation
- Telegram connection silently drops — no error response to user, no error in chat
- Kanban dispatcher, channel directory, auth.json, terminal cleanup all fail with the same OSError
- The gateway process survives in a zombie state, unable to open any file

Proposed Fix (optional)

Two fixes needed:

Fix 1 (reconnect loop): Close the newly created adapter in all failure paths before continuing:

python
After any failed reconnect, clean up the unused adapter
async def _dispose_failed_adapter(adapter, platform):
    try:
        await adapter.disconnect()
    except Exception:
        pass
    if hasattr(adapter, '_response_store'):
        try:
            adapter._response_store.close()
        except Exception:
            pass


Then call _dispose_failed_adapter(adapter, platform) at lines 5994, 6008, and 6030 (all three failure paths).

Fix 2 (APIServerAdapter.disconnect): disconnect() should also close the ResponseStore. Currently it only stops the aiohttp web server and does not call self._response_store.close(), meaning even a clean shutdown leaks that connection.

python
In APIServerAdapter.disconnect():
if self._response_store:
    self._response_store.close()
    self._response_store = None

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions