Skip to content

issue:Memory Leak in Attach Web Page Function Due to Null Bytes in Postgres Embeddings #19867

@fgonzalez-glmc

Description

@fgonzalez-glmc

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.60.41

Ollama Version (if applicable)

No response

Operating System

Debian

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

The attach web pages functions works correctly and does not cause a memory leak. null bytes should be correctly managed.

Actual Behavior

The "attach web page" function in OpenWebUI creates a memory leak when attaching some web pages, not all of them. When you attach a web page, it tries to convert and embed the results, but sometimes the results create null bytes in the embedding, so the system Postgres doesn't handle it well and it creates a memory leak. The RAM peaks up, like by 100 MB for example.

Steps to Reproduce

Use the OpenWebUI website for environment variables as a test case : https://docs.openwebui.com/getting-started/env-configuration/

Actual Behavior

  • PostgreSQL rejects the data due to null bytes in the text content
  • A ValueError is raised: A string literal cannot contain NUL (0x00) characters.
  • Memory usage spikes by ~100 MB
  • The web page attachment fails

Related Issues

This is similar to an issue that was fixed recently with native web search in OpenWebUI. Some web search results contained null bytes in the embedding and Postgres didn't like it.

Steps to Reproduce

  1. start a chat
  2. attach a web page like https://docs.openwebui.com/getting-started/env-configuration/
  3. send message and observe memory usage et logs

Logs & Screenshots

Error Logs

2025-12-10 15:49:17.688 | INFO     | open_webui.routers.retrieval:save_docs_to_vector_db:1426 - adding to collection 0e24a0680a3ccfb58e6382fcf4af45c17ceeae11e59914b2b106ce5b973e143
2025-12-10 15:49:18.845 | ERROR    | open_webui.retrieval.vector.dbs.pgvector:insert:358 - Error during insert: A string literal cannot contain NUL (0x00) characters.
Traceback (most recent call last):
 
  File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
    self._bootstrap_inner()
    │    └ <function Thread._bootstrap_inner at 0xec98188c49a0>
    └ <WorkerThread(AnyIO worker thread, started 260136564093344)>
  File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
    │    └ <function WorkerThread.run at 0xec97ee6efec0>
    └ <WorkerThread(AnyIO worker thread, started 260136564093344)>
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 986, in run
    result = context.run(func, *args)
             │       │   │      └ ()
             │       │   └ functools.partial(<function save_docs_to_vector_db at 0xec97f0ab4e00>, <starlette.requests.Request object at 0xec97c382c150>,...
             │       └ <method 'run' of '_contextvars.Context' objects>
             └ <_contextvars.Context object at 0xec97c849f300>
 
  File "/app/backend/open_webui/routers/retrieval.py", line 1427, in save_docs_to_vector_db
    VECTOR_DB_CLIENT.insert(
    │                └ <function PgvectorClient.insert at 0xec97f4f99d00>
    └ <open_webui.retrieval.vector.dbs.pgvector.PgvectorClient object at 0xec97f4f0aa10>
 
> File "/app/backend/open_webui/retrieval/vector/dbs/pgvector.py", line 351, in insert
    self.session.bulk_save_objects(new_items)
    │    │       │                 └ [<open_webui.retrieval.vector.dbs.pgvector.DocumentChunk object at 0xec97a787acd0>, <open_webui.retrieval.vector.dbs.pgvector...
    │    │       └ <function scoped_session.bulk_save_objects at 0xec97f6490400>
    │    └ <sqlalchemy.orm.scoping.scoped_session object at 0xec97f4f0f890>
    └ <open_webui.retrieval.vector.dbs.pgvector.PgvectorClient object at 0xec97f4f0aa10>
 
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/scoping.py", line 1359, in bulk_save_objects
    return self._proxied.bulk_save_objects(
           │    └ <property object at 0xec97f6439170>
           └ <sqlalchemy.orm.scoping.scoped_session object at 0xec97f4f0f890>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4586, in bulk_save_objects
    self._bulk_save_mappings(
    │    └ <function Session._bulk_save_mappings at 0xec97f6601580>
    └ <sqlalchemy.orm.session.Session object at 0xec97c203bad0>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4763, in _bulk_save_mappings
    with util.safe_reraise():
         │    └ <class 'sqlalchemy.util.langhelpers.safe_reraise'>
         └ <module 'sqlalchemy.util' from '/usr/local/lib/python3.11/site-packages/sqlalchemy/util/__init__.py'>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
          │         │              └ <traceback object at 0xec97c39bc3c0>
          │         └ <method 'with_traceback' of 'BaseException' objects>
          └ ValueError('A string literal cannot contain NUL (0x00) characters.')
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4752, in _bulk_save_mappings
    bulk_persistence._bulk_insert(
    │                └ <function _bulk_insert at 0xec97f6578f40>
    └ <module 'sqlalchemy.orm.bulk_persistence' from '/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/bulk_persistence.py'>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/bulk_persistence.py", line 222, in _bulk_insert
    result = persistence._emit_insert_statements(
             │           └ <function _emit_insert_statements at 0xec97f6578900>
             └ <module 'sqlalchemy.orm.persistence' from '/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py'>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 1048, in _emit_insert_statements
    result = connection.execute(
             │          └ <function Connection.execute at 0xec9815b3f7e0>
             └ <sqlalchemy.engine.base.Connection object at 0xec97a6898210>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
    return meth(
           └ <bound method ClauseElement._execute_on_connection of <sqlalchemy.sql.dml.Insert object at 0xec97c94a1d10>>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection
    return connection._execute_clauseelement(
           │          └ <function Connection._execute_clauseelement at 0xec9815b3fb00>
           └ <sqlalchemy.engine.base.Connection object at 0xec97a6898210>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1638, in _execute_clauseelement
    ret = self._execute_context(
          │    └ <function Connection._execute_context at 0xec9815b3fce0>
          └ <sqlalchemy.engine.base.Connection object at 0xec97a6898210>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1841, in _execute_context
    return self._exec_insertmany_context(dialect, context)
           │    │                        │        └ <sqlalchemy.dialects.postgresql.psycopg2.PGExecutionContext_psycopg2 object at 0xec97a68c6250>
           │    │                        └ <sqlalchemy.dialects.postgresql.psycopg2.PGDialect_psycopg2 object at 0xec97f4f77ed0>
           │    └ <function Connection._exec_insertmany_context at 0xec9815b3fe20>
           └ <sqlalchemy.engine.base.Connection object at 0xec97a6898210>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2123, in _exec_insertmany_context
    self._handle_dbapi_exception(
    │    └ <function Connection._handle_dbapi_exception at 0xec9815b38040>
    └ <sqlalchemy.engine.base.Connection object at 0xec97a6898210>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2355, in _handle_dbapi_exception
    raise exc_info[1].with_traceback(exc_info[2])
          │                          └ (<class 'ValueError'>, ValueError('A string literal cannot contain NUL (0x00) characters.'), <traceback object at 0xec97c1ede...
          └ (<class 'ValueError'>, ValueError('A string literal cannot contain NUL (0x00) characters.'), <traceback object at 0xec97c1ede...
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2115, in _exec_insertmany_context
    dialect.do_execute(
    │       └ <function DefaultDialect.do_execute at 0xec981586fc40>
    └ <sqlalchemy.dialects.postgresql.psycopg2.PGDialect_psycopg2 object at 0xec97f4f77ed0>
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 942, in do_execute
    cursor.execute(statement, parameters)
    │      │       │          └ {'id__0': '7e45a657-fde5-4095-9b94-3fa6ccb0a302', 'collection_name__0': '0e24a0680a3ccfb58e6382fcf4af45c17ceeae11e59914b2b106...
    │      │       └ 'INSERT INTO document_chunk (id, vector, collection_name, text, vmetadata) VALUES (%(id__0)s, %(vector__0)s, %(collection_nam...
    │      └ <method 'execute' of 'psycopg2.extensions.cursor' objects>
    └ <cursor object at 0xec97c9ef71f0; closed: -1>
 
ValueError: A string literal cannot contain NUL (0x00) characters.
2025-12-10 15:49:18.956 | ERROR    | open_webui.routers.retrieval:save_docs_to_vector_db:1435 - A string literal cannot contain NUL (0x00) characters.
 
--- Duplicate traceback block (26 lines) ---
 
ValueError: A string literal cannot contain NUL (0x00) characters.
2025-12-10 15:49:19.068 | ERROR    | open_webui.routers.retrieval:process_web:1757 - A string literal cannot contain NUL (0x00) characters.
 
--- Duplicate traceback block (106 lines) ---
 
ValueError: A string literal cannot contain NUL (0x00) characters.
2025-12-10 15:49:19.182 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 15.157.52.23:0 - "POST /api/v1/retrieval/process/web HTTP/1.1" 400
2025-12-10 15:49:38.734 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 15.157.52.23:0 - "GET /_app/version.json HTTP/1.1" 200
2025-12-10 15:49:45.195 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 45.73.69.53:0 - "GET /_app/version.json HTTP/1.1" 200

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtesting wantedTesting from the community is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions