Skip to content

fix: preserve http_auth in _safe_deepcopy_config for OpenSearch (#3580)#4418

Merged
kartik-mem0 merged 1 commit intomainfrom
fix/safe-deepcopy-preserve-http-auth
Mar 19, 2026
Merged

fix: preserve http_auth in _safe_deepcopy_config for OpenSearch (#3580)#4418
kartik-mem0 merged 1 commit intomainfrom
fix/safe-deepcopy-preserve-http-auth

Conversation

@utkarsh240799
Copy link
Copy Markdown
Contributor

@utkarsh240799 utkarsh240799 commented Mar 19, 2026

Description

_safe_deepcopy_config() used broad substring matching to sanitize sensitive fields during telemetry config cloning. The token "auth" matched http_auth, and "connection_class" was listed explicitly — both are runtime objects required by OpenSearch's AWS SigV4 authentication (AWSV4SignerAuth, RequestsHttpConnection). When deepcopy fails for these non-serializable objects, the fallback path nullified them, causing AuthorizationException(403, '') errors.

Root cause

sensitive_tokens = ("auth", "credential", "password", "token", "secret", "key", "connection_class")
for field_name in list(clone_dict.keys()):
    if any(token in field_name.lower() for token in sensitive_tokens):
        clone_dict[field_name] = None  # ← nullifies http_auth, connection_class

Fix

Replaced the broad substring sanitizer with a 3-layer field matching system:

  1. Allowlist (_RUNTIME_FIELDS) — runtime objects like http_auth, auth, connection_class, ssl_context are always preserved (highest priority)
  2. Exact deny (_SENSITIVE_FIELDS_EXACT) — 19 known secret field names (api_key, password, secret_key, auth_client_secret, etc.)
  3. Suffix deny (_SENSITIVE_SUFFIXES) — catches patterns like db_password, client_secret, oauth_token

Also fixed model_dump(mode="json")model_dump() to preserve actual Python objects instead of relying on a PydanticSerializationError fallback to the __dict__ path.

Removed a dead _safe_deepcopy_config call in the sync Memory.__init__ whose result was immediately overwritten.

Fixes #3580

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Unit tests (106 total, all passing)

New test file tests/memory/test_safe_deepcopy_config.py — 94 tests covering:

  • _is_sensitive_field() allowlist, exact deny, suffix deny, case insensitivity, edge cases
  • Over-matching prevention: primary_key, partition_key, monkey, keyboard, tokenizer, authenticate, credentials_path are correctly NOT redacted
  • Real-world field names from 20+ vector store configs (OpenSearch, Weaviate, Databricks, Pinecone, Qdrant, Vertex AI, Azure, AWS, etc.)
  • _safe_deepcopy_config() integration with plain classes, Pydantic BaseModel, and dataclasses
  • End-to-end: real Pydantic model with mock AWSV4SignerAuth verifying result.http_auth is auth (actual object identity preserved)

Updated tests/vector_stores/test_opensearch.py — flipped assertions from is None to is not None for http_auth, auth, connection_class while keeping credentials is None

Manual verification

Reproduced the exact scenario from #3580 using real OpenSearchConfig with mock AWSV4SignerAuth (thread lock, raises on __deepcopy__). Verified both sync and async telemetry flows preserve auth objects end-to-end.

=== AFTER _safe_deepcopy_config ===
  http_auth:        PRESERVED (actual auth object)
  connection_class: PRESERVED (actual class reference)
  password:         REDACTED (None)
  api_key:          REDACTED (None)
  collection_name:  MUTABLE (can be overridden for telemetry)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Maintainer Checklist

Replace broad substring sanitizer with layered allow/deny field matching
to prevent runtime auth objects (http_auth, connection_class) from being
nullified during telemetry config cloning. Also fix model_dump(mode="json")
to model_dump() to preserve actual Python objects instead of relying on
a PydanticSerializationError fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@kartik-mem0 kartik-mem0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of the changes lgtm!

@kartik-mem0 kartik-mem0 merged commit 46b4b2e into main Mar 19, 2026
8 checks passed
@kartik-mem0 kartik-mem0 deleted the fix/safe-deepcopy-preserve-http-auth branch March 19, 2026 11:08
jamebobob pushed a commit to jamebobob/mem0-vigil-recall that referenced this pull request Mar 29, 2026
…ai#3580) (mem0ai#4418)

Co-authored-by: utkarsh240799 <utkarsh240799@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: _safe_deepcopy_config removes http_auth from OpenSearch configs

2 participants