Skip to content

Update website loader/chunker to web page loader/chunker#2

Merged
taranjeet merged 1 commit intomainfrom
update-website-to-webpage
Jun 20, 2023
Merged

Update website loader/chunker to web page loader/chunker#2
taranjeet merged 1 commit intomainfrom
update-website-to-webpage

Conversation

@taranjeet
Copy link
Copy Markdown
Member

This commit renames the website loader/chunker to web page, as it is loading and chunking a single url than the complete website.

This commit renames the website loader, chunker
to web page, as it is loading and chunking a single
url than the complete website.
@taranjeet taranjeet merged commit 80f7011 into main Jun 20, 2023
@taranjeet taranjeet deleted the update-website-to-webpage branch June 20, 2023 11:23
PranavPuranik added a commit to PranavPuranik/embedchain that referenced this pull request Aug 1, 2024
merlinfrombelgium pushed a commit to merlinfrombelgium/mem0 that referenced this pull request Jul 4, 2025
Update website loader/chunker to web page loader/chunker
xiangnuans added a commit to xiangnuans/mem0 that referenced this pull request Dec 5, 2025
utkarsh240799 added a commit that referenced this pull request Mar 13, 2026
- Add user identity to extraction preamble so memories are attributed to
  the correct user instead of cross-referencing cached patterns (OPE-6 #1)
- Skip mem0.add() when no user messages remain after noise filtering,
  avoiding wasted API calls on assistant-only payloads (OPE-6 #2)
- Raise auto-recall threshold to 0.6 (vs 0.5 for explicit search) and
  add dynamic thresholding that drops memories below 50% of the top
  result's score to reduce irrelevant context injection (OPE-6 #3)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
utkarsh240799 added a commit that referenced this pull request Mar 16, 2026
- Add user identity to extraction preamble so memories are attributed to
  the correct user instead of cross-referencing cached patterns (OPE-6 #1)
- Skip mem0.add() when no user messages remain after noise filtering,
  avoiding wasted API calls on assistant-only payloads (OPE-6 #2)
- Raise auto-recall threshold to 0.6 (vs 0.5 for explicit search) and
  add dynamic thresholding that drops memories below 50% of the top
  result's score to reduce irrelevant context injection (OPE-6 #3)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
utkarsh240799 added a commit that referenced this pull request Mar 16, 2026
- Add user identity to extraction preamble so memories are attributed to
  the correct user instead of cross-referencing cached patterns (OPE-6 #1)
- Skip mem0.add() when no user messages remain after noise filtering,
  avoiding wasted API calls on assistant-only payloads (OPE-6 #2)
- Raise auto-recall threshold to 0.6 (vs 0.5 for explicit search) and
  add dynamic thresholding that drops memories below 50% of the top
  result's score to reduce irrelevant context injection (OPE-6 #3)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MDGreyMatter pushed a commit to Go-Grey-Matter/ai-memory-system that referenced this pull request Mar 31, 2026
## 307 Redirect Fixes
- memories.py: Added @router.get(""), @router.post(""), @router.delete("")
  aliases alongside "/" root routes so Next.js proxy never gets 307'd
- config.py: Added @router.get(""), @router.put(""), @router.patch("")
  aliases for config root routes
- (stats.py and apps.py were already fixed in Task mem0ai#1)

## Vector Encoding Pipeline Fix
- Root cause: Supabase unreachable from Replit via IPv6; SUPABASE_CONNECTION_STRING
  had [YOUR-PASSWORD] placeholder
- Enabled pgvector extension in Replit's built-in PostgreSQL
- Updated memory.py to prefer DATABASE_URL (pgvector) over SUPABASE_CONNECTION_STRING
- Added fastembed as free local embedder (no OpenAI quota needed):
  - _build_fastembed_embedder_config factory added
  - _EMBEDDER_DEFAULT_DIMS mapping for dynamic dim selection
  - _get_embedder_dims() helper function
  - Restructured get_default_memory_config() to detect embedder first
  - EMBEDDER_PROVIDER=fastembed env var set (shared)
- Patched mem0 FastEmbedEmbedding.embed() in main.py to return Python
  lists (.tolist()) instead of numpy arrays (psycopg2 compat fix)
- fastembed added to start.sh pip install

## Model Updates
- Updated deprecated claude-3-5-sonnet-20240620 → claude-3-haiku-20240307
  in categorization.py, memory.py, config.py, default_config.json, config.json

## Verified End-to-End
- POST /api/v1/memories returns 200 with memory object (no 307)
- Memory appears in filter/list endpoints
- Backend logs: fastembed → pgvector (384-dim) → stored successfully
- Anthropic LLM connected (200 OK) for inference
MDGreyMatter pushed a commit to Go-Grey-Matter/ai-memory-system that referenced this pull request Mar 31, 2026
## 307 Redirect Fixes
- memories.py: Added @router.get(""), @router.post(""), @router.delete("")
  aliases alongside "/" root routes to prevent Next.js proxy from getting 307'd
- config.py: Added @router.get(""), @router.put(""), @router.patch("") aliases
  (stats.py and apps.py were already fixed in Task mem0ai#1)

## Error Handling Fix
- create_memory now raises HTTPException(503) when memory client unavailable
  and HTTPException(500) when vector store operation fails, instead of silently
  returning {"error": ...} dict with a 200 status

## Vector Encoding Pipeline
- Supabase is unreachable from Replit (host resolves to IPv6 only)
- SUPABASE_CONNECTION_STRING had [YOUR-PASSWORD] placeholder
- Solution: enabled pgvector extension in Replit's built-in PostgreSQL,
  updated memory.py to prefer DATABASE_URL (pgvector) over Supabase
- Added fastembed as free local embedder (no API key needed):
  * _build_fastembed_embedder_config factory
  * _EMBEDDER_DEFAULT_DIMS dict for dynamic embedding dimension selection
  * _get_embedder_dims() helper
  * Restructured get_default_memory_config() to detect embedder first
  * EMBEDDER_PROVIDER=fastembed set as shared env var
- Patched mem0 FastEmbedEmbedding.embed() at startup to return Python lists
  instead of numpy arrays (required for psycopg2 pgvector compatibility)
- fastembed added to start.sh pip install line

## Verified End-to-End
- POST /api/v1/memories: 200 OK, memory stored in pgvector (384-dim)
- GET /api/v1/memories: 200 OK, returns stored memories
- POST /api/v1/memories/filter: 200 OK
- GET /api/v1/stats and GET /api/v1/apps: 200 OK
- No 307 redirects on any route
- LLM model name kept as original (claude-3-5-sonnet-20240620) per task scope
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant