Skip to content

fix(miner): use token-boundary matching in detect_room#1004

Merged
igorls merged 2 commits into
MemPalace:developfrom
coogie:coogie/fix/miner-routing
May 9, 2026
Merged

fix(miner): use token-boundary matching in detect_room#1004
igorls merged 2 commits into
MemPalace:developfrom
coogie:coogie/fix/miner-routing

Conversation

@coogie

@coogie coogie commented Apr 18, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Substring checks in path/filename routing caused systemic misrouting in large monorepos — e.g., "views" ⊂ "interviews" sent every file under views/ to the interviews room. Switch to separator-bounded token matching (-, _, ., /) via a _name_matches helper, applied to priority 1 (path parts) and priority 2 (filename).

Checklist

  • Tests pass (python -m pytest tests/ -v)
  • No hardcoded paths
  • Linter passes (ruff check .)

Closes #1002

@igorls igorls added bug Something isn't working area/mining File and conversation mining labels Apr 24, 2026
@igorls

igorls commented May 3, 2026

Copy link
Copy Markdown
Member

Bumping — verified REAL on develop: mempalace/miner.py:342 uses any(part == c or c in part or part in c for c in candidates) — bidirectional substring, exactly the "views" ⊂ "interviews" misrouting described. Issue pulled into v3.3.5 via #1002.

Token-boundary matching with separator-aware tokenization is the right approach. Please confirm CI green on latest and we can land it.

@coogie coogie force-pushed the coogie/fix/miner-routing branch 2 times, most recently from bae13d7 to 9cfc4bc Compare May 5, 2026 11:34
coogie added a commit to coogie/mempalace that referenced this pull request May 5, 2026
@coogie

coogie commented May 5, 2026

Copy link
Copy Markdown
Contributor Author

@igorls please re-review at convenience

  • Rebased against latest develop
  • Added entry to CHANGELOG.md under 3.3.5 heading
  • Outcome of python -m pytest tests/ -v below
========================================================== warnings summary ===========================================================
tests/test_fact_checker.py::TestCLI::test_exits_nonzero_when_issues_found
  <frozen runpy>:128: RuntimeWarning: 'mempalace.fact_checker' found in sys.modules after import of package 'mempalace', but prior to execution of 'mempalace.fact_checker'; this may result in unpredictable behaviour

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===================================== 1493 passed, 1 skipped, 106 deselected, 1 warning in 36.98s =====================================

@coogie coogie force-pushed the coogie/fix/miner-routing branch from 9cfc4bc to 40aaae1 Compare May 6, 2026 10:38
coogie added a commit to coogie/mempalace that referenced this pull request May 6, 2026
coogie added 2 commits May 7, 2026 21:44
Substring checks in path/filename routing caused systemic misrouting
in large monorepos — e.g., "views" ⊂ "interviews" sent every file
under views/ to the interviews room. Switch to separator-bounded
token matching (-, _, ., /) via a _name_matches helper, applied to
priority 1 (path parts) and priority 2 (filename).
@coogie coogie force-pushed the coogie/fix/miner-routing branch from 40aaae1 to 3d0d037 Compare May 7, 2026 20:47
@igorls igorls added this to the v3.3.5 milestone May 9, 2026
@igorls igorls merged commit 2fc47a5 into MemPalace:develop May 9, 2026
6 checks passed
@igorls igorls mentioned this pull request May 10, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/mining File and conversation mining bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

miner.detect_room substring matching causes systemic misrouting

2 participants