Skip to content

Scope AI to comment spam + native-search assistant#298

Merged
AllTerrainDeveloper merged 2 commits into
trunkfrom
ai/comment-spam-only-native-search
Jun 6, 2026
Merged

Scope AI to comment spam + native-search assistant#298
AllTerrainDeveloper merged 2 commits into
trunkfrom
ai/comment-spam-only-native-search

Conversation

@AllTerrainDeveloper

@AllTerrainDeveloper AllTerrainDeveloper commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

Why

Enabling the AI Copilot used to behave like a site-wide indexing obligation: it auto-analyzed every post, page, and taxonomy term on save (sending content to OpenAI), and the assistant could only find content that had already been analyzed. There was also an orphaned POST /desktop-mode/v1/ai/reindex endpoint that swept all published posts/pages/comments in one unbounded pass (posts_per_page => -1, comment number => 0).

This narrows AI to the one analysis that earns its keep — comment spam scoring (it feeds the comments-window spam score) — and makes the assistant find content the way WordPress already does: native keyword search.

What changed

1. Removed the bulk reindex endpoint

Deleted includes/ai-copilot/reindex.php, its require_once in bootstrap.php, and the route row in includes/rest/README.md. POST /ai/reindex now 404s (it had no built-in caller).

2. Removed post & taxonomy-term auto-analysis; kept comment

  • Stripped the save_post / created_term / edited_term hooks, the analyze_post / analyze_term jobs, and messages_for_post / messages_for_term / schema_content.
  • The comment path is untouched — analyze_comment still runs on comment insert/edit and stores the spam / harmful verdict the comments-window score reads.

3. Assistant search → native WordPress queries

search_posts / search_pages / search_comments / search_comments_by_post now take a query string and run WP_Query s= / get_comments search=. The _desktop_mode_ai_analysis meta_query dependency is gone, so all published posts/pages and approved comments are findable — returning real titles + excerpts. System prompt + tool descriptions rewritten to match.

4. Upgrade cleanup

Migration v2 (includes/migrations.php) calls wp_unschedule_hook() for the now-dead desktop_mode_ai_analyze_post / _analyze_term cron events. Existing post/term meta is left in place (hidden, harmless).

5. Docs + version

hooks-reference.md, javascript-reference.md, register-ai-provider.md, README.md, readme.txt (+ changelog) synced; new docs/migration-ai-comment-only.md indexed in docs/README.md. Version bumped 0.10.0 → 0.11.0.

6. Tests

  • aiNativeSearch.php — content with no analysis meta is found via keyword search; keyword filters the result set; entity builder works without meta.
  • aiCommentOnlyMigration.php — migration v2 unschedules post/term jobs, leaves the comment job alone.
  • commentsAiSpamScore.phpspam floors the score at 75, harmful adds 20, no-meta is a no-op.

⚠️ Breaking change

Removes the post/term analysis hooks documented as Stable/Experimental: desktop_mode_ai_post_prompt, desktop_mode_ai_term_prompt, desktop_mode_ai_post_analyzed, desktop_mode_ai_term_analyzed, desktop_mode_ai_supported_post_types, desktop_mode_ai_supported_taxonomies, desktop_mode_ai_supported_types, desktop_mode_ai_schema_content. The comment-side equivalents remain. See docs/migration-ai-comment-only.md.

Testing

  • npm run build — clean
  • npm run typecheck — clean
  • npm run test:js1519 passed
  • PHPUnit full suite — 914 / 915 pass; the 1 failure is the pre-existing network test (wp_update_plugins() reaching WordPress.org from the offline container), untouched by this change
  • php -l on every changed PHP file — clean

🤖 Generated with Claude Code

Open WordPress Playground Preview

Narrow the AI Copilot to the one analysis that earns its keep — comment
spam scoring — and make the assistant find content with WordPress's
native keyword search instead of pre-analyzed summaries.

- Remove the bulk /ai/reindex endpoint (orphaned, unbounded full sweep)
- Remove post & taxonomy-term auto-analysis (jobs, save_post/term hooks,
  schema_content, messages_for_post/term) and their Stable hooks
- Keep comment analysis only (feeds the comments-window spam score)
- Rewrite search_posts/pages/comments[_by_post] to take a `query` and run
  WP_Query s= / get_comments search=; drop the _desktop_mode_ai_analysis
  meta dependency so all published content is findable
- Migration v2 unschedules leftover analyze_post/analyze_term cron events
- Docs/readme/version synced (0.10.0 -> 0.11.0) + migration note
- Tests: native search, comment-only migration, comment spam score

Breaking: removes the post/term analysis hooks (desktop_mode_ai_post_prompt,
_term_prompt, _post_analyzed, _term_analyzed, _supported_post_types,
_supported_taxonomies, _schema_content). See docs/migration-ai-comment-only.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AllTerrainDeveloper

Copy link
Copy Markdown
Collaborator Author

This is a prior step to integrate with WordPress AI providers shipped on 7.0

@AllTerrainDeveloper

Copy link
Copy Markdown
Collaborator Author

@claude how you feel about leaving here a review? :)

@claude

claude Bot commented Jun 6, 2026

Copy link
Copy Markdown

Claude finished @AllTerrainDeveloper's task in 2m 40s —— View job


Code Review — PR #298

  • Gather context and understand the PR
  • Read changed PHP files (analysis.php, hooks.php, jobs.php, search.php, migrations.php)
  • Read new test files
  • Read migration doc
  • Analyze and write review

Overall verdict: looks solid. The architectural simplification is well-motivated and executed cleanly. The core loop (remove bulk reindex → remove post/term analysis → switch to native WP search → migrate cron cruft) is a tight diff with no surprises. A few things worth fixing before merge, and a few observations.


Bugs / correctness

1. Stale file header docblock in search.php (line 10)

The top-of-file docblock still reads:

* Each tool fetches 10 entities with their _desktop_mode_ai_analysis meta so the model
* can compare AI-generated summaries to the user's description.

That's the old behaviour. The tools now return real titles + excerpts via WP_Query s= / get_comments search= with no dependency on the meta. Fix this →

2. analysis.php — dead 'post'/'term' branches in public helpers

desktop_mode_ai_save_meta() (line 159) and desktop_mode_ai_get_meta() (line 189) still have 'post' and 'term' switch cases. Since the hooks that called them are gone these branches are dead within the plugin itself. They also carry a misleading docstring: 'post' | 'term' | 'comment'. If the intent is to keep them as a public escape-hatch for third-party plugins that might store arbitrary meta under the key, the docblock should say so explicitly; if they're truly dead, drop them. Right now they imply the plugin still does post/term analysis.


Performance

3. N+1 queries in search_fetch_comments() (line 454)

For every comment in the batch a get_post() call fetches the parent post title:

foreach ( $comments as $comment ) {
    $parent_post  = get_post( $comment->comment_post_ID );

With batch size 10 that's up to 10 extra queries per tool call. Comments on the same parent post (common) pay the penalty multiple times since WP's object cache is not seeded. A single get_posts(['post__in' => array_unique($post_ids)]) before the loop eliminates this. (Same pattern in search_fetch_comments_by_post, though there the parent is fetched once already, so that one is fine.) Fix this →


Code quality / minor

4. @since 0.14.0 tags throughout files that ship at 0.11.0

analysis.php, hooks.php, jobs.php, and search.php all carry @since 0.14.0 on their functions. The PR bumps the plugin to 0.11.0, not 0.14.0. These were presumably leftovers from a previous version-numbering plan. Not a runtime issue, but it undermines the accuracy of the inline docs.

5. desktop_mode_ai_run_search() — docblock stub left in place (line 650)

There is a dangling orphan doc comment immediately before desktop_mode_ai_progress_message():

/**
 * Runs the agentic content-search loop.
 * …
 * @return array|WP_Error
 */
/**
 * Returns a friendly progress message for a tool name…

The first docblock belongs to desktop_mode_ai_run_search() but desktop_mode_ai_progress_message() sits between it and the function it describes. The diff (removed a duplicate dispatch-tool docblock) shows awareness of the duplication pattern — this one slipped through. Fix this →


Security

No new issues. Good handling throughout:

  • sanitize_text_field() on the query arg before it reaches WP_Query / get_comments.
  • get_php_error_log capability check (manage_options) sits in the dispatcher before desktop_mode_ai_fetch_error_log() is ever invoked.
  • The streaming endpoint validates nonce + is_user_logged_in() + current_user_can('read') before touching any input.
  • follow_up.result is JSON-encoded and bounded to 4 KB (desktop_mode_ai_followup_outcome_max_chars) before being forwarded to OpenAI, preventing unbounded token inflation from a misbehaving plugin.

Tests

The three new test files are well-structured and cover the right things:

  • aiNativeSearch.php — proves content with no analysis meta is findable and that the keyword actually filters. The assertion assertArrayNotHasKey('ai_summary', $match) is a nice explicit guard against the old shape creeping back.
  • aiCommentOnlyMigration.php — covers the happy path, the non-interference case (comment job left alone), and the full runner dispatch + stamp.
  • commentsAiSpamScore.php — the assertSame(50, $score) in test_harmful_verdict_adds_20 is tight; consider whether that test should assert >= 50 in case the filter stacks with other contributors, or document that the fixture comment is designed to start with exactly score 30.

One gap worth adding: test_search_comments_by_post with a post_id = 0 input to confirm the error-path return is well-formed (the early-return at line 506 is there but untested).


Migration

Migration v2 (desktop_mode_migrate_unschedule_post_term_ai) is correct and minimal — wp_unschedule_hook clears all queued events for the hook regardless of argument signature, which is exactly right here since we want a full sweep. Leaving existing post/term _desktop_mode_ai_analysis meta in place is the right call (harmless, avoids a potentially large table scan, matches the documented behaviour in migration-ai-comment-only.md).

The migration doc itself is clear and the hook-removal table is the right level of detail for plugin authors.


Summary: Three items worth addressing before merge — the stale header docblock (#1), clarifying or removing the dead post/term branches in the meta helpers (#2), and the misplaced run_search docblock (#5). The N+1 query (#3) is a real efficiency win for comment-heavy sites and easy to fix. The @since tag noise (#4) is low priority but would be nice to clean up in the same pass.

… docblock

- search.php: fix the file header docblock that still described the old
  meta-based behaviour (now: native WP search returning title + excerpt)
- analysis.php: drop the dead 'post'/'term' branches in save_meta/get_meta;
  they're comment-only now, documented as such (param kept for signature
  stability, non-comment types are a no-op)
- search.php: kill the N+1 in search_comments by priming parent-post caches
  once via _prime_post_caches() before the per-comment loop
- search.php: re-attach the run_search() docblock to its function (it had
  been left stranded above progress_message())

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AllTerrainDeveloper AllTerrainDeveloper merged commit 9b147f7 into trunk Jun 6, 2026
5 checks passed
@AllTerrainDeveloper AllTerrainDeveloper deleted the ai/comment-spam-only-native-search branch June 6, 2026 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant