Skip to content

Peer Cards#180

Merged
VVoruganti merged 47 commits into
mainfrom
ben/peer-cards
Aug 12, 2025
Merged

Peer Cards#180
VVoruganti merged 47 commits into
mainfrom
ben/peer-cards

Conversation

@dr-frmr

@dr-frmr dr-frmr commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

Fixes DEV-1032, DEV-1043

Summary by CodeRabbit

  • New Features

    • Automatic "peer cards" (concise bios) are generated/updated and used to personalize chats, prompts, and representation reasoning.
    • Peer-card benchmarking CLI to evaluate LLM provider/model candidates.
  • Improvements

    • Richer working-representation handling with formatted outputs and per-peer/target bio context included in chat flows.
    • LLM client: GPT‑5 compatibility plus optional reasoning-effort and verbosity hints.
    • Async bench runner for concurrent test execution.
  • Performance/Telemetry

    • Per-workspace summary metrics and refined timing formatting.
  • Chores

    • Dependency bumps and new deriver configuration/env options exposed.
  • Documentation/Tests

    • New bench fixtures and docs updated (Python 3.11+).

dr-frmr and others added 24 commits August 4, 2025 14:35
chore: update tests
chore: bump version, changelog
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…rbitrary filters

- Remove unused config variables
- Added arbitrary filters to all search endpoints.
- Pluralize `filters` everywhere in SDKs for consistency
- Updated documentation and changelog to reflect these changes.
* expose core client from sdks

* align text

* fix: resolve get_effective_observe me race condition, default peer config (#176)

* fix: resolve get_effective_observe me race condition, default peer config

* fix: preserve custom config even after leaving

* chore: test cases, enqueue types

* Update sdks/typescript/package.json

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: doria <93405247+dr-frmr@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@coderabbitai

coderabbitai Bot commented Aug 6, 2025

Copy link
Copy Markdown
Contributor

Caution

Review failed

The pull request is closed.

Walkthrough

Adds peer-card creation/management and session-scoped working-representation storage; refactors deriver from a class to top-level task handlers; updates prompts, dialectic/chat, LLM client, embedding store, CRUD exports, queue handling, tests, bench tooling, configs, and adds a new LLMError type. Dependencies bumped.

Changes

Cohort / File(s) Summary
Dependencies
pyproject.toml
Bumped mirascope >=1.25.5 and openai >=1.99.7.
Config & env
src/config.py, .env.template, config.toml.example, docs/.../configuration.mdx
Add DERIVER peer-card and working-representation settings and limits; rename/add DERIVER_ env vars; docs and example config updated.
CRUD: exports & representation
src/crud/__init__.py, src/crud/representation.py
Rework exports; add get_peer_card/set_peer_card; add session-scoped working-representation get/set, formatting helpers, metadata-key construction and legacy-key fallback.
Deriver orchestration & prompts
src/deriver/consumer.py, src/deriver/deriver.py, src/deriver/prompts.py, src/deriver/queue_payload.py, src/deriver/queue_manager.py
Remove Deriver class; add top-level handlers (process_webhook, process_summary_task, process_representation_task); add peer_card_call and peer-card update flow; update critical-analysis prompt inputs to accept peer_card/working_representation; remove old union payload types; queue_manager handles LLMError and revises finalization.
Dialectic chat & prompts
src/dialectic/chat.py, src/dialectic/prompts.py
Thread peer_card/target_peer_card through chat and prompt generation; fetch cards from CRUD; use global representation collection constant.
LLM client & shared models
src/utils/clients.py, src/utils/shared_models.py
Add OpenAI GPT‑5 shim; extend honcho_llm_call with reasoning_effort and verbosity; GPT‑5 max_completion_tokens handling; add PeerCardQuery Pydantic model.
Embedding & formatting
src/utils/embedding_store.py, src/utils/formatting.py
Tighten save_unified_observations signature (message_id:int, session_name, message_created_at required, add fallback_level); timestamp typing stricter; find_new_observations return type narrowed.
Logging & summarizer
src/utils/logging.py, src/utils/summarizer.py
Millisecond metric formatting change; summary metrics renamed to per-workspace names (observability-only).
Queue/exception types & models
src/exceptions.py, src/models.py
Add new LLMError exception with normalized data; add QueueItem.__repr__.
Bench tools & data
tests/bench/README.md, tests/bench/peer_card_bench.py, tests/bench/peer_card_tests/*, tests/bench/run_tests.py, tests/bench/tests/summary_and_query.json
Add peer-card bench CLI, many JSON bench fixtures, async bench runner refactor and README updates.
Tests cleanup/updates
tests/deriver/README.md, tests/deriver/conftest.py, tests/deriver/test_deriver_processing.py, tests/deriver/test_queue_processing.py, tests/test_llm_mock.py, tests/deriver/test_representation_crud.py
Remove Deriver-based fixture/tests, update mocks to new critical_analysis_call args, add comprehensive representation/peer-card CRUD tests.

Sequence Diagram(s)

sequenceDiagram
  participant Q as Queue/Worker
  participant C as consumer.process_item
  participant DB as DB (tracked_db)
  participant D as process_representation_task
  participant CRUD as crud.representation
  participant LLM as LLM (honcho_llm_call)

  Q->>C: representation task(payload)
  C->>DB: open session
  C->>D: process_representation_task(db, payload)
  D->>CRUD: get_working_representation_data(...)
  alt no or stale working representation
    D->>LLM: critical_analysis_call(peer_card?, working_representation?, history, new_turn)
    LLM-->>D: reasoning + thinking
    D->>CRUD: get_peer_card(...)
    alt peer-card should change
      D->>LLM: peer_card_call(old_peer_card, new_observations)
      LLM-->>D: PeerCardQuery
      D->>CRUD: set_peer_card(...)
    end
    D->>CRUD: set_working_representation(..., final_observations)
  else existing representation
    D->>LLM: critical_analysis_call(...)
    LLM-->>D: reasoning
    D->>CRUD: set_working_representation(...)
  end
  D-->>C: done
  C-->>DB: commit/close
Loading
sequenceDiagram
  participant API as dialectic.chat()
  participant DB as DB (tracked_db)
  participant CRUD as crud.get_peer_card
  participant PR as dialectic_prompt
  participant LLM as LLM (stream/call)

  API->>DB: open session
  API->>CRUD: get_peer_card(peer_name)
  CRUD-->>API: peer_card
  alt target provided
    API->>CRUD: get_peer_card(target_name)
    CRUD-->>API: target_peer_card
  end
  API->>PR: dialectic_prompt(..., peer_card, target_peer_card)
  PR-->>API: prompt
  API->>LLM: dialectic_stream/call(prompt)
  LLM-->>API: response
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~75 minutes

Assessment against linked issues

Objective Addressed Explanation
Peer cards stored in a peer's internal metadata and managed by the deriver (DEV-1032)
Automatically populate/update a “card” with basic biographical data via deriver flows (DEV-1043)

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
Overhaul of EmbeddingStore.save_unified_observations (src/utils/embedding_store.py) Changes embedding persistence signature/contract unrelated to peer-card storage objectives.
Summary metric renames to per-workspace (src/utils/summarizer.py) Observability-only metric naming; not required by peer-card feature.
Millisecond formatting change in log_performance_metrics (src/utils/logging.py) Logging-format tweak; unrelated to peer-card feature.

Possibly related PRs

  • Honcho 2.1.0 "ROTE" deriver #160 — Strong overlap: deriver orchestration, representation/peer-card CRUD, prompts and top-level flow changes.
  • Typing #137 — Related changes to AI client surface and mirascope/openai integration.
  • feat: webhooks #168 — Related queue/consumer/processing refactors and error-handling adjustments.

Suggested reviewers

  • VVoruganti

Poem

"I nibble prompts and hop through rows,
I stitch the cards where memory grows.
I nudge the LLM, then hop away —
Benchmarks hum and tests at play.
A twitch of whiskers, code that glows. 🐇"


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1648989 and 3f4148e.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • .env.template (1 hunks)
  • config.toml.example (1 hunks)
  • pyproject.toml (1 hunks)
  • src/config.py (1 hunks)
  • src/deriver/deriver.py (10 hunks)
  • src/dialectic/chat.py (9 hunks)
  • src/exceptions.py (2 hunks)
  • src/utils/embedding_store.py (3 hunks)
  • src/utils/formatting.py (4 hunks)
  • src/utils/summarizer.py (2 hunks)
  • tests/bench/run_tests.py (26 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ben/peer-cards

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

- Updated `peer_card_prompt` to join `old_peer_card` list elements with newlines for better readability.
- Removed outdated comment in `dialectic_prompt` regarding handling of non-existent cards.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🔭 Outside diff range comments (2)
tests/bench/run_tests.py (2)

173-183: Improve error handling with more specific timeout information.

The error message lacks details about which session timed out during polling.

Enhance the error logging to include session context:

 try:
     await honcho_client.poll_deriver_status(
         session_id=session_id,
         timeout=float(self.timeout_seconds)
         if self.timeout_seconds
         else 10000.0,
     )
     return True
 except Exception as e:
-    self.logger.warning(f"Error polling deriver status: {e}")
+    self.logger.warning(
+        f"Error polling deriver status for session '{session_id}': {e}"
+    )
     return False

730-731: Pass timeout parameter to TestRunner constructor.

The --timeout argument is parsed but never passed to the TestRunner constructor.

 # Create test runner
 runner = TestRunner(
-    honcho_url=args.honcho_url, anthropic_api_key=args.anthropic_api_key
+    honcho_url=args.honcho_url,
+    anthropic_api_key=args.anthropic_api_key,
+    timeout_seconds=args.timeout
 )
♻️ Duplicate comments (9)
src/config.py (1)

197-203: Peer-card model and token bounds: align with earlier review and provider limits

Current bounds (gt=1000, le=10_000, default=4000) are at odds with earlier review guidance and typical provider limits for GPT‑5 Nano; also peer cards are intentionally short.

Update defaults and bounds to be flexible for short outputs and match model capacity:

-    PEER_CARD_MODEL: str = "gpt-5-nano-2025-08-07"
-    # Note: peer cards should be very short, but GPT-5 models need output tokens for thinking which cannot be turned off...
-    PEER_CARD_MAX_OUTPUT_TOKENS: Annotated[
-        int, Field(default=4000, gt=1000, le=10_000)
-    ] = 4000
+    PEER_CARD_MODEL: str = "gpt-5-nano-2025-08-07"  # verify availability in envs
+    # Peer cards are concise. Allow small budgets; upper bound reflects model capacity.
+    PEER_CARD_MAX_OUTPUT_TOKENS: Annotated[
+        int, Field(default=2000, ge=128, le=128_000)
+    ] = 2000  # includes any provider reasoning tokens

Optionally add a validator to ensure the correct API key is set for the provider (based on prior feedback). I can draft it if you want it in this PR.

src/dialectic/chat.py (1)

306-318: Add explicit error handling for peer card retrieval.

The current implementation doesn't handle potential errors from crud.get_peer_card. While tracked_db may handle some errors, consider what happens if the database operation fails unexpectedly.

Apply this diff to add proper error handling:

    # 4. Peer card(s) ----------------------------------------------------------
    async with tracked_db("chat.get_peer_card") as db:
-       peer_card = await crud.get_peer_card(db, workspace_name, peer_name)
-       if target_name:
-           target_peer_card = await crud.get_peer_card(db, workspace_name, target_name)
-       else:
-           target_peer_card = None
+       try:
+           peer_card = await crud.get_peer_card(db, workspace_name, peer_name)
+       except Exception as e:
+           logger.error("Failed to retrieve peer_card for %s in workspace %s: %s",
+                        peer_name, workspace_name, e)
+           peer_card = None  # Use None as fallback
+       
+       if target_name:
+           try:
+               target_peer_card = await crud.get_peer_card(db, workspace_name, target_name)
+           except Exception as e:
+               logger.warning("Could not fetch target_peer_card for %s: %s", target_name, e)
+               target_peer_card = None
+       else:
+           target_peer_card = None
src/crud/representation.py (1)

48-73: Add 25-line validation and improve error handling for peer cards.

According to DEV-1032 requirements, peer cards should be limited to 25 lines. Also, consider removing the "peer_card" key when setting to None instead of storing null values.

Apply this diff to address both issues:

 async def set_peer_card(
     db: AsyncSession, workspace_name: str, peer_name: str, peer_card: list[str] | None
 ) -> None:
     """
     Set peer card for a peer.
 
     Raises:
         ResourceNotFoundException: If the peer does not exist
+        ValidationException: If the peer card exceeds 25 lines
     """
+    # Validate 25-line limit
+    if peer_card is not None and len(peer_card) > 25:
+        raise exceptions.ValidationException(
+            f"Peer card exceeds 25 line limit: {len(peer_card)} lines"
+        )
+
+    # Build update statement based on whether we're setting or clearing
+    if peer_card is None:
+        # Remove the key entirely when clearing
+        stmt = (
+            update(models.Peer)
+            .where(models.Peer.workspace_name == workspace_name)
+            .where(models.Peer.name == peer_name)
+            .values(
+                internal_metadata=models.Peer.internal_metadata.op("-")("peer_card")
+            )
+        )
+    else:
         stmt = (
             update(models.Peer)
             .where(models.Peer.workspace_name == workspace_name)
             .where(models.Peer.name == peer_name)
             .values(
                 internal_metadata=models.Peer.internal_metadata.op("||")(
                     {"peer_card": peer_card}
                 )
             )
         )
     result = await db.execute(stmt)
     if result.rowcount == 0:
-        raise exceptions.ResourceNotFoundException(
-            f"Peer {peer_name} not found in workspace {workspace_name}"
-        )
+        # Log context, then raise
+        logger.warning(
+            "Peer not found when setting peer card (workspace=%s, peer=%s)",
+            workspace_name,
+            peer_name,
+        )
+        raise exceptions.ResourceNotFoundException()
     await db.commit()
src/deriver/deriver.py (6)

41-44: Use absolute imports for consistency.

Replace relative imports with absolute ones to comply with project conventions.

-from .prompts import critical_analysis_prompt, peer_card_prompt
-from .queue_payload import (
-    RepresentationPayload,
-)
+from src.deriver.prompts import critical_analysis_prompt, peer_card_prompt
+from src.deriver.queue_payload import RepresentationPayload

91-98: Add return type annotation for public function.

The public function peer_card_call is missing a return type annotation, which is inconsistent with other public functions in the codebase.

 async def peer_card_call(
     old_peer_card: list[str] | None,
     new_observations: list[str],
-):
+) -> PeerCardQuery:

101-109: Expand docstring to follow Google style conventions.

The docstring should include Args and Returns sections for better documentation.

     """
     Process a representation task by extracting insights and updating working representations.
+    
+    Args:
+        db: Database session
+        payload: Representation task payload containing workspace, session, and message details
+    
+    Returns:
+        None
     """

555-558: Improve error handling for peer card updates.

Use logger.exception to preserve stack trace and consider whether this should be a fatal error.

         except Exception as e:
             if settings.SENTRY.ENABLED:
                 sentry_sdk.capture_exception(e)
-            logger.error("Error updating peer card! Skipping... %s", e)
+            logger.exception("Error updating peer card for %s! Skipping...", self.ctx.sender_name)

612-612: Use timezone-aware datetime for consistency.

Using datetime.now() without timezone can cause issues in distributed systems or when comparing timestamps across different timezones.

     working_rep_data = {
         "final_observations": final_obs_dict,
         "message_id": payload.message_id,
-        "created_at": datetime.datetime.now().isoformat(),
+        "created_at": datetime.datetime.now(datetime.timezone.utc).isoformat(),
     }

221-231: Type inconsistency: peer card is list[str] but used as if it's a string.

The sender_peer_card variable is a list[str] | None from crud.get_peer_card, but the downstream reasoner.reason method and prompts expect a string representation.

Convert the list to a string before passing downstream:

     if payload.sender_name == payload.target_name:
         sender_peer_card: list[str] | None = await crud.get_peer_card(
             db, payload.workspace_name, payload.sender_name
         )
         if sender_peer_card is None:
             logger.warning("No peer card found for %s", payload.sender_name)
+            sender_peer_card_str = None
         else:
-            logger.info("Using peer card: %s", sender_peer_card)
+            sender_peer_card_str = "\n".join(sender_peer_card)
+            logger.info("Using peer card:\n%s", sender_peer_card_str)
     else:
         logger.info("No peer card used for directional representation derivation")
-        sender_peer_card = None
+        sender_peer_card_str = None

     # Run single-pass reasoning
     final_observations = await reasoner.reason(
         db,
         working_representation,
         formatted_history,
-        sender_peer_card,
+        sender_peer_card_str,
     )

Then update the reason method signature and related methods to accept str | None instead of list[str] | None.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f409b67 and ceaecfb.

📒 Files selected for processing (13)
  • .env.template (1 hunks)
  • config.toml.example (1 hunks)
  • docs/v2/contributing/configuration.mdx (1 hunks)
  • src/config.py (1 hunks)
  • src/crud/representation.py (7 hunks)
  • src/deriver/deriver.py (10 hunks)
  • src/deriver/queue_manager.py (3 hunks)
  • src/dialectic/chat.py (9 hunks)
  • src/exceptions.py (2 hunks)
  • src/models.py (1 hunks)
  • tests/bench/run_tests.py (25 hunks)
  • tests/bench/tests/summary_and_query.json (1 hunks)
  • tests/deriver/test_representation_crud.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
{src,tests}/**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

{src,tests}/**/*.py: Follow isort conventions with absolute imports preferred in Python code
Use snake_case for variables/functions and PascalCase for classes in Python
Enforce line length of 88 characters (Black compatible)
Write docstrings using Google style

Files:

  • src/models.py
  • tests/deriver/test_representation_crud.py
  • src/exceptions.py
  • src/deriver/queue_manager.py
  • src/config.py
  • src/dialectic/chat.py
  • tests/bench/run_tests.py
  • src/deriver/deriver.py
  • src/crud/representation.py
src/models.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

src/models.py: Use explicit type hints with SQLAlchemy mapped_column annotations in ORM models
All database tables use text IDs (nanoid format) as primary keys
Use composite foreign keys to enforce multi-tenant relationships
Include feature flags at workspace, peer, and session levels in the data model
Track token counts on messages for usage tracking in the database
Use JSONB metadata fields for extensibility in database tables
Create HNSW indexes for vector similarity search where applicable

Files:

  • src/models.py
src/**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

src/**/*.py: Use explicit error handling with appropriate exception types
Use specific exception types (e.g., ResourceNotFoundException, ValidationException) when raising errors
Use proper logging with context instead of print statements

Files:

  • src/models.py
  • src/exceptions.py
  • src/deriver/queue_manager.py
  • src/config.py
  • src/dialectic/chat.py
  • src/deriver/deriver.py
  • src/crud/representation.py
src/exceptions.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

Define custom exceptions in src/exceptions.py

Files:

  • src/exceptions.py
🧠 Learnings (3)
📚 Learning: 2025-08-11T17:17:14.646Z
Learnt from: CR
PR: plastic-labs/honcho#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-11T17:17:14.646Z
Learning: Applies to src/crud.py : Maintain database CRUD operations in src/crud.py

Applied to files:

  • src/dialectic/chat.py
📚 Learning: 2025-06-18T20:42:06.458Z
Learnt from: dr-frmr
PR: plastic-labs/honcho#131
File: src/routers/sessions.py:206-213
Timestamp: 2025-06-18T20:42:06.458Z
Learning: The `get_or_create_session` function in this codebase is designed to handle both session creation and adding peers to existing sessions. When called with peers, it will add those peers to an existing session rather than creating a duplicate session.

Applied to files:

  • tests/bench/run_tests.py
📚 Learning: 2025-08-11T17:17:14.646Z
Learnt from: CR
PR: plastic-labs/honcho#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-11T17:17:14.646Z
Learning: Applies to {src,tests}/**/*.py : Follow isort conventions with absolute imports preferred in Python code

Applied to files:

  • src/deriver/deriver.py
🧬 Code Graph Analysis (6)
tests/deriver/test_representation_crud.py (5)
src/crud/representation.py (5)
  • get_peer_card (27-45)
  • get_working_representation (75-119)
  • get_working_representation_data (122-164)
  • set_peer_card (48-72)
  • set_working_representation (256-329)
src/exceptions.py (1)
  • ResourceNotFoundException (22-26)
tests/conftest.py (2)
  • db_session (159-164)
  • sample_data (235-253)
src/schemas.py (1)
  • SessionCreate (190-199)
src/models.py (1)
  • SessionPeer (421-431)
src/deriver/queue_manager.py (1)
src/exceptions.py (1)
  • LLMError (95-127)
src/dialectic/chat.py (2)
src/dependencies.py (1)
  • tracked_db (32-61)
src/crud/representation.py (1)
  • get_peer_card (27-45)
tests/bench/run_tests.py (2)
sdks/python/src/honcho/async_client/client.py (1)
  • AsyncHoncho (22-485)
src/schemas.py (1)
  • SessionPeerConfig (179-187)
src/deriver/deriver.py (11)
src/utils/shared_models.py (7)
  • PeerCardQuery (196-205)
  • ReasoningResponse (91-101)
  • ReasoningResponseWithThinking (104-108)
  • UnifiedObservation (51-88)
  • DeductiveObservation (41-48)
  • from_string (84-88)
  • ObservationContext (111-176)
src/deriver/prompts.py (2)
  • critical_analysis_prompt (15-99)
  • peer_card_prompt (103-159)
src/utils/clients.py (6)
  • honcho_llm_call (119-134)
  • honcho_llm_call (139-154)
  • honcho_llm_call (159-175)
  • honcho_llm_call (180-196)
  • honcho_llm_call (201-216)
  • honcho_llm_call (219-392)
src/utils/logging.py (2)
  • conditional_observe (24-41)
  • format_reasoning_response_as_markdown (70-113)
src/deriver/queue_payload.py (1)
  • RepresentationPayload (13-23)
src/utils/summarizer.py (1)
  • get_session_context_formatted (634-676)
src/crud/representation.py (1)
  • construct_collection_name (332-333)
src/crud/collection.py (1)
  • get_or_create_collection (50-78)
src/utils/embedding_store.py (2)
  • EmbeddingStore (27-397)
  • save_unified_observations (38-132)
src/utils/formatting.py (3)
  • format_new_turn_with_timestamp (94-109)
  • format_context_for_prompt (112-164)
  • find_new_observations (177-209)
src/exceptions.py (1)
  • LLMError (95-127)
src/crud/representation.py (5)
src/crud/peer.py (1)
  • get_peer (89-122)
src/utils/shared_models.py (2)
  • ObservationDict (187-193)
  • Observation (30-38)
src/schemas.py (1)
  • PeerCreate (61-69)
src/exceptions.py (1)
  • ResourceNotFoundException (22-26)
src/models.py (1)
  • SessionPeer (421-431)
🪛 Ruff (0.12.2)
tests/deriver/test_representation_crud.py

22-22: Missing return type annotation for public function test_peer_card_get_set_roundtrip

Add return type annotation: None

(ANN201)


43-43: Missing return type annotation for public function test_set_peer_card_missing_peer_raises

Add return type annotation: None

(ANN201)


99-99: Missing return type annotation for public function test_working_representation_self_merge_and_trim

Add return type annotation: None

(ANN201)


193-193: Missing return type annotation for public function test_working_representation_directional_merge_and_keys

Add return type annotation: None

(ANN201)


246-246: Missing return type annotation for public function test_wr_string_roundtrip_then_structured_overrides

Add return type annotation: None

(ANN201)


288-288: Missing return type annotation for public function test_wr_missing_levels_and_empty_new_lists

Add return type annotation: None

(ANN201)


342-342: Missing return type annotation for public function test_wr_trim_boundary_exact_25_plus_one

Add return type annotation: None

(ANN201)


392-392: Missing return type annotation for public function test_wr_formatting_rules_mixed_observation_types

Add return type annotation: None

(ANN201)


441-441: Missing return type annotation for public function test_wr_legacy_key_fallback_for_self_representation

Add return type annotation: None

(ANN201)


478-478: Missing return type annotation for public function test_wr_empty_both_levels_formats_empty_string

Add return type annotation: None

(ANN201)


504-504: Missing return type annotation for public function test_wr_existing_dict_without_final_observations

Add return type annotation: None

(ANN201)


543-543: Missing return type annotation for public function test_wr_deductive_trim_when_no_new_items

Add return type annotation: None

(ANN201)

src/exceptions.py

112-112: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)


112-112: Dynamically typed expressions (typing.Any) are disallowed in **kwargs

(ANN401)


121-121: Dynamically typed expressions (typing.Any) are disallowed in value

(ANN401)

src/deriver/queue_manager.py

251-253: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


252-252: Logging statement uses f-string

(G004)


258-258: Logging .exception(...) should be used instead of .error(..., exc_info=True)

(G201)


259-259: Logging statement uses f-string

(G004)


259-259: Use explicit conversion flag

Replace with conversion flag

(RUF010)


278-278: Logging statement uses f-string

(G004)

tests/bench/run_tests.py

124-124: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)


502-502: Logging statement uses f-string

(G004)


504-504: Logging statement uses f-string

(G004)

src/deriver/deriver.py

91-91: Missing return type annotation for public function peer_card_call

(ANN201)


555-555: Do not catch blind exception: Exception

(BLE001)


558-558: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


612-612: datetime.datetime.now() called without a tz argument

(DTZ005)

src/crud/representation.py

69-71: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (15)
tests/bench/tests/summary_and_query.json (1)

172-174: LGTM: session-scoped queries aligned with defined session

Both queries now explicitly attach to "session1", which exists under "sessions". This should make the bench harness evaluate within the intended context.

Also applies to: 176-179

src/config.py (1)

209-214: LGTM: working-representation cap is reasonable and bounded

The limit and bounds (default 100, gt=0, le=500) give headroom while preventing unbounded growth.

src/deriver/queue_manager.py (2)

16-16: LGTM: explicit import of exceptions module

Using absolute import and namespaced reference (exceptions.LLMError) keeps it clear and avoids deep coupling.


265-267: Good: safeguard against head-of-line blocking

Marking malformed messages as processed avoids indefinite stalling. With the LLMError change above (removing continue), this will also cover that path.

tests/bench/run_tests.py (2)

225-225: Verify the Claude Sonnet 4 model identifier
Please confirm that “claude-sonnet-4-20250514” is the correct Anthropic API model name for Claude Sonnet 4 (it formally launched on May 22, 2025, and IDs often include the exact release date). If the date or name is off, update it consistently across the codebase.

Occurrences to review:

  • config.toml.example: MODEL = "claude-sonnet-4-20250514"
  • src/config.py (line 220): MODEL: str = "claude-sonnet-4-20250514"
  • tests/bench/run_tests.py (line 225) & peer_card_bench.py (lines 275, 392, 415)
  • docs/v2/contributing/configuration.mdx (lines 234, 331, 377)
  • docs/docs.json (entry at line 16)

109-129: Redundant time recalculation logic.

The duration formatting logic includes redundant recalculations when seconds round up to 60.

Simplify the formatting logic:

 def _format_duration(self, total_seconds: float) -> str:
     """Format a duration in seconds into a human-readable string.
 
     If the duration is at least one minute, this returns a string in the
     form "XmYYs" with zero-padded seconds. Otherwise, it returns the
     duration in seconds with two decimal places, e.g., "12.34s".
 
     Args:
         total_seconds: The duration in seconds.
 
     Returns:
         A formatted duration string.
     """
-    minutes = int(total_seconds // 60)
+    total_minutes = int(total_seconds / 60)
-    if minutes > 0:
-        seconds_rounded = int(round(total_seconds - minutes * 60))
-        if seconds_rounded == 60:
-            minutes += 1
-            seconds_rounded = 0
-        return f"{minutes}m{seconds_rounded:02d}s"
+    if total_minutes > 0:
+        seconds = int(total_seconds % 60)
+        return f"{total_minutes}m{seconds:02d}s"
     return f"{total_seconds:.2f}s"

Likely an incorrect or invalid review comment.

tests/deriver/test_representation_crud.py (1)

108-111: LGTM! Clever use of monkeypatch for testing configuration limits.

The test effectively validates the trimming behavior by patching the configuration value. This approach makes the test independent of the actual configuration values.

src/dialectic/chat.py (2)

20-20: LGTM! Good centralization of collection name constant.

Moving from hard-coded string to a centralized constant improves maintainability and reduces the risk of typos across the codebase.


51-59: Function signature properly extended for peer card support.

The addition of peer_card and target_peer_card parameters is consistent with the PR objectives to support peer cards in dialectic calls.

src/crud/representation.py (3)

27-45: LGTM! Clean implementation of peer card retrieval.

The function properly handles the case when a peer doesn't exist by catching ResourceNotFoundException and returning None, which aligns with the documented behavior.


207-254: Well-designed working representation merge logic.

The merge function properly handles FIFO trimming of observations to maintain a bounded size, preserves metadata, and handles both explicit and deductive observation types correctly.


289-308: Consider adding error handling for the merge operation.

The current implementation catches merge exceptions but doesn't rollback the transaction. If the merge fails, you might end up storing corrupted data.

     merged_value: str | dict[str, Any] = representation
     if isinstance(representation, dict):
         try:
             existing = await get_working_representation_data(
                 db=db,
                 workspace_name=workspace_name,
                 observer_name=observer_name,
                 observed_name=observed_name,
                 session_name=session_name,
             )
             merged_value = _merge_working_representation(
                 existing,
                 representation,
             )
         except Exception:
             logger.exception(
                 "Failed to merge working representation; storing as provided"
             )
+            # Use the new representation as-is on merge failure
             merged_value = representation

Likely an incorrect or invalid review comment.

src/deriver/deriver.py (3)

329-330: Critical analysis call already expects list[str] for peer_card.

I see that critical_analysis_call expects list[str] | None for the peer_card parameter. The issue is that we need to maintain consistency - either pass the list directly and have the prompt function handle the joining, or join it before passing.

Looking at the critical_analysis_prompt function in the relevant code snippets, it already handles joining the list with chr(10).join(peer_card). So the current implementation is actually correct.


452-456: Good observation flattening logic.

The nested list comprehension correctly flattens observations from all levels. This is clear and concise.


561-585: Well-structured conversion function.

The observation_context_to_reasoning_response function properly handles the conversion between data structures, maintaining all necessary fields and handling optional metadata correctly.

Comment thread .env.template
Comment thread config.toml.example
Comment thread docs/v2/contributing/configuration.mdx
Comment thread src/deriver/deriver.py
Comment thread src/deriver/queue_manager.py
Comment thread src/exceptions.py
Comment thread src/models.py
Comment thread tests/bench/run_tests.py
Comment thread tests/deriver/test_representation_crud.py
Comment thread tests/deriver/test_representation_crud.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (4)
src/dialectic/prompts.py (3)

39-41: Typos fixed (“user’s”): LGTM

Thanks for correcting “users's” to “user's” in both branches.

Also applies to: 48-49


77-78: Wrap {query_target} in tags for consistency with other blocks

Matches the style of <recent_conversation_history>, , etc., and eases parsing/debugging.

Apply this diff:

-{query_target}
+<query_target>
+{query_target}
+</query_target>

13-16: Default peer_card to None in dialectic_prompt signature

Mirror the existing optional target_peer_card default so adding this parameter remains non-breaking. All current call sites in src/dialectic/chat.py already pass peer_card positionally, so this change won’t affect them.

• File: src/dialectic/prompts.py, adjust lines 13–16

     peer_name: str,
-    peer_card: list[str] | None,
+    peer_card: list[str] | None = None,
     target_name: str | None = None,
     target_peer_card: list[str] | None = None,
src/deriver/prompts.py (1)

111-120: Render old card only when present and use '\n' join

Avoid showing an empty “Current user biographical card” section; prefer "\n" over chr(10).

Apply this diff:

-    old_peer_card_section = (
-        f"""
+    old_peer_card_section = (
+        f"""
 Current user biographical card:
-{chr(10).join(old_peer_card)}
+{\n.join(old_peer_card)}
     """
-        if old_peer_card is not None
+        if old_peer_card
         else """
 User does not have a card. Create one with any key observations.
     """
     )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ceaecfb and 9f70222.

📒 Files selected for processing (2)
  • src/deriver/prompts.py (3 hunks)
  • src/dialectic/prompts.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
{src,tests}/**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

{src,tests}/**/*.py: Follow isort conventions with absolute imports preferred in Python code
Use snake_case for variables/functions and PascalCase for classes in Python
Enforce line length of 88 characters (Black compatible)
Write docstrings using Google style

Files:

  • src/deriver/prompts.py
  • src/dialectic/prompts.py
src/**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

src/**/*.py: Use explicit error handling with appropriate exception types
Use specific exception types (e.g., ResourceNotFoundException, ValidationException) when raising errors
Use proper logging with context instead of print statements

Files:

  • src/deriver/prompts.py
  • src/dialectic/prompts.py

Comment thread src/deriver/prompts.py
Comment thread src/deriver/prompts.py
Comment thread src/deriver/prompts.py
Comment thread src/deriver/prompts.py
Comment thread src/deriver/prompts.py
Comment thread src/deriver/prompts.py
Comment thread src/dialectic/prompts.py

@VVoruganti VVoruganti left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would just go through and resolve code rabbit suggestions on styling, but otherwise all looks great

…e manager

- Updated the docstring in `critical_analysis_prompt` to provide detailed type annotations for parameters.
- Improved logging in `chat` to differentiate between single and multiple retrieved peer cards.
- Adjusted logging format in `QueueManager` to use a more structured approach for shutdown messages.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🔭 Outside diff range comments (1)
src/deriver/queue_manager.py (1)

16-21: Prefer absolute imports; unify models import for consistency

You switched to an absolute import for exceptions but still use a relative import for models. Per guidelines, prefer absolute imports throughout.

Apply this diff:

-from .. import models
+from src import models
♻️ Duplicate comments (8)
src/deriver/prompts.py (5)

35-45: Skip empty peer-card blocks; use "\n" and filter blank lines

Avoid emitting empty <peer_card> tags and prefer "\n" over chr(10). Also filter out empty lines for cleaner formatting.

-    # Format the peer card as a string with newlines
-    peer_card_section = (
-        f"""
-        The user's known biographical information:
-        <peer_card>
-        {chr(10).join(peer_card)}
-        </peer_card>
-        """
-        if peer_card is not None
-        else ""
-    )
+    # Format the peer card as a string with newlines, skipping empties
+    peer_card_lines = [line for line in (peer_card or []) if line and line.strip()]
+    peer_card_section = (
+        f"""
+The user's known biographical information:
+<peer_card>
+{'\n'.join(peer_card_lines)}
+</peer_card>
+"""
+        if peer_card_lines
+        else ""
+    )

47-56: Skip empty current-context blocks to avoid blank tags

Test truthiness so empty strings don’t produce empty-tag sections.

-    working_representation_section = (
+    working_representation_section = (
         f"""
 The current user understanding:
 <current_context>
 {working_representation}
 </current_context>
 """
-        if working_representation is not None
+        if working_representation
         else ""
     )

111-120: Render old_peer_card as readable text and fall back if empty

Join lines with "\n" and treat empty lists as “no card”.

-    old_peer_card_section = (
-        f"""
-Current user biographical card:
-{chr(10).join(old_peer_card)}
-    """
-        if old_peer_card is not None
-        else """
-User does not have a card. Create one with any key observations.
-    """
-    )
+    old_peer_card_lines = [line for line in (old_peer_card or []) if line and line.strip()]
+    old_peer_card_section = (
+        f"""
+Current user biographical card:
+{'\n'.join(old_peer_card_lines)}
+    """
+        if old_peer_card_lines
+        else """
+User does not have a card. Create one with any key observations.
+    """
+    )

154-156: Format new observations as readable lines and wrap in tags

Avoid Python list repr; join lines and add a clear tag.

-New observations:
-{chr(10).join(new_observations)}
+New observations:
+<new_observations>
+{'\n'.join(obs for obs in new_observations if obs and obs.strip())}
+</new_observations>

157-158: Remove reference to undefined “notes” field

The prompt mentions a “notes field,” but no output schema includes it. This can confuse the model.

-If there's no new key info, you should return an empty card. **NEVER** include notes or temporary information in the card itself, instead use the notes field.
+If there's no new key info, return an empty card. **NEVER** include notes or temporary information in the card.
src/deriver/queue_manager.py (2)

276-281: Good: parameterized logging on shutdown path

The switch to parameterized logging resolves prior lint (G004) and avoids f-string expansion.


250-256: Fix head-of-line blocking and avoid PII leakage in LLMError path

  • continue will re-select the same unprocessed message and can loop indefinitely, stalling the work unit.
  • f-string logs the full QueueItem repr, which likely includes payload (PII risk).
  • Use logger.exception (stack trace) with parameterized logging, capture to Sentry, and fall through so processed=True is applied.

Apply this diff:

-                    except exceptions.LLMError as e:
-                        logger.error(
-                            f"LLM returned bad JSON for message {message}, re-queueing",
-                        )
-                        if settings.SENTRY.ENABLED:
-                            sentry_sdk.capture_exception(e)
-                        continue
+                    except exceptions.LLMError as e:
+                        logger.exception(
+                            "LLM returned malformed JSON for queue item id=%s; "
+                            "marking processed to prevent stalling",
+                            message.id,
+                        )
+                        if settings.SENTRY.ENABLED:
+                            sentry_sdk.capture_exception(e)
+                        # fall through to mark processed and commit
src/dialectic/chat.py (1)

306-317: Handle DB errors and avoid PII leakage when logging peer cards; consider truncation

  • get_peer_card may return None and can still raise (e.g., DB issues). Wrap calls in try/except and default to a safe empty list to keep types consistent.
  • Logging full peer cards at INFO risks leaking PII. Log counts at INFO and full content only at DEBUG.
  • Optional: bound the size of cards (by tokens/lines) before passing to the prompt to prevent context overflow.

Apply this diff:

-    async with tracked_db("chat.get_peer_card") as db:
-        peer_card = await crud.get_peer_card(db, workspace_name, peer_name)
-        if target_name:
-            target_peer_card = await crud.get_peer_card(db, workspace_name, target_name)
-        else:
-            target_peer_card = None
-
-    if target_peer_card:
-        logger.info("Retrieved peer cards:\n%s\n%s", peer_card, target_peer_card)
-    else:
-        logger.info("Retrieved peer card:\n%s", peer_card)
+    async with tracked_db("chat.get_peer_card") as db:
+        try:
+            peer_card = await crud.get_peer_card(db, workspace_name, peer_name)
+        except Exception as e:
+            logger.error(
+                "Failed to retrieve peer_card for peer=%s in workspace=%s: %s",
+                peer_name,
+                workspace_name,
+                e,
+            )
+            peer_card = []
+        if target_name:
+            try:
+                target_peer_card = await crud.get_peer_card(
+                    db, workspace_name, target_name
+                )
+            except Exception as e:
+                logger.warning(
+                    "Failed to retrieve target_peer_card for target=%s in workspace=%s: %s",
+                    target_name,
+                    workspace_name,
+                    e,
+                )
+                target_peer_card = []
+        else:
+            target_peer_card = None
+
+    # Avoid PII leakage: log only counts at INFO; full content gated to DEBUG
+    if logger.isEnabledFor(logging.DEBUG):
+        logger.debug(
+            "Retrieved peer cards content (DEBUG). peer=%s, target=%s\npeer_card=%s\ntarget_peer_card=%s",
+            peer_name,
+            target_name,
+            peer_card,
+            target_peer_card,
+        )
+    if target_peer_card:
+        logger.info(
+            "Retrieved peer cards: peer=%s lines=%d, target=%s lines=%d",
+            peer_name,
+            len(peer_card or []),
+            target_name,
+            len(target_peer_card or []),
+        )
+    else:
+        logger.info(
+            "Retrieved peer card: peer=%s lines=%d",
+            peer_name,
+            len(peer_card or []),
+        )

Optional truncation helper to keep cards within a safe token budget (place near the top of this file, alongside other utils), then call it here before logging:

def truncate_peer_card(
    card: list[str] | None,
    tokenizer: tiktoken.Encoding,
    max_tokens: int = 1000,
) -> list[str]:
    if not card:
        return []
    tokens = tokenizer.encode("\n".join(card))
    tokens = tokens[:max_tokens]
    text = tokenizer.decode(tokens)
    return text.splitlines()

Then invoke:

peer_card = truncate_peer_card(peer_card, tokenizer)
if target_peer_card is not None:
    target_peer_card = truncate_peer_card(target_peer_card, tokenizer)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f70222 and 1648989.

📒 Files selected for processing (3)
  • src/deriver/prompts.py (3 hunks)
  • src/deriver/queue_manager.py (3 hunks)
  • src/dialectic/chat.py (9 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
{src,tests}/**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

{src,tests}/**/*.py: Follow isort conventions with absolute imports preferred in Python code
Use snake_case for variables/functions and PascalCase for classes in Python
Enforce line length of 88 characters (Black compatible)
Write docstrings using Google style

Files:

  • src/deriver/queue_manager.py
  • src/dialectic/chat.py
  • src/deriver/prompts.py
src/**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

src/**/*.py: Use explicit error handling with appropriate exception types
Use specific exception types (e.g., ResourceNotFoundException, ValidationException) when raising errors
Use proper logging with context instead of print statements

Files:

  • src/deriver/queue_manager.py
  • src/dialectic/chat.py
  • src/deriver/prompts.py
🧠 Learnings (1)
📚 Learning: 2025-08-11T17:17:14.646Z
Learnt from: CR
PR: plastic-labs/honcho#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-11T17:17:14.646Z
Learning: Applies to src/crud.py : Maintain database CRUD operations in src/crud.py

Applied to files:

  • src/dialectic/chat.py
🧬 Code Graph Analysis (2)
src/deriver/queue_manager.py (1)
src/exceptions.py (1)
  • LLMError (95-127)
src/dialectic/chat.py (2)
src/dependencies.py (1)
  • tracked_db (32-61)
src/crud/representation.py (1)
  • get_peer_card (27-45)
🪛 Ruff (0.12.2)
src/deriver/queue_manager.py

251-253: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


252-252: Logging statement uses f-string

(G004)


258-258: Logging .exception(...) should be used instead of .error(..., exc_info=True)

(G201)


259-259: Logging statement uses f-string

(G004)


259-259: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🔇 Additional comments (6)
src/deriver/prompts.py (2)

16-21: Signature change aligns with the new peer-card workflow

Switching to peer_card: list[str] | None and adding working_representation: str | None is consistent with the PR objectives and past feedback. No issues with typing or ordering here.


16-21: All critical_analysis_prompt calls include working_representation
The only invocation in src/deriver/deriver.py already passes working_representation=working_representation, and no other call sites exist. No further updates needed.

src/deriver/queue_manager.py (1)

265-267: Confirm intent: marking messages processed even on exceptions

In src/deriver/queue_manager.py at line 266, message.processed = True is set unconditionally after catching any exception—this will drop items even on transient failures.

  • If avoiding head-of-line blocking is your priority, this is acceptable.
  • If you’d rather retry failed items, consider a bounded retry strategy:
    • Add an attempts counter to the message payload
    • On failure, if attempts < MAX_RETRIES, increment and re-enqueue with exponential backoff
    • Otherwise, mark as processed and route to a dead-letter queue or alert

Let me know if you’d like me to draft a detailed retry/backoff design.

src/dialectic/chat.py (3)

247-250: LGTM: correct collection selection for global vs targeted embeddings

Using GLOBAL_REPRESENTATION_COLLECTION_NAME for self/global queries and construct_collection_name for targeted queries is consistent with the cross-peer representation design.


328-331: LGTM: passing peer cards into dialectic calls

The streaming and non-streaming paths consistently pass peer_card and target_peer_card through. With the above error handling and logging tweaks, this should be robust.

Also applies to: 339-342


80-83: No action needed: dialectic_prompt already supports peer_card and target_peer_card
The signature in src/dialectic/prompts.py defines:

def dialectic_prompt(
    query: str,
    working_representation: str | None,
    recent_conversation_history: str | None,
    additional_context: str | None,
    peer_name: str,
    peer_card: list[str] | None,
    target_peer_card: list[str] | None = None,
    …
):
    …
    {chr(10).join(peer_card) if peer_card else "(none)"}
    {chr(10).join(target_peer_card) if target_peer_card else "(none)"}
    …

Both parameters are declared with the correct list[str] | None types, and the implementation safely handles None.

Comment thread src/deriver/prompts.py
Comment on lines 25 to 34
Args:
peer_name: The name of the user being analyzed
message_created_at: Timestamp of the message being analyzed
context: Current user understanding context
history: Recent conversation history
new_turn: New conversation turn to analyze
peer_card (list[str] | None): The bio card of the user being analyzed.
message_created_at (datetime.datetime): Timestamp of the message.
working_representation (str | None): Current user understanding context.
history (str): Recent conversation history.
new_turn (str): New conversation turn to analyze.

Returns:
Formatted prompt string for critical analysis
"""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Docstring: include explicit return type per Google style

Add the return type to comply with the repo’s docstring guidelines.

     Returns:
-        Formatted prompt string for critical analysis
+        str: Formatted prompt string for critical analysis.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Args:
peer_name: The name of the user being analyzed
message_created_at: Timestamp of the message being analyzed
context: Current user understanding context
history: Recent conversation history
new_turn: New conversation turn to analyze
peer_card (list[str] | None): The bio card of the user being analyzed.
message_created_at (datetime.datetime): Timestamp of the message.
working_representation (str | None): Current user understanding context.
history (str): Recent conversation history.
new_turn (str): New conversation turn to analyze.
Returns:
Formatted prompt string for critical analysis
"""
Args:
peer_card (list[str] | None): The bio card of the user being analyzed.
message_created_at (datetime.datetime): Timestamp of the message.
working_representation (str | None): Current user understanding context.
history (str): Recent conversation history.
new_turn (str): New conversation turn to analyze.
Returns:
str: Formatted prompt string for critical analysis.
"""
🤖 Prompt for AI Agents
In src/deriver/prompts.py around lines 25 to 34 the docstring "Returns:" section
lacks an explicit return type per Google style; update the Returns block to
include the return type (e.g., "str:") followed by the existing description so
it reads like "Returns: str: Formatted prompt string for critical analysis" (or
the exact return type used by the function), ensuring the type and description
are on the same Returns line or on the next indented line as per Google
docstring conventions.

Comment thread src/deriver/prompts.py
Comment on lines +102 to +110
@prompt_template()
def peer_card_prompt(
old_peer_card: list[str] | None,
new_observations: list[str],
) -> str:
"""
Generate the peer card prompt for the deriver.
Currently optimized for GPT-5 mini/nano.
"""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add Google-style Args/Returns to peer_card_prompt docstring

Document parameters and return type to match the repo standards.

 @prompt_template()
 def peer_card_prompt(
     old_peer_card: list[str] | None,
     new_observations: list[str],
 ) -> str:
     """
-    Generate the peer card prompt for the deriver.
-    Currently optimized for GPT-5 mini/nano.
+    Generate the peer card prompt for the deriver.
+
+    Currently optimized for GPT-5 mini/nano.
+
+    Args:
+        old_peer_card (list[str] | None): Existing biographical card, if any.
+        new_observations (list[str]): New observations to incorporate.
+
+    Returns:
+        str: A formatted prompt instructing the model to produce a concise card.
     """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@prompt_template()
def peer_card_prompt(
old_peer_card: list[str] | None,
new_observations: list[str],
) -> str:
"""
Generate the peer card prompt for the deriver.
Currently optimized for GPT-5 mini/nano.
"""
@prompt_template()
def peer_card_prompt(
old_peer_card: list[str] | None,
new_observations: list[str],
) -> str:
"""
Generate the peer card prompt for the deriver.
Currently optimized for GPT-5 mini/nano.
Args:
old_peer_card (list[str] | None): Existing biographical card, if any.
new_observations (list[str]): New observations to incorporate.
Returns:
str: A formatted prompt instructing the model to produce a concise card.
"""
🤖 Prompt for AI Agents
In src/deriver/prompts.py around lines 102 to 110, the peer_card_prompt
docstring is missing Google-style Args and Returns sections; add an "Args:"
section documenting old_peer_card (list[str] | None) with meaning and
optionality and new_observations (list[str]) with meaning, and add a "Returns:"
line stating it returns a str (the generated prompt). Keep descriptions brief
and consistent with repo docstring style and placement directly under the
existing one-line description.

Comment on lines 257 to 261
except Exception as e:
logger.error(
f"Error processing queue item for task type {message.task_type} with id {message.id}: {str(e)}",
exc_info=True,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use logging.exception and parameterized logging for generic exceptions

Switch from error(..., exc_info=True) and f-strings to exception(...) with placeholders. This satisfies Ruff (G004, G201, RUF010) and avoids leaking PII via f-strings.

Apply this diff:

-                    except Exception as e:
-                        logger.error(
-                            f"Error processing queue item for task type {message.task_type} with id {message.id}: {str(e)}",
-                            exc_info=True,
-                        )
+                    except Exception as e:
+                        logger.exception(
+                            "Error processing queue item task_type=%s id=%s",
+                            message.task_type,
+                            message.id,
+                        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
logger.error(
f"Error processing queue item for task type {message.task_type} with id {message.id}: {str(e)}",
exc_info=True,
)
except Exception as e:
logger.exception(
"Error processing queue item task_type=%s id=%s",
message.task_type,
message.id,
)
🧰 Tools
🪛 Ruff (0.12.2)

258-258: Logging .exception(...) should be used instead of .error(..., exc_info=True)

(G201)


259-259: Logging statement uses f-string

(G004)


259-259: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents
In src/deriver/queue_manager.py around lines 257 to 261, replace the f-string +
logger.error(..., exc_info=True) with a call to logger.exception using
parameterized logging; specifically, remove the f-string, use a format string
with %s placeholders for task_type and id, and pass message.task_type and
message.id as separate arguments so the exception info is automatically logged
and PII is not embedded in the message.

Comment thread src/dialectic/chat.py

from src import crud
from src.config import settings
from src.crud.representation import GLOBAL_REPRESENTATION_COLLECTION_NAME

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Prefer importing constants via src.crud public API to avoid brittle deep imports

Re-export GLOBAL_REPRESENTATION_COLLECTION_NAME from src/crud/init.py and import it from src.crud. This keeps import paths stable and aligned with isort grouping.

Apply this diff here:

-from src.crud.representation import GLOBAL_REPRESENTATION_COLLECTION_NAME
+from src.crud import GLOBAL_REPRESENTATION_COLLECTION_NAME

Add this to src/crud/init.py (if not already present):

from .representation import GLOBAL_REPRESENTATION_COLLECTION_NAME

__all__ = [
    # ...
    "GLOBAL_REPRESENTATION_COLLECTION_NAME",
]
🤖 Prompt for AI Agents
In src/dialectic/chat.py around line 20, the code imports
GLOBAL_REPRESENTATION_COLLECTION_NAME via a deep import from
src.crud.representation which is brittle; instead re-export the constant from
src.crud and update the import. Add from .representation import
GLOBAL_REPRESENTATION_COLLECTION_NAME to src/crud/__init__.py and include
"GLOBAL_REPRESENTATION_COLLECTION_NAME" in that module's __all__, then change
the import in src/dialectic/chat.py to import it from src.crud (i.e., from
src.crud import GLOBAL_REPRESENTATION_COLLECTION_NAME) so imports remain stable
and align with isort.

Comment thread src/dialectic/chat.py
Comment on lines +57 to 60
peer_card: list[str] | None,
target_name: str | None = None,
target_peer_card: list[str] | None = None,
):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Update docstrings to include the new peer_card/target_peer_card params (Google style)

Both functions added parameters but the docstrings don’t document them. Please update to keep docs accurate per our guidelines.

Example for dialectic_call (apply similar to dialectic_stream):

async def dialectic_call(...):
    """
    Make a direct call to the dialectic model for context synthesis.

    Args:
        query: The user query.
        working_representation: Current session conclusions (short-term context).
        recent_conversation_history: Recent messages/summaries for the session.
        additional_context: Historical context from semantic search.
        peer_name: Name of the user/peer initiating the query.
        peer_card: Optional biography card for the peer (lines of text).
        target_name: Optional name of the peer being queried about.
        target_peer_card: Optional biography card for the target peer (lines of text).

    Returns:
        Model prompt/messages to be sent by the LLM client wrapper.
    """

Also applies to: 117-120

🤖 Prompt for AI Agents
In src/dialectic/chat.py around lines 57-60 and 117-120, the docstrings for the
functions were not updated to document the newly added peer_card and
target_peer_card parameters; update each function's Google-style docstring Args
section to list peer_card as "Optional biography card for the peer (lines of
text)" and target_peer_card as "Optional biography card for the target peer
(lines of text)" (also keep peer_name/target_name descriptions consistent with
existing style), and ensure the Returns section remains accurate — mirror the
example provided in the review to keep formatting and wording consistent across
both dialectic_call and dialectic_stream.

@VVoruganti VVoruganti merged commit 7a9746c into main Aug 12, 2025
10 of 11 checks passed
@VVoruganti VVoruganti deleted the ben/peer-cards branch August 12, 2025 21:39
ranc1 pushed a commit to ranc1/honcho that referenced this pull request Apr 16, 2026
* chore: fill out missing metadata inputs in python sdk

* feat: add get_peer_config to python sdk, thoroughly document ts sdk and remove bad client usage

* feat: zod
chore: update tests
chore: bump version, changelog

* chore: python sdk version bump and changelog

* [WIP] feat: combine search methods and rework endpoint to include limit param

* chore: test new stainless config with library

* nits: coderabbit

* Merge branch 'ben/sdk-improvements' into ben/search-rrf

* chore: pre-commit hooks cleanup

* feat: thoroughly document observation config

* Update sdks/python/src/honcho/peer.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* chore: v1.3.0

* feat: update version to 2.2.0 and enhance search functionality with arbitrary filters

- Remove unused config variables
- Added arbitrary filters to all search endpoints.
- Pluralize `filters` everywhere in SDKs for consistency
- Updated documentation and changelog to reflect these changes.

* expose core client in TS and Python SDKs (plastic-labs#150)

* expose core client from sdks

* align text

* fix: resolve get_effective_observe me race condition, default peer config (plastic-labs#176)

* fix: resolve get_effective_observe me race condition, default peer config

* fix: preserve custom config even after leaving

* chore: test cases, enqueue types

* Update sdks/typescript/package.json

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: doria <93405247+dr-frmr@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* chore: formatting

* chore: revert undesired changes to v1 spec, clean up docs, coderabbit

* feat: better search docs, fix worker.ts

* fix: correctly make ts params optional in cases, update docs

* chore: coderabbit

* chore: remove spurious package-lock

* feat: [WIP] introduce peer cards

* chore: remove search.mdx

* chore: clean up deriver

* feat: peer cards working in deriver

* feat: add basic peer_card_bench

* feat: refine peer card prompt, add mini-benchmark, switch to gpt-5-nano

* refactor: update peer card handling in dialectic functions and improve error handling

- Enhanced `get_peer_card` function to handle `ResourceNotFoundException`.
- Updated `dialectic_call` and `dialectic_stream` to accept `peer_card` and `target_peer_card` parameters.
- Modified prompt generation to include peer card information.
- Cleaned up whitespace in several files for consistency.

* chore: update mirascope dependency version in configuration files

- Bumped mirascope version from 1.25.1 to 1.25.5 in pyproject.toml and uv.lock.
- Added a note in config.py regarding peer card output token handling.
- Removed unnecessary comments in clients.py for clarity.

* fix: [coderabbit] improve error handling in set_peer_card and enhance logging

- Added a check in `set_peer_card` to raise `ResourceNotFoundException` if the peer does not exist.
- Updated logging in `CertaintyReasoner` to capture exceptions with Sentry when enabled.
- Refined logging messages for clarity and consistency across various functions.
- Cleaned up whitespace and formatting in several files for improved readability.

* refactor: update working representation handling and improve metadata key usage

- Introduced constants for representation collection names to enhance clarity and maintainability.
- Updated function signatures in `get_working_representation` and `set_working_representation` to require `session_name`.
- Simplified metadata key determination logic by using constants instead of hardcoded strings.
- Removed legacy fallback logic for working representation data retrieval.
- Refactored `save_working_representation_to_peer` to utilize the new `set_working_representation` function for improved code reuse.

* chore: update configuration files and enhance working representation settings

- Added new peer card settings and context token limits to `.env.template`, `config.toml.example`, and documentation.
- Introduced `WORKING_REPRESENTATION_MAX_OBSERVATIONS` to `DeriverSettings` for better control over observation storage.
- Updated `set_working_representation` to merge new observations while respecting the maximum limit.
- Improved docstrings for clarity and consistency across functions.

* feat: introduce LLMError exception and enhance error handling in deriver

- Added LLMError exception to handle failures in LLM calls, normalizing inputs into a JSON-serializable format.
- Updated CertaintyReasoner to raise LLMError on exceptions during LLM function calls.
- Enhanced QueueManager to log LLMError occurrences and re-queue messages appropriately.
- Modified test runner to support asynchronous operations and improved output formatting for test results.
- Updated test cases to include session information for better context.

* feat: add __repr__ method to QueueItem for improved string representation

- Implemented a __repr__ method in the QueueItem class to provide a clear and informative string representation of its attributes.
- Updated timeout handling in TestRunner to default to 10000.0 seconds when timeout_seconds is not set, enhancing robustness in polling operations.

* refactor: update peer card data structure and improve handling in related functions

- Changed return type of `get_peer_card` and `set_peer_card` to use `list[str]` instead of `str | None`.
- Updated `peer_card_call` and related functions to accommodate the new list structure for peer cards.
- Introduced `PeerCardQuery` model to standardize responses from peer card queries.
- Adjusted prompt generation in `peer_card_prompt` to reflect the new data structure.
- Modified benchmark tests to align with the updated peer card handling.

* refactor: adjust peer card output token settings and update related functions

- Increased `PEER_CARD_MAX_OUTPUT_TOKENS` from 2000 to 4000 in `DeriverSettings`.
- Updated `critical_analysis_call` to use `json_mode` and removed unused parameters.
- Modified benchmark tests to utilize the new `PEER_CARD_MAX_OUTPUT_TOKENS` setting.
- Removed obsolete `add_dislike.json` test file.

* refactor: update peer card handling in critical analysis and dialectic prompts

- Changed `peer_card` parameter type from `str | None` to `list[str] | None` in `critical_analysis_call` and related functions.
- Simplified error handling in `process_representation_task` by removing redundant try-except block.
- Updated prompt generation in `critical_analysis_prompt` and `dialectic_prompt` to format `peer_card` as a string with newlines.
- Adjusted benchmark tests to reflect changes in peer card structure and output formatting.

* refactor: update peer card test cases to use list structure

- Modified test cases in `test_representation_crud.py` to reflect the change in `peer_card` parameter type from `str` to `list[str]`.
- Updated assertions to accommodate the new list format for setting and retrieving peer cards.
- Ensured that tests for missing peers correctly handle the list input format.

* fix: improve formatting of peer card output in prompts

- Updated `peer_card_prompt` to join `old_peer_card` list elements with newlines for better readability.
- Removed outdated comment in `dialectic_prompt` regarding handling of non-existent cards.

* chore: [coderabbit] enhance docstring and logging in prompts and queue manager

- Updated the docstring in `critical_analysis_prompt` to provide detailed type annotations for parameters.
- Improved logging in `chat` to differentiate between single and multiple retrieved peer cards.
- Adjusted logging format in `QueueManager` to use a more structured approach for shutdown messages.

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Rajat Ahuja <rahuja445@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants