Skip to content

Add postgresql to container build#5

Merged
crivetimihai merged 1 commit intomainfrom
postgresql-dependency
May 27, 2025
Merged

Add postgresql to container build#5
crivetimihai merged 1 commit intomainfrom
postgresql-dependency

Conversation

@crivetimihai
Copy link
Copy Markdown
Member

No description provided.

@crivetimihai crivetimihai merged commit 973ed20 into main May 27, 2025
0 of 3 checks passed
@crivetimihai crivetimihai deleted the postgresql-dependency branch June 2, 2025 06:14
vk-playground pushed a commit to vk-playground/mcp-context-forge that referenced this pull request Sep 14, 2025
Add postgresql to container build

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
vk-playground pushed a commit to vk-playground/mcp-context-forge that referenced this pull request Sep 14, 2025
Add postgresql to container build
vk-playground pushed a commit to vk-playground/mcp-context-forge that referenced this pull request Sep 16, 2025
Add postgresql to container build
Signed-off-by: Vicky Kuo <vicky.kuo@ibm.com>
hughhennelly added a commit to hughhennelly/mcp-context-forge that referenced this pull request Feb 12, 2026
- Fix database import: from ..database to from ..db
- Fix unified_pdp imports: from plugins.unified_pdp (Issue #1)
- Remove scratch header files (Issue IBM#5)

Related to PR feedback on Issue IBM#2226

Signed-off-by: hughhennelly <hughhennelly06@gmail.com>
hughhennelly added a commit to hughhennelly/mcp-context-forge that referenced this pull request Feb 12, 2026
1. Fix broken imports (Issue #1):
   - Change from ..database to ..db
   - Fix unified_pdp imports to use plugins.unified_pdp
   - Update in routes, services, schemas, and tests

2. Register sandbox router in main.py (Issue IBM#2):
   - Add import and app.include_router call

3. Fix XSS vulnerability (Issue IBM#3):
   - Replace f-string HTML with Jinja2 template
   - Create sandbox_simulate_results.html template
   - Add Request parameter for template access

4. Add authentication (Issue IBM#4):
   - Add Depends(get_current_user) to simulate endpoint

5. Remove scratch files (Issue IBM#5):
   - Delete sandbox_header.txt and sandbox_new_header.txt

6. Resolve schemas conflict (Issue IBM#6):
   - Merge schemas/sandbox.py into schemas.py
   - Remove conflicting schemas/ directory
   - Update imports in routes and services

All changes tested and ready for review.

Related to IBM#2226

Signed-off-by: hughhennelly <hughhennelly06@gmail.com>
yiannis2804 added a commit to yiannis2804/mcp-context-forge that referenced this pull request Feb 19, 2026
Address code review feedback from @jonpspri:

Problem: PR description claimed 'Full audit trail of all access decisions'
but _log_decision() only logged to Python logger with TODO comment.

Solution:
- Updated docstring to clarify DB audit logging is Phase 2
- Changed logging level from INFO to DEBUG (reduce production noise)
- AccessDecisionLog table is created and ready for Phase 2
- Honest about current state vs future implementation

Result:
- Clear documentation that DB audit is Phase 2 scaffolding
- Application logging still captures all decisions
- Table structure ready for Phase 2 implementation
- No misleading claims in PR description

Related: PR IBM#2682 Phase 1 Code Review Item IBM#5
Signed-off-by: yiannis2804 <yiannis2804@gmail.com>
crivetimihai pushed a commit that referenced this pull request Feb 24, 2026
Address code review feedback from @jonpspri:

Problem: PR description claimed 'Full audit trail of all access decisions'
but _log_decision() only logged to Python logger with TODO comment.

Solution:
- Updated docstring to clarify DB audit logging is Phase 2
- Changed logging level from INFO to DEBUG (reduce production noise)
- AccessDecisionLog table is created and ready for Phase 2
- Honest about current state vs future implementation

Result:
- Clear documentation that DB audit is Phase 2 scaffolding
- Application logging still captures all decisions
- Table structure ready for Phase 2 implementation
- No misleading claims in PR description

Related: PR #2682 Phase 1 Code Review Item #5
Signed-off-by: yiannis2804 <yiannis2804@gmail.com>
crivetimihai pushed a commit that referenced this pull request Feb 24, 2026
- Fix database import: from ..database to from ..db
- Fix unified_pdp imports: from plugins.unified_pdp (Issue #1)
- Remove scratch header files (Issue #5)

Related to PR feedback on Issue #2226

Signed-off-by: hughhennelly <hughhennelly06@gmail.com>
crivetimihai pushed a commit that referenced this pull request Feb 24, 2026
1. Fix broken imports (Issue #1):
   - Change from ..database to ..db
   - Fix unified_pdp imports to use plugins.unified_pdp
   - Update in routes, services, schemas, and tests

2. Register sandbox router in main.py (Issue #2):
   - Add import and app.include_router call

3. Fix XSS vulnerability (Issue #3):
   - Replace f-string HTML with Jinja2 template
   - Create sandbox_simulate_results.html template
   - Add Request parameter for template access

4. Add authentication (Issue #4):
   - Add Depends(get_current_user) to simulate endpoint

5. Remove scratch files (Issue #5):
   - Delete sandbox_header.txt and sandbox_new_header.txt

6. Resolve schemas conflict (Issue #6):
   - Merge schemas/sandbox.py into schemas.py
   - Remove conflicting schemas/ directory
   - Update imports in routes and services

All changes tested and ready for review.

Related to #2226

Signed-off-by: hughhennelly <hughhennelly06@gmail.com>
MohanLaksh added a commit that referenced this pull request Apr 8, 2026
This commit addresses valid concerns raised in PR review:

1. Fix class docstring showing old API (madhu-mohan-jaishankar)
   - Updated ObservabilityService class docstring to show correct API
   - Removed db parameter from examples (old: start_trace(db, name))
   - Added clarification about write vs query methods
   - Examples now demonstrate correct usage with proper parameters

2. Fix context manager atomicity (msureshkumar88 issue #5)
   - Added optional obs_db parameter to add_event() method
   - Context managers now pass their session to add_event() in error paths
   - Ensures atomic commit of span updates and error events together
   - Prevents orphaned events when span update fails
   - Updated all three context managers:
     * trace_span() at line 644
     * trace_tool_invocation() at line 757
     * trace_a2a_request() at line 1112

3. Document connection pool sizing (msureshkumar88 issue #7)
   - Added Connection Pool Sizing section to AGENTS.md
   - Explains 4-6 sessions per traced request pattern
   - Documents default pool (200+10 = 210 connections)
   - Supports ~35 concurrent traced requests by default
   - Guidance for high-traffic deployments (>50 req/sec)
   - Note about SQLite 50-connection cap

Implementation details:
- add_event() now accepts optional obs_db: Optional[Session] = None
- When obs_db is provided, uses caller's session (owned = False)
- When obs_db is None, creates independent session (owned = True)
- Only closes session if owned is True
- All existing tests pass (87 observability service tests, 23 middleware tests)

Tests: All 87 observability service tests passing
       All 23 observability middleware tests passing
       100% differential coverage maintained

Addresses review feedback from:
- @madhu-mohan-jaishankar (approved with comments)
- @msureshkumar88 (issues #5 and #7 from detailed review)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
brian-hussey pushed a commit that referenced this pull request Apr 8, 2026
This commit addresses valid concerns raised in PR review:

1. Fix class docstring showing old API (madhu-mohan-jaishankar)
   - Updated ObservabilityService class docstring to show correct API
   - Removed db parameter from examples (old: start_trace(db, name))
   - Added clarification about write vs query methods
   - Examples now demonstrate correct usage with proper parameters

2. Fix context manager atomicity (msureshkumar88 issue #5)
   - Added optional obs_db parameter to add_event() method
   - Context managers now pass their session to add_event() in error paths
   - Ensures atomic commit of span updates and error events together
   - Prevents orphaned events when span update fails
   - Updated all three context managers:
     * trace_span() at line 644
     * trace_tool_invocation() at line 757
     * trace_a2a_request() at line 1112

3. Document connection pool sizing (msureshkumar88 issue #7)
   - Added Connection Pool Sizing section to AGENTS.md
   - Explains 4-6 sessions per traced request pattern
   - Documents default pool (200+10 = 210 connections)
   - Supports ~35 concurrent traced requests by default
   - Guidance for high-traffic deployments (>50 req/sec)
   - Note about SQLite 50-connection cap

Implementation details:
- add_event() now accepts optional obs_db: Optional[Session] = None
- When obs_db is provided, uses caller's session (owned = False)
- When obs_db is None, creates independent session (owned = True)
- Only closes session if owned is True
- All existing tests pass (87 observability service tests, 23 middleware tests)

Tests: All 87 observability service tests passing
       All 23 observability middleware tests passing
       100% differential coverage maintained

Addresses review feedback from:
- @madhu-mohan-jaishankar (approved with comments)
- @msureshkumar88 (issues #5 and #7 from detailed review)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
brian-hussey added a commit that referenced this pull request Apr 8, 2026
…ons (#4050)

* feat: implement separate observability session pattern (best-effort)

Observability write operations now use independent database sessions that
commit immediately on a best-effort basis, separate from main request
transactions. This ensures observability data persists even when the main
request fails, providing visibility into partial failures.

Implementation details:
- Observability write methods (start_trace, end_trace, start_span, end_span,
  add_event, record_metric, record_token_usage, delete_old_traces) now create
  their own independent sessions using SessionLocal() instead of accepting a
  db parameter from the caller
- Context managers (trace_span, trace_tool_invocation, trace_a2a_request)
  create and manage a single session for the full span lifecycle
- Query methods still accept db: Session parameter for RBAC/token scoping
- ObservabilityMiddleware no longer creates or manages request.state.db
- Each observability operation creates its own short-lived session that
  commits immediately in a try/except/finally block
- Updated all callers across resource_service, prompt_service, tool_service,
  and instrumentation/sqlalchemy to use new API (removed db parameters)
- Added comprehensive test coverage verifying separate session behavior
- Fixed pre-existing bugs in add_event() and record_metric() where db.refresh()
  was called after commit failure

Benefits:
- Observability data persists independently of main request transaction success
- Provides visibility into partial failures and error states
- Follows pattern established in instrumentation/sqlalchemy.py (lines 58-87)
- Reduces transaction contention between observability and business logic

Trade-offs:
- NOT atomic with main request transaction (intentional design choice)
- Traces may show "in progress" or partial states for failed requests
- Provides eventual consistency rather than strict consistency

Documentation:
- Added "Observability Transaction Behavior" section to AGENTS.md explaining
  the separate session pattern, implementation details, and security notes
- Updated module docstrings with session management details

Tests: 96 observability tests passing, full suite 16,673 tests passing
Pylint: 10.00/10 rating maintained

Closes #3883

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: add type annotations to observability context managers

Add proper return type annotations to context manager methods:
- trace_span() -> Generator[str, None, None]
- trace_tool_invocation() -> Generator[Tuple[Optional[str], Dict[str, Any]], None, None]
- trace_a2a_request() -> Generator[Tuple[Optional[str], Dict[str, Any]], None, None]

These annotations improve type safety and make the return values explicit.
The trace_span context manager yields just the span_id string, while
trace_tool_invocation and trace_a2a_request yield a tuple of (span_id, result_dict)
for capturing operation results.

Tests: 96 observability tests passing
Pylint: 10.00/10 rating maintained
Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* test: Add coverage tests for observability session close exception handling

Add 11 new test functions to achieve 100% coverage of exception handling
paths in observability_service.py. These tests verify that all write
methods and context managers handle session close failures gracefully:

- test_start_trace_session_close_failure
- test_end_trace_session_close_failure
- test_start_span_session_close_failure
- test_start_span_with_commit_true
- test_end_span_session_close_failure
- test_add_event_session_close_failure
- test_record_metric_session_close_failure
- test_delete_old_traces_session_close_failure
- test_trace_span_context_manager_session_close_failure
- test_trace_tool_invocation_context_manager_session_close_failure
- test_trace_a2a_request_context_manager_session_close_failure

Each test mocks the session close() method to raise an exception and
verifies that the operation completes successfully despite the close
failure. This covers the previously uncovered lines in finally blocks
where session close exceptions are caught and logged.

Resolves CI/CD coverage requirement for observability_service.py.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: complete separate session migration for delete_old_traces and record_transport_activity

- Fix runtime TypeError in cleanup_old_traces router: delete_old_traces()
  no longer accepts db parameter (missed call site from #3883)
- Remove unused db parameter from record_transport_activity() for API
  consistency with other write methods
- Update docstring and doctest for delete_old_traces call in router
- Update test calls to match new record_transport_activity signature
- Add 3 tests for 100% differential coverage: error-path exception
  handlers in trace_tool_invocation, trace_a2a_request, and session
  close failure in record_token_usage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: address review feedback for separate observability session pattern

This commit addresses valid concerns raised in PR review:

1. Fix class docstring showing old API (madhu-mohan-jaishankar)
   - Updated ObservabilityService class docstring to show correct API
   - Removed db parameter from examples (old: start_trace(db, name))
   - Added clarification about write vs query methods
   - Examples now demonstrate correct usage with proper parameters

2. Fix context manager atomicity (msureshkumar88 issue #5)
   - Added optional obs_db parameter to add_event() method
   - Context managers now pass their session to add_event() in error paths
   - Ensures atomic commit of span updates and error events together
   - Prevents orphaned events when span update fails
   - Updated all three context managers:
     * trace_span() at line 644
     * trace_tool_invocation() at line 757
     * trace_a2a_request() at line 1112

3. Document connection pool sizing (msureshkumar88 issue #7)
   - Added Connection Pool Sizing section to AGENTS.md
   - Explains 4-6 sessions per traced request pattern
   - Documents default pool (200+10 = 210 connections)
   - Supports ~35 concurrent traced requests by default
   - Guidance for high-traffic deployments (>50 req/sec)
   - Note about SQLite 50-connection cap

Implementation details:
- add_event() now accepts optional obs_db: Optional[Session] = None
- When obs_db is provided, uses caller's session (owned = False)
- When obs_db is None, creates independent session (owned = True)
- Only closes session if owned is True
- All existing tests pass (87 observability service tests, 23 middleware tests)

Tests: All 87 observability service tests passing
       All 23 observability middleware tests passing
       100% differential coverage maintained

Addresses review feedback from:
- @madhu-mohan-jaishankar (approved with comments)
- @msureshkumar88 (issues #5 and #7 from detailed review)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: address remaining review concerns with safety improvements

This commit addresses all remaining valid reviewer concerns:

1. Document dead code guard design choice (madhu-mohan-jaishankar issue #1)
   - Added Note section to _get_or_create_observability_session() docstring
   - Explains tuple return pattern enables future session reuse optimization
   - Guards remain for forward compatibility (maintainer approved pattern)

2. Move session creation inside try blocks (msureshkumar88 issue #1)
   - Initialize obs_db = None and owned = False before try blocks
   - Create session inside try to prevent theoretical resource leak
   - Update finally blocks to check `if owned and obs_db is not None:`
   - Applied to all 8 write methods:
     * start_trace() (line 317)
     * end_trace() (line 385)
     * trace_span() context manager (line 619)
     * trace_tool_invocation() context manager (line 719)
     * add_event() (line 836)
     * record_token_usage() (line 926)
     * trace_a2a_request() context manager (line 1087)
     * record_metric() (line 1306)
     * delete_old_traces() (line 1718)

3. Add header sanitization (msureshkumar88 issue #10)
   - Added sanitize_header_for_storage() helper function
   - Removes control characters (prevents log injection)
   - Truncates to max_length (prevents DoS via large headers)
   - Applied to user_agent (max 500 chars) and http_url (max 2000 chars)
   - Defense-in-depth security improvement

Implementation details:
- sanitize_header_for_storage() removes non-printable chars except space/tab
- Returns "unknown" for None/empty values
- All 87 observability service tests passing
- All 23 observability middleware tests passing

Deferred to future PR (with justification):
- Code duplication refactoring (issue #9): Too complex and risky for this PR.
  The three context managers share patterns but have significant domain-specific
  differences (attributes, yield signatures, sanitization). Refactoring would
  require extensive testing and carries risk of subtle bugs. Current code is
  correct and well-tested. Recommend separate focused PR for refactoring.

Tests: All 110 observability tests passing (87 service + 23 middleware)
       100% differential coverage maintained

Addresses remaining review feedback from:
- @madhu-mohan-jaishankar (issue #1 - dead code guards)
- @msureshkumar88 (issues #1, #10 - resource leak, header sanitization)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: correct doctest examples in sanitize_header_for_storage

The function removes control characters completely (doesn't replace with spaces)
and truncates without adding ellipsis. Updated doctest examples to match
actual behavior:
- Control chars removed: 'Evil\x00\nInjection' -> 'EvilInjection'
- Truncation verified by length check instead of expecting '...'
- Added test for None input returning 'unknown'

All doctests now passing.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* Update secrets following rebase

Signed-off-by: Brian Hussey <brian.hussey@ie.ibm.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Brian Hussey <brian.hussey@ie.ibm.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Brian Hussey <brian.hussey@ie.ibm.com>
jonpspri pushed a commit that referenced this pull request Apr 10, 2026
…ons (#4050)

* feat: implement separate observability session pattern (best-effort)

Observability write operations now use independent database sessions that
commit immediately on a best-effort basis, separate from main request
transactions. This ensures observability data persists even when the main
request fails, providing visibility into partial failures.

Implementation details:
- Observability write methods (start_trace, end_trace, start_span, end_span,
  add_event, record_metric, record_token_usage, delete_old_traces) now create
  their own independent sessions using SessionLocal() instead of accepting a
  db parameter from the caller
- Context managers (trace_span, trace_tool_invocation, trace_a2a_request)
  create and manage a single session for the full span lifecycle
- Query methods still accept db: Session parameter for RBAC/token scoping
- ObservabilityMiddleware no longer creates or manages request.state.db
- Each observability operation creates its own short-lived session that
  commits immediately in a try/except/finally block
- Updated all callers across resource_service, prompt_service, tool_service,
  and instrumentation/sqlalchemy to use new API (removed db parameters)
- Added comprehensive test coverage verifying separate session behavior
- Fixed pre-existing bugs in add_event() and record_metric() where db.refresh()
  was called after commit failure

Benefits:
- Observability data persists independently of main request transaction success
- Provides visibility into partial failures and error states
- Follows pattern established in instrumentation/sqlalchemy.py (lines 58-87)
- Reduces transaction contention between observability and business logic

Trade-offs:
- NOT atomic with main request transaction (intentional design choice)
- Traces may show "in progress" or partial states for failed requests
- Provides eventual consistency rather than strict consistency

Documentation:
- Added "Observability Transaction Behavior" section to AGENTS.md explaining
  the separate session pattern, implementation details, and security notes
- Updated module docstrings with session management details

Tests: 96 observability tests passing, full suite 16,673 tests passing
Pylint: 10.00/10 rating maintained

Closes #3883

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: add type annotations to observability context managers

Add proper return type annotations to context manager methods:
- trace_span() -> Generator[str, None, None]
- trace_tool_invocation() -> Generator[Tuple[Optional[str], Dict[str, Any]], None, None]
- trace_a2a_request() -> Generator[Tuple[Optional[str], Dict[str, Any]], None, None]

These annotations improve type safety and make the return values explicit.
The trace_span context manager yields just the span_id string, while
trace_tool_invocation and trace_a2a_request yield a tuple of (span_id, result_dict)
for capturing operation results.

Tests: 96 observability tests passing
Pylint: 10.00/10 rating maintained
Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* test: Add coverage tests for observability session close exception handling

Add 11 new test functions to achieve 100% coverage of exception handling
paths in observability_service.py. These tests verify that all write
methods and context managers handle session close failures gracefully:

- test_start_trace_session_close_failure
- test_end_trace_session_close_failure
- test_start_span_session_close_failure
- test_start_span_with_commit_true
- test_end_span_session_close_failure
- test_add_event_session_close_failure
- test_record_metric_session_close_failure
- test_delete_old_traces_session_close_failure
- test_trace_span_context_manager_session_close_failure
- test_trace_tool_invocation_context_manager_session_close_failure
- test_trace_a2a_request_context_manager_session_close_failure

Each test mocks the session close() method to raise an exception and
verifies that the operation completes successfully despite the close
failure. This covers the previously uncovered lines in finally blocks
where session close exceptions are caught and logged.

Resolves CI/CD coverage requirement for observability_service.py.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: complete separate session migration for delete_old_traces and record_transport_activity

- Fix runtime TypeError in cleanup_old_traces router: delete_old_traces()
  no longer accepts db parameter (missed call site from #3883)
- Remove unused db parameter from record_transport_activity() for API
  consistency with other write methods
- Update docstring and doctest for delete_old_traces call in router
- Update test calls to match new record_transport_activity signature
- Add 3 tests for 100% differential coverage: error-path exception
  handlers in trace_tool_invocation, trace_a2a_request, and session
  close failure in record_token_usage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: address review feedback for separate observability session pattern

This commit addresses valid concerns raised in PR review:

1. Fix class docstring showing old API (madhu-mohan-jaishankar)
   - Updated ObservabilityService class docstring to show correct API
   - Removed db parameter from examples (old: start_trace(db, name))
   - Added clarification about write vs query methods
   - Examples now demonstrate correct usage with proper parameters

2. Fix context manager atomicity (msureshkumar88 issue #5)
   - Added optional obs_db parameter to add_event() method
   - Context managers now pass their session to add_event() in error paths
   - Ensures atomic commit of span updates and error events together
   - Prevents orphaned events when span update fails
   - Updated all three context managers:
     * trace_span() at line 644
     * trace_tool_invocation() at line 757
     * trace_a2a_request() at line 1112

3. Document connection pool sizing (msureshkumar88 issue #7)
   - Added Connection Pool Sizing section to AGENTS.md
   - Explains 4-6 sessions per traced request pattern
   - Documents default pool (200+10 = 210 connections)
   - Supports ~35 concurrent traced requests by default
   - Guidance for high-traffic deployments (>50 req/sec)
   - Note about SQLite 50-connection cap

Implementation details:
- add_event() now accepts optional obs_db: Optional[Session] = None
- When obs_db is provided, uses caller's session (owned = False)
- When obs_db is None, creates independent session (owned = True)
- Only closes session if owned is True
- All existing tests pass (87 observability service tests, 23 middleware tests)

Tests: All 87 observability service tests passing
       All 23 observability middleware tests passing
       100% differential coverage maintained

Addresses review feedback from:
- @madhu-mohan-jaishankar (approved with comments)
- @msureshkumar88 (issues #5 and #7 from detailed review)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: address remaining review concerns with safety improvements

This commit addresses all remaining valid reviewer concerns:

1. Document dead code guard design choice (madhu-mohan-jaishankar issue #1)
   - Added Note section to _get_or_create_observability_session() docstring
   - Explains tuple return pattern enables future session reuse optimization
   - Guards remain for forward compatibility (maintainer approved pattern)

2. Move session creation inside try blocks (msureshkumar88 issue #1)
   - Initialize obs_db = None and owned = False before try blocks
   - Create session inside try to prevent theoretical resource leak
   - Update finally blocks to check `if owned and obs_db is not None:`
   - Applied to all 8 write methods:
     * start_trace() (line 317)
     * end_trace() (line 385)
     * trace_span() context manager (line 619)
     * trace_tool_invocation() context manager (line 719)
     * add_event() (line 836)
     * record_token_usage() (line 926)
     * trace_a2a_request() context manager (line 1087)
     * record_metric() (line 1306)
     * delete_old_traces() (line 1718)

3. Add header sanitization (msureshkumar88 issue #10)
   - Added sanitize_header_for_storage() helper function
   - Removes control characters (prevents log injection)
   - Truncates to max_length (prevents DoS via large headers)
   - Applied to user_agent (max 500 chars) and http_url (max 2000 chars)
   - Defense-in-depth security improvement

Implementation details:
- sanitize_header_for_storage() removes non-printable chars except space/tab
- Returns "unknown" for None/empty values
- All 87 observability service tests passing
- All 23 observability middleware tests passing

Deferred to future PR (with justification):
- Code duplication refactoring (issue #9): Too complex and risky for this PR.
  The three context managers share patterns but have significant domain-specific
  differences (attributes, yield signatures, sanitization). Refactoring would
  require extensive testing and carries risk of subtle bugs. Current code is
  correct and well-tested. Recommend separate focused PR for refactoring.

Tests: All 110 observability tests passing (87 service + 23 middleware)
       100% differential coverage maintained

Addresses remaining review feedback from:
- @madhu-mohan-jaishankar (issue #1 - dead code guards)
- @msureshkumar88 (issues #1, #10 - resource leak, header sanitization)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: correct doctest examples in sanitize_header_for_storage

The function removes control characters completely (doesn't replace with spaces)
and truncates without adding ellipsis. Updated doctest examples to match
actual behavior:
- Control chars removed: 'Evil\x00\nInjection' -> 'EvilInjection'
- Truncation verified by length check instead of expecting '...'
- Added test for None input returning 'unknown'

All doctests now passing.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* Update secrets following rebase

Signed-off-by: Brian Hussey <brian.hussey@ie.ibm.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Brian Hussey <brian.hussey@ie.ibm.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Brian Hussey <brian.hussey@ie.ibm.com>
MohanLaksh added a commit that referenced this pull request Apr 11, 2026
This commit fixes 5 issues identified in code review:

**Issue #1 - Default config inconsistency (CRITICAL):**
- Changed SSRF_ALLOW_LOCALHOST default from false to true
- Fixes immediate failure on fresh installs where backend_rpc_url defaults to 127.0.0.1
- Updated README.md to reflect new defaults and clarify production vs development modes
- Location: tools_rust/mcp_runtime/src/config.rs:208

**Issue #2 - SSRF bypass in two functions (CRITICAL):**
- Added URL validation to backend_authenticate_url() before HTTP call
- Added URL validation to backend_tools_call_resolve_url() before HTTP call
- Both functions now use url_validator.validate_url() with SSRF protection
- Locations: tools_rust/mcp_runtime/src/lib.rs:2564, 8158

**Issue #3 - Malformed CIDR fails open (HIGH):**
- Changed CIDR parsing from fail-open (warn + continue) to fail-closed (return error)
- Invalid CIDR in SSRF_BLOCKED_NETWORKS now fails runtime startup
- Invalid CIDR in SSRF_ALLOWED_NETWORKS now fails runtime startup
- Location: tools_rust/mcp_runtime/src/url_validator.rs:186-210

**Issue #4 - DNS re-resolution overhead (MEDIUM):**
- Implemented DNS result caching with 5-minute TTL
- Added Arc<RwLock<HashMap<String, (Vec<IpAddr>, Instant)>>> cache
- Reduces DNS lookups on hot paths while maintaining security
- Cache prevents DNS rebinding attacks with short TTL
- Location: tools_rust/mcp_runtime/src/url_validator.rs:58-68, 486-530

**Issue #5 - Missing body-size limit test (LOW):**
- Added integration test: request_body_size_limit_rejects_large_payloads()
- Verifies 413 Payload Too Large response for >10MB request bodies
- Tests the DefaultBodyLimit middleware enforcement
- Location: tools_rust/mcp_runtime/src/lib.rs:13309-13338

All changes maintain backward compatibility except for stricter CIDR validation (fail-closed is more secure).

Addresses reviewer feedback from lucarlig on PR #4111

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
MohanLaksh added a commit that referenced this pull request Apr 11, 2026
This commit fixes 5 issues identified in code review:

**Issue #1 - Default config inconsistency (CRITICAL):**
- Changed SSRF_ALLOW_LOCALHOST default from false to true
- Fixes immediate failure on fresh installs where backend_rpc_url defaults to 127.0.0.1
- Updated README.md to reflect new defaults and clarify production vs development modes
- Location: tools_rust/mcp_runtime/src/config.rs:208

**Issue #2 - SSRF bypass in two functions (CRITICAL):**
- Added URL validation to backend_authenticate_url() before HTTP call
- Added URL validation to backend_tools_call_resolve_url() before HTTP call
- Both functions now use url_validator.validate_url() with SSRF protection
- Locations: tools_rust/mcp_runtime/src/lib.rs:2564, 8158

**Issue #3 - Malformed CIDR fails open (HIGH):**
- Changed CIDR parsing from fail-open (warn + continue) to fail-closed (return error)
- Invalid CIDR in SSRF_BLOCKED_NETWORKS now fails runtime startup
- Invalid CIDR in SSRF_ALLOWED_NETWORKS now fails runtime startup
- Location: tools_rust/mcp_runtime/src/url_validator.rs:186-210

**Issue #4 - DNS re-resolution overhead (MEDIUM):**
- Implemented DNS result caching with 5-minute TTL
- Added Arc<RwLock<HashMap<String, (Vec<IpAddr>, Instant)>>> cache
- Reduces DNS lookups on hot paths while maintaining security
- Cache prevents DNS rebinding attacks with short TTL
- Location: tools_rust/mcp_runtime/src/url_validator.rs:58-68, 486-530

**Issue #5 - Missing body-size limit test (LOW):**
- Added integration test: request_body_size_limit_rejects_large_payloads()
- Verifies 413 Payload Too Large response for >10MB request bodies
- Tests the DefaultBodyLimit middleware enforcement
- Location: tools_rust/mcp_runtime/src/lib.rs:13309-13338

All changes maintain backward compatibility except for stricter CIDR validation (fail-closed is more secure).

Addresses reviewer feedback from lucarlig on PR #4111

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
claudia-gray pushed a commit that referenced this pull request Apr 13, 2026
…ons (#4050)

* feat: implement separate observability session pattern (best-effort)

Observability write operations now use independent database sessions that
commit immediately on a best-effort basis, separate from main request
transactions. This ensures observability data persists even when the main
request fails, providing visibility into partial failures.

Implementation details:
- Observability write methods (start_trace, end_trace, start_span, end_span,
  add_event, record_metric, record_token_usage, delete_old_traces) now create
  their own independent sessions using SessionLocal() instead of accepting a
  db parameter from the caller
- Context managers (trace_span, trace_tool_invocation, trace_a2a_request)
  create and manage a single session for the full span lifecycle
- Query methods still accept db: Session parameter for RBAC/token scoping
- ObservabilityMiddleware no longer creates or manages request.state.db
- Each observability operation creates its own short-lived session that
  commits immediately in a try/except/finally block
- Updated all callers across resource_service, prompt_service, tool_service,
  and instrumentation/sqlalchemy to use new API (removed db parameters)
- Added comprehensive test coverage verifying separate session behavior
- Fixed pre-existing bugs in add_event() and record_metric() where db.refresh()
  was called after commit failure

Benefits:
- Observability data persists independently of main request transaction success
- Provides visibility into partial failures and error states
- Follows pattern established in instrumentation/sqlalchemy.py (lines 58-87)
- Reduces transaction contention between observability and business logic

Trade-offs:
- NOT atomic with main request transaction (intentional design choice)
- Traces may show "in progress" or partial states for failed requests
- Provides eventual consistency rather than strict consistency

Documentation:
- Added "Observability Transaction Behavior" section to AGENTS.md explaining
  the separate session pattern, implementation details, and security notes
- Updated module docstrings with session management details

Tests: 96 observability tests passing, full suite 16,673 tests passing
Pylint: 10.00/10 rating maintained

Closes #3883

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: add type annotations to observability context managers

Add proper return type annotations to context manager methods:
- trace_span() -> Generator[str, None, None]
- trace_tool_invocation() -> Generator[Tuple[Optional[str], Dict[str, Any]], None, None]
- trace_a2a_request() -> Generator[Tuple[Optional[str], Dict[str, Any]], None, None]

These annotations improve type safety and make the return values explicit.
The trace_span context manager yields just the span_id string, while
trace_tool_invocation and trace_a2a_request yield a tuple of (span_id, result_dict)
for capturing operation results.

Tests: 96 observability tests passing
Pylint: 10.00/10 rating maintained
Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* test: Add coverage tests for observability session close exception handling

Add 11 new test functions to achieve 100% coverage of exception handling
paths in observability_service.py. These tests verify that all write
methods and context managers handle session close failures gracefully:

- test_start_trace_session_close_failure
- test_end_trace_session_close_failure
- test_start_span_session_close_failure
- test_start_span_with_commit_true
- test_end_span_session_close_failure
- test_add_event_session_close_failure
- test_record_metric_session_close_failure
- test_delete_old_traces_session_close_failure
- test_trace_span_context_manager_session_close_failure
- test_trace_tool_invocation_context_manager_session_close_failure
- test_trace_a2a_request_context_manager_session_close_failure

Each test mocks the session close() method to raise an exception and
verifies that the operation completes successfully despite the close
failure. This covers the previously uncovered lines in finally blocks
where session close exceptions are caught and logged.

Resolves CI/CD coverage requirement for observability_service.py.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: complete separate session migration for delete_old_traces and record_transport_activity

- Fix runtime TypeError in cleanup_old_traces router: delete_old_traces()
  no longer accepts db parameter (missed call site from #3883)
- Remove unused db parameter from record_transport_activity() for API
  consistency with other write methods
- Update docstring and doctest for delete_old_traces call in router
- Update test calls to match new record_transport_activity signature
- Add 3 tests for 100% differential coverage: error-path exception
  handlers in trace_tool_invocation, trace_a2a_request, and session
  close failure in record_token_usage

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: address review feedback for separate observability session pattern

This commit addresses valid concerns raised in PR review:

1. Fix class docstring showing old API (madhu-mohan-jaishankar)
   - Updated ObservabilityService class docstring to show correct API
   - Removed db parameter from examples (old: start_trace(db, name))
   - Added clarification about write vs query methods
   - Examples now demonstrate correct usage with proper parameters

2. Fix context manager atomicity (msureshkumar88 issue #5)
   - Added optional obs_db parameter to add_event() method
   - Context managers now pass their session to add_event() in error paths
   - Ensures atomic commit of span updates and error events together
   - Prevents orphaned events when span update fails
   - Updated all three context managers:
     * trace_span() at line 644
     * trace_tool_invocation() at line 757
     * trace_a2a_request() at line 1112

3. Document connection pool sizing (msureshkumar88 issue #7)
   - Added Connection Pool Sizing section to AGENTS.md
   - Explains 4-6 sessions per traced request pattern
   - Documents default pool (200+10 = 210 connections)
   - Supports ~35 concurrent traced requests by default
   - Guidance for high-traffic deployments (>50 req/sec)
   - Note about SQLite 50-connection cap

Implementation details:
- add_event() now accepts optional obs_db: Optional[Session] = None
- When obs_db is provided, uses caller's session (owned = False)
- When obs_db is None, creates independent session (owned = True)
- Only closes session if owned is True
- All existing tests pass (87 observability service tests, 23 middleware tests)

Tests: All 87 observability service tests passing
       All 23 observability middleware tests passing
       100% differential coverage maintained

Addresses review feedback from:
- @madhu-mohan-jaishankar (approved with comments)
- @msureshkumar88 (issues #5 and #7 from detailed review)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: address remaining review concerns with safety improvements

This commit addresses all remaining valid reviewer concerns:

1. Document dead code guard design choice (madhu-mohan-jaishankar issue #1)
   - Added Note section to _get_or_create_observability_session() docstring
   - Explains tuple return pattern enables future session reuse optimization
   - Guards remain for forward compatibility (maintainer approved pattern)

2. Move session creation inside try blocks (msureshkumar88 issue #1)
   - Initialize obs_db = None and owned = False before try blocks
   - Create session inside try to prevent theoretical resource leak
   - Update finally blocks to check `if owned and obs_db is not None:`
   - Applied to all 8 write methods:
     * start_trace() (line 317)
     * end_trace() (line 385)
     * trace_span() context manager (line 619)
     * trace_tool_invocation() context manager (line 719)
     * add_event() (line 836)
     * record_token_usage() (line 926)
     * trace_a2a_request() context manager (line 1087)
     * record_metric() (line 1306)
     * delete_old_traces() (line 1718)

3. Add header sanitization (msureshkumar88 issue #10)
   - Added sanitize_header_for_storage() helper function
   - Removes control characters (prevents log injection)
   - Truncates to max_length (prevents DoS via large headers)
   - Applied to user_agent (max 500 chars) and http_url (max 2000 chars)
   - Defense-in-depth security improvement

Implementation details:
- sanitize_header_for_storage() removes non-printable chars except space/tab
- Returns "unknown" for None/empty values
- All 87 observability service tests passing
- All 23 observability middleware tests passing

Deferred to future PR (with justification):
- Code duplication refactoring (issue #9): Too complex and risky for this PR.
  The three context managers share patterns but have significant domain-specific
  differences (attributes, yield signatures, sanitization). Refactoring would
  require extensive testing and carries risk of subtle bugs. Current code is
  correct and well-tested. Recommend separate focused PR for refactoring.

Tests: All 110 observability tests passing (87 service + 23 middleware)
       100% differential coverage maintained

Addresses remaining review feedback from:
- @madhu-mohan-jaishankar (issue #1 - dead code guards)
- @msureshkumar88 (issues #1, #10 - resource leak, header sanitization)

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* fix: correct doctest examples in sanitize_header_for_storage

The function removes control characters completely (doesn't replace with spaces)
and truncates without adding ellipsis. Updated doctest examples to match
actual behavior:
- Control chars removed: 'Evil\x00\nInjection' -> 'EvilInjection'
- Truncation verified by length check instead of expecting '...'
- Added test for None input returning 'unknown'

All doctests now passing.

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>

* Update secrets following rebase

Signed-off-by: Brian Hussey <brian.hussey@ie.ibm.com>

---------

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Signed-off-by: Brian Hussey <brian.hussey@ie.ibm.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Brian Hussey <brian.hussey@ie.ibm.com>
MohanLaksh added a commit that referenced this pull request Apr 14, 2026
This commit fixes 5 issues identified in code review:

**Issue #1 - Default config inconsistency (CRITICAL):**
- Changed SSRF_ALLOW_LOCALHOST default from false to true
- Fixes immediate failure on fresh installs where backend_rpc_url defaults to 127.0.0.1
- Updated README.md to reflect new defaults and clarify production vs development modes
- Location: tools_rust/mcp_runtime/src/config.rs:208

**Issue #2 - SSRF bypass in two functions (CRITICAL):**
- Added URL validation to backend_authenticate_url() before HTTP call
- Added URL validation to backend_tools_call_resolve_url() before HTTP call
- Both functions now use url_validator.validate_url() with SSRF protection
- Locations: tools_rust/mcp_runtime/src/lib.rs:2564, 8158

**Issue #3 - Malformed CIDR fails open (HIGH):**
- Changed CIDR parsing from fail-open (warn + continue) to fail-closed (return error)
- Invalid CIDR in SSRF_BLOCKED_NETWORKS now fails runtime startup
- Invalid CIDR in SSRF_ALLOWED_NETWORKS now fails runtime startup
- Location: tools_rust/mcp_runtime/src/url_validator.rs:186-210

**Issue #4 - DNS re-resolution overhead (MEDIUM):**
- Implemented DNS result caching with 5-minute TTL
- Added Arc<RwLock<HashMap<String, (Vec<IpAddr>, Instant)>>> cache
- Reduces DNS lookups on hot paths while maintaining security
- Cache prevents DNS rebinding attacks with short TTL
- Location: tools_rust/mcp_runtime/src/url_validator.rs:58-68, 486-530

**Issue #5 - Missing body-size limit test (LOW):**
- Added integration test: request_body_size_limit_rejects_large_payloads()
- Verifies 413 Payload Too Large response for >10MB request bodies
- Tests the DefaultBodyLimit middleware enforcement
- Location: tools_rust/mcp_runtime/src/lib.rs:13309-13338

All changes maintain backward compatibility except for stricter CIDR validation (fail-closed is more secure).

Addresses reviewer feedback from lucarlig on PR #4111

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
MohanLaksh added a commit that referenced this pull request Apr 21, 2026
This commit fixes 5 issues identified in code review:

**Issue #1 - Default config inconsistency (CRITICAL):**
- Changed SSRF_ALLOW_LOCALHOST default from false to true
- Fixes immediate failure on fresh installs where backend_rpc_url defaults to 127.0.0.1
- Updated README.md to reflect new defaults and clarify production vs development modes
- Location: tools_rust/mcp_runtime/src/config.rs:208

**Issue #2 - SSRF bypass in two functions (CRITICAL):**
- Added URL validation to backend_authenticate_url() before HTTP call
- Added URL validation to backend_tools_call_resolve_url() before HTTP call
- Both functions now use url_validator.validate_url() with SSRF protection
- Locations: tools_rust/mcp_runtime/src/lib.rs:2564, 8158

**Issue #3 - Malformed CIDR fails open (HIGH):**
- Changed CIDR parsing from fail-open (warn + continue) to fail-closed (return error)
- Invalid CIDR in SSRF_BLOCKED_NETWORKS now fails runtime startup
- Invalid CIDR in SSRF_ALLOWED_NETWORKS now fails runtime startup
- Location: tools_rust/mcp_runtime/src/url_validator.rs:186-210

**Issue #4 - DNS re-resolution overhead (MEDIUM):**
- Implemented DNS result caching with 5-minute TTL
- Added Arc<RwLock<HashMap<String, (Vec<IpAddr>, Instant)>>> cache
- Reduces DNS lookups on hot paths while maintaining security
- Cache prevents DNS rebinding attacks with short TTL
- Location: tools_rust/mcp_runtime/src/url_validator.rs:58-68, 486-530

**Issue #5 - Missing body-size limit test (LOW):**
- Added integration test: request_body_size_limit_rejects_large_payloads()
- Verifies 413 Payload Too Large response for >10MB request bodies
- Tests the DefaultBodyLimit middleware enforcement
- Location: tools_rust/mcp_runtime/src/lib.rs:13309-13338

All changes maintain backward compatibility except for stricter CIDR validation (fail-closed is more secure).

Addresses reviewer feedback from lucarlig on PR #4111

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
MohanLaksh added a commit that referenced this pull request Apr 21, 2026
This commit fixes 5 issues identified in code review:

**Issue #1 - Default config inconsistency (CRITICAL):**
- Changed SSRF_ALLOW_LOCALHOST default from false to true
- Fixes immediate failure on fresh installs where backend_rpc_url defaults to 127.0.0.1
- Updated README.md to reflect new defaults and clarify production vs development modes
- Location: tools_rust/mcp_runtime/src/config.rs:208

**Issue #2 - SSRF bypass in two functions (CRITICAL):**
- Added URL validation to backend_authenticate_url() before HTTP call
- Added URL validation to backend_tools_call_resolve_url() before HTTP call
- Both functions now use url_validator.validate_url() with SSRF protection
- Locations: tools_rust/mcp_runtime/src/lib.rs:2564, 8158

**Issue #3 - Malformed CIDR fails open (HIGH):**
- Changed CIDR parsing from fail-open (warn + continue) to fail-closed (return error)
- Invalid CIDR in SSRF_BLOCKED_NETWORKS now fails runtime startup
- Invalid CIDR in SSRF_ALLOWED_NETWORKS now fails runtime startup
- Location: tools_rust/mcp_runtime/src/url_validator.rs:186-210

**Issue #4 - DNS re-resolution overhead (MEDIUM):**
- Implemented DNS result caching with 5-minute TTL
- Added Arc<RwLock<HashMap<String, (Vec<IpAddr>, Instant)>>> cache
- Reduces DNS lookups on hot paths while maintaining security
- Cache prevents DNS rebinding attacks with short TTL
- Location: tools_rust/mcp_runtime/src/url_validator.rs:58-68, 486-530

**Issue #5 - Missing body-size limit test (LOW):**
- Added integration test: request_body_size_limit_rejects_large_payloads()
- Verifies 413 Payload Too Large response for >10MB request bodies
- Tests the DefaultBodyLimit middleware enforcement
- Location: tools_rust/mcp_runtime/src/lib.rs:13309-13338

All changes maintain backward compatibility except for stricter CIDR validation (fail-closed is more secure).

Addresses reviewer feedback from lucarlig on PR #4111

Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant