Skip to content

fix: TrustSec audit findings - Low severity batch#905

Merged
neithanmo merged 10 commits intomainfrom
fix/audit_fixes
Jan 21, 2026
Merged

fix: TrustSec audit findings - Low severity batch#905
neithanmo merged 10 commits intomainfrom
fix/audit_fixes

Conversation

@neithanmo
Copy link
Copy Markdown
Collaborator

This PR addresses the remaining low-severity findings from the TrustSec security audit (Nov-Dec 2025).

Summary

Changes

Configuration Validation (indexer-config)

  • TRST-L-1/L-12: Fixed float-to-u128 conversion that caused threshold warnings to never trigger. Now uses proper wei constants
  • TRST-L-2: Added explicit validation to reject zero-duration values for critical timeout/interval fields
  • TRST-L-3: Added warning when auth tokens are configured with non-HTTPS URLs (localhost exempted for dev)

TAP Agent (indexer-tap-agent)

  • TRST-L-9: Fixed silent V2 receipt misrouting - 20-byte addresses are now properly converted to CollectionId instead of falling back to Legacy
  • TRST-L-11: Added comprehensive metrics cleanup on actor shutdown to prevent stale gauges and unbounded label cardinality growth

Indexer Service (indexer-service-rs)

  • TRST-L-10: Added validation for graph-node responses before forwarding - empty/invalid JSON now returns BAD_GATEWAY instead of causing client panics

Documentation

Bellow fixes that required simple documentation to make clear intent and logic

  • TRST-L-5: Documented why is_horizon_active() always returns true (intentional post-migration) and marked as deprecated
  • TRST-L-8: Documented why paid query responses always return HTTP 200 (TAP protocol requirement), this is not bug, but design choice in other components like Gateway

Prevent DoS attacks via unbounded request buffering by adding a
configurable maximum request body size to the tap_context middleware.

Previously, `to_bytes(body, usize::MAX)` allowed attackers to send
arbitrarily large request bodies, potentially causing memory exhaustion.

Changes:
- Add `max_request_body_size` config option to ServiceConfig (default: 2MB)
- Introduce TapContextState to pass the limit to context_middleware
- Update middleware to enforce the configured body size limit
- Add test for oversized request rejection
…ation (TRST-L-1)

The validation warnings for low trigger values and max_amount_willing_to_lose
were never triggering due to incorrect float-to-u128 conversion.

Previously, floating-point literals like `0.1` were converted to u128 via
`.to_u128()`, which truncates to 0 since u128 cannot represent fractions.
This meant comparisons like `trigger_value < 0` were always false.

Fix:
- Replace `0.1f64.to_u128()` with proper wei constant (0.1 GRT = 10^17 wei)
- Replace `0.001f64.to_u128()` with proper wei constant (0.001 GRT = 10^15 wei)
- Remove unused `FromStr` import

This only affects startup warnings, not runtime behavior. Operators with
misconfigured low values will now correctly receive warnings.
Add explicit validation to reject zero values for Duration configuration
fields that would cause incorrect behavior or system failures:

- subgraphs.escrow.syncing_interval_secs
- subgraphs.network.syncing_interval_secs
- tap.rav_request.request_timeout_secs
- tap.sender_timeout_secs
- subgraphs.network.recently_closed_allocation_buffer_secs

Note: timestamp_buffer_secs intentionally not validated as zero is a valid
(though not recommended) value meaning "no buffer".

Addresses: TRST-L-2
Add validation that warns when authentication tokens are configured
with non-HTTPS URLs, which could expose credentials to interception
via man-in-the-middle attacks.

The warning is suppressed for localhost/127.0.0.1/::1 to allow
local development workflows without noise.

Affected configurations:
- subgraphs.network.query_auth_token with non-HTTPS query_url
- subgraphs.escrow.query_auth_token with non-HTTPS query_url

Addresses: TRST-L-3
Add documentation explaining why is_horizon_active() always returns true:
during migration, zero PaymentsEscrow accounts does not indicate Horizon
is inactive - contracts may be deployed before any deposits are made.

Mark the function as deprecated since the network has fully migrated to
Horizon and this detection is no longer necessary.

Addresses: TRST-L-5
Add documentation explaining why paid query responses always return
HTTP 200 OK regardless of graph-node's status code. This is intentional
for TAP protocol compatibility:

- Gateway expects 200 OK with wrapped IndexerResponsePayload format
- GraphQL errors are conveyed in the response body, not HTTP status
- Maintains compatibility with TAP payment flow

This clarifies the design decision that auditors flagged as potentially
problematic (TRST-L-8), confirming it is intentional behavior.

Addresses: TRST-L-8
Fix silent allocation identifier downgrade that caused V2 (Horizon)
receipts to be incorrectly routed to V1 (Legacy) sender allocation
actors when collection_id parsing failed.

Previously, if CollectionId::from_str failed (e.g., for 20-byte addresses
during migration), the code fell back to AllocationId::Legacy, misrouting
the receipt. Now:

- If collection_id is a 20-byte address (40 hex chars), convert it to
  CollectionId using the standard left-padding conversion
- All V2 receipts stay on the Horizon path, never downgrade to Legacy
- Fallback uses zero CollectionId but maintains Horizon routing

This is consistent with get_pending_sender_allocation_id_v2() which
already handles the 20-byte address migration case correctly.

Addresses: TRST-L-9
…us endpoint (TRST-L-10)

The /status endpoint was forwarding responses from graph-node without
validation, which could cause client panics when graph-node returned
empty or malformed JSON responses.

Changes:
- Add validation to reject empty response bodies with BAD_GATEWAY error
- Validate response is valid JSON before forwarding to clients
- Add tests for empty and invalid JSON response handling
…TRST-L-11)

Metrics with allocation-level labels ([sender, allocation]) were not
being cleaned up when SenderAccount or SenderAllocation actors shut down,
causing stale gauge values and unbounded label cardinality growth.

Changes:
- Add PENDING_RAV cleanup in ActorTerminated handler (was missing despite
  other allocation metrics being cleaned there)
- Add allocation-level metrics cleanup in post_stop for all tracked
  allocations (UNAGGREGATED_FEES, UNAGGREGATED_FEES_BY_VERSION,
  INVALID_RECEIPT_FEES, PENDING_RAV)
- Clean up PENDING_RAV for allocations in rav_tracker (closed allocations
  with pending RAVs not in allocation_ids)

Note: ALLOCATION_RECONCILIATION_RUNS counter is intentionally not cleaned
as removing counter labels can cause rate() calculation issues.
@neithanmo neithanmo marked this pull request as ready for review January 21, 2026 04:01
@neithanmo neithanmo merged commit 4e80ec0 into main Jan 21, 2026
12 checks passed
@neithanmo neithanmo deleted the fix/audit_fixes branch January 21, 2026 13:37
@github-actions github-actions bot mentioned this pull request Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants