Skip to content

fix: security hardening for user-input-facing code paths#251

Merged
tjgreen42 merged 3 commits intomainfrom
security-audit-fixes
Mar 3, 2026
Merged

fix: security hardening for user-input-facing code paths#251
tjgreen42 merged 3 commits intomainfrom
security-audit-fixes

Conversation

@tjgreen42
Copy link
Copy Markdown
Collaborator

Summary

Security audit and hardening of code paths reachable from user input (queries, binary protocol, corrupted stored data). Fixes 9 distinct vulnerability classes:

  • Binary protocol validation (tpvector_recv): Added bounds checks on total_size, index_name_len, and entry_count from untrusted binary input. Previously, a malicious client could trigger integer underflow, undersized allocations, or OOB writes.

  • Stored data validation (tpvector_out): Added bounds checks on entry iteration to prevent reading past varlena allocation. Validates lexeme_len >= 0 before alloca to prevent stack overflow with corrupted data.

  • Input parsing (tpvector_in): Replaced atoi() with pg_strtoint32() for proper overflow detection and error reporting on user-provided frequency values.

  • IDF cache OOB read (query.c): Fixed off-by-one where num_terms++ sentinel (to suppress repeated warnings) caused lookup_cached_idf to iterate one element past the terms[] array bounds. Uses a separate bool flag instead.

  • Stack overflow (tp_gettuple): Converted recursive call to a loop. Previously, up to 100K recursive frames could exhaust the stack when results reference invalid blocks.

  • Assert-only bounds checks (segment.c, scan.c, merge.c): Converted Assert(doc_id < num_docs) to runtime checks in tp_segment_lookup_ctid and cached CTID array accesses. Asserts are compiled out in release builds, leaving no protection against corrupted segment data causing OOB heap reads.

  • Unchecked term length from disk (5 locations): Added TP_MAX_TERM_LENGTH (1MB) validation for string_entry.length read from on-disk segments before allocating buffers. Previously, corrupted segments could trigger unbounded allocations or buffer overflows via uint32 wrap-around.

  • Integer overflow (posting.c): Added overflow check before capacity * growth_factor in posting list expansion. Cast to Size for palloc argument to prevent 32-bit truncation.

  • Use-after-free/close (state.c): Fixed three instances: metap->magic accessed after pfree(metap), index_rel->rd_index accessed after index_close, and hash entry accessed after HASH_REMOVE.

Testing

  • All 49 SQL regression tests pass
  • All shell tests pass (concurrency, recovery, segment, CIC)
  • make format-check passes

Address multiple security issues found during audit, focusing on
vulnerabilities triggerable by end-user input (queries, binary
protocol, corrupted index data):

- tpvector_recv: validate total_size to prevent undersized/huge
  allocations and validate deserialized fields (index_name_len,
  entry_count) from untrusted binary input
- tpvector_out: add bounds checks on entry iteration to prevent
  walking past varlena bounds; validate lexeme_len >= 0 before
  alloca to prevent stack overflow
- tpvector_in: replace atoi() with pg_strtoint32() for proper
  overflow/error handling on user-provided frequency values
- query.c IDF cache: fix off-by-one OOB read where num_terms++
  sentinel caused lookup_cached_idf to iterate past array bounds;
  use separate bool flag instead
- tp_gettuple: convert recursive call to loop to prevent stack
  overflow (up to 100K frames) when results have invalid blocks
- tp_segment_lookup_ctid: convert Assert-only doc_id bounds check
  to runtime check (Assert compiled out in release builds)
- segment/scan.c, merge.c: convert Assert-only doc_id bounds
  checks in cached CTID array accesses to runtime checks
- Validate string_entry.length from on-disk segments against
  TP_MAX_TERM_LENGTH (1MB) before allocating buffers, across 5
  locations in scan.c, segment.c, and merge.c
- posting.c: add overflow check before capacity * growth_factor
  multiplication; cast to Size for palloc argument
- state.c: fix use-after-free (metap->magic after pfree), fix
  use-after-close (index_rel after index_close), fix access to
  hash entry after HASH_REMOVE
Cast total_size to Size before comparing with MaxAllocSize to avoid
signed/unsigned comparison warning that GCC treats as error with
-Werror=sign-compare. The cast is safe because the preceding check
already rejects negative values.
@tjgreen42 tjgreen42 marked this pull request as ready for review March 3, 2026 20:30
@tjgreen42 tjgreen42 merged commit 8e17584 into main Mar 3, 2026
15 checks passed
@tjgreen42 tjgreen42 deleted the security-audit-fixes branch March 3, 2026 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant