Skip to content

[FEATURE][SECURITY]: Implement database-level security constraints and SQL injection prevention #342

@crivetimihai

Description

@crivetimihai

[FEATURE][SECURITY]: Database-Level Security Constraints and SQL Injection Prevention

Goal

Add database-level CHECK constraints and validation to ensure data integrity and provide defense-in-depth against SQL injection attacks. While SQLAlchemy ORM provides parameterized queries, these additional constraints prevent malicious content storage even if application-layer validation is bypassed.

Why Now?

  1. Defense-in-Depth: Multiple validation layers reduce attack surface
  2. Data Integrity: Ensure only valid data formats stored in database
  3. Compliance: Security audits require database-level protections
  4. Path Traversal Prevention: Block ../ patterns in URIs
  5. XSS Prevention: Block script tags in user-controlled fields

📖 User Stories

US-1: Security Engineer - Prevent Malicious Data Storage

As a security engineer
I want CHECK constraints to reject malicious input
So that even bypassed validation can't store harmful data

Acceptance Criteria:

Scenario: Reject HTML in prompt names
  Given a database with CHECK constraints
  When inserting a prompt with name "<script>alert(1)</script>"
  Then the database should reject the insert
  And raise an IntegrityError

Scenario: Reject path traversal in resource URIs
  Given a database with CHECK constraints
  When inserting a resource with uri "../../../etc/passwd"
  Then the database should reject the insert
  And log the attempt

Scenario: Reject JavaScript URLs in tools
  Given a database with CHECK constraints
  When inserting a tool with url "javascript:alert(1)"
  Then the database should reject the insert
  And the error should indicate invalid URL format

Technical Requirements:

  • CHECK constraints on prompts, tools, resources, gateways, servers tables
  • Pattern matching for HTML tags, JavaScript URLs, path traversal
  • SQLite and PostgreSQL compatible regex patterns
US-2: Developer - Safe Query Construction

As a developer
I want query sanitization utilities
So that I can safely build dynamic queries

Acceptance Criteria:

Scenario: Sanitize LIKE patterns
  Given a search term "test%drop"
  When sanitizing for LIKE query
  Then the result should be "test\\%drop"
  And special characters should be escaped

Scenario: Validate column names
  Given a sort column from user input
  When validating against whitelist ["name", "created_at"]
  Then unknown columns should raise ValueError
  And valid columns should pass through

Scenario: Build safe IN clauses
  Given a list of IDs [1, 2, 3]
  When building an IN clause
  Then the result should use parameterized placeholders
  And no raw values should be in the SQL

Technical Requirements:

  • QuerySanitizer class with utility methods
  • Column whitelist validation
  • LIKE pattern escaping
  • Parameterized IN clause builder
US-3: Operator - Database Security Configuration

As a platform operator
I want configurable database security settings
So that I can tune security for my environment

Acceptance Criteria:

Scenario: Configure statement timeout
  Given DB_STATEMENT_TIMEOUT=30000
  When a query runs for 35 seconds
  Then the query should be cancelled
  And an error should be logged

Scenario: Configure max connections
  Given DB_MAX_CONNECTIONS=100
  When 101 connections are attempted
  Then the 101st should be rejected
  And connection pool should remain stable

Scenario: Enable query logging in debug mode
  Given LOG_LEVEL=DEBUG
  And DB_ENABLE_SQL_LOG=true
  When a query executes
  Then the SQL should be logged

Technical Requirements:

  • DB_STATEMENT_TIMEOUT setting
  • DB_MAX_CONNECTIONS setting
  • DB_ENABLE_SQL_LOG setting
  • Connection timeout configuration
US-4: Security Admin - Model-Level Validation

As a security administrator
I want SQLAlchemy models to validate data before insert
So that invalid data is caught at the application layer

Acceptance Criteria:

Scenario: Validate tool name format
  Given a Tool model instance
  When setting name to "invalid name with spaces!"
  Then a ValueError should be raised
  And the message should indicate valid format

Scenario: Validate URL scheme
  Given a Tool model instance
  When setting url to "file:///etc/passwd"
  Then a ValueError should be raised
  And only http/https/ws/wss should be allowed

Scenario: Validate name length
  Given a Tool model instance
  When setting name longer than 255 characters
  Then a ValueError should be raised

Technical Requirements:

  • @validates decorators on models
  • Name format regex validation
  • URL scheme validation
  • Length validation

🏗 Architecture

Validation Layers

flowchart TD
    A[API Request] --> B[Pydantic Validation]
    B --> C[SQLAlchemy @validates]
    C --> D[Database CHECK Constraints]
    D --> E[Data Stored]

    B -->|Invalid| F[HTTP 422]
    C -->|Invalid| G[ValueError]
    D -->|Invalid| H[IntegrityError]
Loading

CHECK Constraints by Table

erDiagram
    PROMPTS {
        string name "CHECK: no HTML, length <= 255"
        string template "CHECK: no script tags"
    }
    TOOLS {
        string name "CHECK: alphanumeric_underscore"
        string url "CHECK: http(s)/ws(s) scheme"
    }
    RESOURCES {
        string uri "CHECK: no path traversal"
        string mime_type "CHECK: valid MIME format"
    }
    GATEWAYS {
        string transport "CHECK: IN (SSE, STREAMABLEHTTP, STDIO)"
        string url "CHECK: http(s)/ws(s) scheme"
    }
    SERVERS {
        string name "CHECK: safe characters only"
        string icon "CHECK: https or data:image"
    }
Loading

📋 Implementation Tasks

Phase 1: CHECK Constraints Migration

  • Create alembic migration for CHECK constraints
  • Add prompts table constraints (name format, template safety)
  • Add tools table constraints (name format, URL scheme)
  • Add resources table constraints (URI format, no traversal)
  • Add gateways table constraints (transport enum, URL scheme)
  • Add servers table constraints (name format, icon URL)
  • Make migration idempotent (check before adding)

Phase 2: Model-Level Validation

  • Add @validates to Tool model (name, url)
  • Add @validates to Prompt model (name)
  • Add @validates to Resource model (uri)
  • Add @validates to Gateway model (transport, url)
  • Add @validates to Server model (name, icon)
  • Add table-level CheckConstraint definitions

Phase 3: Query Sanitization Utilities

  • Create QuerySanitizer class
  • Implement sanitize_like_pattern()
  • Implement validate_column_name()
  • Implement validate_sort_order()
  • Implement build_safe_in_clause()

Phase 4: Database Connection Security

  • Add connection security settings to config
  • Implement statement timeout
  • Implement connection pooling limits
  • Add SQLite PRAGMA settings
  • Add application_name for monitoring

Phase 5: Testing

  • Unit tests for CHECK constraint violations
  • Unit tests for model validation
  • Unit tests for query sanitization
  • Integration tests for SQL injection attempts
  • Performance tests for constraint overhead

Phase 6: Data Cleanup Script

  • Create scripts/sanitize_database.py
  • Fix existing HTML in names
  • Fix invalid URLs
  • Log all changes made

⚙️ Configuration Example

# Database Security Settings
DB_STATEMENT_TIMEOUT=30000       # 30 second query timeout
DB_MAX_CONNECTIONS=100           # Maximum connection pool size
DB_CONNECTION_TIMEOUT=10         # Connection acquisition timeout
DB_ENABLE_SQL_LOG=false          # Only enable in debug mode

# Query Limits
MAX_QUERY_RESULTS=1000           # Maximum results per query
MAX_BATCH_SIZE=100               # Maximum batch operation size

# Security Flags
ENABLE_QUERY_SANITIZATION=true   # Enable query sanitization utilities

✅ Success Criteria

  • All tables have appropriate CHECK constraints
  • Model-level validation catches invalid data
  • SQL injection attempts are blocked
  • Path traversal attempts are blocked
  • XSS vectors in data are rejected
  • Query sanitization utilities working
  • Database connection security configured
  • Performance impact < 5% overhead

🏁 Definition of Done

  • Alembic migration created and tested
  • CHECK constraints on all relevant tables
  • @validates decorators on models
  • QuerySanitizer class implemented
  • Database security settings in config
  • Unit tests pass
  • Integration tests pass
  • Data cleanup script created
  • Code passes make verify
  • Documentation complete

🔗 Related Issues

Metadata

Metadata

Assignees

Labels

MUSTP1: Non-negotiable, critical requirements without which the product is non-functional or unsafechoreLinting, formatting, dependency hygiene, or project maintenance chorescicdIssue with CI/CD process (GitHub Actions, scaffolding)devopsDevOps activities (containers, automation, deployment, makefiles, etc)enhancementNew feature or requestreadyValidated, ready-to-work-on itemssecurityImproves security

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions