-
Notifications
You must be signed in to change notification settings - Fork 613
[FEATURE][SECURITY]: Implement database-level security constraints and SQL injection prevention #342
Description
[FEATURE][SECURITY]: Database-Level Security Constraints and SQL Injection Prevention
Goal
Add database-level CHECK constraints and validation to ensure data integrity and provide defense-in-depth against SQL injection attacks. While SQLAlchemy ORM provides parameterized queries, these additional constraints prevent malicious content storage even if application-layer validation is bypassed.
Why Now?
- Defense-in-Depth: Multiple validation layers reduce attack surface
- Data Integrity: Ensure only valid data formats stored in database
- Compliance: Security audits require database-level protections
- Path Traversal Prevention: Block
../patterns in URIs - XSS Prevention: Block script tags in user-controlled fields
📖 User Stories
US-1: Security Engineer - Prevent Malicious Data Storage
As a security engineer
I want CHECK constraints to reject malicious input
So that even bypassed validation can't store harmful data
Acceptance Criteria:
Scenario: Reject HTML in prompt names
Given a database with CHECK constraints
When inserting a prompt with name "<script>alert(1)</script>"
Then the database should reject the insert
And raise an IntegrityError
Scenario: Reject path traversal in resource URIs
Given a database with CHECK constraints
When inserting a resource with uri "../../../etc/passwd"
Then the database should reject the insert
And log the attempt
Scenario: Reject JavaScript URLs in tools
Given a database with CHECK constraints
When inserting a tool with url "javascript:alert(1)"
Then the database should reject the insert
And the error should indicate invalid URL formatTechnical Requirements:
- CHECK constraints on prompts, tools, resources, gateways, servers tables
- Pattern matching for HTML tags, JavaScript URLs, path traversal
- SQLite and PostgreSQL compatible regex patterns
US-2: Developer - Safe Query Construction
As a developer
I want query sanitization utilities
So that I can safely build dynamic queries
Acceptance Criteria:
Scenario: Sanitize LIKE patterns
Given a search term "test%drop"
When sanitizing for LIKE query
Then the result should be "test\\%drop"
And special characters should be escaped
Scenario: Validate column names
Given a sort column from user input
When validating against whitelist ["name", "created_at"]
Then unknown columns should raise ValueError
And valid columns should pass through
Scenario: Build safe IN clauses
Given a list of IDs [1, 2, 3]
When building an IN clause
Then the result should use parameterized placeholders
And no raw values should be in the SQLTechnical Requirements:
- QuerySanitizer class with utility methods
- Column whitelist validation
- LIKE pattern escaping
- Parameterized IN clause builder
US-3: Operator - Database Security Configuration
As a platform operator
I want configurable database security settings
So that I can tune security for my environment
Acceptance Criteria:
Scenario: Configure statement timeout
Given DB_STATEMENT_TIMEOUT=30000
When a query runs for 35 seconds
Then the query should be cancelled
And an error should be logged
Scenario: Configure max connections
Given DB_MAX_CONNECTIONS=100
When 101 connections are attempted
Then the 101st should be rejected
And connection pool should remain stable
Scenario: Enable query logging in debug mode
Given LOG_LEVEL=DEBUG
And DB_ENABLE_SQL_LOG=true
When a query executes
Then the SQL should be loggedTechnical Requirements:
- DB_STATEMENT_TIMEOUT setting
- DB_MAX_CONNECTIONS setting
- DB_ENABLE_SQL_LOG setting
- Connection timeout configuration
US-4: Security Admin - Model-Level Validation
As a security administrator
I want SQLAlchemy models to validate data before insert
So that invalid data is caught at the application layer
Acceptance Criteria:
Scenario: Validate tool name format
Given a Tool model instance
When setting name to "invalid name with spaces!"
Then a ValueError should be raised
And the message should indicate valid format
Scenario: Validate URL scheme
Given a Tool model instance
When setting url to "file:///etc/passwd"
Then a ValueError should be raised
And only http/https/ws/wss should be allowed
Scenario: Validate name length
Given a Tool model instance
When setting name longer than 255 characters
Then a ValueError should be raisedTechnical Requirements:
- @validates decorators on models
- Name format regex validation
- URL scheme validation
- Length validation
🏗 Architecture
Validation Layers
flowchart TD
A[API Request] --> B[Pydantic Validation]
B --> C[SQLAlchemy @validates]
C --> D[Database CHECK Constraints]
D --> E[Data Stored]
B -->|Invalid| F[HTTP 422]
C -->|Invalid| G[ValueError]
D -->|Invalid| H[IntegrityError]
CHECK Constraints by Table
erDiagram
PROMPTS {
string name "CHECK: no HTML, length <= 255"
string template "CHECK: no script tags"
}
TOOLS {
string name "CHECK: alphanumeric_underscore"
string url "CHECK: http(s)/ws(s) scheme"
}
RESOURCES {
string uri "CHECK: no path traversal"
string mime_type "CHECK: valid MIME format"
}
GATEWAYS {
string transport "CHECK: IN (SSE, STREAMABLEHTTP, STDIO)"
string url "CHECK: http(s)/ws(s) scheme"
}
SERVERS {
string name "CHECK: safe characters only"
string icon "CHECK: https or data:image"
}
📋 Implementation Tasks
Phase 1: CHECK Constraints Migration
- Create alembic migration for CHECK constraints
- Add prompts table constraints (name format, template safety)
- Add tools table constraints (name format, URL scheme)
- Add resources table constraints (URI format, no traversal)
- Add gateways table constraints (transport enum, URL scheme)
- Add servers table constraints (name format, icon URL)
- Make migration idempotent (check before adding)
Phase 2: Model-Level Validation
- Add @validates to Tool model (name, url)
- Add @validates to Prompt model (name)
- Add @validates to Resource model (uri)
- Add @validates to Gateway model (transport, url)
- Add @validates to Server model (name, icon)
- Add table-level CheckConstraint definitions
Phase 3: Query Sanitization Utilities
- Create QuerySanitizer class
- Implement sanitize_like_pattern()
- Implement validate_column_name()
- Implement validate_sort_order()
- Implement build_safe_in_clause()
Phase 4: Database Connection Security
- Add connection security settings to config
- Implement statement timeout
- Implement connection pooling limits
- Add SQLite PRAGMA settings
- Add application_name for monitoring
Phase 5: Testing
- Unit tests for CHECK constraint violations
- Unit tests for model validation
- Unit tests for query sanitization
- Integration tests for SQL injection attempts
- Performance tests for constraint overhead
Phase 6: Data Cleanup Script
- Create scripts/sanitize_database.py
- Fix existing HTML in names
- Fix invalid URLs
- Log all changes made
⚙️ Configuration Example
# Database Security Settings
DB_STATEMENT_TIMEOUT=30000 # 30 second query timeout
DB_MAX_CONNECTIONS=100 # Maximum connection pool size
DB_CONNECTION_TIMEOUT=10 # Connection acquisition timeout
DB_ENABLE_SQL_LOG=false # Only enable in debug mode
# Query Limits
MAX_QUERY_RESULTS=1000 # Maximum results per query
MAX_BATCH_SIZE=100 # Maximum batch operation size
# Security Flags
ENABLE_QUERY_SANITIZATION=true # Enable query sanitization utilities✅ Success Criteria
- All tables have appropriate CHECK constraints
- Model-level validation catches invalid data
- SQL injection attempts are blocked
- Path traversal attempts are blocked
- XSS vectors in data are rejected
- Query sanitization utilities working
- Database connection security configured
- Performance impact < 5% overhead
🏁 Definition of Done
- Alembic migration created and tested
- CHECK constraints on all relevant tables
- @validates decorators on models
- QuerySanitizer class implemented
- Database security settings in config
- Unit tests pass
- Integration tests pass
- Data cleanup script created
- Code passes
make verify - Documentation complete
🔗 Related Issues
- Related: [FEATURE][SECURITY]: Audit logging system #535 (Audit Logging - logs rejected attempts)
- Related: [FEATURE][SECURITY]: Add security configuration validation and startup checks #534 (Security Config Validation)
- Reference: OWASP SQL Injection Prevention Cheat Sheet