-
Notifications
You must be signed in to change notification settings - Fork 56
📋 Validate and Correct DATABASE_VIEW_INTELLIGENCE_CATALOG.md Against Actual Schema #7866
Description
🎯 Objective
Validate and correct the DATABASE_VIEW_INTELLIGENCE_CATALOG.md documentation by cross-referencing documented views with the actual database schema, identifying discrepancies, and ensuring all view descriptions are accurate and complete.
Recommended Agent: @intelligence-operative
Depends on: Issue #7865 (SQL validation script to provide baseline data)
📋 Background
The DATABASE_VIEW_INTELLIGENCE_CATALOG.md is a comprehensive ~29KB document describing 80+ database views used for political intelligence analysis. However, there may be discrepancies between documented views and actual database schema:
- Views documented but not in database
- Views in database but not documented
- Incorrect column descriptions or SQL examples
- Outdated view definitions after schema changes
- Missing performance characteristics
📊 Current State
Documentation: DATABASE_VIEW_INTELLIGENCE_CATALOG.md contains:
- Executive summary claiming "80+ database views"
- Detailed descriptions of major view categories
- SQL examples for each view
- Column definitions and usage patterns
- Performance characteristics
Known Issues:
- No validation that documented views exist
- SQL examples may not run against current schema
- Column lists may be incomplete or outdated
- Row count estimates may be inaccurate
From refresh-all-views.sql:
- 28 materialized views listed for refresh
- These should all be documented
✅ Acceptance Criteria
- All views from actual database schema documented
- All documented views verified to exist in database
- Column lists validated against actual view definitions
- SQL examples tested and confirmed working
- Row count estimates updated with actual data
- Missing views added to documentation
- Non-existent views removed from documentation
- Inconsistencies corrected with evidence
- Change log added documenting corrections made
🛠️ Implementation Guidance
Step 1: Extract View Lists
From Database (using Issue #7865 script):
psql -U postgres -d cia_dev -c "
SELECT viewname FROM pg_views WHERE schemaname = 'public'
UNION
SELECT matviewname FROM pg_matviews WHERE schemaname = 'public'
ORDER BY 1
" > actual_views.txtFrom Documentation:
# Extract view names from DATABASE_VIEW_INTELLIGENCE_CATALOG.md
grep -E "^### view_" DATABASE_VIEW_INTELLIGENCE_CATALOG.md | sed 's/^### //g' | cut -d' ' -f1 > documented_views.txtStep 2: Compare Lists
# Views in DB but not documented
comm -23 <(sort actual_views.txt) <(sort documented_views.txt) > missing_in_docs.txt
# Views documented but not in DB
comm -13 <(sort actual_views.txt) <(sort documented_views.txt) > missing_in_db.txt
# Views in both (should validate details)
comm -12 <(sort actual_views.txt) <(sort documented_views.txt) > views_to_validate.txtStep 3: Validate View Details
For each view in views_to_validate.txt:
-- Get actual column list
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = 'view_riksdagen_politician'
ORDER BY ordinal_position;
-- Get actual row count
SELECT COUNT(*) FROM view_riksdagen_politician;
-- Test documented SQL examples
-- (Copy SQL from documentation and run against database)Step 4: Document Findings
Create a correction report:
## DATABASE_VIEW_INTELLIGENCE_CATALOG.md Validation Report
### Date: [DATE]
### Validator: [NAME]
### Summary
- Total views in database: [N]
- Total views documented: [M]
- Matching views: [K]
- Missing from documentation: [N-K]
- Documented but not in DB: [M-K]
- Views with incorrect details: [X]
### Views Missing from Documentation
1. view_example_1 - [brief description of what it does]
2. view_example_2 - [brief description]
### Views Documented but Not in Database
1. view_old_deprecated_1 - [reason for removal]
2. view_renamed_to_new_name - [migration path]
### Views with Incorrect Documentation
1. view_riksdagen_politician
- Documented columns: [list]
- Actual columns: [list]
- Missing columns: [list]
- Extra documented columns: [list]
- SQL example status: [WORKING/BROKEN with details]
### Corrections Applied
1. Added documentation for: [list of views]
2. Removed documentation for: [list of views]
3. Updated column lists for: [list of views]
4. Fixed SQL examples for: [list of views]
5. Updated row counts for: [list of views]Step 5: Apply Corrections
Update DATABASE_VIEW_INTELLIGENCE_CATALOG.md:
-
Add missing views:
- Follow existing documentation pattern
- Include: Purpose, Key Columns, Example Queries, Performance Characteristics
- Assign intelligence value rating (⭐⭐⭐⭐⭐ to ⭐⭐)
-
Remove non-existent views:
- Add note in changelog section
- Preserve historical information if relevant
-
Fix existing view documentation:
- Update column lists to match actual schema
- Test and correct SQL examples
- Update row count estimates with actual data
- Correct performance characteristics
-
Add validation metadata:
- Last validated date
- Validation method used
- Known limitations
Example Corrections
Before:
### view_riksdagen_politician ⭐⭐⭐⭐⭐
| Column | Type | Description |
|--------|------|-------------|
| person_id | VARCHAR(255) | Unique identifier |
| name | VARCHAR(500) | Full name |After (if validation reveals issues):
### view_riksdagen_politician ⭐⭐⭐⭐⭐
| Column | Type | Description |
|--------|------|-------------|
| person_id | VARCHAR(255) | Unique identifier |
| first_name | VARCHAR(255) | First name |
| last_name | VARCHAR(255) | Last name |
| party | VARCHAR(50) | Current party affiliation |
**Note**: Column list validated against database on 2025-11-18.
~~Removed `name` column (deprecated in v1.25)~~
Added `party` column (added in v1.28)📊 Validation Checklist
For each documented view, verify:
- View exists in database
- Column names match actual view definition
- Column types match actual view definition
- SQL examples execute without errors
- SQL examples return expected results
- Row count estimate is accurate (±20%)
- Intelligence value rating is appropriate
- Related risk rules are correctly referenced
- Performance characteristics are accurate
🔗 Related Files
- DATABASE_VIEW_INTELLIGENCE_CATALOG.md - File to validate and correct
- full_schema.sql - Source of truth for schema
- refresh-all-views.sql - List of materialized views
- Issue 🗄️ Create SQL Script for Database Schema Validation and Statistics #7865 - SQL validation script (prerequisite)
📊 Success Metrics
- 100% of database views documented (or intentionally excluded with reason)
- 0 views documented that don't exist in database
- All SQL examples tested and working
- Column lists accurate for all views
- Row counts updated with real data
- Validation report produced with evidence
- Documentation tagged with validation date
Estimated Effort: 6-8 hours
Priority: High
Domain: documentation, intelligence, database