Skip to content

📋 Validate and Correct DATABASE_VIEW_INTELLIGENCE_CATALOG.md Against Actual Schema #7866

@pethers

Description

@pethers

🎯 Objective

Validate and correct the DATABASE_VIEW_INTELLIGENCE_CATALOG.md documentation by cross-referencing documented views with the actual database schema, identifying discrepancies, and ensuring all view descriptions are accurate and complete.

Recommended Agent: @intelligence-operative

Depends on: Issue #7865 (SQL validation script to provide baseline data)

📋 Background

The DATABASE_VIEW_INTELLIGENCE_CATALOG.md is a comprehensive ~29KB document describing 80+ database views used for political intelligence analysis. However, there may be discrepancies between documented views and actual database schema:

  • Views documented but not in database
  • Views in database but not documented
  • Incorrect column descriptions or SQL examples
  • Outdated view definitions after schema changes
  • Missing performance characteristics

📊 Current State

Documentation: DATABASE_VIEW_INTELLIGENCE_CATALOG.md contains:

  • Executive summary claiming "80+ database views"
  • Detailed descriptions of major view categories
  • SQL examples for each view
  • Column definitions and usage patterns
  • Performance characteristics

Known Issues:

  • No validation that documented views exist
  • SQL examples may not run against current schema
  • Column lists may be incomplete or outdated
  • Row count estimates may be inaccurate

From refresh-all-views.sql:

  • 28 materialized views listed for refresh
  • These should all be documented

✅ Acceptance Criteria

  • All views from actual database schema documented
  • All documented views verified to exist in database
  • Column lists validated against actual view definitions
  • SQL examples tested and confirmed working
  • Row count estimates updated with actual data
  • Missing views added to documentation
  • Non-existent views removed from documentation
  • Inconsistencies corrected with evidence
  • Change log added documenting corrections made

🛠️ Implementation Guidance

Step 1: Extract View Lists

From Database (using Issue #7865 script):

psql -U postgres -d cia_dev -c "
  SELECT viewname FROM pg_views WHERE schemaname = 'public'
  UNION
  SELECT matviewname FROM pg_matviews WHERE schemaname = 'public'
  ORDER BY 1
" > actual_views.txt

From Documentation:

# Extract view names from DATABASE_VIEW_INTELLIGENCE_CATALOG.md
grep -E "^### view_" DATABASE_VIEW_INTELLIGENCE_CATALOG.md | sed 's/^### //g' | cut -d' ' -f1 > documented_views.txt

Step 2: Compare Lists

# Views in DB but not documented
comm -23 <(sort actual_views.txt) <(sort documented_views.txt) > missing_in_docs.txt

# Views documented but not in DB
comm -13 <(sort actual_views.txt) <(sort documented_views.txt) > missing_in_db.txt

# Views in both (should validate details)
comm -12 <(sort actual_views.txt) <(sort documented_views.txt) > views_to_validate.txt

Step 3: Validate View Details

For each view in views_to_validate.txt:

-- Get actual column list
SELECT column_name, data_type 
FROM information_schema.columns 
WHERE table_schema = 'public' AND table_name = 'view_riksdagen_politician'
ORDER BY ordinal_position;

-- Get actual row count
SELECT COUNT(*) FROM view_riksdagen_politician;

-- Test documented SQL examples
-- (Copy SQL from documentation and run against database)

Step 4: Document Findings

Create a correction report:

## DATABASE_VIEW_INTELLIGENCE_CATALOG.md Validation Report

### Date: [DATE]
### Validator: [NAME]

### Summary
- Total views in database: [N]
- Total views documented: [M]
- Matching views: [K]
- Missing from documentation: [N-K]
- Documented but not in DB: [M-K]
- Views with incorrect details: [X]

### Views Missing from Documentation
1. view_example_1 - [brief description of what it does]
2. view_example_2 - [brief description]

### Views Documented but Not in Database
1. view_old_deprecated_1 - [reason for removal]
2. view_renamed_to_new_name - [migration path]

### Views with Incorrect Documentation
1. view_riksdagen_politician
   - Documented columns: [list]
   - Actual columns: [list]
   - Missing columns: [list]
   - Extra documented columns: [list]
   - SQL example status: [WORKING/BROKEN with details]

### Corrections Applied
1. Added documentation for: [list of views]
2. Removed documentation for: [list of views]
3. Updated column lists for: [list of views]
4. Fixed SQL examples for: [list of views]
5. Updated row counts for: [list of views]

Step 5: Apply Corrections

Update DATABASE_VIEW_INTELLIGENCE_CATALOG.md:

  1. Add missing views:

    • Follow existing documentation pattern
    • Include: Purpose, Key Columns, Example Queries, Performance Characteristics
    • Assign intelligence value rating (⭐⭐⭐⭐⭐ to ⭐⭐)
  2. Remove non-existent views:

    • Add note in changelog section
    • Preserve historical information if relevant
  3. Fix existing view documentation:

    • Update column lists to match actual schema
    • Test and correct SQL examples
    • Update row count estimates with actual data
    • Correct performance characteristics
  4. Add validation metadata:

    • Last validated date
    • Validation method used
    • Known limitations

Example Corrections

Before:

### view_riksdagen_politician ⭐⭐⭐⭐⭐

| Column | Type | Description |
|--------|------|-------------|
| person_id | VARCHAR(255) | Unique identifier |
| name | VARCHAR(500) | Full name |

After (if validation reveals issues):

### view_riksdagen_politician ⭐⭐⭐⭐⭐

| Column | Type | Description |
|--------|------|-------------|
| person_id | VARCHAR(255) | Unique identifier |
| first_name | VARCHAR(255) | First name |
| last_name | VARCHAR(255) | Last name |
| party | VARCHAR(50) | Current party affiliation |

**Note**: Column list validated against database on 2025-11-18.
~~Removed `name` column (deprecated in v1.25)~~
Added `party` column (added in v1.28)

📊 Validation Checklist

For each documented view, verify:

  • View exists in database
  • Column names match actual view definition
  • Column types match actual view definition
  • SQL examples execute without errors
  • SQL examples return expected results
  • Row count estimate is accurate (±20%)
  • Intelligence value rating is appropriate
  • Related risk rules are correctly referenced
  • Performance characteristics are accurate

🔗 Related Files

📊 Success Metrics

  • 100% of database views documented (or intentionally excluded with reason)
  • 0 views documented that don't exist in database
  • All SQL examples tested and working
  • Column lists accurate for all views
  • Row counts updated with real data
  • Validation report produced with evidence
  • Documentation tagged with validation date

Estimated Effort: 6-8 hours
Priority: High
Domain: documentation, intelligence, database

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions