[EPIC][TESTING][UI]: Comprehensive Playwright E2E Test Suite for MCP Gateway Admin UI

# [EPIC][TESTING][UI]: Comprehensive Playwright E2E Test Suite for MCP Gateway Admin UI

## Goal

Implement & test complete end-to-end Playwright test coverage for the MCP Gateway Admin UI, ensuring all user workflows, CRUD operations, HTMX interactions, and administrative features are automatically validated before each release.

The framework (and many of the tests) below should already exist, but have not been refined, verified, expanded, optimized, etc. The goal of this epic is to ensure the testing is complete, comprehensive, and fully working for all UI testing needs.

**Prerequisites**: #2136 (auth fix) must be resolved first.

**Related Issues**:
- [BUG][TESTING]: Playwright tests not updated to use admin email/password login credentials #2136
- ~~[CHORE]: #255 (basic automation) — closed as duplicate of this epic~~

---

## Why Now?

The Admin UI is the primary interface for administrators managing MCP servers, tools, and agents. Automated E2E coverage is needed to:

1. **Prevent UI regressions** across releases
2. **Validate HTMX/Alpine.js interactions** that are difficult to unit test
3. **Ensure authentication and RBAC work correctly** from the user's perspective
4. **Catch integration issues** between frontend and backend
5. **Enable confident refactoring** of the UI codebase

---

## Environment Configuration

### Feature Flags That Control UI Visibility

Tests must detect and adapt to these feature flags. The recommended approach is to check element visibility rather than fail when features are disabled.

**Important**: Settings defaults (in `config.py`) differ from `.env.example` values:

| Feature Flag | Settings Default | .env.example | UI Tab/Section Affected |
|-------------|------------------|--------------|------------------------|
| `MCPGATEWAY_UI_ENABLED` | `false` | `true` | Entire Admin UI |
| `MCPGATEWAY_ADMIN_API_ENABLED` | `false` | `true` | Admin API endpoints |
| `EMAIL_AUTH_ENABLED` | `true` | `true` | Organization section (Teams, Users, Tokens) |
| `MCPGATEWAY_A2A_ENABLED` | `true` | `true` | Agents -> A2A Agents tab |
| `MCPGATEWAY_GRPC_ENABLED` | `false` | `false` | Agents -> gRPC Services tab |
| `PLUGINS_ENABLED` | `false` | `false` | Extensions -> Plugins tab |
| `LLMCHAT_ENABLED` | `false` | `false` | LLM -> LLM Chat, LLM Settings tabs |
| `TOOLOPS_ENABLED` | `false` | `false` | MCP -> ToolOps tab |
| `OBSERVABILITY_ENABLED` | `false` | `false` | Monitoring -> Observability tab |
| `MCPGATEWAY_PERFORMANCE_TRACKING` | `false` | `false` | Monitoring -> Performance tab |
| `STRUCTURED_LOGGING_DATABASE_ENABLED` | `false` | `false` | System -> Logs tab (required for log viewing) |

> **Note**: For testing, ensure `.env` explicitly sets `MCPGATEWAY_UI_ENABLED=true` and `MCPGATEWAY_ADMIN_API_ENABLED=true` since the Settings defaults are `false`.

### Recommended Test Environment `.env`

```bash
# Core (required)
MCPGATEWAY_UI_ENABLED=true
MCPGATEWAY_ADMIN_API_ENABLED=true
EMAIL_AUTH_ENABLED=true
MCPGATEWAY_UI_AIRGAPPED=true # Use local assets to avoid CDN failures

# Test credentials (NEW - replaces BASIC_AUTH_USER/PASSWORD)
PLATFORM_ADMIN_EMAIL=admin@example.com
PLATFORM_ADMIN_PASSWORD=changeme

# Enable ALL optional features for full coverage
MCPGATEWAY_A2A_ENABLED=true
MCPGATEWAY_GRPC_ENABLED=true
PLUGINS_ENABLED=true
LLMCHAT_ENABLED=true
TOOLOPS_ENABLED=true
OBSERVABILITY_ENABLED=true
MCPGATEWAY_PERFORMANCE_TRACKING=true
STRUCTURED_LOGGING_DATABASE_ENABLED=true

# Disable password change requirement for tests
PASSWORD_CHANGE_ENFORCEMENT_ENABLED=false
ADMIN_REQUIRE_PASSWORD_CHANGE_ON_BOOTSTRAP=false
```

### Key Test Infrastructure (conftest.py)

The following fixtures and utilities should be implemented:

| Fixture/Class | Purpose |
|--------------|---------|
| `authenticated_page` | Returns a Page logged into admin UI via form POST |
| `enabled_features` | Detects which feature tabs are visible (depends on `authenticated_page`) |
| `skip_if_feature_disabled` | Decorator to skip tests with warning when feature disabled |
| `ConsoleErrorCollector` | Collects browser console errors, auto-asserts no errors at test end |
| `NetworkRequestCollector` | Monitors network requests, flags 4xx/5xx failures (with allowlist support) |
| `VisualComparator` | Screenshot comparison with 1% pixel threshold (requires `pixelmatch`, `Pillow`) |
| `PerformanceTimer` | Measures page load times, asserts < 3s threshold |

> **Important**: Do NOT use `wait_until='networkidle'` for admin pages - the SSE connection (`/admin/events`) keeps a connection open indefinitely. Use `domcontentloaded` + selector waits instead.

---

## User Stories

<details>
<summary>US-1: QA Engineer - Automated Regression Coverage</summary>

As a QA Engineer, I want automated E2E tests that cover all Admin UI workflows so that every release is validated without manual effort.

**Acceptance Criteria:**
- All CRUD operations tested for each entity type
- Tab navigation and pagination tested
- Form validation errors tested
- Tests run in CI on every PR
- Tests gracefully skip when optional features are disabled
</details>

<details>
<summary>US-2: Developer - Confident Refactoring</summary>

As a Developer, I want comprehensive UI tests so I can refactor frontend code without fear of breaking user workflows.

**Acceptance Criteria:**
- Tests cover all critical user paths
- Tests fail fast on breaking changes
- Clear error messages indicate what broke
</details>

<details>
<summary>US-3: Security Engineer - Authentication Validation</summary>

As a Security Engineer, I want tests that validate authentication flows and access control from the UI perspective.

**Acceptance Criteria:**
- Email/password login tested (not HTTP Basic Auth)
- Password change flow tested
- Admin-only pages properly protected
- Session expiration handled correctly
</details>

---

## Test Strategy

### Test Organization

```
tests/playwright/
├── conftest.py # Shared fixtures, feature detection, console/network monitors
├── pages/ # Page Object Model (25 page objects)
├── entities/ # CRUD tests (always-on features)
├── entities_optional/ # CRUD tests (feature-flagged)
├── features/ # Feature tests (auth, observability, plugins, etc.)
├── interactions/ # UI interaction tests (navigation, htmx, modals, forms)
├── accessibility/ # WCAG AA compliance (required)
├── chaos/ # Multi-tab and stress testing (lower priority)
├── network/ # Network condition simulation (local only)
├── performance/ # Performance and memory tests
├── visual/ # Visual regression (1% threshold)
├── cross_browser/ # Browser-specific tests
└── realtime/ # SSE and real-time tests
```

### Test Markers

```python
# pyproject.toml - key markers
markers = [
 "smoke: Critical path tests (< 2 min total)",
 "crud: Entity CRUD operations",
 "a11y: Accessibility tests (WCAG AA required)",
 "chaos: Multi-tab and stress tests (lower priority)",
 "network: Network condition simulation (local only)",
 "perf: Performance threshold tests",
 "requires_*: Feature flag requirements",
]
```

---

## Implementation Tasks

### Phase 1: Foundation & Fixtures

#### 1.1 Core Fixtures
- [ ] **FX-1**: Update `conftest.py` with `PLATFORM_ADMIN_EMAIL/PASSWORD` (#2136)
- [ ] **FX-2**: Create `enabled_features` fixture for feature detection
- [ ] **FX-3**: Create `skip_if_feature_disabled` decorator
- [ ] **FX-4**: Create entity factory fixtures (test data generators with UUID)
- [ ] **FX-5**: Create cleanup fixtures (teardown after tests)
- [ ] **FX-6**: Create authenticated admin vs non-admin user fixtures

#### 1.2 Page Objects (Always-On Features)
- [ ] **PO-1**: `LoginPage` - form login, error handling, SSO buttons
- [ ] **PO-2**: `AdminPage` - sidebar, tab navigation, dark mode toggle
- [ ] **PO-3**: `ServersPage` (Virtual Servers/Catalog) - full CRUD
- [ ] **PO-4**: `GatewaysPage` (MCP Servers) - CRUD + connectivity test
- [ ] **PO-5**: `ToolsPage` - CRUD + schema display
- [ ] **PO-6**: `ResourcesPage` - CRUD + URI handling
- [ ] **PO-7**: `PromptsPage` - CRUD + arguments
- [ ] **PO-8**: `RootsPage` - CRUD
- [ ] **PO-9**: `McpRegistryPage` - browse, search, register individual servers
- [ ] **PO-10**: `MetricsPage` - admin dashboard stats
- [ ] **PO-11**: `ExportImportPage` - export/import workflows
- [ ] **PO-12**: `LogsPage` - view/stream logs
- [ ] **PO-13**: `MaintenancePage` - cleanup, rollup operations
- [ ] **PO-14**: `VersionInfoPage` - version info, services status, support bundle

#### 1.3 Page Objects (Optional Features)
- [ ] **PO-15**: `A2AAgentsPage` (requires `MCPGATEWAY_A2A_ENABLED`)
- [ ] **PO-16**: `GrpcServicesPage` (requires `MCPGATEWAY_GRPC_ENABLED`)
- [ ] **PO-17**: `LlmChatPage` (requires `LLMCHAT_ENABLED`)
- [ ] **PO-18**: `LlmSettingsPage` (requires `LLMCHAT_ENABLED`)
- [ ] **PO-19**: `ToolOpsPage` (requires `TOOLOPS_ENABLED`)
- [ ] **PO-20**: `ObservabilityPage` (requires `OBSERVABILITY_ENABLED`)
- [ ] **PO-21**: `PerformancePage` (requires `MCPGATEWAY_PERFORMANCE_TRACKING`)
- [ ] **PO-22**: `PluginsPage` (requires `PLUGINS_ENABLED`)
- [ ] **PO-23**: `TeamsPage` (requires `EMAIL_AUTH_ENABLED`)
- [ ] **PO-24**: `UsersPage` (requires `EMAIL_AUTH_ENABLED` + admin)
- [ ] **PO-25**: `TokensPage` (requires `EMAIL_AUTH_ENABLED`)

### Phase 2: Authentication Tests

> **Note**: To test AUTH-6/AUTH-7 (password change flow), you'll need a separate test profile with `PASSWORD_CHANGE_ENFORCEMENT_ENABLED=true`.

- [ ] **AUTH-1**: Email/password login success via form POST
- [ ] **AUTH-2**: Login with invalid credentials (error message display)
- [ ] **AUTH-3**: Login with missing fields (validation)
- [ ] **AUTH-4**: Logout functionality (cookie cleared)
- [ ] **AUTH-5**: Session expiration handling (redirect to login)
- [ ] **AUTH-6**: Password change required flow ⚠️ *requires enforcement enabled*
- [ ] **AUTH-7**: Password validation errors ⚠️ *requires enforcement enabled*
- [ ] **AUTH-8**: SSO provider buttons display (when SSO providers enabled)
- [ ] **AUTH-9**: Admin-only page protection (non-admin gets restricted view)
- [ ] **AUTH-10**: JWT cookie httpOnly flag (always), Secure flag (HTTPS only)

### Phase 3: Core Entity CRUD Tests (Always-On)

#### 3.1 Virtual Servers (Catalog) `#catalog-panel`
- [ ] **SRV-1**: List servers with pagination
- [ ] **SRV-2**: Create new server via inline form `#add-server-form`
- [ ] **SRV-3**: Edit existing server (via edit modal)
- [ ] **SRV-4**: Delete server with confirmation dialog
- [ ] **SRV-5**: Activate/deactivate server toggle
- [ ] **SRV-6**: Associate tools/resources/prompts with server
- [ ] **SRV-7**: Server search functionality (`#catalog-search-input`)
- [ ] **SRV-8**: View server details via view modal
- [ ] **SRV-9**: Filter by team (when `EMAIL_AUTH_ENABLED`)
- [ ] **SRV-10**: Show inactive toggle

#### 3.2 MCP Servers (Gateways) `#gateways-panel`
- [ ] **GW-1**: List gateways with pagination
- [ ] **GW-2**: Create new gateway (name, URL, transport type)
- [ ] **GW-3**: Edit gateway configuration
- [ ] **GW-4**: Delete gateway
- [ ] **GW-5**: Activate/deactivate gateway
- [ ] **GW-6**: Test gateway connectivity button (`#gateway-test-modal`)
- [ ] **GW-7**: View associated tools/resources/prompts counts
- [ ] **GW-8**: OAuth configuration (when OAuth enabled)
- [ ] **GW-9**: Passthrough headers configuration
- [ ] **GW-10**: Fetch Tools from MCP Server (`fetchToolsForGateway`)

#### 3.3 Tools `#tools-panel`
- [ ] **TOOL-1**: List tools with pagination
- [ ] **TOOL-2**: Create new REST tool via `#add-tool-form`
- [ ] **TOOL-3**: Create MCP tool (from gateway discovery)
- [ ] **TOOL-4**: Edit tool configuration
- [ ] **TOOL-5**: Delete tool
- [ ] **TOOL-6**: Activate/deactivate tool
- [ ] **TOOL-7**: Tool search and filtering
- [ ] **TOOL-8**: View tool details (input schema, annotations)
- [ ] **TOOL-9**: Test tool execution via `#tool-test-modal`
- [ ] **TOOL-10**: Tool visibility settings (public/team/private)
- [ ] **TOOL-11**: Bulk import dropdown - JSON array paste or file upload
- [ ] **TOOL-12**: Test case generation modal (`#testcase-gen-modal`)
- [ ] **TOOL-13**: Bulk test case generation (`#bulk-testcase-gen-modal`)

#### 3.4 Resources `#resources-panel`
- [ ] **RES-1**: List resources with pagination
- [ ] **RES-2**: Create new resource
- [ ] **RES-3**: Edit resource configuration
- [ ] **RES-4**: Delete resource
- [ ] **RES-5**: Activate/deactivate resource
- [ ] **RES-6**: Resource URI validation
- [ ] **RES-7**: Resource search functionality
- [ ] **RES-8**: Resource template handling
- [ ] **RES-9**: Test resource via `#resource-test-modal` (`runResourceTest()`)
- [ ] **RES-10**: View resource details

#### 3.5 Prompts `#prompts-panel`
- [ ] **PRMT-1**: List prompts with pagination
- [ ] **PRMT-2**: Create new prompt
- [ ] **PRMT-3**: Edit prompt details
- [ ] **PRMT-4**: Delete prompt
- [ ] **PRMT-5**: Activate/deactivate prompt
- [ ] **PRMT-6**: Prompt arguments handling
- [ ] **PRMT-7**: Prompt search functionality
- [ ] **PRMT-8**: Test prompt via `#prompt-test-modal` (`runPromptTest()`)

#### 3.6 Roots `#roots-panel`
- [ ] **ROOT-1**: List roots
- [ ] **ROOT-2**: Add new root URI
- [ ] **ROOT-3**: Delete root
- [ ] **ROOT-4**: Root path validation
- [ ] **ROOT-5**: Export root configuration (`exportRoot()`)

#### 3.7 MCP Registry `#mcp-registry-panel`
- [ ] **REG-1**: Browse registry servers
- [ ] **REG-2**: Search registry
- [ ] **REG-3**: Register individual server (per-server "Add" button)
- [ ] **REG-4**: Check server status

> **Note**: Bulk registration is not currently implemented. Only per-server registration is available.

### Phase 4: Optional Entity CRUD Tests

#### 4.1 A2A Agents (requires `MCPGATEWAY_A2A_ENABLED`)
- [ ] **A2A-1** through **A2A-7**: Full CRUD + test connectivity + view skills

#### 4.2 gRPC Services (requires `MCPGATEWAY_GRPC_ENABLED`)
- [ ] **GRPC-1** through **GRPC-7**: Full CRUD + reflection + get methods

#### 4.3 Users (requires `EMAIL_AUTH_ENABLED` + admin)
- [ ] **USR-1** through **USR-8**: Full CRUD + admin toggle + force password change

#### 4.4 Teams (requires `EMAIL_AUTH_ENABLED`)
- [ ] **TEAM-1** through **TEAM-7**: Full CRUD + member management

#### 4.5 API Tokens (requires `EMAIL_AUTH_ENABLED`)
- [ ] **TKN-1** through **TKN-5**: Generate, view, revoke, copy tokens

### Phase 5: UI Interaction Tests

#### 5.1 Navigation & Layout
- [ ] **NAV-1** through **NAV-7**: Tab switching, sidebar, hash navigation, dark mode

#### 5.2 Tables & Pagination
- [ ] **TBL-1** through **TBL-8**: Button-based pagination (no text input), per-page size, row actions

> **Note**: Pagination is button-based only (no direct page number text input).

#### 5.3 Modals & Forms
- [ ] **MDL-1** through **MDL-8**: Open/close, backdrop, escape key, validation, nested modals

#### 5.4 HTMX Interactions
- [ ] **HTMX-1** through **HTMX-7**: Tab loading, form submission, partial refresh, indicators

#### 5.5 Search & Filter
- [ ] **SRCH-1** through **SRCH-5**: Client-side search, status filter, team filter, reset

> **Note**: Entity search is client-side only and does NOT update URL params.

### Phase 6: Optional Feature Tests

#### 6.1 Observability (requires `OBSERVABILITY_ENABLED`)
- [ ] **OBS-1** through **OBS-9**: Traces, filters, saved queries, sub-tabs, auto-polling

> **Prerequisite**: Tests require trace data or should handle empty states gracefully.

#### 6.2 Performance (requires `MCPGATEWAY_PERFORMANCE_TRACKING`)
- [ ] **PERF-1** through **PERF-5**: Metrics, charts, latency percentiles

#### 6.3 LLM Chat (requires `LLMCHAT_ENABLED`)
- [ ] **LLM-1** through **LLM-4**: Chat interface, model selection, error handling

#### 6.4 LLM Settings (requires `LLMCHAT_ENABLED`)
- [ ] **LLMS-1** through **LLMS-5**: Provider CRUD, model availability

#### 6.5 ToolOps (requires `TOOLOPS_ENABLED`)
- [ ] **TOP-1** through **TOP-2**: Panel loads, configuration display

#### 6.6 Plugins (requires `PLUGINS_ENABLED`, admin only)
- [ ] **PLG-1** through **PLG-5**: List, enable/disable, details, refresh

### Phase 7: Admin-Only Features

#### 7.1 Metrics `#metrics-panel`
- [ ] **MET-1** through **MET-3**: Dashboard cards, key stats, cache statistics

#### 7.2 Export/Import `#export-import-panel`
- [ ] **EXP-1** through **IMP-5**: Export, selective export, import, validation, progress

#### 7.3 System Logs `#logs-panel`
> **Prerequisite**: Requires `STRUCTURED_LOGGING_DATABASE_ENABLED=true`

- [ ] **LOG-1** through **LOG-4**: View, filter, search, download logs

#### 7.4 Maintenance `#maintenance-panel`
- [ ] **MNT-1** through **MNT-5**: Cleanup, rollup operations

#### 7.5 Version Info `#version-info-panel`
- [ ] **VER-1** through **VER-6**: App info, platform info, services status, support bundle

### Phase 8: Visual & Cross-Browser

#### 8.1 Visual Regression (1% Pixel Threshold)
- [ ] **VIS-1** through **VIS-8**: Page baselines, dark mode, mobile, per-browser baselines

#### 8.2 Responsive Design
- [ ] **RESP-1** through **RESP-6**: Mobile/tablet/desktop viewports, sidebar behavior

#### 8.3 Cross-Browser Matrix
- [ ] **XBROW-1** through **XBROW-5**: Chromium, Firefox, WebKit full suites

### Phase 9: CI/CD Integration

- [ ] **CI-1** through **CI-7**: GitHub Actions, parallel execution, artifacts, reports, nightly runs

### Phase 10: Accessibility Testing (WCAG AA Required)

> **⚠️ UI Changes Required**: Many a11y tests will fail on current UI without product changes.
> **Related Issues**: #2480 (manual testing), #2275 (keyboard navigation epic)
> **Recommendation**: Mark tests as `@pytest.mark.xfail` until UI changes are implemented.

#### 10.1 Core Accessibility (axe-core)
- [ ] **A11Y-1** through **A11Y-6**: axe-core scans on all pages, WCAG AA violations

#### 10.2 Keyboard Navigation
- [ ] **A11Y-7** through **A11Y-13**: Tab order, focus indicators, modal trap, arrow keys

#### 10.3 Screen Reader Support
- [ ] **A11Y-14** through **A11Y-19**: Labels, ARIA, live regions, landmarks, headings

#### 10.4 Visual Accessibility
- [ ] **A11Y-20** through **A11Y-23**: Color contrast, dark mode, focus visibility

### Phase 11: Performance & Memory Tests

#### 11.1 Page Load (< 3s threshold)
- [ ] **PERF-T1** through **PERF-T5**: Login, dashboard, tab switch, modal, search

#### 11.2 Memory Leak Detection
- [ ] **MEM-1** through **MEM-5**: Tab switching, modal cycles, HTMX loads, Chart.js cleanup

#### 11.3 Resource Cleanup
- [ ] **MEM-6** through **MEM-8**: XHR cancellation, SSE close, intervals cleared

### Phase 12: Chaos & Multi-Tab Testing (Lower Priority)

#### 12.1 Multi-Tab Scenarios
- [ ] **CHAOS-1** through **CHAOS-5**: Multiple tabs, concurrent edits, cross-tab logout

#### 12.2 Rapid Interaction
- [ ] **CHAOS-6** through **CHAOS-9**: Rapid navigation, double-click prevention

#### 12.3 State Consistency
- [ ] **CHAOS-10** through **CHAOS-13**: LocalStorage, dark mode, tab state persistence

### Phase 13: Network Simulation (Local Only)

#### 13.1 Slow Network (Slow 3G: 500kbps, 400ms latency)
- [ ] **NET-1** through **NET-4**: Loading indicators, navigation, timeout errors

#### 13.2 Offline Mode
- [ ] **NET-5** through **NET-7**: Graceful errors, retry button

#### 13.3 Request Failures
- [ ] **NET-8** through **NET-12**: 500/401/403/404/timeout handling

### Phase 14: Negative & Edge Case Tests

#### 14.1 Input Validation
- [ ] **VAL-1** through **VAL-7**: XSS, SQL injection, long input, unicode, whitespace

#### 14.2 Error Handling
- [ ] **ERR-1** through **ERR-5**: Field errors, duplicates, dependencies, session expiry

#### 14.3 Data Integrity
- [ ] **DATA-1** through **DATA-5**: Immediate list updates after CRUD

### Phase 15: Real-Time & SSE Tests

#### 15.1 SSE Connection
- [ ] **SSE-1** through **SSE-4**: Connect, reconnect, UI updates, logout close

#### 15.2 Live Updates
- [ ] **SSE-5** through **SSE-7**: Log streaming, observability updates, no duplicates

---

## Makefile Targets

| Target | Browsers | Purpose | Time |
|--------|----------|---------|------|
| `test-ui-smoke` | Chromium | Quick sanity check | ~2 min |
| `test-ui-smoke-all` | All 3 | Smoke across browsers | ~5 min |
| `test-ui` / `test-ui-lite` | Chromium | Full suite, single browser | ~10 min |
| `test-ui-full` | All 3 | Complete cross-browser suite | ~25 min |
| `test-ui-a11y` | Chromium | WCAG AA compliance | ~3 min |
| `test-ui-visual` | Chromium | Visual regression (1% threshold) | ~5 min |
| `test-ui-perf` | Chromium | Performance thresholds | ~2 min |
| `test-ui-chaos` | Chromium | Multi-tab stability | ~3 min |
| `test-ui-slow-network` | Chromium | Slow network (local only) | ~5 min |
| `test-ui-ci` | All 3 | CI pipeline target | ~25 min |

---

## Definition of Done

### Per Test
- [ ] Uses appropriate `data-testid` selectors where available
- [ ] Uses `skip_if_feature_disabled` for optional features
- [ ] Is deterministic (no flakiness, no time.sleep)
- [ ] Cleans up created entities
- [ ] Uses `console_collector` fixture to detect JS errors

### Epic Complete
- [ ] All phases completed (1-15)
- [ ] CI pipeline runs full suite on PRs
- [ ] Total execution < 15 minutes (parallel, single browser)
- [ ] Visual regression baselines established (1% threshold)
- [ ] Cross-browser tests passing (Chromium, Firefox, WebKit)
- [ ] Accessibility tests pass WCAG AA
- [ ] Performance tests pass (< 3s page load)
- [ ] Feature detection works correctly (warns but doesn't fail)

---

## Test Count Estimate

| Category | Test Count |
|----------|------------|
| **Core Functionality** | ~180 |
| **Quality & Robustness** | ~99 |
| **Total Unique Tests** | **~279** |
| Full (all 3 browsers) | ~837 |

---

## Success Criteria

| Metric | Target |
|--------|--------|
| Entity CRUD coverage | 100% |
| Authentication flows | 100% |
| CI execution time | < 15 min (single browser) |
| Flaky test rate | < 5% |
| **Accessibility (WCAG AA)** | **100% compliance** |
| Page load time | < 3 seconds |
| Visual regression | 1% pixel threshold |
| Console errors | 0 (auto-detected) |

---

## Available `data-testid` Selectors

| Selector | Element |
|----------|---------|
| `[data-testid="overview-tab"]` | Overview tab link |
| `[data-testid="gateways-tab"]` | MCP Servers tab link |
| `[data-testid="servers-tab"]` | Virtual Servers tab link |
| `[data-testid="tools-tab"]` | Tools tab link |
| `[data-testid="search-input"]` | Search input field |
| *(and more...)* | |

> **⚠️ Important**: The `data-testid` inventory is **incomplete**. Many tabs/inputs do not have test IDs.
> **Fallback Strategy**: `[data-testid]` → `#tab-*` / `#*-panel` → `#element-id` → semantic selectors

---

## References

- **Prerequisites**: #2136 (~~#255 closed as duplicate of this epic~~)
- **Template reference**: #2387 (RBAC epic)
- **Related Accessibility Issues**: #2480, #2275
- **Existing tests**: `tests/playwright/`
- **Admin template**: `mcpgateway/templates/admin.html`
- **Playwright docs**: https://playwright.dev/python/
- **axe-core (a11y)**: https://github.com/dequelabs/axe-core

---

## Important Notes

1. **Feature Detection**: Tests MUST gracefully skip disabled features with a warning.

2. **Test Data Isolation**: Use UUID-suffixed names for created entities.

3. **HTMX Waits**: **Do NOT use `networkidle`** for admin pages - SSE keeps connection open. Use `domcontentloaded` + selector waits.

4. **Authentication**: Uses `PLATFORM_ADMIN_EMAIL/PASSWORD` (form login), NOT `BASIC_AUTH_USER/PASSWORD`.

5. **Console Error Detection**: All tests should use `console_collector` fixture.

6. **Accessibility is Required**: WCAG AA compliance is a requirement, not optional.

7. **Browser Matrix**: `-lite` targets = Chromium only; full targets = all 3 browsers.

8. **Network Simulation**: Local testing only, not run in CI.

9. **Visual Regression**: 1% pixel threshold, per-browser baselines in `baselines/{browser}/`.

---

## Open Questions (Resolved)

1. **data-testid**: Add incrementally as tests are written
2. **Stub Services**: Use skip-with-warning; optionally add stub fixtures
3. **A11y vs UI Changes**: Mark as `xfail` until UI changes implemented (#2275, #2480)
4. **Multi-Browser Scope**: Full matrix for smoke/visual; single browser for full suite
5. **SSE Fixtures**: Create fixtures that call APIs to generate observable events
6. **Password Change Tests**: Separate `test-auth-enforcement.env` profile
7. **Feature Detection Auth**: `enabled_features` depends on `authenticated_page` fixture

Feature Flag	Settings Default	.env.example	UI Tab/Section Affected
`MCPGATEWAY_UI_ENABLED`	`false`	`true`	Entire Admin UI
`MCPGATEWAY_ADMIN_API_ENABLED`	`false`	`true`	Admin API endpoints
`EMAIL_AUTH_ENABLED`	`true`	`true`	Organization section (Teams, Users, Tokens)
`MCPGATEWAY_A2A_ENABLED`	`true`	`true`	Agents -> A2A Agents tab
`MCPGATEWAY_GRPC_ENABLED`	`false`	`false`	Agents -> gRPC Services tab
`PLUGINS_ENABLED`	`false`	`false`	Extensions -> Plugins tab
`LLMCHAT_ENABLED`	`false`	`false`	LLM -> LLM Chat, LLM Settings tabs
`TOOLOPS_ENABLED`	`false`	`false`	MCP -> ToolOps tab
`OBSERVABILITY_ENABLED`	`false`	`false`	Monitoring -> Observability tab
`MCPGATEWAY_PERFORMANCE_TRACKING`	`false`	`false`	Monitoring -> Performance tab
`STRUCTURED_LOGGING_DATABASE_ENABLED`	`false`	`false`	System -> Logs tab (required for log viewing)

Fixture/Class	Purpose
`authenticated_page`	Returns a Page logged into admin UI via form POST
`enabled_features`	Detects which feature tabs are visible (depends on `authenticated_page`)
`skip_if_feature_disabled`	Decorator to skip tests with warning when feature disabled
`ConsoleErrorCollector`	Collects browser console errors, auto-asserts no errors at test end
`NetworkRequestCollector`	Monitors network requests, flags 4xx/5xx failures (with allowlist support)
`VisualComparator`	Screenshot comparison with 1% pixel threshold (requires `pixelmatch`, `Pillow`)
`PerformanceTimer`	Measures page load times, asserts < 3s threshold

Target	Browsers	Purpose	Time
`test-ui-smoke`	Chromium	Quick sanity check	~2 min
`test-ui-smoke-all`	All 3	Smoke across browsers	~5 min
`test-ui` / `test-ui-lite`	Chromium	Full suite, single browser	~10 min
`test-ui-full`	All 3	Complete cross-browser suite	~25 min
`test-ui-a11y`	Chromium	WCAG AA compliance	~3 min
`test-ui-visual`	Chromium	Visual regression (1% threshold)	~5 min
`test-ui-perf`	Chromium	Performance thresholds	~2 min
`test-ui-chaos`	Chromium	Multi-tab stability	~3 min
`test-ui-slow-network`	Chromium	Slow network (local only)	~5 min
`test-ui-ci`	All 3	CI pipeline target	~25 min

Category	Test Count
Core Functionality	~180
Quality & Robustness	~99
Total Unique Tests	~279
Full (all 3 browsers)	~837

Metric	Target
Entity CRUD coverage	100%
Authentication flows	100%
CI execution time	< 15 min (single browser)
Flaky test rate	< 5%
Accessibility (WCAG AA)	100% compliance
Page load time	< 3 seconds
Visual regression	1% pixel threshold
Console errors	0 (auto-detected)

Selector	Element
`[data-testid="overview-tab"]`	Overview tab link
`[data-testid="gateways-tab"]`	MCP Servers tab link
`[data-testid="servers-tab"]`	Virtual Servers tab link
`[data-testid="tools-tab"]`	Tools tab link
`[data-testid="search-input"]`	Search input field
(and more...)

[EPIC][TESTING][UI]: Comprehensive Playwright E2E Test Suite for MCP Gateway Admin UI #2519

Description

[EPIC][TESTING][UI]: Comprehensive Playwright E2E Test Suite for MCP Gateway Admin UI

Goal

Why Now?

Environment Configuration

Feature Flags That Control UI Visibility

Recommended Test Environment .env

Key Test Infrastructure (conftest.py)

User Stories

Test Strategy

Test Organization

Test Markers

Implementation Tasks

Phase 1: Foundation & Fixtures

1.1 Core Fixtures

1.2 Page Objects (Always-On Features)

1.3 Page Objects (Optional Features)

Phase 2: Authentication Tests

Phase 3: Core Entity CRUD Tests (Always-On)

3.1 Virtual Servers (Catalog) #catalog-panel

3.2 MCP Servers (Gateways) #gateways-panel

3.3 Tools #tools-panel

3.4 Resources #resources-panel

3.5 Prompts #prompts-panel

3.6 Roots #roots-panel

3.7 MCP Registry #mcp-registry-panel

Phase 4: Optional Entity CRUD Tests

4.1 A2A Agents (requires MCPGATEWAY_A2A_ENABLED)

4.2 gRPC Services (requires MCPGATEWAY_GRPC_ENABLED)

4.3 Users (requires EMAIL_AUTH_ENABLED + admin)

4.4 Teams (requires EMAIL_AUTH_ENABLED)

4.5 API Tokens (requires EMAIL_AUTH_ENABLED)

Phase 5: UI Interaction Tests

5.1 Navigation & Layout

5.2 Tables & Pagination

5.3 Modals & Forms

5.4 HTMX Interactions

5.5 Search & Filter

Phase 6: Optional Feature Tests

6.1 Observability (requires OBSERVABILITY_ENABLED)

6.2 Performance (requires MCPGATEWAY_PERFORMANCE_TRACKING)

6.3 LLM Chat (requires LLMCHAT_ENABLED)

6.4 LLM Settings (requires LLMCHAT_ENABLED)

6.5 ToolOps (requires TOOLOPS_ENABLED)

6.6 Plugins (requires PLUGINS_ENABLED, admin only)

Phase 7: Admin-Only Features

7.1 Metrics #metrics-panel

7.2 Export/Import #export-import-panel

7.3 System Logs #logs-panel

7.4 Maintenance #maintenance-panel

7.5 Version Info #version-info-panel

Phase 8: Visual & Cross-Browser

8.1 Visual Regression (1% Pixel Threshold)

8.2 Responsive Design

8.3 Cross-Browser Matrix

Phase 9: CI/CD Integration

Phase 10: Accessibility Testing (WCAG AA Required)

10.1 Core Accessibility (axe-core)

10.2 Keyboard Navigation

10.3 Screen Reader Support

10.4 Visual Accessibility

Phase 11: Performance & Memory Tests

11.1 Page Load (< 3s threshold)

11.2 Memory Leak Detection

11.3 Resource Cleanup

Phase 12: Chaos & Multi-Tab Testing (Lower Priority)

12.1 Multi-Tab Scenarios

12.2 Rapid Interaction

12.3 State Consistency

Phase 13: Network Simulation (Local Only)

13.1 Slow Network (Slow 3G: 500kbps, 400ms latency)

13.2 Offline Mode

13.3 Request Failures

Phase 14: Negative & Edge Case Tests

14.1 Input Validation

14.2 Error Handling

14.3 Data Integrity

Phase 15: Real-Time & SSE Tests

15.1 SSE Connection

Recommended Test Environment `.env`

3.1 Virtual Servers (Catalog) `#catalog-panel`

3.2 MCP Servers (Gateways) `#gateways-panel`

3.3 Tools `#tools-panel`

3.4 Resources `#resources-panel`

3.5 Prompts `#prompts-panel`

3.6 Roots `#roots-panel`

3.7 MCP Registry `#mcp-registry-panel`

4.1 A2A Agents (requires `MCPGATEWAY_A2A_ENABLED`)

4.2 gRPC Services (requires `MCPGATEWAY_GRPC_ENABLED`)

4.3 Users (requires `EMAIL_AUTH_ENABLED` + admin)

4.4 Teams (requires `EMAIL_AUTH_ENABLED`)

4.5 API Tokens (requires `EMAIL_AUTH_ENABLED`)

6.1 Observability (requires `OBSERVABILITY_ENABLED`)

6.2 Performance (requires `MCPGATEWAY_PERFORMANCE_TRACKING`)

6.3 LLM Chat (requires `LLMCHAT_ENABLED`)

6.4 LLM Settings (requires `LLMCHAT_ENABLED`)

6.5 ToolOps (requires `TOOLOPS_ENABLED`)

6.6 Plugins (requires `PLUGINS_ENABLED`, admin only)

7.1 Metrics `#metrics-panel`

7.2 Export/Import `#export-import-panel`

7.3 System Logs `#logs-panel`

7.4 Maintenance `#maintenance-panel`

7.5 Version Info `#version-info-panel`

Available `data-testid` Selectors