epic: Python Security Testing & Fuzzing Initiative for PowerPoint Skill

## Overview

Establish a comprehensive security testing and fuzzing strategy for the PowerPoint skill Python codebase (~4,700 lines across 14 modules under `.github/skills/experimental/powerpoint/scripts/`).

This initiative implements **Scenario D (Hypothesis + pip-audit)** from the ClusterFuzzLite research evaluation, leveraging the existing CodeQL Python analysis for SAST coverage. The approach was selected for its highest value-to-effort ratio given the codebase's structured-input nature and size.

## Background

A thorough security-focused codebase analysis identified 5 security findings (1 CRITICAL, 3 HIGH, 1 MODERATE), a complete absence of property-based or fuzz testing (all ~300+ tests are deterministic), and no dependency CVE scanning. ClusterFuzzLite was evaluated but rejected as the primary approach due to structured input mismatch, Python >=3.11 incompatibility with the default base image, and disproportionate setup complexity.

## Three-Phase Implementation

### Phase 1: Hypothesis Property Tests (High Priority)
Add `hypothesis>=6.100` to dev dependencies and write property tests targeting priority modules:
- `validate_slides.py` / `validate_deck.py` — input validation robustness
- `build_deck.py` — element builder dispatch with arbitrary element definitions
- `pptx_colors.py` — hex color parsing edge cases
- `pptx_tables.py` — merge bounds and out-of-range handling

### Phase 2: pip-audit Dependency CVE Scanning (High Priority)
Add `pip-audit` CI step to scan `pyproject.toml` dependencies (python-pptx, pyyaml, pymupdf, lxml) for known CVEs using open vulnerability databases (PyPI Advisory Database, OSV).

### Phase 3: OSSF Scorecard Fuzzing Compliance (Medium Priority)
Add a thin Atheris wrapper using the polyglot pattern so that `import atheris` is detectable by OSSF Scorecard's Fuzzing check. Hypothesis alone scores 0/10 since Scorecard only recognizes `import atheris` for Python.

## Security Findings to Address

| Severity | Finding | Location |
|----------|---------|----------|
| CRITICAL | Arbitrary code execution via `importlib` | `build_deck.py` |
| HIGH | XML parsing (XXE vector) via `lxml.etree.fromstring()` | `extract_content.py` |
| HIGH | Untrusted binary blob writes | `extract_content.py` |
| HIGH | PyMuPDF C extension attack surface | `export_slides.py`, `render_pdf_images.py` |
| MODERATE | Recursive processing without depth limits | Multiple modules |

## Existing Security Coverage

- **CodeQL**: `security-extended,security-and-quality` query suites for `actions` and `python` — runs on every PR, on-demand, and weekly
- **OpenSSF Scorecard**: Weekly runs on Sundays + push to main
- **gitleaks**: Secret scanning (devcontainer-only)

## Sub-Issues

This epic tracks the following work items:
- [ ] Hypothesis property tests for priority modules
- [ ] pip-audit dependency CVE scanning
- [ ] Atheris wrapper for Scorecard compliance
- [ ] importlib code execution remediation (CRITICAL)
- [ ] lxml XXE vector remediation (HIGH)
- [ ] Untrusted blob writes remediation (HIGH)
- [ ] PyMuPDF attack surface assessment (HIGH)
- [ ] Recursive processing depth limits (MODERATE)
- [ ] Contribute Hypothesis detection to ossf/scorecard (upstream)

## Acceptance Criteria

- [ ] All three implementation phases have corresponding issues with clear task-* RPI deliverables
- [ ] All five security findings have corresponding issues with severity labels and remediation guidance
- [ ] Sub-issue relationships are established for dependency tracking
- [ ] Labels are applied consistently across all child issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epic: Python Security Testing & Fuzzing Initiative for PowerPoint Skill #1012

Overview

Background

Three-Phase Implementation

Phase 1: Hypothesis Property Tests (High Priority)

Phase 2: pip-audit Dependency CVE Scanning (High Priority)

Phase 3: OSSF Scorecard Fuzzing Compliance (Medium Priority)

Security Findings to Address

Existing Security Coverage

Sub-Issues

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Severity	Finding	Location
CRITICAL	Arbitrary code execution via `importlib`	`build_deck.py`
HIGH	XML parsing (XXE vector) via `lxml.etree.fromstring()`	`extract_content.py`
HIGH	Untrusted binary blob writes	`extract_content.py`
HIGH	PyMuPDF C extension attack surface	`export_slides.py`, `render_pdf_images.py`
MODERATE	Recursive processing without depth limits	Multiple modules

epic: Python Security Testing & Fuzzing Initiative for PowerPoint Skill #1012

Description

Overview

Background

Three-Phase Implementation

Phase 1: Hypothesis Property Tests (High Priority)

Phase 2: pip-audit Dependency CVE Scanning (High Priority)

Phase 3: OSSF Scorecard Fuzzing Compliance (Medium Priority)

Security Findings to Address

Existing Security Coverage

Sub-Issues

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions