feat: workspace PDF and DOCX parser plugins

## Problem
The workspace knowledgebase only indexes text, markdown, and code files. PDF and DOCX are common document formats.

## Research
- **PrivateGPT**: PDFReader (llama-index), DocxReader. Supports PDF, DOCX, EPUB, PPTX, images (OCR), IPython notebooks
- **Khoj**: PyMuPDF (fitz) for PDF with pdfplumber fallback, python-docx for DOCX
- **LocalGPT**: PDFMinerLoader or UnstructuredPDFLoader

## Proposed approach
Add two new parser plugins following the existing workspace plugin contract:

### PDF parser: `plugins/workspace/parsers/pdf/`
- Use PyMuPDF (fitz) as primary, pdfplumber as fallback
- Extract text per page with page markers
- Optional dep: `pip install pymupdf`

### DOCX parser: `plugins/workspace/parsers/docx/`
- Use python-docx library
- Extract paragraph text
- Optional dep: `pip install python-docx`

Update `BINARY_SUFFIXES` to exclude `.pdf` and `.docx` (binary but parseable). Add `workspace-docs` optional dep group in pyproject.toml.

## Related
Part of workspace foundation (#5840)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: workspace PDF and DOCX parser plugins #5850

Problem

Research

Proposed approach

PDF parser: `plugins/workspace/parsers/pdf/`

DOCX parser: `plugins/workspace/parsers/docx/`

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: workspace PDF and DOCX parser plugins #5850

Description

Problem

Research

Proposed approach

PDF parser: plugins/workspace/parsers/pdf/

DOCX parser: plugins/workspace/parsers/docx/

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

PDF parser: `plugins/workspace/parsers/pdf/`

DOCX parser: `plugins/workspace/parsers/docx/`