Context
Phase 1 (#2185, PR #2198) delivered CandleClassifier for injection detection only. Phase 2 expands the classifier infrastructure with three additional backends and two new task integrations.
Prerequisite: PR #2198 merged and live-tested. Collect real latency/FPR data from injection detection before Phase 2 architecture decisions.
Scope
1. OnnxClassifier via ort crate
- Backend:
pykeio/ort wrapping ONNX Runtime (faster than Candle for encoder inference, 3–5x on CPU)
- Defer until
ort reaches stable release (currently 2.0.0-rc.x)
- Models:
protectai/deberta-v3-base-injection-onnx, protectai/deberta-v3-base-zeroshot-v1-onnx
ClassifierBackend trait already object-safe — new backend is a drop-in
2. PII detection via iiiorg/piiranha-v1-detect-personal-information
- Task:
ClassifierTask::Pii (token classification / NER, not sequence classification)
- Candle backend:
candle_transformers::models::deberta_v2 already supports NER head
- 6 languages, 17 PII types (email, phone, SSN, credit card, etc.)
- Hybrid approach: keep regex fast path, add piiranha second pass for contextual PII
- Extend
ClassifiersConfig with pii_model and pii_threshold fields
3. LlmClassifier for FeedbackDetector
- Task:
ClassifierTask::Feedback — detect user corrections/disagreements for skill learning
- Backend: zero-shot via existing
[[llm.providers]] (gpt-4o-mini or similar)
- Config:
feedback_provider field referencing a [[llm.providers]] name
- Replace
detector_mode = "regex" with detector_mode = "model" option; keep regex as fallback
- No labeled dataset available — bootstrap with zero-shot prompt
4. Config additions
[classifiers]
# existing Phase 1 fields ...
pii_model = "iiiorg/piiranha-v1-detect-personal-information"
pii_threshold = 0.85
pii_enabled = false
feedback_provider = "" # references [[llm.providers]] name; empty = skip
5. TUI / observability
- Spinner during model load (TUI rule: all background ops must show status)
--init wizard entries for new classifier fields
- Classifier latency metrics in TUI metrics panel (p50/p95 per task type)
6. Model hash verification (security finding #5 from PR #2198 audit)
- Optional
injection_model_sha256 / pii_model_sha256 config fields
- Verify downloaded safetensors against pinned hash before loading
Research dependencies
Notes
ort RC stability: check release status before starting — do not add RC dependency to full feature
- Llama Guard 3-1B (2–5s CPU) remains async-only post-processing, not inline
- Credential patterns (sk-, AKIA, ghp_, Bearer) stay regex permanently
Context
Phase 1 (#2185, PR #2198) delivered
CandleClassifierfor injection detection only. Phase 2 expands the classifier infrastructure with three additional backends and two new task integrations.Prerequisite: PR #2198 merged and live-tested. Collect real latency/FPR data from injection detection before Phase 2 architecture decisions.
Scope
1. OnnxClassifier via
ortcratepykeio/ortwrapping ONNX Runtime (faster than Candle for encoder inference, 3–5x on CPU)ortreaches stable release (currently 2.0.0-rc.x)protectai/deberta-v3-base-injection-onnx,protectai/deberta-v3-base-zeroshot-v1-onnxClassifierBackendtrait already object-safe — new backend is a drop-in2. PII detection via
iiiorg/piiranha-v1-detect-personal-informationClassifierTask::Pii(token classification / NER, not sequence classification)candle_transformers::models::deberta_v2already supports NER headClassifiersConfigwithpii_modelandpii_thresholdfields3. LlmClassifier for FeedbackDetector
ClassifierTask::Feedback— detect user corrections/disagreements for skill learning[[llm.providers]](gpt-4o-mini or similar)feedback_providerfield referencing a[[llm.providers]]namedetector_mode = "regex"withdetector_mode = "model"option; keep regex as fallback4. Config additions
5. TUI / observability
--initwizard entries for new classifier fields6. Model hash verification (security finding #5 from PR #2198 audit)
injection_model_sha256/pii_model_sha256config fieldsResearch dependencies
Notes
ortRC stability: check release status before starting — do not add RC dependency tofullfeature