`/review` produces high false-positive rate when applied to mature frameworks (Django)

# Description

Hi! Thanks for gstack — it's been a core part of my Claude Code workflow
for two consecutive sprints. I wanted to share data on `/review` false
positive rate that might be useful for tuning the prompt.

## Context

- Project: Django + DRF + PostgreSQL full-stack app (PoC for construction
  site management)
- Sprint 2 (`/ship` adversarial review): 1+ false positive (Finding #8)
- Sprint 2.5 (`/review`): 4 false positives out of 8 findings (50%)

Both runs raised concerns that were resolvable in under 5 minutes by
inspecting the relevant code or model definitions, suggesting the
adversarial review prompt may be raising hypotheses without first applying
basic self-verification.

## Specific examples from Sprint 2.5

The four false positives all fit a pattern: **"resolvable in <5 minutes by
viewing the actual code or running a simple grep"**.

| # | Concern raised | Resolution |
|---|---|---|
| FP-1 | `dict.get()` might be None-unsafe | Django form's `cleaned_data` is `{}`-initialized — visible by reading the form code |
| FP-2 | `rental.save()` might lose fields | Standard Django ORM INSERT behavior for unsaved instances |
| FP-3 | `update_fields` might miss `updated_at` | Field doesn't exist on the model — `grep` resolves immediately |
| FP-4 | 3 classes might be asymmetric | Mechanical comparison shows they're symmetric |

## What was correctly identified (true positives)

The same `/review` run also correctly identified:

- F-1: Test calling `rental.save()` directly bypasses `save_model` mutation
  testing (real coverage gap)
- F-3: Django 3.1+ `_post_clean()` calls `validate_constraints()` before
  `save_model`, causing UniqueConstraint to fire before the ServiceLayer
  can resolve the conflict (real bug, surfaced only via browser QA)

These were valuable findings — they required cross-layer reasoning
(test/admin/form/DB) and were not resolvable by simple code inspection.

## Suggested improvement

What worked for me was adding a self-check **before** reporting findings:

> Before reporting a finding, ask:
> 1. Can I resolve this by `view`-ing 1-2 files in under 5 minutes?
>    - Yes → resolve it, don't report it
>    - No → report it with Y-10 evidence
> 2. Is this reproducible by existing pytest tests?
>    - Yes → likely already covered, re-check before flagging
>    - No → likely a real-environment issue, worth flagging
> 3. Is the answer in the framework's surface-level documentation?
>    - Yes → skip (or just cite the docs)
>    - No → genuine internal-behavior concern, worth flagging

I added this as a project-level rule in my `CLAUDE.md`. Will measure
Sprint 3+ false positive rates to validate.

## Question

Would gstack be open to:
- (a) Adding a "self-verification gate" to the `/review` prompt before
  finding generation, or
- (b) Documenting this pattern in gstack docs (e.g., as a known caveat
  with Django/mature frameworks)?

Happy to contribute either way. Let me know if you'd like to see the full
context (sprint retrospective notes are public in my repo).

Happy hacking 🙏


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`/review` produces high false-positive rate when applied to mature frameworks (Django) #1539

Description

Context

Specific examples from Sprint 2.5

What was correctly identified (true positives)

Suggested improvement

Question

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

#	Concern raised	Resolution
FP-1	`dict.get()` might be None-unsafe	Django form's `cleaned_data` is `{}`-initialized — visible by reading the form code
FP-2	`rental.save()` might lose fields	Standard Django ORM INSERT behavior for unsaved instances
FP-3	`update_fields` might miss `updated_at`	Field doesn't exist on the model — `grep` resolves immediately
FP-4	3 classes might be asymmetric	Mechanical comparison shows they're symmetric

/review produces high false-positive rate when applied to mature frameworks (Django) #1539

Description

Description

Context

Specific examples from Sprint 2.5

What was correctly identified (true positives)

Suggested improvement

Question

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

`/review` produces high false-positive rate when applied to mature frameworks (Django) #1539