SEP-2484: Require Conformance Tests for Standards Track SEPs to Reach Final Status#2484
Conversation
|
Preview deployment for your docs. Learn more about Mintlify Previews.
|
| **What's required:** | ||
|
|
||
| - A conformance scenario tagged with the SEP number, targeting the `draft` spec version | ||
| - A structured traceability file (`sep-NNNN.yaml`) mapping each MUST/MUST NOT in the SEP's Specification section to either a check ID or a documented exclusion with a linked tracking issue |
There was a problem hiding this comment.
Should SHOULD/SHOULD NOT be mapped to a "warning" (not "failure") status?(Maybe a bigger question for the conformance suite at large)
There was a problem hiding this comment.
Yea, that is how we've been representing them in conformance. I left it out mapping them as a requirement here initially to keep the burden small, but reflecting more now, it makes sense to require mapping them too. They have a burden on implementors, so having it be a burden on SEP writers to test makes sense.
|
|
||
| Exclusions come in two flavors. **Framework gaps** (the behavior is observable but the framework can't express it yet) should link a tracking `issue`. **Not protocol-observable** (the requirement governs client rendering, implementation internals, or similar) needs only the `excluded` reason. A SEP whose requirements are all the second kind is exempt and doesn't need a scenario at all. | ||
|
|
||
| The sponsor verifies the traceability file is complete: every MUST and MUST NOT (and RFC 2119 equivalents: SHALL, REQUIRED) in the SEP's Specification section has a row. SHOULD and MAY requirements do not need rows, although it is recommended to include SHOULD's a WARNING check. The sponsor does not review test code; that is the conformance repository's normal PR review. What counts as a normative requirement is the sponsor's call. |
There was a problem hiding this comment.
| The sponsor verifies the traceability file is complete: every MUST and MUST NOT (and RFC 2119 equivalents: SHALL, REQUIRED) in the SEP's Specification section has a row. SHOULD and MAY requirements do not need rows, although it is recommended to include SHOULD's a WARNING check. The sponsor does not review test code; that is the conformance repository's normal PR review. What counts as a normative requirement is the sponsor's call. | |
| The sponsor verifies the traceability file is complete: every MUST and MUST NOT (and RFC 2119 equivalents: SHALL, REQUIRED) in the SEP's Specification section has a row. SHOULD and MAY requirements do not need rows, although it is recommended to include SHOULDs as a WARNING check. The sponsor does not review test code; that is the conformance repository's normal PR review. What counts as a normative requirement is the sponsor's call. |
There was a problem hiding this comment.
I'd be in favor of upgrading recommended to include SHOULDs -> required, since many protocol behaviors end up being SHOULDs. But I am OK leaving this up to the discretion of the author or sponsor.
|
I'm supportive of this! Will be a great way to systematize the spec and introducing it into the SEP process is the right place to start doing it. |
State Transition: proposal → draftThis SEP has been transitioned from proposal to draft. @pcarleton has been assigned as the sponsor for this SEP. This is an automated message from the SEP lifecycle bot. |
|
I am in favor of this. My main worry, similar with reference implementations, is tracking that SEPs have it. |
mikekistler
left a comment
There was a problem hiding this comment.
I think this would be a great improvement to our spec evolution process.
|
|
||
| **What's required:** | ||
|
|
||
| - A conformance scenario tagged with the SEP number, targeting the `draft` spec version |
There was a problem hiding this comment.
There was some discussion about this in the transports-wg last week, and there was some concern about using "draft" as the spec version for this since it will mean different things at different times. The suggestion there was to use something like "2026-06-XX-draft", which more specifically means "the draft that should eventually become "2026-06-XX".
|
|
||
| - A conformance scenario tagged with the SEP number, targeting the `draft` spec version | ||
| - A structured traceability file (`sep-NNNN.yaml`) mapping each MUST/MUST NOT in the SEP's Specification section to either a check ID or a documented exclusion with a linked tracking issue | ||
| - The scenario passes against the SEP's reference implementation |
There was a problem hiding this comment.
I've recently been trying to adopt this proposal for SEP-2243, and I found that to get this to work fully both the SDK and the conformance tests need to treat "draft" (or whatever we wind up calling the next protocol version) as an official and latest protocol version. This is needed to get the client tests to use this version and thus be subject to the requirements of this protocol version.
|
This was accepted by Core Maintainer vote with 10/10 voting for Accept. |
Addresses review feedback: traceability files must now cover SHOULD/SHOULD NOT (reported as warnings), scenarios target a YYYY-MM-draft tag rather than a bare draft, and the harness/SDK must recognize that tag as a negotiable protocol version.
|
New commits were pushed — removed the |
…ion wording - Drop 'co-located' (yamls live in a single conformance directory now) - YAML example: check: key first on check rows, blank line before excluded - Replace YYYY-MM-draft references with 'the conformance repository's draft spec-version tag' so the conformance repo owns the exact string - sep-guidelines: tracking issue only required for framework-gap exclusions, not all exclusions
Regenerated docs/seps/index.mdx to resolve the conflict from new SEPs landing on main. :house: Remote-Dev: homespace
|
/lgtm |
Adds a conformance test requirement to the
Accepted → Finaltransition for Standards Track SEPs.Summary
Before a Standards Track SEP that changes observable protocol behavior can be marked
Final:sep-NNNN.yaml) mapping each MUST/MUST NOT to a check or a documented exclusionProcess and Informational SEPs are exempt, as are Standards Track SEPs with no observable protocol behavior.
Why
SEP-1730's SDK tiering depends on conformance tests, but nothing keeps the suite synchronized with the spec. This ties test creation to the SEP lifecycle so the suite grows exactly as fast as the spec does.
Supersedes
SEP-1627 (Conformance Testing)
Changes
seps/0000-*.md— the SEP itself (number will be updated to match this PR)docs/community/sep-guidelines.mdx— adds conformance test gate to theAccepted → Finalworkflow, updates flowchart and status table