docs: add Effective Tokens (ET) specification#24111
Conversation
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/4a382732-c858-4cd8-ba39-dd968329980f Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new W3C-style reference specification defining Effective Tokens (ET) as a normalized metric for LLM token usage across token classes, model multipliers, and multi-invocation execution graphs.
Changes:
- Introduces normative ET accounting formulas (per-invocation and aggregated across execution graphs).
- Defines required execution-graph node schema and reporting contract (
summary+invocations). - Adds conformance levels and a compliance testing matrix with
T-ET-NNNtest IDs.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ### 3.2 Model Multiplier | ||
|
|
||
| The **Copilot Multiplier** (`m`) is a scalar representing the relative computational intensity of a model versus a defined baseline. Its value is model-specific and MUST be disclosed by the implementation. |
There was a problem hiding this comment.
Section 3.2 defines the multiplier as the “Copilot Multiplier”, but elsewhere the document describes a generic “Model Multiplier” (and the concept is model-relative, not Copilot-specific). This terminology mismatch can confuse implementers and API consumers; consider standardizing on “Model Multiplier” (and, if needed, note that the JSON field name remains copilot_multiplier).
| The **Copilot Multiplier** (`m`) is a scalar representing the relative computational intensity of a model versus a defined baseline. Its value is model-specific and MUST be disclosed by the implementation. | |
| The **Model Multiplier** (`m`, represented as `copilot_multiplier` in JSON payloads) is a scalar representing the relative computational intensity of a model versus a defined baseline. Its value is model-specific and MUST be disclosed by the implementation. |
| ### 5.2 Total Raw Tokens | ||
|
|
||
| ``` | ||
| raw_total_tokens = Σ (I_i + C_i + O_i + R_i) | ||
| ``` |
There was a problem hiding this comment.
raw_total_tokens is defined in Section 4.1 as a per-invocation value, but Section 5.2 reuses the same identifier for the aggregated total across invocations. Reusing the same symbol for two different quantities makes the spec ambiguous; consider renaming one (e.g., raw_total_tokens_i vs raw_total_tokens_total) or explicitly distinguishing per-invocation vs aggregated totals in the formulas.
| A conforming response MUST include a `summary` object alongside the `invocations` array: | ||
|
|
||
| ```json | ||
| { | ||
| "summary": { | ||
| "total_invocations": number, | ||
| "raw_total_tokens": number, | ||
| "base_weighted_tokens": number, | ||
| "effective_tokens": number | ||
| }, |
There was a problem hiding this comment.
The summary object requires a base_weighted_tokens field, but the spec never defines how to aggregate base-weighted tokens across multiple invocations (Section 5 defines ET_total and raw_total_tokens, but not base_weighted_tokens_total). Add a normative definition (e.g., base_weighted_tokens_total = Σ base_weighted_tokens_i) so implementations can produce consistent summaries.
| ```json | ||
| { | ||
| "invocations": [ | ||
| { | ||
| "id": "root", | ||
| "parent_id": null, | ||
| "model": { "name": "model-a", "copilot_multiplier": 2.0 }, | ||
| "usage": { | ||
| "input_tokens": 500, | ||
| "cached_input_tokens": 200, | ||
| "output_tokens": 150, | ||
| "reasoning_tokens": 0 | ||
| } | ||
| }, |
There was a problem hiding this comment.
Section 6.1 states each invocation node MUST include a derived object with base_weighted_tokens and effective_tokens, but the worked example input data omits derived for every invocation. Either update the example to include the required derived fields, or relax the schema (e.g., make derived OPTIONAL at lower conformance levels / when only raw usage is provided).
| ```json | ||
| { | ||
| "summary": { | ||
| "total_invocations": 3, | ||
| "raw_total_tokens": 1800, | ||
| "base_weighted_tokens": 3030, | ||
| "effective_tokens": 5360 | ||
| } | ||
| } |
There was a problem hiding this comment.
Section 7 says a conforming response MUST include both a summary object and an invocations array, but Appendix A.4’s output example only shows summary. If the example is meant to be conforming, include the invocations array (or explicitly label it as a partial/minimal example and align that with the conformance requirements).
| ### 8.4 Partial Visibility | ||
|
|
||
| When sub-agents are not fully observable, implementations MUST still report aggregate totals. Invocation nodes with incomplete data SHOULD be flagged to indicate missing information. | ||
|
|
There was a problem hiding this comment.
Section 8.4 says invocation nodes with incomplete data “SHOULD be flagged”, but the node schema doesn’t define any field/mechanism for representing this flag (and Section 9 extensions don’t specify a recommended shape). Consider adding a standard field (e.g., derived.incomplete, flags, or missing_fields) so different implementations don’t invent incompatible representations.
| With default weights: | ||
|
|
||
| ``` | ||
| ET_total = Σ [ m_i × (I_i + 0.1 C_i + 4 O_i + 4 R_i) ] |
There was a problem hiding this comment.
In Appendix B’s default-weights formula, terms like 0.1 C_i / 4 O_i rely on implied multiplication, which is easy to misread in a normative formula section. Consider using explicit multiplication (e.g., 0.1 × C_i, 4 × O_i, 4 × R_i) to match the earlier formula style and reduce ambiguity.
| ET_total = Σ [ m_i × (I_i + 0.1 C_i + 4 O_i + 4 R_i) ] | |
| ET_total = Σ [ m_i × (I_i + 0.1 × C_i + 4 × O_i + 4 × R_i) ] |
Adds a W3C-style formal specification for Effective Tokens (ET) — a normalized scalar metric for LLM token usage that accounts for token class weights and per-model computational multipliers across multi-agent execution graphs.
What's included
ET = m × (I + 0.1C + 4O + 4R)with default weights and override disclosure requirementsET_total = Σ [m_i × base_weighted_tokens_i]across full execution graphs including sub-agents and tool-triggered callsid,parent_id,model,usage, andderivedfieldssummaryobject withtotal_invocations,raw_total_tokens,base_weighted_tokens,effective_tokensT-ET-NNNtest IDs across accounting, aggregation, graph, and reporting categoriesFollows the same structure and
sidebar.orderas the other four specs indocs/src/content/docs/reference/.