-
Notifications
You must be signed in to change notification settings - Fork 870
Discussion: Skill composition without context bloat #11
Description
The problem
When skills are chained together (extracting data, transforming it, then generating a report) intermediate results often flow through the model context. Each step introduces friction:
- Higher cost: large payloads are repeatedly serialized as tokens
- Lower reliability: the model acts as a lossy clipboard where fields get dropped, values paraphrased, or numbers rounded
- Worse context quality: intermediate data crowds out the information needed for later reasoning
This is a data transport problem: the model is being used to move data between steps when that data could flow directly and deterministically.
Direction to explore
Consider whether the spec should support composable skills, so that multi-step workflows can execute programmatically rather than pass through the model context at every step.
Two ideas seem foundational.
1. Well-defined inputs and outputs
For skills to be pluggable, they need to declare what they accept and what they return. This may already be partially modeled in the spec, but reliable composition requires these contracts to be explicit and machine-readable so that outputs from one skill can safely become inputs to another.
2. Programmatic composition
A composition model could let the calling agent express a small amount of orchestration logic that:
- Defines a sequence of skill invocations
- Validates that the chaining respects input/output contracts (type safety)
- Applies lightweight transformations to pass data between skills and control what returns to the model
The model remains responsible for planning and decision-making. The runtime handles data movement and execution. Intermediate results stay out of context unless explicitly requested.
This shifts chained workflows from "LLM as transport layer" to a cleaner separation between the reasoning plane and the data/execution plane.
Illustrative example
The following TypeScript snippet is not a proposed API. It is only meant to illustrate the concept of skill composition where intermediate data stays out of context.
// The model writes this composition script.
// The runtime executes it. Intermediate data never enters the model context.
import { run as fetchSalesData } from "sales-data-skill";
import { run as analyzeTransactions } from "analysis-skill";
import { run as generateChart } from "chart-skill";
// Step 1: Fetch raw sales data (could be thousands of rows)
const sales = await fetchSalesData({ region: "EMEA", year: 2024 });
// Step 2: Analyze transactions (model never sees the raw rows)
const analysis = await analyzeTransactions({ transactions: sales.data.rows });
// Step 3: Generate chart from analysis
const chart = await generateChart({ summary: analysis.data.summary });
// Only return what the model needs for the next reasoning step
return {
chartUrl: chart.data.url,
insight: analysis.data.topInsight,
recordCount: sales.data.rows.length
};In this example, thousands of transaction rows flow between skills but never enter the model context. The model receives only a chart URL, a single insight, and a count. The runtime handled all the data movement.
Related work
- Cloudflare Code Mode: model writes glue code, runtime executes, intermediates stay out of context https://blog.cloudflare.com/code-mode/
- Anthropic Code execution with MCP: perform work in an execution environment, keep context lean https://www.anthropic.com/engineering/code-execution-with-mcp
Questions for maintainers
- Is skill composition in scope for this spec, or better handled entirely by host runtimes?
- What constraints matter most? (portability, sandboxing, observability, determinism)
Happy to follow up with a concrete proposal or prototype once there is alignment on direction.