Skip to content

Feature Requrest: /compress-fast non-AI assisted context reduction #4264

@fantasyz

Description

@fantasyz

What would you like to be added?

A very fast context compression process that only trim unwanted data from chat without using AI/LLM.
Optionally, provide user a multiple choice to select what to remove.

For example:

Choose what to remove:
1. Tool calls + thinking.
2. Everything except AI last response.
3. Cancel

Basically, from my observation, AI context consist of

  1. user input
  2. thinking
  3. tool calls
  4. final reply to user

The final reply generally conclude the results of thinking and tool call. For a less accurate summary, simply pruning these two is enough in some use cases. It help speed things up for manual compress use case.

Why is this needed?

The current compression is too slow because it use AI to summarize the chat. I found myself most of the time don't really need a high quality summary to continue the work.

Sometime, I just need the conclusion which is resided in the final response in the chat to start the next phase. In this case, I can just option 2 suggested above to start a new round of work.

Some other time I just want a little more context window for a little more work. Instead of waiting a few minutes for /compress to summarize, I can use option 1 to start right away.

Additional context

It is especially helpful for local host model setup as local model generally run slower than cloud service. A choice for user to do faster but less accurate summary is very welcomed.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions