Checklist
Motivation
Description
It would be beneficial to introduce model hooks that allow users to access and modify model activations. This feature would enable greater flexibility for tasks such as visualization, debugging, and custom processing of intermediate representations.
Use case
- Extract intermediate outputs for interpretability analysis, such as LogitLens-style investigations.
- Expose internal activations, enabling users to cache activations and implement functions to edit, remove, or replace them dynamically during inference, for example representation engineering.
While this may introduce some performance overhead, it would enhance interpretability research and enable efficient model editing.
Related resources
model hook resources
related issues and use case
Checklist
Motivation
Description
It would be beneficial to introduce model hooks that allow users to access and modify model activations. This feature would enable greater flexibility for tasks such as visualization, debugging, and custom processing of intermediate representations.
Use case
While this may introduce some performance overhead, it would enhance interpretability research and enable efficient model editing.
Related resources
model hook resources
related issues and use case