26 Comments
User's avatar
Alejandro Aboy's avatar

Amazing Jenny! Some extras you could find interesting: RTK for token compression, claude-hud plugin to track usage and other artifacts in the terminal. They are really cool tools!

Jenny Ouyang's avatar

Great to learn about those Alejandro! Thanks for sharing it!

Bianca Schulz's avatar

Have ever thought about using a local model on a strong hardware, or open source frameworks for AI agents with a cheaper model subscription? I guess there must be reason why you stay with Claude. What have been your thoughts?

Jenny Ouyang's avatar

I did try different models, especially locally hosted ones. My argument to myself was that if I want to stay on top of the latest changes and really take advantage of Claude’s instruction-following capabilities, this is the price I’m willing to pay.

Of course, now I’m really paying the price 😂

Patrick Schaber's avatar

Wow, Jenny! There were a few in here that I did not know about. Thank you so much for putting this together. I think you saved me a ton of money!

Jenny Ouyang's avatar

Glad it helps Patrick!

Abby Keul's avatar

This is really helpful, Jenny. I have been burning quickly through my Claude Pro usage. I fed your advice into Claude and it helped me customize my token reduction strategy even further.

Jenny Ouyang's avatar

I’m so happy to hear that, Abby! Glad it is of practical use for you :)

My other self's avatar

Thank you, thank you, thank you!

Jenny Ouyang's avatar

You are very welcome!

Hope this saves you some thing :)

Karen Spinner's avatar

All great tips! Thank you for sharing the surprise bill from hell and explaining how to avoid it!

Jenny Ouyang's avatar

Thank you for reading it! I hope no one gets the same kind of bill :)

Karen Spinner's avatar

💯

Julia | Taking you global's avatar

Great article, Jenny!

Jenny Ouyang's avatar

Thank you Julia 🤗

AI Meets Girlboss's avatar

Nothing teaches prompt discipline quite like a $1600 bill hitting my inbox. 🫣Loved how practical this was, especially the token-saving tips.🩷🦩

Jenny Ouyang's avatar

I learnt this the hard way, and hope anyone reading this won’t get hit by it the same ways 😅

Thank you Pinkie!

David Richard's avatar

I just started using (discovered?) the /advisor setting for my coding work. Have you explored it?

Jenny Ouyang's avatar

I have not used it before, looks like it's a very useful setting. Been eagerly exploring it now :)

David Richard's avatar

This just happened on my end:

⏺ Now I have everything I need. Let me get the advisor's perspective before writing a complex new file.

⏺ Advising using Opus 4.7

⎿  ✔ Advisor has reviewed the conversation and will apply the feedback

So I guess Claude is getting a second opinion to check what Sonnet came up with. Kinda cool.

Peter Simmons's avatar

That is a context-memory manager designed for you to save everything important without filling up actual context and only retrieving what you need.

Jenny Ouyang's avatar

Great work Peter!

Luc B. Perussault-Diallo's avatar

The tool output accumulation insight is underrated. It's not just that the output is big. It's that a 2,000-line raw test log dumped into context doesn't actually help the model reason about the failure any better than a 20-line structured summary would. Volume and usefulness are different things, and most optimization advice only addresses the first one.

Subagent delegation helps. But the other option is giving the model pre-analyzed context (what calls what, what's in scope for this change) so it doesn't need to discover relationships through expensive tool calls in the first place. That's the angle I'm exploring with Sense (https://luuuc.github.io/sense).