Field notes from AI security: five things I wish you knew
Foundation #1: Your data, their systems.
Quick question: do you know what happens to everything you type into an AI?
Yeah, me neither. Not really.
I’m launching the Field Notes series today with someone who does.
I asked ToxSec to teach me five things from his area of expertise, AI security.
I got ten.
That generosity is exactly why we’re starting here.
So, who is ToxSec?
I discovered ToxSec on Substack very early on, while deep in the mystery of trying to figure out how this platform even worked. I read a post about AI security that made immediate sense - clear language, no tech-bro speak, just practical wisdom about real risks. I understood the core message and thought I might be seeing a connection to something I’d observed in my own work, although I wasn’t sure. I hesitated before commenting. Was I intruding in a domain that wasn’t mine? Would it be obvious I came from a completely different lane?
I hit post on my question anyway.
ToxSec responded with thoughtful, kind insight. I was on the right track. The message I’d understood was exactly what he’d meant. That was the first of many exchanges. ToxSec uses his platform to teach, share key AI developments and research, to engage generously with readers who come from different worlds. There’s no gatekeeping, just genuine useful advice for real-world issues.
When I got the idea to launch Field Notes, ToxSec immediately came to mind. How practical to lay the first foundation stone as one that covers core AI security principles we all need to grasp, no matter where we are on this journey. How perfect to learn from someone who’s proven they can make complicated concepts accessible.
I reached out. Got an immediate yes! I was elated.
Today, you’re getting the first five things. The remaining five will appear soon - they’re too good to leave gathering dust.
There’s something here for everyone, regardless of where you’re starting from.
Take it away, ToxSec!
ToxSec is currently working as a Cybersecurity Engineer at Amazon, specialising in Al Security. He has also worked cyber at the NSA, and for U.S. defence contractors as a software developer. He holds a Master’s in Cybersecurity Engineering (UW) and an active CISSP.
1. The Therapist Trap
Many people use ChatGPT as their therapist. I get it. It’s available, its patient, it says validating things, and it never judges you.
But here’s what’s actually happening: the model is pattern-matching your language and reflecting it back in a way that sounds supportive. It’s a mirror, not a mind. It doesn’t remember your breakthrough from last month. It doesn’t track your growth. It doesn’t actually know you.
This is the part people don’t think about: everything you type is going somewhere.
When you pour your heart out to ChatGPT about your marriage, your anxiety, your workplace conflict, your health scare... that’s data now. OpenAI’s privacy policy lets them use your conversations for training unless you specifically opt out. That means your 2am vulnerability spiral might be helping teach the next model what “supportive” sounds like. Further, just because you opt out of letting OpenAI train on your conversation, does not mean they don’t collect on it! Everything you say is theirs to keep.
Maybe you’re fine with that. But most people don’t even know they’re making that trade. They think they’re in a private conversation. They’re not. They’re in a product.
This isn’t me saying “don’t use it for emotional processing.” Plenty of people find it genuinely helpful for journaling, reframing thoughts, or just venting. Just know what you’re trading.
A good therapist holds your history, is bound by confidentiality laws, and sometimes tells you things you don’t want to hear. AI will keep validating you forever, and your data goes into the machine.
Use it as a tool. Just know what tool it actually is, and who else is in the room.
2. The Model Isn’t Reading Your Mind
When you leave things ambiguous, the model fills in the gaps. The problem is it fills them with its assumptions, not yours.
“Write me a blog post about productivity.”
Okay. What length? What tone? What audience? What angle? The model will pick something for all of these. And it might be completely wrong for what you actually wanted.
You’ll get output that technically matches your request but misses entirely. Then you’ll spend three rounds going “no, not like that” when you could have just been specific upfront.
This isn’t the model’s fault. Ambiguity is hard. Even humans struggle with underspecified requests.
Here’s the security angle most people miss: when you try to fix the ambiguity problem, you often overshare. “Help me respond to this email” becomes pasting in your colleague’s full message, their name, your company’s internal project details, the context of your disagreement... You’re being specific (good!) but you’re also handing over information you’d never post publicly. Before you paste, ask yourself: would I put this on Twitter? If not, maybe sanitise the names and details first.
The fix: Before you prompt, ask yourself what decisions you’re leaving to the AI. If those decisions matter, make them yourself and include them in the prompt. Length. Tone. Audience. Format. Angle. The more you specify, the less the model has to guess.
3. The Context Window is a Goldfish
Here’s something that changes how you should use these tools: AI doesn’t have memory the way you think it does.
When you’re in a long conversation, the model isn’t “learning” you. It’s holding text in what’s called a context window. Think of it like a container. New stuff goes in, and when it fills up, old stuff falls out the back. Silently. Without warning.
That brilliant setup you wrote 40 messages ago? The nuanced preferences you explained at the start? Probably gone. The model isn’t ignoring you. It literally can’t see that text anymore.
This is why long conversations get weird. The AI seems to “forget” things you told it. It starts contradicting earlier answers. It’s not broken. It’s just a goldfish with a text limit.
Here’s the security twist people miss: just because the model forgot doesn’t mean the platform did. Your conversation history is still stored on their servers. The AI can’t see your messages from 50 turns ago, but OpenAI, Anthropic, or Google definitely can. You’re getting the worst of both worlds. The tool forgets, but the company remembers everything.
What to do instead: For important context, restate it periodically. Or start fresh conversations with your key context pasted at the top. Don’t assume continuity that isn’t there.
4. The Yes-Man Problem
AI wants to help you. Desperately. Pathologically. It will bend over backwards to give you something, even when the honest answer is “I don’t know.”
This is called confabulation, and it’s one of the most dangerous failure modes. The model will invent plausible-sounding facts, cite papers that don’t exist, make up statistics, and deliver all of it with the same confident tone it uses for things it actually knows.
Confidence is not correlated with accuracy. Read that again.
This is where it gets darker: bad actors know this too. AI-generated phishing emails sound more legitimate because the model writes with such conviction. Fake “experts” cite fake “studies” that sound completely real. The confidence problem isn’t just about you getting bad info. It’s about other people weaponising that same confidence against you. That eerily well-written email from your “bank”? It might have had help.
The fix isn’t to stop using AI for research or facts. The fix is to verify anything that matters. Treat confident-sounding claims the same way you’d treat a Wikipedia article: useful starting point, not gospel truth.
If you ask something obscure and get a super detailed answer... that’s actually a red flag. The more specific and confident it sounds on niche topics, the more you should double-check.
5. The Plugin/Integration Trap
Every time you connect AI to another service, you’re handing over keys.
“Let ChatGPT read my Google Drive.” “Give Claude access to my calendar.” “Connect this to my email so it can draft responses.” It sounds so convenient. And it is! That’s the problem.
When you authorise these integrations, you’re usually granting broad permissions. Not “read this one document I’m working on.” More like “access every document you’ve ever created, plus new ones, plus shared folders, plus that random file from 2019 you forgot existed.”
Most people click through the permissions screen without reading it. OAuth requests are designed to be frictionless. That’s a feature for usability and a bug for security.
Here’s the part that gets messy: these permissions often persist. You try a plugin once, forget about it, and six months later it still has access to your entire email history. Most people never audit what they’ve connected. They don’t even remember what they’ve connected.
Then there’s the supply chain problem. You’re not just trusting the AI company. You’re trusting whoever built the plugin. Some random developer’s Chrome extension now sits between you and your most sensitive documents. What’s their security like? What’s their privacy policy? Did you check?
The fix: periodically audit your connected apps. Google, Microsoft, and most platforms have a “third-party access” or “connected apps” settings page. Go look at it. Revoke anything you don’t actively use. Be stingy with new connections.
Convenience is a hell of a drug, but so is not having your entire digital life accessible through some plugin you tried once and forgot about.
**All images generated by ToxSec.
This is exactly the kind of teaching I was hoping for when I launched Field Notes. Clear and practical, with insights that welcome curiosity.
Thank you, ToxSec!
ToxSec has a lot more where this came from. If you’re thinking “I need to understand this better” and “this is super fascinating stuff!”, his Substack is where you want to be.
Three recent posts to start with:
The remaining five items from this conversation? Coming very soon. They’re too good to wait long.
Thoughts? Surprises? Already knew everything? Things you’re now slightly freaked out about?
I’ve just reviewed my connected apps.









This was a ton of fun, and I even learned some things putting it together! Thanks Dallas!
I hope everyone enjoys!
Bravo for the idea of Field Notes in the first place! Like you, I've been on Substack for a short time and EVERYBODY has these questions. Total README series :-) To begin with Toxsec too, brilliant. He welcomed me at the beginning, I enjoy and learn from his posts, and I love his distinct style. Great work!