Hwee-Boon Yar

My Cloudflare Tunnel Config Is My Local Dev Directory

2026-05-01T06:02:00+00:00

I saw Greg Raiz’s local.vibe post on Hacker News. The problem is familiar: once you have enough local projects, remembering localhost:5173 vs localhost:3001 vs whatever the browser extension dev server picked becomes annoying.

My setup is less ambitious. I already use Cloudflare Tunnel for most products I build, so my tunnel config became the directory of local dev services.

It is a boring YAML file, but it has the answer I need most often: what runs where?

The File

My ~/.cloudflared/config.yml has the exposed dev apps:

ingress:
  - hostname: dev.example.com
    service: http://localhost:5174
  - hostname: dev-backend.example.com
    service: http://localhost:4002
  - hostname: dev-myog.example.com
    service: http://localhost:5273
  - hostname: dev-myog-backend.example.com
    service: http://localhost:4010
  - hostname: dev-stacknaut.example.com
    service: http://localhost:5375
  - hostname: dev-stacknaut-backend.example.com
    service: http://localhost:3005
  - service: http_status:404

In my actual config, those are real hostnames. They go through a Cloudflare Tunnel to my machine. I use the dev domains as the normal URLs for those apps, including OAuth, webhooks, mobile callbacks, and testing from other devices.

For local-only things that should not be exposed through the tunnel, I add comments to the same file:

# Local-only dev ports reserved outside Cloudflare tunnels:
# - 3017: MyOG browser-extension WXT dev server
# - 3006: DevSnoop browser-extension WXT dev server

That’s it. Exposed services are normal ingress rules. Local-only services are comments at the bottom.

Why I Like This Better Than Another Dashboard

I don’t need a dashboard for this. I need one file to check.

Most of the time I don’t browse a launcher to find a project. I am already in the project, in tmux, or talking to a coding agent. What I need is for me and the agent to agree on which port and hostname belong to which thing.

The Cloudflare config already has to exist. It already maps names to ports. It has the exact shape I care about:

hostname -> localhost port

Adding another file just for agents would make the system worse. Now I have two places to update, and eventually one of them lies.

The Agent Angle

This ended up working well with coding agents.

My AGENTS.md tells coding agents to check ~/.cloudflared/config.yml before choosing or fixing dev ports. It also tells them that https://*.example.com domains are Cloudflare tunnels to my local machine, not deployed servers.

So when an agent needs to test a frontend, it does not guess:

It checks the tunnel config.
It sees the hostname and port.
It uses the dev domain.
If it adds a new exposed app, it updates the ingress list.
If it reserves a local-only port, it adds a comment at the bottom.

Browser extension dev servers, local helper tools, one-off WXT servers — these often do not belong on the public internet, even behind a hard-to-guess dev subdomain. But they still need a port. Putting them in comments keeps the list complete without pretending every local service should be tunneled.

The Config Is Also Documentation

I like documentation that is also configuration. It stays accurate because something depends on it.

A README that says “frontend runs on 5173” gets stale. A tunnel config that routes dev-myog.example.com to localhost:5273 gets fixed when it breaks.

The comments are the only weak part because they are not executable. But they are in the same file the agent already has to read, right next to the executable mappings. That is good enough in practice.

I also put a command note in the file:

# Adding new subdomains below requires running this command ("dev" = my tunnel name, don't change):
# cloudflared tunnel route dns dev dev-.example.com

This saves a question. When I ask an agent to add a new tunneled dev app, it has the naming convention and the command in front of it.

Why Not Auto-Assign Ports?

Auto-assigned ports are nice for tools that fully own process lifecycle. If a tool starts the app, injects $PORT, proxies the hostname, watches the process, and shuts it down, auto-assignment makes sense.

That is not my workflow.

I usually have projects already running in tmux. Some are frontend apps. Some are Fastify backends. Some are browser extension dev servers. Some are old projects with old assumptions. I want stable ports because stable ports make everything else boring:

OAuth callback URLs stay fixed.
Mobile apps can keep the same backend URL.
Browser bookmarks work.
Agents can test without asking me which app is running where.
Cloudflare Tunnel can expose selected services with real HTTPS.

The cost is that ports still need to be reserved. That’s fine. The agent can pick one, but it has to write the choice down in the same file.

What I Tell Agents

The key instruction in my global agent setup is simple:

### Dev Ports

Other local-only dev port usage (not Cloudflare tunnels, e.g. WXT dev servers)
is documented in comments at the end of ~/.cloudflared/config.yml.
Check/update there before choosing/fixing ports.

### Dev Domains

https://*.example.com are Cloudflare tunnels to the local machine,
not deployed servers. Config in ~/.cloudflared/config.yml

The file is useful to me, but it is more useful because the agents know it exists.

Without that instruction, agents guess. They search package files, find default Vite ports, try raw ports, maybe start another server, maybe collide with something already running.

With it, they check the registry first.

The Small Setup Works

I like tools like local.vibe. It solves more of the lifecycle problem: hostnames, HTTPS, process management, setup instructions for agents. If I wanted a self-contained local app launcher, I would look at it seriously.

But for my setup, Cloudflare Tunnel already covers the part I care about most. I get stable HTTPS dev domains, remote-device testing, webhook testing, and a readable mapping of services to ports.

The only extra habit I needed was treating ~/.cloudflared/config.yml as the local dev directory, not just tunnel config. Agents read it, use it, and update it.

If You Vibe Code an App for Work, Put the Backend in Charge

2026-05-01T03:29:00+00:00

Someone on Reddit asked about deploying a custom vibe-coded app for work, installed on a local server. They could not code their way through problems, but figured Claude could fix things when they broke.

I have been programming for 30 years. Beyond the obvious “does it work?”, these are the two things I would check first:

the backend must not blindly trust the frontend
secrets must not leak into the frontend

The backend must not trust the frontend

Assume a website frontend talking to a server-side backend. Same idea applies to mobile apps, desktop apps, browser extensions, internal dashboards, whatever. If there is a client and a server, the client is not trusted.

It is easy to build as if only your frontend will ever talk to your backend. Maybe the button is hidden. Maybe the form validates the email address. Maybe the frontend only sends role: "user".

But none of that matters if the backend accepts bad requests directly.

Anyone who can reach your backend can call it directly:

curl -X POST https://your-app.example.com/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt":"do expensive AI work for me"}'

They do not need to use your UI.

They can try:

login attempts
account creation
password reset flows
free trial limits
free AI calls
alt text generation
file uploads
admin-looking payloads
object IDs that belong to other users
prices changed to 0
isAdmin: true
plan: "enterprise"

If the backend accepts it, it happened.

Frontend validation is for user experience

Frontend validation is useful. It makes the app feel better. It catches mistakes before a request goes over the network.

But it is not security.

If your frontend checks that a file is smaller than 10 MB, the backend still has to check that the file is smaller than 10 MB.

If your frontend hides the “delete project” button from non-admins, the backend still has to check that the current user is allowed to delete the project.

If your frontend disables the “generate” button after 10 free AI calls, the backend still has to count the calls.

The frontend can help honest users avoid mistakes. The backend decides what is allowed.

Ask the agent to check this directly

If I were vibe-coding an internal app and did not trust myself to catch this, I would ask the agent to review the backend directly.

Something like:

Review this app for places where the backend trusts the frontend too much.

Check all API routes. For each route, verify:
- authentication is required where needed
- authorization checks happen on the backend
- users can only access their own records
- request body fields cannot override server-owned fields
- paid or limited features are enforced server-side
- rate limits exist for expensive operations
- file upload limits are enforced server-side

Return concrete file paths and fixes.

The exact prompt does not matter. The important part is asking about the class of bug directly.

Coding agents are good at fixing things when you point them at the right problem. “Is this app secure?” is too vague.

You still have to decide whether the agent’s answer is good enough. For a work app with real data, real users, or real money involved, I would get a human programmer to check it too.

Rate limit anything expensive

AI calls make this worse because the attack can cost you money immediately.

If your app has a free feature that calls OpenAI, Anthropic, Gemini, or any other paid API, assume someone will try to call it directly.

Even on a local server, ask what “local” means.

Is it only bound to localhost? Is it exposed on the office Wi-Fi? Is it behind a tunnel? Is it reachable through VPN?

Rate limit these endpoints:

login
signup
password reset
email sending
AI generation
file uploads
anything that hits a paid third-party API

Also set billing limits on the provider side. Do not rely only on your own app code for this.

For an internal tool, the rate limits can be simple. You do not need a giant abuse prevention system on day one. But you need something.

Do not put secrets in the frontend

The second thing I mentioned was leaking API keys or secrets in the frontend.

Frontend code is sent to users. For web apps, that is literal JavaScript in the browser. For mobile and desktop apps, it is still code running on a device you do not control.

Assume an attacker can inspect it.

At minimum, they can search the downloaded code for strings that look like secrets:

sk-...
eyJ...
AKIA...
-----BEGIN PRIVATE KEY-----

Some keys are meant to be public. Analytics keys are commonly public. Stripe publishable keys are public. Supabase anon keys can be used from the frontend if your row-level security is correct.

But secrets belong on the backend:

OpenAI API keys
Anthropic API keys
Stripe secret keys
database URLs
private signing keys
webhook secrets
admin tokens
service account credentials

If the frontend needs to do something that requires a secret, it should call your backend. The backend uses the secret. The frontend gets the result.

Environment variables do not automatically make secrets safe

One common mistake: putting a secret in an environment variable and assuming that makes it safe.

It depends where that environment variable is used.

In many frontend frameworks, variables with certain prefixes are intentionally bundled into the client. For example, PUBLIC_, NEXT_PUBLIC_, VITE_, or similar names usually mean “make this available to browser code.”

That is fine for public values. It is wrong for private secrets.

I would ask the agent to check this too:

Review all environment variables and config usage.

Identify any secrets that are imported or referenced by frontend code.
Check framework-specific public env prefixes.
List which variables are safe to expose and which must move server-side.

Then verify the built frontend bundle if the app matters. Search the output. Search the network requests. Search the browser source.

Do not just trust the .env file layout.

Local server does not mean safe

“Installed on a local server” can mean many things.

It might mean a machine under your desk only reachable from your laptop. It might mean an office server reachable by everyone on Wi-Fi. It might mean a NAS. It might mean a Cloudflare tunnel. It might mean “temporarily exposed for testing” that stays exposed forever.

I would still treat it as a real deployed app:

require login if the data matters
use HTTPS if it crosses a network
restrict network access where possible
keep secrets on the server
back up important data
log errors without logging secrets
set billing limits for paid APIs

Internal tools are still tools. They still delete data, send emails, upload files, and call paid APIs.

My minimum checklist

For a small vibe-coded work app, my checklist would be:

Can an unauthenticated user call any API route?
Can one user read or modify another user’s data by changing an ID?
Are admin actions checked on the backend?
Are usage limits enforced on the backend?
Are expensive operations rate limited?
Are all secrets server-only?
Are API billing limits configured?
Are backups configured if the data matters?
Is the app reachable only by the people who should reach it?
Can I restore it if Claude “fixes” it into a worse state?

That last one matters.

If you cannot code your way out of a bad change, make sure the project is in git and committed before asking an agent to make changes. Then you can get back to a known working version.

git status
git add .
git commit -m "Working version before changes"

Use the agent. Let it help. But give yourself a way back.

I am not against vibe-coded internal tools. I use coding agents heavily. They are useful, and small custom tools can save a lot of time.

But if the app has a backend, the backend is the authority. The frontend is just a client.

And if a value is secret, it does not go into the frontend.

Good luck building.

Magic Link Sign Up and Login for SaaS

2026-04-30T10:04:00+00:00

No passwords. No separate registration form. No “confirm your email” step after sign up.

The user enters an email address, gets a link, clicks it, and they are in. If the account exists, I sign them in. If it does not, I create it.

I use this Magic Link flow across my products. MyOG.social is the example here because it has the cleanest version of the implementation.

I also support Google Sign In because it is the fastest path for Gmail users. But Magic Link is the one I rely on. It works for every email address, including non-Google accounts, company domains, and people who do not want another OAuth prompt.

I don’t ask the user whether they want to sign up or sign in.

That distinction is useful to the app, not the user. The user just wants access.

So the backend does this:

verify the email through a Magic Link
look up the user by normalized email
create the user if one does not exist
return the same session shape either way

In MyOG.social, that looks like this:

let user = await dbService
  .db()
  .select()
  .from(users)
  .where(eq(users.email, email))
  .limit(1)
  .then((rows) => rows[0])

if (!user) {
  const trialExpiresAt = new Date(
    Date.now() + FREE_TRIAL_DAYS * 24 * 60 * 60 * 1000
  )

  const newUser = await dbService
    .db()
    .insert(users)
    .values({
      email,
      emailSource: "magicLink",
      trialCreditsRemaining: FREE_TRIAL_CREDITS,
      trialExpiresAt,
    })
    .returning()

  user = newUser[0]
}

That one branch removes a surprising amount of product surface area. No “create account” screen. No “already have an account?” switch. No duplicate route that does the same thing with slightly different copy.

The frontend can still say “Sign up” or “Sign in” depending on context. The backend does not care.

What the Table Stores

The Magic Link table stores only what the login flow needs.

export const magicLinks = pgTable(
  "magic_links",
  {
    id: serial("id").primaryKey(),
    email: varchar("email").notNull(),
    token: varchar("token").notNull(),
    code: varchar("code").notNull(),
    used: boolean("used").notNull().default(false),
    expiresAt: timestamp("expires_at").notNull(),
    createdAt: timestamp("created_at").notNull().defaultNow(),
  },
  (table) => {
    return {
      tokenIndex: index().on(table.token),
      codeIndex: index().on(table.code),
      emailIndex: index().on(table.email),
    }
  }
)

The key fields:

token for the link in the email
code for manual entry
used so the link can only be used once
expiresAt so old links stop working

I also track how the email first came in:

export const emailSourceEnum = pgEnum("email_source", [
  "googleLogin",
  "magicLink",
])

This is not required for authentication. I keep it because it helps later. I can tell whether a user came from Google Sign In or Magic Link, and I can use that when debugging support issues or looking at conversion.

Sending the Magic Link

app.post(
  "/auth/magic-link/send",
  {
    preHandler: [zodValidateBody(sendMagicLinkSchema)],
  },
  sendMagicLink
)

The schema only needs an email:

const sendMagicLinkSchema = z.object({
  email: z.string().email(),
})

The controller normalizes the email, creates a random token, creates a 6-digit code, and stores both with a 15-minute expiry.

function generateToken(): string {
  return crypto.randomBytes(32).toString("hex")
}

function generateCode(): string {
  return crypto.randomInt(100000, 1000000).toString()
}

const email = normalizeEmail(rawEmail)
const token = generateToken()
const code = generateCode()
const expiresAt = new Date(Date.now() + 15 * 60 * 1000)

await dbService.db().insert(magicLinks).values({
  email,
  token,
  code,
  expiresAt,
})

The link uses the configured frontend hostname:

const magicLinkURL = `${env.FRONTEND_HOSTNAME}/magic-link-verify?token=${token}`

I don’t hardcode production URLs in the auth code. Local dev, staging, and production all need to send different links.

The email includes both the link and the code:

Click the link below to sign in to myog.social:

https://myog.social/magic-link-verify?token=...

Or enter this verification code on the sign-in page:

123456

This link and code will expire in 15 minutes.

The 6-digit code looks like a small detail, but it matters.

Some people open email on their phone and the app on their laptop. Some corporate email tools visit links before the user sees them. Some browsers get weird with logged-in state across profiles. A code gives the user another path without adding another auth system.

Verifying the Link

The verify endpoint accepts either a token or a code.

const verifyMagicLinkSchema = z
  .object({
    token: z.string().optional(),
    code: z.string().optional(),
  })
  .refine((data) => data.token || data.code, {
    message: "Either token or code must be provided",
  })

For a token, I look up a matching record that has not been used and has not expired.

const results = await dbService
  .db()
  .select()
  .from(magicLinks)
  .where(
    and(
      eq(magicLinks.token, token),
      eq(magicLinks.used, false),
      gt(magicLinks.expiresAt, now)
    )
  )
  .limit(1)

const magicLink = results[0]

The code path is the same shape, just eq(magicLinks.code, code) instead of the token check.

If there is no match, the answer is deliberately vague:

return reply.status(400).send({ error: "Invalid or expired link/code" })

No need to tell the caller whether the token existed, expired, or was already used.

When there is a match, mark it used.

await dbService
  .db()
  .update(magicLinks)
  .set({ used: true })
  .where(eq(magicLinks.id, magicLink.id))

I would wrap this in a transaction if I were rebuilding it today. The practical behavior is still fine for my current products, but the stricter version is better: find the row, mark it used, create or fetch the user, all as one unit.

Then create the session.

const jwtToken = await reply.jwtSign({ email })
const creditsInfo = calculateCreditsInfo(user)

return reply.send({
  token: jwtToken,
  user: {
    id: user.id,
    email: user.email,
    customerID: user.customerID,
    accountHint,
    hasPaidSubscription,
  },
  credits: creditsInfo,
})

I keep the JWT payload small. The frontend gets the user object in the response, but the token only needs enough identity for authenticated API requests.

The Frontend Has Two States

The Vue page has two states.

First: enter email.

  id="email"
  v-model="email"
  type="email"
  placeholder="you@example.com"
  @keyup.enter="sendMagicLink"
/>

 @click="sendMagicLink">
  Send Magic Link

After the email is sent, it switches to the code state.

  v-for="(digit, index) in codeDigits"
  :key="index"
  v-model="codeDigits[index]"
  type="text"
  inputmode="numeric"
  pattern="[0-9]*"
  maxlength="1"
  @paste="handleCodePaste"
/>

The paste handler strips non-digits and verifies automatically when it gets 6 digits.

const pastedData = event.clipboardData?.getData("text") || ""
const digits = pastedData.replace(/\D/g, "").slice(0, 6)

for (let i = 0; i < 6; i++) {
  codeDigits.value[i] = digits[i] || ""
}

if (digits.length === 6) {
  await verifyCode()
}

Nothing fancy, but it removes friction. People paste codes.

The email link goes to a separate verify page:

onMounted(async () => {
  const token = route.query.token as string

  if (!token) {
    errorMessage.value = "Invalid magic link"
    isVerifying.value = false
    return
  }

  const result = await appStore.verifyMagicLink({ token })
  if (result.success) {
    void router.push("/")
  } else {
    errorMessage.value = result.error || "Failed to verify magic link."
    isVerifying.value = false
  }
})

Click link, verify token, store session, go to the app. That’s it.

Pinia Owns the Session

The frontend store has the usual auth state:

const user = ref<User | null>(null)
const jwtToken = ref<string | null>(null)
const credits = ref<CreditsInfo | null>(null)

const isAuthenticated = computed(() => !!user.value && !!jwtToken.value)

Sending the Magic Link is just a POST to /auth/magic-link/send.

Verifying stores the returned JWT and user:

jwtToken.value = data.token
user.value = data.user
credits.value = data.credits || null

localStorage.setItem(JWT_STORAGE_KEY, data.token)
localStorage.setItem(USER_STORAGE_KEY, JSON.stringify(data.user))

The rest of the app only asks whether the store has a session.

if (!appStore.isAuthenticated) {
  await appStore.restoreSession()
}

Protected routes do not need to know whether the user came through Magic Link or Google.

Google Sign In sits beside Magic Link in the dialog.

 id="googleSignInButton">


 @click="goToMagicLink()" variant="outline">
  Sign in with Magic Link

The frontend loads Google Identity Services, renders Google’s button, receives an ID token, and sends it to the backend.

const success = await store.loginWithGoogle(response.credential)

The backend verifies the ID token with Google, extracts the email, and then follows the same find-or-create-user shape.

const ticket = await client.verifyIdToken({
  idToken,
  audience: env.GOOGLE_CLIENT_ID,
})

const payload = ticket.getPayload()
const email = normalizeEmail(payload.email)

I like this split:

Google is fast for people with Google accounts
Magic Link works for everyone else
both return the same app session

I don’t want password auth unless I have a specific reason to add it. Passwords mean reset flows, breach concerns, password manager weirdness, and another thing for users to maintain. Email-based auth is enough for the products I build.

This auth flow is part of Stacknaut. I extracted it from the products I actually run.

DevSnoop — Browser Access for Coding Agents

2026-04-23T10:22:00+00:00

I use coding agents — Claude Code, Codex, Cursor — for most of my development. They’re good at reading and writing code. They’re not good at seeing what’s happening in the browser. The agent generates frontend code and has no way to check if a button is in the right place, if a form submits correctly, or if there’s a console error after the page loads.

The existing options — Chrome’s DevTools MCP, its CLI, Browser Tools MCP — all work. But they load dozens of tools into the agent’s context, return data the agent has to translate before it can act, and cost more tokens than I’d like for simple tasks. I don’t need all the tools they offer. I wanted a small, reliable subset.

So I built DevSnoop.

What it does

DevSnoop is a Chrome extension that gives coding agents direct access to the browser. The agent sends an HTTP request to localhost:9400, the extension executes on the live page, and structured JSON comes back.

curl -s -X POST http://127.0.0.1:9400/ \
  -H 'Content-Type: application/json' \
  -d '{"command":"page_summary","params":{"depth":3}}'

One HTTP call, structured JSON back. No MCP server, no DevTools protocol.

18 commands — page inspection, click/fill/scroll/hover, console logs, network requests, screenshots, DOM diffing, tech stack detection, accessibility audits. I kept the surface area small on purpose.

Why not the existing tools?

Chrome DevTools MCP loads a large tool definition into the agent’s context window. DevSnoop’s skill file is a single markdown document with curl examples. The agent learns the full API in one read.

I ran a comparison on a real page — the admin panel of one of my projects. DevSnoop’s setup cost was under 2,000 tokens. Chrome DevTools CLI/MCP was closer to 5,000. For the actual task (understanding the page and its interactive elements), DevSnoop returned a compact summary with 30 interactive targets. Chrome DevTools returned a deeper accessibility tree, but at higher token cost.

Then there’s actionability. DevSnoop’s page_summary returns each interactive element with a CSS selector the agent can pass directly to click or fill:

label: "Refresh Followers" → selector: "body > div > button:nth-of-type(2)"

Chrome DevTools returns accessibility-tree nodes. The agent gets a label, but needs an extra step to map it back to something it can interact with. DevSnoop skips that step — the response is already in the shape the agent needs for the next action.

The tradeoff: Chrome DevTools gives richer semantic understanding of page structure. DevSnoop gives the agent what it needs to act. For my workflow — build something, verify it in the browser, fix what’s wrong — acting is what matters.

How it works

Three pieces:

Chrome extension — runs on the page, executes commands, captures logs and network requests via Chrome’s debugger API
Native host — a compiled binary (no Node/Bun/npm needed on your machine) that bridges HTTP to Chrome’s native messaging
Skill file — a markdown doc that teaches the agent all 18 commands with examples

Install is one line:

curl -fsSL https://devsnoop.com/install.sh | bash

Works with any coding agent that can make HTTP requests — Claude Code, Cursor, Windsurf, Cline, Aider. No special integration needed. Chrome-based, so it works with Arc, Brave, and Edge too.

What I use it for

Mostly verification. I make changes, the agent checks the browser to see if they worked. A typical flow:

page_summary to understand the current state
click or fill to interact with something
get_logs to check for errors
screenshot to see the result
diff to track what changed after a hot reload

The diff command is useful — first call takes a baseline, second call compares against it. The agent can make code changes, wait for hot reload, then check what actually changed in the DOM without re-reading the full page.

Pricing

$29, one-time. Launch pricing. No subscription, no usage limits. Works on macOS and Linux (Windows planned).

DevSnoop is on the Chrome Web Store today.

Writing Coding-Agent Skills for External Services

2026-04-23T06:26:00+00:00

Most of my coding-agent skills so far have been for workflow automation — committing, deploying, reviewing. But I recently wrote two skills for external services — betterstack-log-export and postmark-setup — and they turned out to work differently.

Write skills during the task, not before

betterstack-log-export started when I was tracking down why a weekly email report was failing. I needed to filter Better Stack logs by time range and fields like from and level. I figured out the right approach — both the manual UI export and the curl/ClickHouse query — and set an in-session reminder to turn it into a skill.

postmark-setup was even more direct. I was deploying DevSnoop and needed Postmark servers. I gave the agent my account token, told it to follow the pattern from my existing betterstack-source-setup skill, and the new skill was written and used in the same session.

I wrote both during or right after the actual task, which meant I already knew what to encode instead of guessing at it.

What to put in the skill

Most of the time goes into deciding what belongs in the skill.

For betterstack-log-export, what mattered:

UI path for one-offs: Telemetry → filter view → gear → Download → NDJSON
curl path for repeatable queries: Integrations → Connect ClickHouse HTTP client
EU/US cluster split — which of my projects are on which cluster
Collection names — t337893_theblue_logs, t337893_theblue_s3, etc.
Credential locations: ~/.config/betterstack/eu.env and ~/.config/betterstack/us.env
Query pattern: UNION ALL over hot + archived logs — not obvious from the docs
JSONExtract patterns for nested fields (JSONExtractRaw for nested objects)
Concurrency limit: Standard tier allows 4 concurrent log queries, so retry on failures

None of this is in the Better Stack docs as a package. It’s my account topology, my file layout, and the things I discovered by doing it once.

For postmark-setup, different knowledge but the same idea:

Create both dev and prod servers, not just one
Naming convention: dev and prod
The returned Server API token works as both SMTP username and password — not obvious
Update .env for dev, .env.kamal for prod
Remind about sender signature/domain verification in the Postmark UI

That last one is easy to forget. Without it in the skill, the agent finishes the API work, everything looks fine, and then email silently fails in production.

Include both the UI path and the API path

betterstack-log-export covers two ways to get at logs:

Manual/one-off: filter in the UI and download NDJSON from the gear menu
Repeatable/scripted: query the ClickHouse HTTP API with curl

I discovered the UI download first, before reaching for the API. It’s faster when you just need to eyeball what happened. The curl path is for reproducible filters, row counts, or anything you might run again.

An agent that only knows the API will always reach for curl. Sometimes the right answer is downloading 200 rows from the UI and piping them through jq. Encoding both paths means the agent can pick the right one.

Keep credentials outside the skill

Both skills store credentials under ~/.config// and source them at runtime. Nothing in the skill body, nothing in the repo.

source ~/.config/betterstack/eu.env

Skills are text files — they can end up in shared directories, version control, or another agent’s context. Keeping credentials out means you don’t have to think about it when you share or reorganize skills.

The pattern came from betterstack-source-setup, which predates the log-export skill. When I wrote postmark-setup, I followed the same layout: ~/.config/postmark/api.env, sourced at runtime.

Trim it down

The first version of betterstack-log-export was much longer than the current one. More explanation, more examples than the agent would ever need. Most of that was useful while I was figuring things out, but the agent at runtime just needs to know which cluster to hit, where the credentials are, and what SQL pattern to use.

I trimmed it in the same session. The current version has one section each for the quick path, clusters, credentials, query pattern, and guardrails.

What the agent can’t get from docs

The agent can read Better Stack’s API docs or Postmark’s SMTP reference on its own. What it can’t figure out is which cluster my project is on, what naming convention I use for servers, or that the returned API token doubles as the SMTP password. That’s the knowledge that gets lost between sessions and re-discovered every time — unless a skill holds it.

Use the skill immediately

postmark-setup was used in the same session it was written — I set up DevSnoop’s dev and prod servers with it right away. If the skill was wrong or incomplete, I’d have found out on the spot.

betterstack-log-export was used within a few sessions, running real queries against real log data. The JSONExtract patterns were refined once during actual use. The first version is a starting point — it gets better when you exercise it.

Claude Code vs Cursor vs Codex: Which AI Agent Should You Use?

2026-04-16T05:22:00+00:00

I use Claude Code, Droid, and Codex daily across all my projects. I also ship a SaaS starter kit — Stacknaut — that comes with an AGENTS.md pre-configured for coding agents.

These are practical daily-use observations from working with all three agents on production codebases with skills, project instructions, and defined conventions.

Claude Code

Claude Code is the most capable agent I use for working with a structured codebase. It reads CLAUDE.md (which I point to AGENTS.md via @AGENTS.md) at session start and follows instructions consistently.

What it does well:

Follows AGENTS.md conventions reliably — coding style, commit format, tool preferences
Reads and understands project structure quickly. Greps for patterns, reads relevant files, builds a mental map
Handles multi-step tasks well — “add a new API endpoint with tests, types, and a frontend page” works in a single prompt
Git integration is solid — commits, branches, diffs
Skills work naturally. Trigger a skill, the prompt gets injected, the agent follows it

Where it struggles:

Context window fills up fast on large codebases. Long sessions degrade. I start fresh sessions often
Sometimes over-reads files — pulls in more context than needed, burning through the window
Can be cautious about running commands, asking for approval when I’d prefer it just goes ahead. I use --dangerously-skip-permissions for trusted projects. I used to route Claude Max through Droid via CLIProxyAPI partly because Droid has a stronger permission system — but Anthropic has since restricted Claude Max use with third-party tools

With a structured codebase: Claude Code handles Stacknaut’s monorepo structure (frontend/backend/shared) naturally. It understands the path aliases, knows to run type-check in both packages, and follows the Drizzle ORM patterns without me repeating the rules. The pre-configured AGENTS.md means the first session on a new project based on Stacknaut is already productive — no warmup needed.

Codex

Codex is OpenAI’s open source agent. I use it for reviewing code that Claude Code wrote, bug fixing, and tackling tasks where I want a different perspective.

What it does well:

Fast for targeted tasks. “Review this file for issues” or “refactor this function” — Codex is snappy
Good at catching things other agents missed. Different model, different blind spots
Reads AGENTS.md and follows basic conventions
The sandbox is a nice safety net — commands run in an isolated environment by default
Open source, so I can see exactly what it’s doing

Where it struggles:

Less capable at multi-step agentic workflows than Claude Code. It handles simpler task chains better than complex ones
Skills work but less polished than Droid and Claude Code — I share skills across all three, though Codex sometimes needs more nudging to follow them
The sandbox, while safe, sometimes prevents it from doing things I want — accessing the network, running the dev server, interacting with Docker

With a structured codebase: Codex works well for targeted edits within Stacknaut — fixing a bug, adding a field, updating a component. For bigger tasks like “add a new billing plan with Stripe integration across frontend, backend, and shared types,” I reach for Claude Code. Codex tends to need more prompting to coordinate across a monorepo.

Cursor

Cursor is the best IDE-embedded agent, but I keep coming back to terminal agents.

What it does well:

Tab completion is genuinely good for small, predictive edits while you’re actively writing code
Inline diffs are nice to review — you see the changes in context without switching tools
Reads project rules (.cursor/rules, AGENTS.md) and follows conventions
The Composer/Agent mode handles multi-file edits within the IDE
Background agents and Bugbot for automated tasks

Where it struggles:

Primarily an editor experience. Cursor has a CLI and background agents now, but the core workflow is still VS Code. I use WebStorm and Neovim — Cursor means giving those up
Parallel sessions are less natural. Background agents help, but with terminal agents I run 3-5 in tmux and coordinate between them. That’s harder to replicate in an editor
Project rules work for conventions, but there’s no skills system like Claude Code or Droid have — small, portable prompts I can trigger on demand and share across agents

With a structured codebase: Cursor handles a starter kit fine for single-file edits. Where it falls short is the agentic workflow I actually use — having an agent autonomously implement a feature across the monorepo, run tests, check types, fix errors, and commit. That workflow needs a terminal agent that can loop independently.

Droid

Droid was my primary agent. It reads both AGENTS.md and CLAUDE.md, supports skills and custom droids, and has good context management.

What it does well:

Model-agnostic — I can use different models for different tasks
Skills and custom droids work well for repeatable workflows. I have droids for review, exploration, and specific project tasks
Spec mode lets me plan before coding — useful for complex features where I want to review the approach before the agent starts writing
Sub-agents via the Task tool — delegate subtasks to separate instances
Good at following AGENTS.md conventions, especially with project-specific custom droids

Where it struggles:

Newer than Claude Code, so the ecosystem and documentation are still growing
Some rough edges in session management compared to Claude Code

With a structured codebase: Droid works particularly well with Stacknaut because I can create project-specific droids that know the codebase deeply. A custom droid configured for “add a new API endpoint” knows the exact file patterns, the route structure, the type definitions, and the test setup. It goes beyond what AGENTS.md alone provides.

How I Actually Use Them Together

I don’t pick one agent. I use all three:

Claude Code for primary development — implementing features, working through complex tasks, using skills for commit/review/deploy workflows
Droid for an alternative perspective and when I want spec mode or custom droids for specific workflows
Codex for review — I have it check what the other agents wrote. Different model catches different issues

The shared AGENTS.md means all three agents follow the same conventions. The code they produce is consistent regardless of which agent wrote it. That’s the whole point of having project instructions — it normalizes the output across agents.

Which Should You Pick?

If you’re working with a structured codebase — starter kit or not — start with Claude Code. It’s the most capable, most polished, and the AGENTS.md support is mature.

Add Codex as a reviewer. Having a second agent review the first agent’s work is one of the most reliable quality improvements I’ve found.

If you want skills and custom agents, try Droid. Project-specific droids that know your exact patterns go beyond what AGENTS.md alone provides.

Cursor is fine if you prefer staying in VS Code. It’ll follow your project conventions. But you’ll miss the composability and parallel sessions of terminal agents.

Don’t Let the LLM Verify. Make It Build the Verifier.

2026-04-11T03:30:00+00:00

Someone posted about asking Claude to generate an HTML report from a JSON file, then spawning 4 agents to test it. All 4 reported success. Manual testing showed 60%+ failure — hallucinated selectors, fake IDs, wrong values.

When you ask an LLM to “check if this is correct,” it predicts what a correct-sounding check looks like. That’s not the same as actually checking.

What I do instead

I tell the agent to write a script that performs the check, then run the script. Not through the LLM — just execute it.

The obvious examples: lint and formatting. I don’t ask Claude “is this formatted correctly?” I have it run eslint and prettier. The tool tells me if it passes or not.

This works for ad-hoc checks too. Say I need to verify an HTML report pulls the right values from a JSON source. I tell Claude to write a script that parses the JSON, queries the DOM with a real parser, compares expected vs. actual, and prints mismatches. Then run it. Same input, same output every time.

The pattern applies anywhere there’s a ground truth to check against — data validation, math, DOM structure, spelling, broken links. All of these have real tools or can be checked with a short script. The LLM writes the script. The script does the verification.

It’s also cheaper. The script runs without burning tokens, and once you have it, you can improve it and rerun it as many times as you want. Having the LLM re-check means paying for a fresh prediction every time — and getting a different answer each time too.

Why more agents don’t help

Four agents running the same prediction process give you four predictions. If the model hallucinates a selector, it hallucinates it four times. More agents just means synchronized fiction.

Agents help when each one runs a real tool and reports the output. The agent orchestrates. The tool verifies.

Two Ways to Direct Coding Agents

2026-04-11T03:21:00+00:00

I work with coding agents in two modes. For larger features, I write a detailed spec before any code. For smaller tasks, I skip the spec and set guardrails instead. Both work — they solve different problems.

Full Spec

For new features or anything with non-obvious architectural decisions, I write everything out first — data flow, DB schema, API shape, edge cases. I have a spec skill that interviews me about all of this before any code gets written. What comes out is a document like:

## Data Model
- users table with email, hashed_password, created_at
- sessions table with user_id, token, expires_at
- No soft deletes — hard delete on account removal

## API
- POST /auth/register — validate, hash, insert, return session token
- POST /auth/login — verify, create session, return token
- Auth middleware checks session token on protected routes

## Edge Cases
- Duplicate email returns 409, not a generic error
- Expired sessions return 401, frontend sends user to login

The agent follows this. It doesn’t get to suggest alternatives or rethink things during execution. This eliminates drift — there’s no room to wander.

Guardrails Instead of a Spec

For smaller tasks — bug fixes, refactors, wiring up something that follows an existing pattern — I skip the full spec. I describe the problem and add constraints. A real prompt looks like:

The /auth/login endpoint returns 500 when the email doesn't exist.
It should return 401 with { error: "invalid_credentials" }.

Constraints:
- Do not modify the database schema
- Do not change more than 20 lines
- Keep changes in src/routes/auth.ts
- Run the existing auth tests after fixing

The constraints keep the agent from over-engineering it. “Do not modify the database schema” prevents the agent from deciding it needs a login_attempts table. “Keep changes in src/routes/auth.ts” stops it from refactoring the error handling across three files.

I couple these with validation — linting, type checks, tests — to catch anything that goes outside the boundaries.

When I Use Which

New system, multiple services, design decisions that affect the whole project? Full spec. A bug in a well-understood module, or a refactor that follows an existing pattern? Guardrails and go.

The gray area is medium-sized tasks — adding a feature to an existing system where the pattern is clear but there are a few decisions to make. I usually start with guardrails and tighten them if the agent drifts. If I find myself adding more than four or five constraints, that’s a sign I should write a spec instead.

Why I Self-Host My SaaS Apps

2026-04-07T04:16:00+00:00

I run four apps — MyOG.social, Stacknaut.com, AltCaption.com, and ToLocaltime.com — on a single ~$30/month Hetzner ARM server. TheBlue.social is older and still runs on Render. I used Heroku years ago, then Render for a couple of years. Both were fine for deployment. I switched to self-hosting for cost, control, and performance.

Cost

~$30/month for four apps with PostgreSQL, background workers, and the whole stack on one server.

The cost stays flat. More traffic doesn’t change my bill — I’m paying for the server, not per-request or per-seat. If I need more capacity, I bump to the next server tier. Still cheaper than any PaaS.

Control

On my server, I pick the OS, the Docker version, the PostgreSQL config, the backup schedule, the firewall rules. And I make those decisions once — they’re encoded in Terraform and Kamal configs. Spinning up a new project is the same setup, same commands. I’m not dependent on a platform’s roadmap or pricing decisions.

Performance

My app and database run on the same machine. Database queries are sub-millisecond — no network hop. On most PaaS setups, the database is a separate service with network overhead on every query.

The ~$30 server gives me dedicated vCPUs and RAM.

DevOps in 2026

Setting up a server in 2015 meant configuring Nginx, managing SSL certificates by hand, writing systemd service files, and hoping you didn’t miss a security update.

Now it’s Docker and Kamal. I define my app in a Dockerfile, my infrastructure in Terraform, and my deployment in a Kamal config. One command provisions the server. One command deploys the app. SSL is automatic via Let’s Encrypt through kamal-proxy.

I spend maybe 30 minutes a month on maintenance — OS updates, checking disk space, reviewing logs. My Telegram bot pings me if anything needs attention.

Scaling

A single Hetzner ARM server handles far more traffic than most indie SaaS apps will see. PostgreSQL on one server can do thousands of queries per second.

If I outgrow one server — a great problem to have — Kamal supports multi-server deployments. Move the database to its own box, add a second web server, and kamal-proxy load-balances between them.

What I Self-Host (and Don’t)

MyOG.social, Stacknaut.com, AltCaption.com, ToLocaltime.com — Vue frontends, Fastify APIs, PostgreSQL, background workers
Automated daily PostgreSQL dumps to object storage
Monitoring via Uptime Robot and PostHog

What I don’t self-host:

This blog — Jekyll on GitHub Pages, because it’s static and free
Email — transactional email goes through a service. Self-hosting email is a deliverability nightmare
CDN/edge — Cloudflare sits in front of the server for caching and DDoS protection

Getting Started

I wrote about the specific tools in detail:

Terraform setup — provision a Hetzner server with four files
Kamal deployment — zero-downtime deploys with one command

The whole thing took about two days the first time. Subsequent projects take an hour or two — and most of that time is setting up API keys for external services, not the hosting itself. After that, every deploy is kamal deploy.

Terraform for Indie Hackers: Just Enough Infrastructure as Code

2026-04-06T07:09:00+00:00

I use Terraform and Kamal 2 to provision and deploy my SaaS apps. The main reason is cost control — hosting one app on a PaaS like Render or Vercel can be fine, but it gets tough when I’m experimenting with a few of them and they don’t all make money yet.

Why Bother

With Terraform, the entire server setup is a file. Run terraform apply, get the exact same server. Same config, same firewall rules, same SSH keys.

The Minimal Setup

My Terraform config for a single Hetzner server running a SaaS app:

infra/
  main.tf          # provider and server resources
  variables.tf     # configurable values
  outputs.tf       # values to display after apply
  terraform.tfvars # actual values (not committed)

Four files. That’s it.

Provider Setup

terraform {
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = "~> 1.45"
    }
  }
}

provider "hcloud" {
  token = var.hcloud_token
}

The Server

resource "hcloud_server" "app" {
  name        = "myapp-prod"
  image       = "ubuntu-24.04"
  server_type = "cax21"
  location    = "fsn1"
  ssh_keys    = [hcloud_ssh_key.default.id]

  user_data = file("cloud-init.yml")
}

resource "hcloud_ssh_key" "default" {
  name       = "default"
  public_key = file("~/.ssh/id_ed25519.pub")
}

cax21 is the ARM instance — 4 vCPU, 8GB RAM, ~€8/month. user_data is a cloud-init script that runs on first boot.

Cloud-Init: Server Bootstrap

#cloud-config
packages:
  - docker.io
  - docker-compose-plugin
  - fail2ban
  - ufw

runcmd:
  - systemctl enable docker
  - systemctl start docker
  - ufw allow 22/tcp
  - ufw allow 80/tcp
  - ufw allow 443/tcp
  - ufw --force enable
  - fallocate -l 2G /swapfile
  - chmod 600 /swapfile
  - mkswap /swapfile
  - swapon /swapfile
  - echo '/swapfile none swap sw 0 0' >> /etc/fstab

Installs Docker, sets up the firewall (SSH, HTTP, HTTPS only), enables fail2ban, and creates swap. When the server boots, it’s ready for Kamal to deploy to.

Variables

variable "hcloud_token" {
  description = "Hetzner API token"
  sensitive   = true
}

The actual token goes in terraform.tfvars (never committed):

hcloud_token = "your-token-here"

Outputs

output "server_ip" {
  value = hcloud_server.app.ipv4_address
}

After terraform apply, it prints the server IP. I copy that into Kamal’s deploy.yml and deploy.

The Workflow

cd infra
terraform init      # first time only
terraform plan      # preview what will change
terraform apply     # create/update the server

terraform plan shows exactly what will be created, changed, or destroyed before you confirm.

For a fresh project:

terraform apply — creates the server
Copy the server IP to Kamal’s deploy.yml
kamal setup — first deploy, sets up kamal-proxy and containers
kamal deploy — subsequent deploys

DNS Records

I manage DNS through Cloudflare and add those records to Terraform too:

resource "cloudflare_dns_record" "app" {
  zone_id = var.cloudflare_zone_id
  name    = "myapp.com"
  content = hcloud_server.app.ipv4_address
  type    = "A"
  proxied = true
}

resource "cloudflare_dns_record" "www" {
  zone_id = var.cloudflare_zone_id
  name    = "www"
  content = "myapp.com"
  type    = "CNAME"
  proxied = true
}

terraform apply creates the server and points the domain at it. One command.

Firewall Rules

Hetzner has cloud firewalls, also manageable via Terraform:

resource "hcloud_firewall" "app" {
  name = "app-firewall"

  rule {
    direction = "in"
    protocol  = "tcp"
    port      = "22"
    source_ips = ["0.0.0.0/0", "::/0"]
  }

  rule {
    direction = "in"
    protocol  = "tcp"
    port      = "80"
    source_ips = ["0.0.0.0/0", "::/0"]
  }

  rule {
    direction = "in"
    protocol  = "tcp"
    port      = "443"
    source_ips = ["0.0.0.0/0", "::/0"]
  }
}

resource "hcloud_firewall_attachment" "app" {
  firewall_id = hcloud_firewall.app.id
  server_ids  = [hcloud_server.app.id]
}

Defense in depth — the cloud firewall blocks traffic before it reaches the server, UFW on the server is the second layer.

State Management

Terraform tracks what it created in a state file (terraform.tfstate). This maps your config to real resources — it knows hcloud_server.app is server ID 12345678 on Hetzner.

For one or two servers, the local state file is fine. Keep it out of version control (it contains sensitive data) and back it up. Lose the state file and Terraform doesn’t know what it created — you’d have to import resources manually or start fresh.

For remote state shared across machines, Terraform supports S3-compatible backends. Hetzner doesn’t have one, but Cloudflare R2 works:

terraform {
  backend "s3" {
    bucket = "terraform-state"
    key    = "myapp/terraform.tfstate"
    region = "auto"
    endpoints = {
      s3 = "https://ACCOUNT_ID.r2.cloudflarestorage.com"
    }
    skip_credentials_validation = true
    skip_metadata_api_check     = true
    skip_requesting_account_id  = true
    skip_region_validation      = true
    skip_s3_checksum            = true
    use_path_style              = true
  }
}

I keep the state file locally and back it up. Remote state is more complexity than I need.

What I Don’t Use Terraform For

Application deployment — that’s Kamal’s job
Database management — PostgreSQL runs in a Docker container managed by Kamal
SSL certificates — Let’s Encrypt via kamal-proxy, also Kamal
Monitoring — I stream logs to BetterStack

Terraform provisions the box. Everything that runs on it is managed by other tools.

Getting Started

Install Terraform (brew install terraform on macOS)
Get a Hetzner API token from the Cloud Console
Create the four files above, adjusted for your server type and SSH key
Run terraform init && terraform apply
Point your domain at the server IP
Deploy your app with Kamal

That’s it. One server, one config, one command. Add complexity later if you need it — but for a SaaS serving hundreds or thousands of users, this is more than enough.

Hwee-Boon Yar

My Cloudflare Tunnel Config Is My Local Dev Directory

The File

Why I Like This Better Than Another Dashboard

The Agent Angle

The Config Is Also Documentation

Why Not Auto-Assign Ports?

What I Tell Agents

The Small Setup Works

If You Vibe Code an App for Work, Put the Backend in Charge

The backend must not trust the frontend

Frontend validation is for user experience

Ask the agent to check this directly

Rate limit anything expensive

Do not put secrets in the frontend

Environment variables do not automatically make secrets safe

Local server does not mean safe

My minimum checklist

Magic Link Sign Up and Login for SaaS

Sign Up and Login Are the Same Operation

What the Table Stores

Sending the Magic Link

Verifying the Link

The Frontend Has Two States

Pinia Owns the Session

Where Google Sign In Fits

DevSnoop — Browser Access for Coding Agents

What it does

Why not the existing tools?

How it works

What I use it for

Pricing

Writing Coding-Agent Skills for External Services

Write skills during the task, not before

What to put in the skill

Include both the UI path and the API path

Keep credentials outside the skill

Trim it down

What the agent can’t get from docs

Use the skill immediately

Claude Code vs Cursor vs Codex: Which AI Agent Should You Use?

Claude Code

Codex

Cursor

Droid

How I Actually Use Them Together

Which Should You Pick?

Don’t Let the LLM Verify. Make It Build the Verifier.

What I do instead

Why more agents don’t help

Two Ways to Direct Coding Agents

Full Spec

Guardrails Instead of a Spec

When I Use Which

Why I Self-Host My SaaS Apps

Cost

Control

Performance

DevOps in 2026

Scaling

What I Self-Host (and Don’t)

Getting Started

Terraform for Indie Hackers: Just Enough Infrastructure as Code

Why Bother

The Minimal Setup

Provider Setup

The Server

Cloud-Init: Server Bootstrap

Variables

Outputs

The Workflow

DNS Records

Firewall Rules

State Management

What I Don’t Use Terraform For

Getting Started