I build Personal AI Operating Systems: memory → agents → evaluation → public proof.
If you are deciding whether to follow or work with me, the short version is:
I turn AI-agent ideas into inspectable systems: captured knowledge, explicit identity, real tool use, eval gates, and public proof that compounds over time.
| Visitor question | What this profile should prove |
|---|---|
| Can he build? | Public repos, working demos, diagrams, tests, and writeups instead of claims. |
| Does he understand agents deeply? | Agent architecture research, runtime plumbing, and trace-first evaluation. |
| Is there a coherent direction? | Every project rolls up into one Personal AI OS thesis, not random side projects. |
| Should I follow? | Follow if you care about AI agents becoming reliable personal/work infrastructure, not just chat UI demos. |
point: individual tools and experiments
line: memory → identity → agents → evals → proof
surface: repos, writing, demos, resume, and operating loops reinforce each other
body: Steven as one inspectable AI-native builder entity
| Layer | Repository / Surface | Role in the system | Public conversion asset |
|---|---|---|---|
| Memory | knowledge-harness (local hardening before public release) | Routes agents through an Obsidian-backed knowledge base without mixing content and runtime state. | Architecture note + public-safe demo pending. |
| Identity | digital-twin | File-first operating layer for making agents inherit style, judgment, memory, and reusable workflows. | README, docs site, operating-layer essay. |
| Agent runtime | Hermes / OpenClaw contributions | Real assistant plumbing: messaging, Feishu threads, gateway/runtime debugging, OAuth, CLI backends, tools. | Issue/PR writeups and debugging notes pending. |
| Model + tool infra | CLIProxyAPI (public surface pending cleanup) | Normalizes model access and CLI-compatible API infrastructure. | Cleanup checklist + launch note pending. |
| Evaluation | agent-scorecard | Trace-first quality gate for deciding whether agents deserve more tokens, permissions, and autonomy. | Runnable examples + reports. |
| Public proof | stevenchouai.github.io · resume system | Converts repos, essays, demos, and job-market artifacts into one navigable proof chain. | Homepage, blog, proof-chain page. |
- Agent Scorecard — deterministic checks for tool use, verification, durable artifacts, side-effect safety, and anti-busywork behavior.
- Claude Code Sourcemap — source-map-based architecture guide for AI coding agents.
- Hermes / OpenClaw work — practical runtime and gateway fixes across real assistant stacks, not toy demos.
| Project | Contribution | Status | Why it matters |
|---|---|---|---|
| NousResearch/hermes-agent | #21254 — fix(update): migrate config in non-interactive updates (salvaged from my original #19221) |
Merged · merge commit 8cef149 |
Makes detached / gateway / non-interactive update flows safer by migrating config before restart. |
| NousResearch/hermes-agent | #17895 — fix(feishu): preserve threaded replies |
Open | Preserves Feishu/Lark threaded reply routing for real agent gateway conversations. |
| openclaw/openclaw | #75024 — fix(feishu): preserve threads without root_id |
Open · CI green-ish | Handles Feishu/Lark thread fallback behavior in another production-style agent runtime. |
- Digital Twin — a personal agent operating layer built around explicit files, skills, memory, and durable outputs.
- knowledge-harness (local hardening before public release) — CLI/runtime wrapper around an Obsidian LLM wiki.
- Input Copilot iOS (local proof) — capture → profile signal → radar → Obsidian export loop.
- Resume system (public writeup pending) — dual-track PM / Engineer resume workflow with local JD matching, AI tailoring, ATS review, and reproducible PDF output.
- ManageUp (archive / visibility pending) — MCP server + skill library for manager-facing reporting.
I do not expect followers to come from a prettier README alone. The loop has to be:
- Build useful primitives — agent memory, identity, runtime, eval, and proof-chain tools.
- Publish one concrete artifact per week — a repo improvement, demo, architecture note, benchmark, debugging case, or before/after workflow.
- Package each artifact into a small distribution unit — GitHub README update + blog note + X/LinkedIn thread + one clear screenshot/diagram.
- Route readers back to the proof chain — every post should answer: “what can I inspect or reuse now?”
- Measure retention signals — stars, follows, profile clicks, repo clones, comments, inbound DMs, and which pages cause people to continue reading.
| Week | Ship | Why it should help conversion |
|---|---|---|
| 1 | Add clear visitor CTA, proof map, and follower thesis to this profile. | People understand the category and why to follow within 10 seconds. |
| 2 | Harden one local proof repo into a public artifact or public writeup. | Converts “interesting private system” into inspectable evidence. |
| 3 | Publish one agent-eval case study using Agent Scorecard. | Shows judgment: not just building agents, but deciding when they deserve trust. |
| 4 | Turn one runtime/debugging win into a practical architecture note. | Attracts expert builders who follow for hard-earned implementation detail. |
Success metric: each shipped artifact should create a visible next click — repo → demo/report → essay → follow/contact. If it cannot create a next click, it is probably internal value, not public conversion value yet.
- Trace over vibes — logs, tests, files, reports, screenshots, and commits beat confident claims.
- Coherence over volume — every project should explain its upstream and downstream role.
- Small tools before dashboards — build the proof loop before polishing the surface.
- AI as leverage, not theater — an agent earns autonomy only by producing verified artifacts.
- Claude Code source deep dive
- Digital Twin operating layer
- AI reshapes engineering SDLC
- Public proof chain
- Website: stevenchouai.github.io
- GitHub: github.com/stevenchouai
- Follow for: AI agents · evaluation · personal knowledge systems · product engineering · public proof chains


