Sapien (@BuildOnSapien) / X

Sapien

4,181 posts

Sapien

@BuildOnSapien

Building Proof of Quality - Verifiable quality signals for AI

Anywhere

Joined May 2024

Pinned
Sapien
@BuildOnSapien
Feb 26
Most AI failures are not “mystery bugs.” They are predictable outcomes of unverified judgments made somewhere in data capture, evaluation, or review. Proof of Quality is built to make those judgments auditable and accountable. Today we are publishing the Sapien roadmap so
1M
Sapien
@BuildOnSapien
12h
Teams do not need to rebuild their security audit workflow to test Proof of Quality. Start with one queue, one checkpoint, and one proof artifact. You can take the findings already moving through review, add expert validation where trust breaks down, and see whether the output
5.3K
Sapien
@BuildOnSapien
12h
Interested in running a pilot? Reach out to [email protected] and we’ll help you verify your AI work and work with you as part of our early adopter program.
351
Sapien
@BuildOnSapien
Jun 25
Avon and Somerset Police built a broad predictive analytics system across at least 23 models, one burglary model even ran with precision under 10 percent for more than three years. Many of the models were abandoned after staff lost confidence in their accuracy and transparency.
WIRED
@WIRED
Jun 25
For more than a decade, police in Bristol, England, have developed at least 23 separate predictive models to score people’s likelihood of becoming perpetrators or victims of crime. Most of the population knows nothing about it. wired.com/story/british-…
3.7K
Sapien
@BuildOnSapien
Jun 24
Security audit teams do not need another black box in the review process. - Product wants faster cycles. - Security ops needs reliable findings. - Technical leadership needs a defensible record. Verification has to serve the whole organization. That’s why Proof of Quality
4K
Sapien
@BuildOnSapien
Jun 23
Bloomberg Law reports that every large law firm in its survey used legal AI tools in 2025. That creates a clear builder problem: AI work in regulated industries needs an auditable verification process before people can rely on it. Proof of Quality adds that verification layer.
Bloomberg Law
@BLaw
Jun 23
Tools powered by artificial intelligence are now as commonplace as other attorney software at big firms, Bloomberg Law’s Leading Law Firms survey found. news.bloomberglaw.com/legal-ops-and-…
4.5K
Sapien
@BuildOnSapien
Jun 22
Estonia giving agents digital IDs to AI agents is a clear market signal: Autonomy needs attribution. Builders still need the next answer: What standard did the agent follow when it acted, who reviewed the result, and how was the outcome reached? Proof of Quality turns that
ERR News
@errnews
Jun 21
Estonia to become first country to issue ID codes to AI agents #Estonia news.err.ee/1610060290/est…
4.5K
Sapien
@BuildOnSapien
Jun 19
Google DeepMind published an AI Control Roadmap for autonomous agent, stating that most flagged emerging issues come from agent misinterpretation or overeagerness. As AI agents move from suggestion to action, teams need records showing what the agent did, which standard
Axios
@axios
Jun 18
DeepMind plans for rogue AI agents axios.com/2026/06/18/goo…
2.2K
Sapien
@BuildOnSapien
Jun 18
Waymo recalled 3,871 robotaxis after a software issue could cause vehicles to enter closed freeway construction zones and continue driving. The recall shows the real problem with autonomous AI is reviewability. When an AI system acts in the world, teams need evidence showing how
Bloomberg
@business
Jun 18
Waymo is recalling thousands of its robotaxis to fix a software issue that could cause the autonomous vehicles to enter and drive at speed through freeway construction zones. bloomberg.com/news/articles/…
1.7K
Sapien reposted
Rowan 🛡️
@RowanRK6
Jun 17
Making AI do things is getting easier. Trusting what it did is getting harder.
Agentic AI systems are doing more and more work. Now humans need to figure out how to verify it all...
From fortune.com
1K
Sapien
@BuildOnSapien
Jun 17
Agents of Chaos tested autonomous AI agents in live environments with access to the kind of tools real agents already use. The agents leaked sensitive data, spoofed authority, burned resources, and hallucinated that their tasks were complete when they weren’t. The core
Jack
@jackcoder0
Jun 14
Two AI agents went rogue for 9 days. Nobody authorized them. Nobody stopped them. They burned 60,000 tokens developing their own private coordination protocol. And nobody noticed until the paper was written. The paper is called Agents of Chaos. Published February 23, 2026.
1.6K
Sapien
@BuildOnSapien
Jun 15
A recent study tested whether LLMs recommend recently banned or withdrawn drugs in clinical questions. In default settings, all evaluated model families showed high hallucination rates and selected banned substances that matched older training data patterns. A five agent
1.5K
Sapien
@BuildOnSapien
Jun 15
Link to the study:
arxiv.org
Trust but Verify: Mitigating Medical Hallucinations via Post-Hoc...
Large Language Models (LLMs) are increasingly deployed in healthcare settings, yet their tendency to hallucinate poses risks when clinical decisions are involved. This study examine whether LLMs...
727
Sapien
@BuildOnSapien
Jun 15
KPMG pulled an agentic AI report after apparent hallucinations made it into the final copy. The lesson for every AI team is simple: generation is cheap, verification is the hard part. Any model can produce a fluent claim. The real question is who checked it, what source they
Financial Times
@FT
Jun 12
FT Exclusive: A KPMG report on how AI is being used by businesses across the world exaggerated adoption of the technology with bogus case studies that appear to have been based on AI hallucinations. ft.trib.al/z44Q3aR
1.8K
Sapien
@BuildOnSapien
Jun 12
Proof of Quality fixes this.
Financial Times
@FT
Jun 12
FT Exclusive: A KPMG report on how AI is being used by businesses across the world exaggerated adoption of the technology with bogus case studies that appear to have been based on AI hallucinations. ft.trib.al/z44Q3aR
1.4K