Sentient (@SentientAGI) / X

Sentient

2,101 posts

Sentient

@SentientAGI

To ensure that Artificial General Intelligence is open-source and not controlled by any single entity. @SentientEco @OpenAGISummit

San Francisco, CA

Joined February 2024

Pinned
Sentient
@SentientAGI
May 20
Arena Challenge 0 is now public! 🏆 $6,000 in prizes + MiniMax credits 🗓️ May 20 - June 22, 2026 Built on @databricks' enterprise OfficeQA benchmark, the Grounded Reasoning Challenge is now open for everyone.
47K
Sentient
@SentientAGI
9h
Replying to @SentientAGI
Watch the interview on Youtube ↓
3.4K
Sentient
@SentientAGI
9h
From predictive tennis models to building AI agents in the Arena 🎾 Here's why Alfie joined Challenge 0 to connect with other builders ↓
00:00
5.9K
Sentient reposted
Sentient Foundation
@sentient_found
11h
Open-source AI makes transparency the default, so no single monolith can dictate access, research, or innovation. Say no to the black box. That’s how everyone wins.
ClaudeDevs
@ClaudeDevs
17h
We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible. Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged
2.5K
Sentient
@SentientAGI
Jun 10
Replying to @SentientAGI
3.5K
Sentient
@SentientAGI
Jun 10
Replying to @SentientAGI
Dive into the full technical breakdown ↓
Sentient
@SentientAGI
Apr 22
Article
How Open-Source Agents Matched Frontier AI at 1/30th the Cost
Over three weeks, 147 builders drawn from 1,200+ applicants took on Grounded Reasoning: the inaugural Arena challenge, built on @databricks' OfficeQA benchmark. Every team had to use the same...
3.9K
Sentient
@SentientAGI
Jun 10
Replying to @SentientAGI
3/ Build agents that push deep into the hard questions no one else has cracked. There's one big cliff in the data, and it’s between the top 6 teams and everyone else: • Medium: Top 6 solved 86-97% vs Top 7-15 solved 57-80% • Hard: Top 6 solved 83-92% vs Top 7-15 solved 17-75%
2K
Sentient
@SentientAGI
Jun 10
Replying to @SentientAGI
2/ Build agents that know when to think harder. Successful agent runs reason significantly more than failed ones, scaling with difficulty: • 10% more on easy tasks (The baseline is failing trajectories on the same task) • 25% more on medium tasks • 35% more on hard tasks Of
2.4K
Sentient
@SentientAGI
Jun 10
Replying to @SentientAGI
1/ Build agents that know when to stop. Cohort 0 burned ~$3,300 on inference. 43% of that paid for traces that returned wrong answers. Agents can't tell when they're off the rails, so they keep generating. Nearly half of every inference dollar went to failure, but better
3.6K
Sentient
@SentientAGI
Jun 10
We analyzed data from our first batch of Arena builders to see what separates the top teams from everyone else. Here are the three open source AI insights that stood out ↓
11K
Sentient
@SentientAGI
Jun 8
Replying to @SentientAGI
Read @MSFTResearch’s paper ↓
arxiv.org
SkillOpt: Executive Strategy for Self-Evolving Agent Skills
Agent skills today are hand-crafted, generated one-shot, or evolved through loosely controlled self-revision, none of which behaves like a deep-learning optimizer for the skill, and none of which...
6.2K
Sentient
@SentientAGI
Jun 8
In Microsoft Research's new SkillOpt paper, EvoSkill is named the “strongest harness-side competitor” tested, and the closest system to their own method when run inside Codex and Claude Code agent loops. The biggest labs in AI are paying attention, and @salahalzubi401 and the
22K
Sentient
@SentientAGI
Jun 8
This is why Big Tech can't ignore open-source AI.
Sentient Foundation
@sentient_found
Jun 8
Why is @SentientAGI Product Lead @oleg_golev excited about open-source AI? The future he imagines is worth paying attention to ↓
00:00
10K