Skip to content

Releases: future-agi/future-agi

v0.5.4 — 2026-05-07

07 May 13:43
eba7670

Choose a tag to compare

Release notes — 2026-05-07

Bugs/Improvements

  • Improved Reliability for Voice Observability evals: Traces, replays, and evals for voice calls now stay fully accessible long after a call ends. Vapi and Retell recording URLs rotate and expire on their own schedules, which causes playback to silently break on older calls. FutureAGI now stores a durable copy of every external recording at ingestion time, so your observability data and eval runs are no longer dependent on provider URL availability.

  • Error Feed Now Works for Voice Simulation: Eval-source clusters on VAPI and simulations were not rendering correctly. The Pattern Summary, Trends KPIs, and trace drawer all needed updates to support these project types. All three are now fixed, and clicking a voice trace now opens the voice call drawer as expected.

  • Datasets: Select-All State Resets When Switching Datasets: Switching datasets or tabs was preserving the previous selection state, causing incorrect behavior in delete, duplicate, and copy actions. Selection now resets cleanly on every dataset switch.

  • Trace Attribute Drawer: Long Values Are Expandable and Rows Are Easier to Scan: Long string values in the span attributes drawer were clipped with no way to see the full content. Values are now click-to-expand, and dividers between rows make it easier to tell where one attribute ends and the next begins.

  • Eval List Shows Correct Default Version: The evals list now correctly shows the current default version for each template instead of always showing V1.

  • Zero Eval Scores Now Render: Eval score rendering was treating a score of 0 as empty. Dataset grids, eval logs, and datapoint drawers now correctly display zero scores.

  • j/k Navigation Shortcuts No Longer Swallow Text Input: The j and k row navigation shortcuts were intercepting keystrokes globally, blocking you from typing those letters into comment fields and text inputs in the detail panel. These shortcuts now correctly yield to focused text inputs.

  • Traces from SDK-Ingested Projects Can Now Be Added to Annotation Queues: Traces belonging to projects created via SDK or OTLP ingestion were sometimes blocked from being added to annotation queues. All traces are now correctly resolved and can be queued for annotation irrespective of type of project or mode of addition.

  • Workspace Invite Fixed for Existing Users: In few cases, existing org members invited to a new workspace were not receiving the invitation email and could not see the new workspace in their list. The invite flow now correctly sends the email and grants access uniformly.

  • Eval "Created By" Now Shows Organization Name for Legacy Evals: Evals without creator metadata were showing "User" in the Created By column. They now fall back to the organization display name, and filtering by creator also matches on organization name.

v0.5.3 — 2026-05-05

05 May 12:12
98b1c5f

Choose a tag to compare

Release notes — 2026-05-05

Bug fixes

  • Inaccessible audio recording URLs now surface a clear error. When an audio recording URL returns 403 or is otherwise unreachable, evaluators previously fell back to treating the URL as plain text and produced meaningless results. The system now raises a user-facing error pointing to the inaccessible recording. (#225 / #216)
  • Eval template list now reflects the actual default version. The list view was hardcoding V1 for every template, so promoting a non-V1 version to default never showed up in the outer evals list. Added bulk version-metadata lookup so the list reflects the real default.
    (#229)
  • Eval-task usage reasons are no longer truncated. The per-log eval explanation in /tracer/eval-task/get_usage/ was being capped at 200 characters with a trailing …, making the full reason unrecoverable in the UI. The full string is now returned. (#229)
  • Test detail drawer no longer flashes the wrong width on reload. Reloading with the test detail drawer open briefly rendered a 90vw skeleton before collapsing to the 50vw voice drawer. The drawer now waits for store data to populate before sliding in, so it opens at the correct width with real content. (#210)
  • Long task labels no longer overflow. Replaced the overflowing label rendering with a custom tooltip. (#213)
  • Beta tag removed from chat simulation. (#226)

Performance & reliability

  • Fixed worker OOMs on high-volume eval tasks. The eval-task dispatcher was hydrating full ObservationSpan instances — including large attribute and I/O payloads — before enqueuing evaluations. Span IDs are now fetched via .only("id") / values_list("id", flat=True), sampling reuses the IDs returned by the random-sample query, and the cnt cap is pushed to Postgres via slicing. Behavior is unchanged; same set of span IDs is enqueued. (#207)