DeepTutor v1.4.2 Release Notes

@truffle-dev

DeepTutor v1.4.2 Release Notes

Release Date: 2026.05.28

v1.4.2 is a stability and polish release on top of v1.4.1.
It unblocks Gemini 2.5+ across Visualize and the chat agent, fixes a
ContextVar regression that silently routed authenticated requests to the
admin workspace, hardens the chat protocol for reasoning models with
native tool calling, ships smooth-streaming UX across every chat
surface, and adds support for the Lemonade local provider.

Gemini 2.5+ Reasoning Default-Off

Gemini 2.5 / 3 ship with thinking enabled by default and burn the entire
max_tokens budget on reasoning unless reasoning_effort: "none" is
sent on the request. v1.4.2 centralizes that logic in
reasoning_params.default_reasoning_effort_for, the single source of
truth used by all three execution paths (the OpenAI SDK, the aiohttp
fallback, and the reasoning-kwargs builder). Visualize, Chat, Solve,
and the agentic loop all stop returning empty bodies when configured
against gemini-2.5-pro / gemini-2.5-flash / gemini-3-*.

Visualize Pipeline Hardening

Three independent failure modes are fixed:

Per-capability max_tokens defaults — Visualize now has its own
entry in agents.yaml (16k tokens) seeded from
DEFAULT_AGENTS_SETTINGS, so existing users with a stale
data/user/settings/agents.yaml pick up the higher cap automatically
without hand-editing.
SVG / HTML root trim — when a model wraps its output with prose
("Here you go: <svg>…") or emits a closing fence on the same line
as the closing tag, the generator agent now trims to the outermost
<svg>…</svg> / <!doctype>…</html> so the renderer always receives
a clean root.
Review-step JSON-mode crash → graceful fallback — large or
complex SVGs occasionally trip JSON-mode escaping inside the review
step. Instead of crashing the turn, Visualize now logs the failure
and ships the unreviewed draft so the user still sees a rendered
result.

Authenticated Requests Land In The Right Workspace (#485)

In v1.4.1, require_auth was a sync FastAPI dependency. FastAPI
dispatches sync dependencies via anyio.to_thread.run_sync, which
runs them in a worker thread under a copy of the request context —
so the set_current_user(...) call inside the dependency installed
the user on the thread's context, which was discarded when the
thread returned. The endpoint then read the unset default and fell
back to the admin workspace, silently routing every authenticated
user's reads/writes through the local admin's data.

require_auth and require_admin are now async def, so they
execute in the same asyncio task as the endpoint and the
ContextVar is visible everywhere downstream. HTTP and WebSocket
entry points now share a single _install_current_user helper so
the user object resolved from a token payload is identical across
transports.

Reasoning Models + Native Tool Calling: Label Protocol Fixed

v1.4.1 tried to be clever with reasoning models that have native
tool-calling support — it told them to ignore the TOOL/THINK/
FINISH/PAUSE labels and rely on reasoning_content plus
tool_calls alone, and inside run_labeled_step it treated
<think> preludes and any incoming tool-call delta as implicit
label resolutions. In practice both shortcuts hurt: when a tool
call leaked into the content stream as JSON instead of a real
tool_calls delta, there was no label to repair against, and the
loop happily treated the JSON-as-answer as a FINISH. Multi-turn
reasoning + tool workflows would either burn iterations on repair
retries or silently terminate early.

In v1.4.2:

Reasoning + native-tools system prompt tells the model that
reasoning is displayed in a separate trace area, but the formal
content stream must still start with exactly one of
FINISH/TOOL/THINK/PAUSE.
run_labeled_step no longer treats tool-call deltas as
authoritative for label resolution, and implicit_think_label is
ignored (kept for API compatibility). A missing label always falls
to LABEL_UNKNOWN, so the chat pipeline's protocol-repair path
catches it instead of silently mis-routing the turn.
Inline <think>...</think> preludes are streamed live into the
reasoning sub-trace and stripped from the formal text returned
to the loop — so the answer area no longer leaks raw provider
markers.

Smooth Streaming Across Every Chat Surface

The rAF typewriter (useSmoothStreamText) introduced last week for
the main chat is now wired through AssistantResponse, so the
book chat panel, quiz follow-up tab, and any other surface that
renders an assistant message all get the same frame-aligned cadence
during streaming and a no-op pass-through for completed messages.

Companion fixes:

Book chat panel and quiz follow-up tab moved their autoscroll to
useLayoutEffect and stopped using scrollIntoView({behavior: "smooth"}) — the smooth animation races against the next-frame
layout update during fast streams and produces visible jitter. They
now do a single scrollTop = scrollHeight pin in layout phase,
matching what useChatAutoScroll does on the main chat.
Book chat panel marks its scroller with data-chat-scroll-root so
the global overflow-anchor: none rule applies (the browser's
built-in scroll anchoring fights manual pinning when code blocks
reflow above the cursor).
AssistantResponse is now memoized — completed bubbles stop
re-parsing markdown when an unrelated streaming sibling updates the
parent.

Sidebar Redesign

The expanded sidebar's chat-session list moved into its own
collapsible Recents region with an independent scroll viewport, so
long histories no longer push secondary nav off-screen. The "New chat"
button is gone (clicking Chat in the nav already starts a new
session), and a Docs link to deeptutor.info
sits next to the GitHub link in the footer.

Each session now renders with a deterministic, friendly Lucide icon —
sparkles, leaf, feather, cloud, droplet, sun, moon, flame, star, etc.
— so the sidebar feels varied at a glance without shuffling on
re-render. Running sessions add a gentle wiggle animation; idle ones
stay still.

Lemonade Local Provider

New lemonade provider binding for the AMD Ryzen AI / NPU runtime
(default base URL http://localhost:13305/api/v1). Auto-detected by
port 13305, no API key required, listed in the README Docker host-
gateway section and in the provider configuration docs alongside
Ollama / LM Studio / llama.cpp / vLLM.

Models-Endpoint Probe Honors `DISABLE_SSL_VERIFY`

The context-window auto-detection now passes
aiohttp.TCPConnector(ssl=False) when DISABLE_SSL_VERIFY is set,
matching the behavior of the rest of the HTTP layer. Self-signed local
inference servers no longer fall back to the default context window
just because the probe couldn't verify their cert.

Tests

tests/api/test_auth_contextvar.py — pins the regression from #485:
a sync require_auth would lose the ContextVar; the async version
preserves it across the dependency boundary.
tests/services/llm/test_reasoning_params.py — covers the
centralized default_reasoning_effort_for mapping.
tests/core/test_labeled_step_think_prelude.py — updated to reflect
the new "labels are always required" semantics.
tests/agents/chat/test_agentic_parallel_tools.py — verifies the
reasoning + native-tools path still resolves multi-tool turns.
tests/services/config/test_context_window_detection.py — the
models-endpoint probe honors DISABLE_SSL_VERIFY and passes a
TCPConnector(ssl=False) to the aiohttp session.

Upgrade Notes

Drop-in from v1.4.1: pip install -U deeptutor; Docker users pull
ghcr.io/hkuds/deeptutor:latest.
If you previously hand-edited data/user/settings/agents.yaml to
bump Visualize's max_tokens, that value still wins. The new 16k
default only seeds users whose agents.yaml doesn't mention
Visualize at all.
If you wired a Gemini 2.5+ model and saw empty or truncated outputs,
no configuration change is needed — the default-off behavior now
applies automatically.

What's Changed

fix(auth): make require_auth async so the user ContextVar reaches the endpoint by @truffle-dev in #485
fix(visualize): unblock Gemini 2.5+ and harden Visualize pipeline by @skinred78 in #490

New Contributors

@skinred78 made their first contribution in #490

Full Changelog: v1.4.1...v1.4.2

@wedone

DeepTutor v1.4.1 Release Notes

Release Date: 2026.05.27

v1.4.1 is a security and stability patch on v1.4.0. It locks down
the TutorBot tool sandbox, isolates per-user resources, fixes a v1.4.0 chat
regression, adds an HTTP API for talking to a specific TutorBot, and ships a
multimodal image-fallback fix for providers DeepTutor has no vision entry for.

What's New

TutorBot Tool Sandbox Is Opt-In

The shell exec tool is no longer registered unless an admin sets
allow_shell_exec, and all filesystem + shell access is confined to the bot
workspace by default. Command deny-lists were re-anchored at command
boundaries, and allow_shell_exec can't be flipped on via the update payload.

Per-User Resource Isolation

Book roots, session databases, turn-runtime stores, and TutorBot directories
are scoped per user, and web/API conversations are keyed per session — so
cross-user requests can't reach each other's data.

HTTP / SSE API For A TutorBot

New POST /{bot_id}/chat and /chat/execute-stream (SSE) endpoints with
auto-start and persistent per-session context, for multi-turn conversations
with a specific bot from external clients.

Multimodal Image Fallback

Images are sent optimistically to every provider; if a request carrying images
fails and the model isn't in the known-vision allowlist, the turn retries
text-only. Fixes silently dropped images on Doubao / VolcEngine and other
multimodal models that lack a capability entry.

Safe ZIP Upload + Network Settings

.zip knowledge uploads expand member-by-member through the document validator
with size / count / compression-ratio bounds and path-escape guards; the
archive itself is never indexed. A new /settings/network page surfaces ports,
public API base, and CORS origins (normalized to tolerate host:port and
trailing slashes), plus a "fetch models" action listing model IDs from an
OpenAI-compatible endpoint.

Community Fixes & Changes

Security — closed TutorBot RCE via the shell tool (#518), path traversal in
the filesystem tool (#517), cross-bot file-management authz bypass (#516),
cross-session turn-regeneration authz bypass (#515), book-confirmation authz
bypass (#514), and ExecTool executing LLM shell commands over chat (#506,
first hardened in PR #507).

Bug fixes — chat input disabled after the first turn (v1.4.0 regression,
#520), knowledge-base embedding failure on long documents (#521 / PR #509),
new users unable to create a profile under Docker (#512 / PR #513), Qwen
reasoning models failing native tool calling (#527 / PR #528), and the GPT-5
init-wizard token parameter (PR #508).

Merged / reworked PRs — native tool calling for reasoning models (#528),
oversized session-event truncation (#524), empty-state profile button (#513),
chunking-pipeline fix (#509), GPT-5 probe (#508), ExecTool hardening (#507).
Contributions #522 (zip upload) and #523 (model fetching + notebook lookup)
were reimplemented locally and ship here.

Feature request — HTTP API for multi-turn chat with a specific TutorBot
(#511).

Upgrade Notes

Drop-in from v1.4.0: pip install -U deeptutor; Docker users pull
ghcr.io/hkuds/deeptutor:latest.
TutorBot shell exec is now disabled by default. If you relied on it, set
allow_shell_exec on the bot; tool access stays confined to the workspace.
For cross-site HTTPS auth, set explicit CORS origins and cookie_secure=true.

What's Changed

fix(settings): add profile button when no profiles exist by @wedone in #513
Fix GPT-5 token limit parameter in init wizard probe by @lezhimiffyliu in #508
fix(security): harden ExecTool defaults against command injection (#506) by @kagura-agent in #507
fix: prevent Document nodes from bypassing chunking pipeline by @washi4 in #509
feat: truncate oversized event payloads in session response by @xiongjnu in #524
fix(chat): enable native tool calling for reasoning models via OpenAI-compatible endpoints by @wedone in #528

New Contributors

@lezhimiffyliu made their first contribution in #508
@washi4 made their first contribution in #509
@xiongjnu made their first contribution in #524

Full Changelog: v1.4.0...v1.4.1

DeepTutor v1.4.0 Release Notes

Release Date: 2026.05.22

v1.4.0 is the GA cut of the v1.4 line. It carries the full v1.4.0-beta scope
(Auto Mode, three-layer Memory workbench, agentic Deep Research / Deep Solve /
Question, LlamaIndex chat refactor, unified capabilities infrastructure +
i18n, Visualize/Animator merge, new chat tools, settings split, multi-user
isolation hardening, and the deeptutor start / deeptutor init launcher)
and adds a focused set of agent-engine and runtime hardening fixes on top.
For the full beta scope see v1.4.0-beta.

What's New Since v1.4.0-beta

Reasoning Effort + Provider Thinking Flags

The agentic engine now normalizes "reasoning effort" across providers in one
place, instead of each pipeline re-deriving it.

build_provider_extra_kwargs in deeptutor/core/agentic/client.py
resolves reasoning_effort against provider_registry, maps minimal /
minimum to the right wire shape (DashScope uses minimum), and routes
the on/off intent through provider thinking-style toggles —
thinking_type (DeepSeek), enable_thinking (Qwen-style), or
reasoning_split. Known reasoning models default to high; the
deepseek-v4-flash shape gets thinking disabled by default.
LLMClientConfig.reasoning_effort is plumbed end-to-end. All four
pipelines — agentic chat, question, research, solve — pass the binding +
effort into build_completion_kwargs, so a single user-level setting
controls reasoning behavior across capabilities.

Tool-Schema Fallback For Strict Providers

Some OpenAI-compatible providers reject native function-calling schemas with
a non-standard error. run_labeled_step now detects that case, strips
tools / tool_choice from the request, retries once, and emits a warning
progress event so the user knows the call ran in tool-less prose mode.
Behavior is unchanged for providers that accept tool schemas normally.

Restart-Safe Turn Runtime

A server or container restart used to leave the database row for an
in-flight turn stuck on running, blocking the next message in that
session and leaving the UI hanging on an event stream that would never
complete.

Orphan detection — TurnRuntimeManager now checks whether this
process still owns the turn's in-memory runner. If not, the persisted
turn is marked failed with "Turn interrupted by server restart. Please retry your message."
start_turn sweeps stale active turns for the session before
creating a new turn, so the user can immediately retry after a restart.
subscribe_turn synthesizes a terminal error + done event for
the orphaned turn so the frontend cleanly closes its streaming state
instead of waiting indefinitely.

Frontend API Base — Docker Placeholder Hardening

web/lib/api.ts switched from exact-match to substring-token detection of
the NEXT_PUBLIC_API_BASE placeholder, and exposes isApiBasePlaceholder()
for reuse. This survives small changes to the Docker placeholder shape that
previously slipped through and produced a blank Settings page with no
visible error. The unused auto-generated web/.env.local shipped with the
beta is removed.

LLM Config Probe — Clearer Status Copy

The /settings/llm probe now emits an explicit "Basic LLM completion
succeeded. Chat additionally validates streaming and provider tool
compatibility at runtime." line after the smoke completion, so users know
the probe deliberately doesn't cover the streaming / tool-call paths that
chat exercises.

CLI Chat REPL — Config Surface Aligned With Docs

The deeptutor chat REPL surface was tightened so it matches the public
docs at deeptutor.info.

deeptutor chat --config key=value and --config-json '<json>' seed
the initial config without entering the REPL first.
/config set key value and /config set key=value are both accepted;
JSON values ([…], {…}) are parsed correctly via shlex.
Backslash-continuation lets you send multi-line prompts in one turn.
/refs prints a structured state snapshot (session, capability, tools,
KB, history, notebooks, language, config) instead of a single dim line.
deeptutor plugin info <name> now also returns cli_aliases and the
capability availability block.

Repo & Docs Hygiene

New issue template for the public docs site (.github/ISSUE_TEMPLATE/docs.yml).
AGENTS.md modernized to reflect Auto Mode, tool gating, and the
current capability stages.
.gitignore covers .playwright-cli/; stale .playwright-cli/
captures, .env.example_CN, and DeepTutor.code-workspace removed.

Tests

New tests/core/test_agentic_client_provider_kwargs.py covering the
reasoning-effort + provider thinking-style matrix.
New tests/core/test_labeled_step_tool_fallback.py exercising the
tool-schema retry path end-to-end against a scripted client.
tests/services/session/test_turn_runtime_subscribe.py adds coverage
for orphan-running-turn recovery on both subscribe and start_turn.
tests/cli/test_chat_cli.py adds REPL config/backslash tests and a
plugin-info contract check.
New tests/cli/test_docs_contract.py keeps the public site docs in
sync with the CLI surface (links + deeptutor … examples).
tests/scripts/test_docker_compose.py + web/tests/api-resolve-base.test.ts
cover the new placeholder detection.

Upgrade Notes

From v1.4.0-beta: drop-in. PyPI installs with pip install -U deeptutor
(PyPI normalizes 1.4.0 over 1.4.0b0). Docker users pull
ghcr.io/hkuds/deeptutor:latest.
Reasoning effort is now read from LLMConfig.reasoning_effort. If
you previously set provider thinking flags manually in extra_body,
the agent will still respect them, but the recommended path is to set
reasoning_effort and let the engine map it.
Restart recovery is automatic. After upgrading, any existing
database rows stuck on running from previous crashes will be
finalized the next time their session is opened.
Beta artifact cleanup: web/.env.local was auto-generated by the
legacy launcher and is no longer needed. If you rebuilt the frontend
manually with that file, delete it before the next build.

Full Changelog: v1.4.0-beta...v1.4.0

DeepTutor v1.4.0-beta Release Notes

Release Date: 2026.05.21

v1.4.0-beta is the largest release since the agent-native rewrite. It folds an
end-to-end Auto Mode on top of the existing capabilities, ships a
three-layer memory subsystem (L1/L2/L3) with a dedicated workbench, rebuilds
Deep Research / Deep Solve / Question on the same agentic engine as Chat,
re-architects the chat capability + LlamaIndex RAG pipeline around a
session-cumulative source inventory, unifies the Capabilities infrastructure
and i18n, merges the Animator menu into Visualize, and reorganises
Settings, environment, and the local launcher. Several new chat tools
(ask_user, web_fetch, write_note, list_notebook, github_query) plus a
delete-chat-turn flow, quiz follow-up chat, and a GeoGebra viewer round out the
release.

Highlights

Auto Mode — Agentic Capability Router

A new auto capability sits on top of the existing modes and chooses the right
one for each request, instead of forcing the user to pick a mode up front.

Three-stage agent loop — ANALYZING (single LLM call, streamed as
thinking) → DELEGATING (up to max_iterations of router calls that emit
delegate_to_<cap> tool calls or atomic tool calls) → SYNTHESIZING (final
inline answer, either passed through from the loop or assembled by a closing
LLM call).
Routes to real capabilities — deep_solve, deep_question,
deep_research, math_animator, visualize, plus the chat-level atomic
tools (web_search, web_fetch, rag, …) live behind the same router so
the LLM can mix retrieval and full sub-capability runs in one turn.
Bounded retries and quotas — independent retry budgets for router-LLM
errors, per-delegation failures, and arg-validation feedback; a configurable
max_same_capability_calls quota keeps the loop from spinning on one mode.
Clean conversation history — sub-capability events flow through a
forward_events shim that tags every content event with a call_id, so the
conversation turn-runtime filter keeps only Auto's own final synthesis in
saved history. Sub-runs are still streamed live to the UI.
answer_now fast-path — when the user asks to "answer now" the pipeline
skips analysis + delegation and produces an immediate inline reply.

Three-Layer Memory Subsystem (Memory v2)

The previous flat memory page is replaced by a structured three-layer store
with an explicit consolidation pipeline and a dedicated workbench.

L1 / L2 / L3 layout — L1 captures raw run traces, L2 holds normalised
document records, L3 holds curated slots per surface (chat, notebook, book,
TutorBot). Per-user paths flow through PathService so multi-user
deployments stay isolated.
Consolidator pipeline — modular consolidator/ modules (chunker, guards,
parse, references, runs, modes, line-doc, meta) turn run traces into
versioned line-oriented documents with stable ids, references between
layers, and a snapshot history.
Memory Workbench UI — new /memory routes (graph, l1, l2, l3,
resolve) ship as standalone pages with workbench, hub, graph viewer, run
panel, and an archived-state banner. A reusable MemorySection component is
embedded where the legacy memory panel used to live.
First-class chat tools — read_memory and write_memory are exposed
as agent tools (with i18n hints) so chat / Auto can recall and update memory
inside a turn instead of needing a separate save step.
Settings integration — Memory now has its own page under
/settings/memory with run controls, mode toggles, and storage status.

Deep Research, Deep Solve, and Question on the Agentic Engine

The three multi-agent pipelines have been rewritten as orchestrators on top of
the shared agentic-engine primitives, deleting hundreds of bespoke prompt
files and per-agent classes.

Deep Research → agents/research/pipeline.py — four phases (Rephrase,
Decompose, Research blocks, Reporting) implemented as labeled steps
(THINK / TOOL / APPEND / OUTLINE / SECTION / FINISH). The dynamic
topic queue and CitationManager are preserved; the new APPEND label lets
research blocks add follow-up topics to the queue without leaving the loop.
ask_user v2 drives up to three rephrase rounds with multi-question cards.
Deep Solve → agents/solve/pipeline.py — Pre-retrieve (KB-only),
Plan, Solve (per-step THINK / TOOL / FINISH / REPLAN loop with a
back-edge from solve to plan), and a final Synthesize step. Each step's
FINISH flows into the next step's prompt context so the answer reads as
one continuous narrative.
Question / Quiz — coordinator + pipeline replace the old generator /
idea_agent / models modules; the old prompt directories have been
removed entirely.
All three drop the legacy agents/ and prompts/ directories for their
respective modes, leaving one pipeline file and shared labeled-step prompts.

Chat Capability & LlamaIndex RAG Refactor

The agentic chat pipeline has been rebuilt around a session-cumulative
"Attached Sources" manifest and a cleaner LlamaIndex pipeline.

Branch-isolated source inventory — services/session/source_inventory.py
materialises every source attached on the active branch's ancestor chain.
Fresh sources from the current turn show a full preview; historical sources
show a one-line row with id, name, kind, size, and the turn ordinal where
they first appeared. The LLM calls read_source(id) to expand the full
text on demand. Sibling branches never leak sources into each other.
LlamaIndex pipeline split-out — dedicated config.py, ingestion.py,
retrievers.py, and document_loader.py replace the previous monolithic
pipeline module. Storage stays backward-compatible with v1.3 versioned
indexes.
Lean agentic chat prompt — agentic_chat.yaml (EN/ZH) was rewritten to
match the new tool surface and the source-inventory contract; the old
parallel-tool prompt scaffolding is gone.
Builtin tools registry — tools/builtin/__init__.py is the single place
where chat-mounted tools, hint prompts, and arg-augmentation wrappers are
registered.

Capabilities Infrastructure Unification

Every capability now goes through one shared envelope, one status-i18n loader,
and one cost-tracking surface.

emit_capability_result helper — every capability emits its final
result through one helper that fills the result envelope (label, summary,
payload, render hints) and the trailing usage-tracker totals consistently.
StatusI18n — capability status copy lives in
capabilities/prompts/{en,zh}/<name>.yaml and is loaded via a shared
StatusI18n accessor. Hard-coded English status strings have been removed
from the pipelines.
UsageTracker cost surface — token usage and cost are tracked through
one tracker per capability run, exposed to the result envelope, and shown
on the new /settings/capabilities admin page (live list, defaults,
per-capability override toggles).
Deprecated main.yaml keys removed — the legacy main.yaml capability
copy has been deleted in favor of per-capability prompt files.

Visualize: Animator Folded Into One Capability

The standalone Animator menu has been merged into Visualize so the user picks a
visualization once and the system chooses the renderer.

render_type discriminator — AnalysisAgent picks one of six render
types — svg, chartjs, mermaid, html (text-emitting, three-stage
pipeline) or manim_video / manim_image (Manim subprocess pipeline). The
result envelope carries render_type so the frontend delegates to the
right viewer.
Single sidebar entry — the old Animator menu entry is gone; users now
go through Visualize for both static charts and Manim videos. The
fullscreen viewer / config panel handle all render types.

New Chat Tools

ask_user — packages 1–3 structured questions into a single payload that
pauses the same turn until the user answers. The frontend renders a card
letting the user navigate questions and submit answers in one batch; the
pipeline resumes the turn with the answers wired back as the tool result.
Used by Deep Research's Rephrase phase and available to chat / Auto.
web_fetch — URL fetch with readable-content extraction, strict scheme
/ private-IP / size guards (applied both pre-flight and post-redirect),
and …[truncated] markers when output exceeds the cap.
write_note — replaces the old save_to_notebook tool. Two modes:
append creates a new record (default body is the rendered transcript,
optional agent-authored body) and edit updates an existing record by
record_id.
list_notebook — read-only index / drill-down listing of the active
user's notebooks and records. Only mounted when the user actually has
notebooks, so empty runs are impossible by construction.
github_query — read-only gh CLI wrapper covering pr, issue,
run, repo, and a GET-only api fallback. No mutation verbs are
reachable through the tool surface. Returns a clean "tool unavailable"
outcome when gh is not installed.

Chat Surface Features

Delete chat turn (#443) — message items now carry a stable id, the
session API exposes deleteMessage, the chat reducer adds a DELETE_TURN
action, and a 409 vs 404 check rejects deletion of a still-running turn.
Optimistic temp ids are resolved before deletion to avoid orphaned UI rows.
Quiz follow-up chat composer — FollowupChatComposer and
QuizFollowupContext let the user start a chat thread directly from a quiz
question. The composer reuses the ma...

DeepTutor v1.3.10 Release Notes

Release Date: 2026.05.10

v1.3.10 is a focused reliability release for the issues reported after v1.3.9.
It restores smoother remote Docker access, makes self-signed LLM endpoints work
consistently across SDK-backed providers, protects code snippets from citation
rewrites, and splits Matrix E2EE into an explicit opt-in dependency.

Highlights

Remote Docker and CORS Recovery

Remote single-user Docker works out of the box again - when
AUTH_ENABLED=false, DeepTutor now accepts browser origins over HTTP/HTTPS so
LAN or remote-server frontends no longer hit the v1.3.8/v1.3.9 CORS
regression reported in #463.
Authenticated deployments stay explicit - when AUTH_ENABLED=true, CORS
still requires a concrete allowlist through CORS_ORIGIN or CORS_ORIGINS,
preserving the credentialed-auth safety boundary.
Multiple deployment origins are supported - CORS_ORIGINS accepts comma
or newline separated values, and both Docker Compose files pass the setting
through to the backend container.
Settings no longer drop network flags - CORS_ORIGIN, CORS_ORIGINS, and
DISABLE_SSL_VERIFY are part of the canonical .env write order.

Provider TLS and Rendering Fixes

DISABLE_SSL_VERIFY now reaches OpenAI SDK paths - OpenAI-compatible,
Azure OpenAI, executor, TutorBot, and legacy embedding SDK clients all receive
a shared httpx.AsyncClient(verify=False) when the flag is enabled, fixing
self-signed HTTPS LLM endpoints reported in #464.
Production still blocks unsafe TLS bypasses - ENVIRONMENT=prod or
ENVIRONMENT=production rejects DISABLE_SSL_VERIFY, with a single warning
logged in non-production use.
Code blocks keep array indexes intact - Markdown citation linkification now
masks fenced and inline code before rewriting references, so values[0] stays
code instead of becoming a #references citation link (#468).

Matrix Install Compatibility

Matrix no longer installs E2EE by default - the standard matrix extra and
requirements/matrix.txt now use plain matrix-nio, avoiding the
python-olm / libolm build failures seen on macOS Python 3.14 and Apple
Clang 21 (#462).
Encrypted rooms are an explicit add-on - install deeptutor[matrix-e2e]
or requirements/matrix-e2e.txt when E2EE support is needed and libolm is
available.
Runtime failures are clearer - Matrix defaults to non-E2EE mode, and
enabling E2EE without crypto dependencies now raises an actionable install
message instead of failing at import time.

Multi-User Runtime Compatibility

Default workspace paths stay stable outside user scope - when no current
multi-user context is active, path resolution falls back to the default data
workspace rather than forcing an admin scope.
Legacy test and monkeypatch hooks remain available - session and settings
routers keep compatibility shims used by tests and older integrations.
Local agent artifacts are ignored - .claude/ is now excluded from Git so
local worktrees and agent metadata do not accidentally enter releases.

Tests

Added CORS setting tests for unauthenticated remote origins and authenticated
explicit allowlists.
Added shared OpenAI SDK HTTP-client tests across provider-core, Azure,
executors, TutorBot, and embedding adapters.
Added Markdown display tests for prose citations, fenced code, inline code,
and explicit backticked citations.
Added Matrix dependency split tests to keep default installs free of
matrix-nio[e2e].
Re-ran targeted Python tests, web node tests, Ruff checks, and diff whitespace
validation for the release patch.

Upgrade Notes

If you run remote Docker with AUTH_ENABLED=false, no extra CORS setting is
required for normal HTTP/HTTPS browser origins.
If you run a shared or authenticated deployment with AUTH_ENABLED=true, set
CORS_ORIGIN or CORS_ORIGINS to the exact frontend origin(s), for example
https://learn.example.com.
Use DISABLE_SSL_VERIFY=true only for local, self-signed, or air-gapped test
LLM endpoints. It remains blocked in ENVIRONMENT=prod and
ENVIRONMENT=production.
Matrix installs are now non-E2EE by default. For encrypted Matrix rooms,
install .[matrix-e2e] or requirements/matrix-e2e.txt, ensure libolm is
present, and set e2ee_enabled=true in the Matrix channel config.
If you previously installed .[matrix] only to get non-encrypted Matrix
messaging, reinstalling after this release should no longer require native
libolm build tooling.

Full Changelog: v1.3.9...v1.3.10

@wedone

DeepTutor v1.3.9 Release Notes

Release Date: 2026.05.09

v1.3.9 builds on the v1.3.8 multi-user foundation with broader TutorBot
deployment options, safer provider routing for thinking models, and a smoother
web onboarding path. It adds Zulip and NVIDIA NIM support, improves startup
ergonomics, and folds in the main issue fixes reported after the last release.

Highlights

TutorBot Channel and Provider Expansion

Zulip is now a TutorBot channel - bots can listen to private messages and
stream topics, enforce allow_from, choose mention-only or open stream
replies, and bridge Zulip's event queue into the async TutorBot bus.
Math and files work better in Zulip - LaTeX is converted to Zulip-friendly
KaTeX markup, upload/download calls use configurable retry behavior, and
attachment filenames include upload-path digests to avoid collisions.
Zulip topics keep conversations separated - stream topics now become part
of the chat/session key, with a stable (no topic) fallback for empty topics.
TutorBot supports NVIDIA NIM - nvidia_nim is available in TutorBot
provider config and registry detection, including NIM's streaming behavior
that omits unsupported stream_options.

Model and Runtime Reliability

Configured context windows are respected - the safety ceiling is raised to
1,000,000 tokens while the large-model fallback remains 65,536, so explicit
128K-style model settings are no longer silently clamped.
Qwen vision detection is fixed - Qwen VL models are treated as
vision-capable across DashScope, OpenAI-compatible, and custom bindings.
Minimal thinking mode is provider-safe - DeepSeek, DashScope, VolcEngine,
BytePlus, and MiniMax no longer receive a rejected top-level
reasoning_effort=minimal; DeepTutor sends the provider-specific disable
signal instead.
DeepSeek v4 costs are tracked - research token accounting includes
deepseek-v4-flash and deepseek-v4-pro pricing entries.

Web and CLI Polish

deeptutor start launches the full web stack - the CLI now delegates to
scripts/start_web.py so backend and frontend can be started from one
command, and launcher failures propagate through the CLI exit code.
Sidebar onboarding is clearer - primary navigation icons now expose
scoped, localized tooltips with descriptions and keyboard focus support.
Multi-line user messages stay readable - chat message rendering preserves
Shift+Enter line breaks, fixing code blocks and structured prompts that were
previously collapsed into one line.
Assigned resources are easier to understand - model-selection summaries
and read-only knowledge-base actions now present clearer labels for
non-admin, grant-scoped sessions.

Multi-User and Session Store Parity

Assigned model options match the selector contract - non-admin LLM choices
now return profile names, model names, labels, and active/default metadata in
the same shape expected by the web model selector.
PocketBase sessions support more chat flows - message metadata can be
persisted, last-message lookup is available, and message deletion works with
PocketBase string IDs as well as SQLite integer IDs.
Regenerate remains storage-neutral - turn retry logic can remove the last
assistant message without assuming the backing session store uses integer
primary keys.

Tests

Added Zulip channel coverage for config parsing, permission checks, duplicate
filtering, mentions, stream topic scoping, attachment extraction, retry
behavior, LaTeX conversion, typing status, sending, uploads, and startup
failures.
Added TutorBot NVIDIA NIM provider tests for registry detection, schema
acceptance, and streaming request compatibility.
Added LLM regression tests for Qwen vision capability, explicit context-window
budgets, and minimal-thinking provider kwargs.
Added CLI coverage so deeptutor start propagates the launcher exit code.
Added research token-pricing coverage for the DeepSeek v4 model entries.

Upgrade Notes

Install or refresh the .[tutorbot] extra, or requirements/tutorbot.txt, to
include the new zulip>=0.8.0,<1.0.0 dependency before enabling Zulip bots.
Configure Zulip bots with site, email, apiKey, allowFrom, and
groupPolicy; use mention for safer stream deployments and open only
when every stream message should reach the bot.
If you use LLM_REASONING_EFFORT=minimal with DeepSeek, DashScope,
VolcEngine, BytePlus, or MiniMax, keep the setting as-is; v1.3.9 translates it
to the correct provider-specific disable payload.
Large configured context windows may now be honored instead of capped at
65,536 tokens, so verify provider limits and expected prompt-cost behavior.
Optional PocketBase deployments should ensure the messages collection has a
metadata_json JSON field before relying on regenerate/session metadata
parity.

What's Changed

fix: raise context_window ceiling and add qwen vision support by @wedone in #442
fix: add deepseek-v4-flash and deepseek-v4-pro to model pricing table by @Starfie1d1272 in #447
fix(llm): stop sending reasoning_effort=minimal as top-level param to providers that reject it by @Starfie1d1272 in #453
feat: add deeptutor start command to launch backend and frontend together by @Starfie1d1272 in #445
fix(web): preserve newlines in user chat messages by @kagura-agent in #449
feat(tutorbot): add Zulip channel support by @wedone in #452
feat: tooltips for sidebar by @philliplagoc in #457
fix: add TutorBot NVIDIA NIM provider support by @Bortlesboat in #455

New Contributors

@philliplagoc made their first contribution in #457
@Bortlesboat made their first contribution in #455

Full Changelog: v1.3.8...v1.3.9

DeepTutor v1.3.8 Release Notes

Release Date: 2026.05.08

v1.3.8 brings DeepTutor's optional multi-user mode into the main release line.
It keeps local single-user installs unchanged while adding authenticated shared
deployments with isolated user workspaces, admin-managed access, and clearer
deployment guidance.

Highlights

Multi-User Workspaces

Authentication can gate shared deployments - enabling AUTH_ENABLED
adds login, registration, JWT sessions, and a first-user admin flow.
Each user gets isolated data - ordinary users work under
multi-user/<uid>/ with separate chat history, memory, notebooks, and
knowledge bases, while admins keep the main workspace.
Admin grants control access - /admin/users lets admins create users and
assign allowed model profiles, knowledge bases, skills, and copied spaces
without exposing API keys.

Safer Runtime Boundaries

Knowledge and RAG stay scoped - assigned knowledge bases are visible with
badges, and non-admin RAG calls no longer fall back silently to admin data.
Model routing honors grants - non-admin chat turns use an assigned model
profile and fail early if no LLM is available.
Settings are redacted for users - non-admin settings show theme, language,
and model summaries, while provider secrets and endpoints remain admin-only.

Deployment and UI

Frontend auth routes are included - /login, /register, auth-aware
middleware, logout controls, and admin navigation are wired into the web app.
Multi-user docs are now first-class - README and translated READMEs
document setup, workspace layout, audit logs, env vars, and production
caveats.
Optional PocketBase remains documented - PocketBase can still be used as a
sidecar path, but true multi-user deployments should leave POCKETBASE_URL
unset and use the built-in JSON/SQLite backend.

Tests

Added multi-user tests for identity migration, first-admin registration,
grants, settings restrictions, scoped interface preferences, skill access, and
RAG fallback prevention.
Added status-redaction coverage so non-admin users do not receive provider
model or search endpoint details.

Upgrade Notes

Existing local installs stay in single-user mode unless AUTH_ENABLED=true.
For real multi-user deployments, set AUTH_ENABLED=true, keep
POCKETBASE_URL blank, create the first admin through /register, and assign
models before ordinary users start chat turns.
New deployment state is stored under multi-user/; back up both data/ and
multi-user/ before upgrading shared instances.
Multi-worker deployments should bootstrap the first admin carefully because
first-user promotion is protected by an in-process lock.

Full Changelog: v1.3.7...v1.3.8

DeepTutor v1.3.7 Release Notes

Release Date: 2026.05.04

v1.3.7 focuses on thinking-model compatibility, clearer knowledge-base index
history, and safer Co-Writer editing. It keeps provider-specific reasoning
output under control while making index activity easier to understand in the UI.

Highlights

Thinking-Model and Gateway Compatibility

Reasoning output stays separate - OpenAI-compatible and TutorBot providers
keep reasoning_content out of visible answer text, and streaming avoids
replaying internal scratchpad as final content.
DeepSeek thinking can be configured from .env - LLM_REASONING_EFFORT
is documented and applied through the resolver path. Use minimal to disable
DeepSeek thinking, or high / max to enable it.
Custom gateway headers are preserved - chat and explicit LLM calls inherit
profile extra_headers, fixing gateways that require custom headers such as
a User-Agent override.
Structured generation is more tolerant - book blocks and question ideation
now handle fenced, repaired, list-shaped, or otherwise imperfect JSON outputs
more reliably.

Knowledge Index Visibility

Index activity is recorded - create, upload, and re-index flows now store
last_indexed_at, indexed document count, and the index action in knowledge
metadata.
Progress payloads describe real index changes - backend status updates can
distinguish metadata-only completion from an actual vector-index update.
The Knowledge UI shows index history - detail, settings, and index-version
panels display the latest index time and document count when available.

Co-Writer Editing Safety

Clear and template actions ask first - replacing a non-empty draft now
opens a confirmation dialog before the editor is cleared or overwritten.
Undo is more dependable - pending typing snapshots are committed before
toolbar edits, and editor shortcuts support Ctrl/Cmd+Z, Shift+Cmd+Z, and
Ctrl/Cmd+Y.
Toolbar controls are clearer - destructive and template actions now have
distinct tones, focus states, labels, and accessible tooltips.

Tests

Added OpenAI-compatible provider tests to keep reasoning_content separate
from visible response content in both service and TutorBot paths.
Expanded LLM factory tests for inherited extra_headers, inherited
reasoning_effort, and reasoning-only streaming behavior.
Added knowledge manager coverage for recording last_indexed_* metadata only
when the index actually changes.

Upgrade Notes

Set LLM_REASONING_EFFORT in .env if you need global thinking control.
Leave it empty to let DeepTutor auto-detect behavior from the active model.
Knowledge-base metadata may now include last_indexed_at,
last_indexed_count, and last_indexed_action.
Co-Writer clear/template actions are recoverable through undo until the user
leaves the current draft.

Full Changelog: v1.3.6...v1.3.7

DeepTutor v1.3.6 Release Notes

Release Date: 2026.05.03

v1.3.6 focuses on making model routing explicit across DeepTutor. Users can
choose configured LLM profiles from chat and TutorBot flows, runtime services
resolve those choices without leaking provider secrets, and RAG/knowledge-base
index handling is more defensive when persisted embeddings are invalid.

Highlights

Catalog-Based Model Selection

Chat can target a configured model - unified chat turns now carry a
profile_id and model_id selection through the WebSocket payload, session
preferences, turn snapshots, and regenerate flows.
Settings exposes safe LLM options - the new settings options endpoint
returns display-ready provider/model choices while omitting credentials and
connection secrets from the response.
Runtime model overrides are scoped per turn - selected profiles are
resolved through the provider catalog for the active request without writing
temporary choices back to disk or changing global defaults.
Model-selector UI is shared - chat and TutorBot screens use the same
configured-model selector, with localized labels and system-default handling.

TutorBot Model Control

Bots can persist model selections - TutorBot create/update flows now accept
llm_selection, validate it against the configured catalog, and store it with
each bot.
Running bots can reload their LLM - changing a bot's model updates the
active agent loop instead of requiring a full bot restart.
Recent bot history is steadier - TutorBot history assembly now sorts by
message timestamp with stable tie-breaking before taking the latest context.
Bot chat route changes are cleaner - the web chat page cancels in-flight
bot requests and resets transient reasoning state when switching bots.

RAG and Knowledge Reliability

Invalid vectors trigger rebuilds - re-indexing no longer treats a matching
document signature as reusable when the existing vector store fails embedding
validation.
Full rebuilds use fresh version directories - complete knowledge-base
rebuilds write to a new flat index version while leaving failed old storage
available for inspection.
RAG tool logs can stream to clients - retrieval runs can forward captured
INFO-level process logs as raw tool events when an event sink is available.
Knowledge health checks recognize bad embeddings - invalid persisted
vectors are surfaced earlier instead of producing opaque search failures.

Provider and Launch Fixes

OpenAI Responses token limits are normalized - Responses API calls now map
chat-style max_completion_tokens and max_tokens to max_output_tokens,
fixing the SDK error reported for newer OpenAI models in #437.
Azure and OpenAI-compatible paths share the mapping - both streaming and
non-streaming Responses API routes use the same conversion helper.
Launch ports come from .env and environment variables - setup and launch
helpers now keep backend/frontend port behavior aligned around the project
.env file instead of the older runtime settings JSON.

Web UX Polish

Skill names validate before save - the Skills editor slugifies names,
flags invalid input inline, and prevents silent API failures for uppercase
letters, spaces, underscores, or other unsupported characters.
Skill editor modals are opaque across themes - the editor now uses the
page background token, avoiding text bleed-through in translucent themes.
Space navigation is easier to scan - Space mini-navigation, notebook,
question-bank, skills, and session-list spacing were tightened with clearer
card and divider treatment.

Tests

Added model-selection service tests for safe option listing, active markers,
invalid profile/model rejection, and non-mutating catalog overrides.
Added unified WebSocket turn-runtime tests for persisted LLM selections,
invalid selections, model switching, snapshots, and regenerate behavior.
Added TutorBot API and manager tests for llm_selection persistence,
validation, runtime reload, and default-model behavior.
Added settings, provider-runtime, and LLM-config tests for scoped catalog
selection and per-turn config precedence.
Added RAG and knowledge-router tests for invalid vector stores, re-index
rebuild decisions, and storage version resolution.
Added OpenAI Responses converter tests for token-limit aliases, precedence,
None filtering, and input immutability.
Added frontend slug tests for skill-name normalization and validation.

Upgrade Notes

Chat and TutorBot clients that want explicit model routing should send
llm_selection as { "profile_id": "...", "model_id": "..." }. Omitting it
continues to use the configured system default.
TutorBot configuration files may now contain llm_selection. Existing bot
configs without that field continue to load, and legacy model values remain
usable as model-name overrides.
Launch ports should be configured in .env or process environment variables
(BACKEND_PORT / FRONTEND_PORT). The old data/user/settings/env.json
port block is no longer used as a launch-port source.
Knowledge bases with stale or invalid persisted vectors may rebuild on the
next re-index even when document signatures have not changed.
Skill names are now normalized and validated as lowercase slugs of up to 64
characters using letters, numbers, and hyphens.

Full Changelog: v1.3.5...v1.3.6

DeepTutor v1.3.5 Release Notes

Release Date: 2026.05.02

v1.3.5 focuses on making local setup and knowledge-base chat more reliable. The
launcher now follows the same runtime settings users configure in the web app,
RAG tool calls are stricter about real search queries, and local embedding
servers no longer receive placeholder auth headers.

Highlights

Smoother Local Launch

Setup Tour writes launch ports - the guided installer now records backend
and frontend ports in data/user/settings/env.json, so later launches can use
the same choices.
start_web.py reads runtime settings first - backend/frontend ports and UI
language come from web settings when available, with .env kept as fallback.
Cleaner process handling - the launcher records started processes, detects
port conflicts, waits for readiness, and exposes scripts/stop_web.py for
cleaning up recorded backend/frontend processes.
Setup requirements are clearer - README and environment examples now align
around Node.js 20.9+, install profiles, complete embedding endpoint URLs, and
optional attachment storage.

More Reliable RAG Tool Calls

RAG queries must be non-empty - tool schemas, prompts, and built-in checks
now reject blank queries early instead of passing empty input into retrieval.
Chat-side fallback is safer - when a model omits the RAG query, the agentic
pipeline can reuse the user's actual question as the retrieval query.
ReAct calls accept simple string input - rag actions that provide a
string are normalized to {"query": ...}, reducing fragile tool-call failures.

Local Embedding Compatibility

No fake API key for local embedding providers - runtime config no longer
injects sk-no-key-required for local embedding servers.
Placeholder keys are not sent as auth headers - OpenAI-compatible
embedding requests suppress Authorization and api-key when the configured
key is the local placeholder, which helps LM Studio, Ollama, vLLM, and similar
servers.
Embedding examples are easier to follow - English and Chinese sample env
files now explain that EMBEDDING_HOST is the exact endpoint DeepTutor calls.

Web UX Polish

Dark-mode provider dropdown is readable - the Settings provider selector
now uses the theme background token, fixing the white native dropdown popover
reported on Edge/Chromium.
Settings controls are more consistent - select fields and setup tour
spotlight behavior were tightened for a steadier settings experience.
Book reference payloads are normalized more defensively - selected book
references keep the same behavior with cleaner filtering and deduplication.

Tests

Added launch settings tests for runtime settings precedence, .env fallback,
and invalid-port handling.
Added start_web.py tests for translation, state persistence, and recorded
process matching.
Added Setup Tour coverage for dependency profiles, Math Animator selection,
Node.js version validation, and saved launch ports.
Added RAG/tool tests for non-empty query schemas, blank-query rejection, and
fallback query behavior.
Added embedding runtime and adapter tests for local providers, placeholder API
keys, and auth header suppression.

Upgrade Notes

Local web installs now require Node.js 20.9 or newer.
start_web.py and setup helpers prefer data/user/settings/env.json and
interface.json over .env; edit the web Settings page or rerun
start_tour.py when changing launch ports.
Local OpenAI-compatible embedding servers should use an empty API key unless a
real key is required. Avoid relying on sk-no-key-required as a transmitted
credential.
Custom RAG callers should always provide a non-empty query; blank queries now
fail fast by design.

Full Changelog: v1.3.4...v1.3.5

Releases: HKUDS/DeepTutor

v1.4.2

DeepTutor v1.4.2 Release Notes

Gemini 2.5+ Reasoning Default-Off

Visualize Pipeline Hardening

Authenticated Requests Land In The Right Workspace (#485)

Reasoning Models + Native Tool Calling: Label Protocol Fixed

Smooth Streaming Across Every Chat Surface

Sidebar Redesign

Lemonade Local Provider

Models-Endpoint Probe Honors DISABLE_SSL_VERIFY

Tests

Upgrade Notes

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.1

DeepTutor v1.4.1 Release Notes

What's New

TutorBot Tool Sandbox Is Opt-In

Per-User Resource Isolation

HTTP / SSE API For A TutorBot

Multimodal Image Fallback

Safe ZIP Upload + Network Settings

Community Fixes & Changes

Upgrade Notes

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.0

DeepTutor v1.4.0 Release Notes

What's New Since v1.4.0-beta

Reasoning Effort + Provider Thinking Flags

Tool-Schema Fallback For Strict Providers

Restart-Safe Turn Runtime

Frontend API Base — Docker Placeholder Hardening

LLM Config Probe — Clearer Status Copy

CLI Chat REPL — Config Surface Aligned With Docs

Repo & Docs Hygiene

Tests

Upgrade Notes

Uh oh!

DeepTutor-v1.4.0-beta

DeepTutor v1.4.0-beta Release Notes

Highlights

Auto Mode — Agentic Capability Router

Three-Layer Memory Subsystem (Memory v2)

Deep Research, Deep Solve, and Question on the Agentic Engine

Chat Capability & LlamaIndex RAG Refactor

Capabilities Infrastructure Unification

Visualize: Animator Folded Into One Capability

New Chat Tools

Chat Surface Features

Contributors

Uh oh!

DeepTutor-v1.3.10

DeepTutor v1.3.10 Release Notes

Highlights

Remote Docker and CORS Recovery

Provider TLS and Rendering Fixes

Matrix Install Compatibility

Multi-User Runtime Compatibility

Tests

Upgrade Notes

Uh oh!

DeepTutor-v1.3.9

DeepTutor v1.3.9 Release Notes

Highlights

TutorBot Channel and Provider Expansion

Model and Runtime Reliability

Web and CLI Polish

Multi-User and Session Store Parity

Tests

Upgrade Notes

What's Changed

New Contributors

Contributors

Uh oh!

Models-Endpoint Probe Honors `DISABLE_SSL_VERIFY`