Releases: HKUDS/DeepTutor
v1.4.2
DeepTutor v1.4.2 Release Notes
Release Date: 2026.05.28
v1.4.2 is a stability and polish release on top of v1.4.1.
It unblocks Gemini 2.5+ across Visualize and the chat agent, fixes a
ContextVar regression that silently routed authenticated requests to the
admin workspace, hardens the chat protocol for reasoning models with
native tool calling, ships smooth-streaming UX across every chat
surface, and adds support for the Lemonade local provider.
Gemini 2.5+ Reasoning Default-Off
Gemini 2.5 / 3 ship with thinking enabled by default and burn the entire
max_tokens budget on reasoning unless reasoning_effort: "none" is
sent on the request. v1.4.2 centralizes that logic in
reasoning_params.default_reasoning_effort_for, the single source of
truth used by all three execution paths (the OpenAI SDK, the aiohttp
fallback, and the reasoning-kwargs builder). Visualize, Chat, Solve,
and the agentic loop all stop returning empty bodies when configured
against gemini-2.5-pro / gemini-2.5-flash / gemini-3-*.
Visualize Pipeline Hardening
Three independent failure modes are fixed:
- Per-capability max_tokens defaults — Visualize now has its own
entry inagents.yaml(16k tokens) seeded from
DEFAULT_AGENTS_SETTINGS, so existing users with a stale
data/user/settings/agents.yamlpick up the higher cap automatically
without hand-editing. - SVG / HTML root trim — when a model wraps its output with prose
("Here you go:<svg>…") or emits a closing fence on the same line
as the closing tag, the generator agent now trims to the outermost
<svg>…</svg>/<!doctype>…</html>so the renderer always receives
a clean root. - Review-step JSON-mode crash → graceful fallback — large or
complex SVGs occasionally trip JSON-mode escaping inside the review
step. Instead of crashing the turn, Visualize now logs the failure
and ships the unreviewed draft so the user still sees a rendered
result.
Authenticated Requests Land In The Right Workspace (#485)
In v1.4.1, require_auth was a sync FastAPI dependency. FastAPI
dispatches sync dependencies via anyio.to_thread.run_sync, which
runs them in a worker thread under a copy of the request context —
so the set_current_user(...) call inside the dependency installed
the user on the thread's context, which was discarded when the
thread returned. The endpoint then read the unset default and fell
back to the admin workspace, silently routing every authenticated
user's reads/writes through the local admin's data.
require_auth and require_admin are now async def, so they
execute in the same asyncio task as the endpoint and the
ContextVar is visible everywhere downstream. HTTP and WebSocket
entry points now share a single _install_current_user helper so
the user object resolved from a token payload is identical across
transports.
Reasoning Models + Native Tool Calling: Label Protocol Fixed
v1.4.1 tried to be clever with reasoning models that have native
tool-calling support — it told them to ignore the TOOL/THINK/
FINISH/PAUSE labels and rely on reasoning_content plus
tool_calls alone, and inside run_labeled_step it treated
<think> preludes and any incoming tool-call delta as implicit
label resolutions. In practice both shortcuts hurt: when a tool
call leaked into the content stream as JSON instead of a real
tool_calls delta, there was no label to repair against, and the
loop happily treated the JSON-as-answer as a FINISH. Multi-turn
reasoning + tool workflows would either burn iterations on repair
retries or silently terminate early.
In v1.4.2:
- Reasoning + native-tools system prompt tells the model that
reasoning is displayed in a separate trace area, but the formal
content stream must still start with exactly one of
FINISH/TOOL/THINK/PAUSE. run_labeled_stepno longer treats tool-call deltas as
authoritative for label resolution, andimplicit_think_labelis
ignored (kept for API compatibility). A missing label always falls
toLABEL_UNKNOWN, so the chat pipeline's protocol-repair path
catches it instead of silently mis-routing the turn.- Inline
<think>...</think>preludes are streamed live into the
reasoning sub-trace and stripped from the formaltextreturned
to the loop — so the answer area no longer leaks raw provider
markers.
Smooth Streaming Across Every Chat Surface
The rAF typewriter (useSmoothStreamText) introduced last week for
the main chat is now wired through AssistantResponse, so the
book chat panel, quiz follow-up tab, and any other surface that
renders an assistant message all get the same frame-aligned cadence
during streaming and a no-op pass-through for completed messages.
Companion fixes:
- Book chat panel and quiz follow-up tab moved their autoscroll to
useLayoutEffectand stopped usingscrollIntoView({behavior: "smooth"})— the smooth animation races against the next-frame
layout update during fast streams and produces visible jitter. They
now do a singlescrollTop = scrollHeightpin in layout phase,
matching whatuseChatAutoScrolldoes on the main chat. - Book chat panel marks its scroller with
data-chat-scroll-rootso
the globaloverflow-anchor: nonerule applies (the browser's
built-in scroll anchoring fights manual pinning when code blocks
reflow above the cursor). AssistantResponseis now memoized — completed bubbles stop
re-parsing markdown when an unrelated streaming sibling updates the
parent.
Sidebar Redesign
The expanded sidebar's chat-session list moved into its own
collapsible Recents region with an independent scroll viewport, so
long histories no longer push secondary nav off-screen. The "New chat"
button is gone (clicking Chat in the nav already starts a new
session), and a Docs link to deeptutor.info
sits next to the GitHub link in the footer.
Each session now renders with a deterministic, friendly Lucide icon —
sparkles, leaf, feather, cloud, droplet, sun, moon, flame, star, etc.
— so the sidebar feels varied at a glance without shuffling on
re-render. Running sessions add a gentle wiggle animation; idle ones
stay still.
Lemonade Local Provider
New lemonade provider binding for the AMD Ryzen AI / NPU runtime
(default base URL http://localhost:13305/api/v1). Auto-detected by
port 13305, no API key required, listed in the README Docker host-
gateway section and in the provider configuration docs alongside
Ollama / LM Studio / llama.cpp / vLLM.
Models-Endpoint Probe Honors DISABLE_SSL_VERIFY
The context-window auto-detection now passes
aiohttp.TCPConnector(ssl=False) when DISABLE_SSL_VERIFY is set,
matching the behavior of the rest of the HTTP layer. Self-signed local
inference servers no longer fall back to the default context window
just because the probe couldn't verify their cert.
Tests
tests/api/test_auth_contextvar.py— pins the regression from #485:
a syncrequire_authwould lose the ContextVar; the async version
preserves it across the dependency boundary.tests/services/llm/test_reasoning_params.py— covers the
centralizeddefault_reasoning_effort_formapping.tests/core/test_labeled_step_think_prelude.py— updated to reflect
the new "labels are always required" semantics.tests/agents/chat/test_agentic_parallel_tools.py— verifies the
reasoning + native-tools path still resolves multi-tool turns.tests/services/config/test_context_window_detection.py— the
models-endpoint probe honorsDISABLE_SSL_VERIFYand passes a
TCPConnector(ssl=False)to the aiohttp session.
Upgrade Notes
- Drop-in from v1.4.1:
pip install -U deeptutor; Docker users pull
ghcr.io/hkuds/deeptutor:latest. - If you previously hand-edited
data/user/settings/agents.yamlto
bump Visualize'smax_tokens, that value still wins. The new 16k
default only seeds users whoseagents.yamldoesn't mention
Visualize at all. - If you wired a Gemini 2.5+ model and saw empty or truncated outputs,
no configuration change is needed — the default-off behavior now
applies automatically.
What's Changed
- fix(auth): make require_auth async so the user ContextVar reaches the endpoint by @truffle-dev in #485
- fix(visualize): unblock Gemini 2.5+ and harden Visualize pipeline by @skinred78 in #490
New Contributors
- @skinred78 made their first contribution in #490
Full Changelog: v1.4.1...v1.4.2
v1.4.1
DeepTutor v1.4.1 Release Notes
Release Date: 2026.05.27
v1.4.1 is a security and stability patch on v1.4.0. It locks down
the TutorBot tool sandbox, isolates per-user resources, fixes a v1.4.0 chat
regression, adds an HTTP API for talking to a specific TutorBot, and ships a
multimodal image-fallback fix for providers DeepTutor has no vision entry for.
What's New
TutorBot Tool Sandbox Is Opt-In
The shell exec tool is no longer registered unless an admin sets
allow_shell_exec, and all filesystem + shell access is confined to the bot
workspace by default. Command deny-lists were re-anchored at command
boundaries, and allow_shell_exec can't be flipped on via the update payload.
Per-User Resource Isolation
Book roots, session databases, turn-runtime stores, and TutorBot directories
are scoped per user, and web/API conversations are keyed per session — so
cross-user requests can't reach each other's data.
HTTP / SSE API For A TutorBot
New POST /{bot_id}/chat and /chat/execute-stream (SSE) endpoints with
auto-start and persistent per-session context, for multi-turn conversations
with a specific bot from external clients.
Multimodal Image Fallback
Images are sent optimistically to every provider; if a request carrying images
fails and the model isn't in the known-vision allowlist, the turn retries
text-only. Fixes silently dropped images on Doubao / VolcEngine and other
multimodal models that lack a capability entry.
Safe ZIP Upload + Network Settings
.zip knowledge uploads expand member-by-member through the document validator
with size / count / compression-ratio bounds and path-escape guards; the
archive itself is never indexed. A new /settings/network page surfaces ports,
public API base, and CORS origins (normalized to tolerate host:port and
trailing slashes), plus a "fetch models" action listing model IDs from an
OpenAI-compatible endpoint.
Community Fixes & Changes
Security — closed TutorBot RCE via the shell tool (#518), path traversal in
the filesystem tool (#517), cross-bot file-management authz bypass (#516),
cross-session turn-regeneration authz bypass (#515), book-confirmation authz
bypass (#514), and ExecTool executing LLM shell commands over chat (#506,
first hardened in PR #507).
Bug fixes — chat input disabled after the first turn (v1.4.0 regression,
#520), knowledge-base embedding failure on long documents (#521 / PR #509),
new users unable to create a profile under Docker (#512 / PR #513), Qwen
reasoning models failing native tool calling (#527 / PR #528), and the GPT-5
init-wizard token parameter (PR #508).
Merged / reworked PRs — native tool calling for reasoning models (#528),
oversized session-event truncation (#524), empty-state profile button (#513),
chunking-pipeline fix (#509), GPT-5 probe (#508), ExecTool hardening (#507).
Contributions #522 (zip upload) and #523 (model fetching + notebook lookup)
were reimplemented locally and ship here.
Feature request — HTTP API for multi-turn chat with a specific TutorBot
(#511).
Upgrade Notes
- Drop-in from v1.4.0:
pip install -U deeptutor; Docker users pull
ghcr.io/hkuds/deeptutor:latest. - TutorBot shell exec is now disabled by default. If you relied on it, set
allow_shell_execon the bot; tool access stays confined to the workspace. - For cross-site HTTPS auth, set explicit CORS origins and
cookie_secure=true.
What's Changed
- fix(settings): add profile button when no profiles exist by @wedone in #513
- Fix GPT-5 token limit parameter in init wizard probe by @lezhimiffyliu in #508
- fix(security): harden ExecTool defaults against command injection (#506) by @kagura-agent in #507
- fix: prevent Document nodes from bypassing chunking pipeline by @washi4 in #509
- feat: truncate oversized event payloads in session response by @xiongjnu in #524
- fix(chat): enable native tool calling for reasoning models via OpenAI-compatible endpoints by @wedone in #528
New Contributors
- @lezhimiffyliu made their first contribution in #508
- @washi4 made their first contribution in #509
- @xiongjnu made their first contribution in #524
Full Changelog: v1.4.0...v1.4.1
v1.4.0
DeepTutor v1.4.0 Release Notes
Release Date: 2026.05.22
v1.4.0 is the GA cut of the v1.4 line. It carries the full v1.4.0-beta scope
(Auto Mode, three-layer Memory workbench, agentic Deep Research / Deep Solve /
Question, LlamaIndex chat refactor, unified capabilities infrastructure +
i18n, Visualize/Animator merge, new chat tools, settings split, multi-user
isolation hardening, and the deeptutor start / deeptutor init launcher)
and adds a focused set of agent-engine and runtime hardening fixes on top.
For the full beta scope see v1.4.0-beta.
What's New Since v1.4.0-beta
Reasoning Effort + Provider Thinking Flags
The agentic engine now normalizes "reasoning effort" across providers in one
place, instead of each pipeline re-deriving it.
build_provider_extra_kwargsindeeptutor/core/agentic/client.py
resolvesreasoning_effortagainstprovider_registry, mapsminimal/
minimumto the right wire shape (DashScope usesminimum), and routes
the on/off intent through provider thinking-style toggles —
thinking_type(DeepSeek),enable_thinking(Qwen-style), or
reasoning_split. Known reasoning models default tohigh; the
deepseek-v4-flashshape gets thinking disabled by default.LLMClientConfig.reasoning_effortis plumbed end-to-end. All four
pipelines — agentic chat, question, research, solve — pass the binding +
effort intobuild_completion_kwargs, so a single user-level setting
controls reasoning behavior across capabilities.
Tool-Schema Fallback For Strict Providers
Some OpenAI-compatible providers reject native function-calling schemas with
a non-standard error. run_labeled_step now detects that case, strips
tools / tool_choice from the request, retries once, and emits a warning
progress event so the user knows the call ran in tool-less prose mode.
Behavior is unchanged for providers that accept tool schemas normally.
Restart-Safe Turn Runtime
A server or container restart used to leave the database row for an
in-flight turn stuck on running, blocking the next message in that
session and leaving the UI hanging on an event stream that would never
complete.
- Orphan detection —
TurnRuntimeManagernow checks whether this
process still owns the turn's in-memory runner. If not, the persisted
turn is markedfailedwith"Turn interrupted by server restart. Please retry your message." start_turnsweeps stale active turns for the session before
creating a new turn, so the user can immediately retry after a restart.subscribe_turnsynthesizes a terminalerror+doneevent for
the orphaned turn so the frontend cleanly closes its streaming state
instead of waiting indefinitely.
Frontend API Base — Docker Placeholder Hardening
web/lib/api.ts switched from exact-match to substring-token detection of
the NEXT_PUBLIC_API_BASE placeholder, and exposes isApiBasePlaceholder()
for reuse. This survives small changes to the Docker placeholder shape that
previously slipped through and produced a blank Settings page with no
visible error. The unused auto-generated web/.env.local shipped with the
beta is removed.
LLM Config Probe — Clearer Status Copy
The /settings/llm probe now emits an explicit "Basic LLM completion
succeeded. Chat additionally validates streaming and provider tool
compatibility at runtime." line after the smoke completion, so users know
the probe deliberately doesn't cover the streaming / tool-call paths that
chat exercises.
CLI Chat REPL — Config Surface Aligned With Docs
The deeptutor chat REPL surface was tightened so it matches the public
docs at deeptutor.info.
deeptutor chat --config key=valueand--config-json '<json>'seed
the initial config without entering the REPL first./config set key valueand/config set key=valueare both accepted;
JSON values ([…],{…}) are parsed correctly viashlex.- Backslash-continuation lets you send multi-line prompts in one turn.
/refsprints a structured state snapshot (session, capability, tools,
KB, history, notebooks, language, config) instead of a single dim line.deeptutor plugin info <name>now also returnscli_aliasesand the
capabilityavailabilityblock.
Repo & Docs Hygiene
- New issue template for the public docs site (
.github/ISSUE_TEMPLATE/docs.yml). AGENTS.mdmodernized to reflect Auto Mode, tool gating, and the
current capability stages..gitignorecovers.playwright-cli/; stale.playwright-cli/
captures,.env.example_CN, andDeepTutor.code-workspaceremoved.
Tests
- New
tests/core/test_agentic_client_provider_kwargs.pycovering the
reasoning-effort + provider thinking-style matrix. - New
tests/core/test_labeled_step_tool_fallback.pyexercising the
tool-schema retry path end-to-end against a scripted client. tests/services/session/test_turn_runtime_subscribe.pyadds coverage
for orphan-running-turn recovery on both subscribe and start_turn.tests/cli/test_chat_cli.pyadds REPL config/backslash tests and a
plugin-info contract check.- New
tests/cli/test_docs_contract.pykeeps the public site docs in
sync with the CLI surface (links +deeptutor …examples). tests/scripts/test_docker_compose.py+web/tests/api-resolve-base.test.ts
cover the new placeholder detection.
Upgrade Notes
- From v1.4.0-beta: drop-in. PyPI installs with
pip install -U deeptutor
(PyPI normalizes1.4.0over1.4.0b0). Docker users pull
ghcr.io/hkuds/deeptutor:latest. - Reasoning effort is now read from
LLMConfig.reasoning_effort. If
you previously set provider thinking flags manually inextra_body,
the agent will still respect them, but the recommended path is to set
reasoning_effortand let the engine map it. - Restart recovery is automatic. After upgrading, any existing
database rows stuck onrunningfrom previous crashes will be
finalized the next time their session is opened. - Beta artifact cleanup:
web/.env.localwas auto-generated by the
legacy launcher and is no longer needed. If you rebuilt the frontend
manually with that file, delete it before the next build.
Full Changelog: v1.4.0-beta...v1.4.0
DeepTutor-v1.4.0-beta
DeepTutor v1.4.0-beta Release Notes
Release Date: 2026.05.21
v1.4.0-beta is the largest release since the agent-native rewrite. It folds an
end-to-end Auto Mode on top of the existing capabilities, ships a
three-layer memory subsystem (L1/L2/L3) with a dedicated workbench, rebuilds
Deep Research / Deep Solve / Question on the same agentic engine as Chat,
re-architects the chat capability + LlamaIndex RAG pipeline around a
session-cumulative source inventory, unifies the Capabilities infrastructure
and i18n, merges the Animator menu into Visualize, and reorganises
Settings, environment, and the local launcher. Several new chat tools
(ask_user, web_fetch, write_note, list_notebook, github_query) plus a
delete-chat-turn flow, quiz follow-up chat, and a GeoGebra viewer round out the
release.
Highlights
Auto Mode — Agentic Capability Router
A new auto capability sits on top of the existing modes and chooses the right
one for each request, instead of forcing the user to pick a mode up front.
- Three-stage agent loop —
ANALYZING(single LLM call, streamed as
thinking) →DELEGATING(up tomax_iterationsof router calls that emit
delegate_to_<cap>tool calls or atomic tool calls) →SYNTHESIZING(final
inline answer, either passed through from the loop or assembled by a closing
LLM call). - Routes to real capabilities —
deep_solve,deep_question,
deep_research,math_animator,visualize, plus the chat-level atomic
tools (web_search,web_fetch,rag, …) live behind the same router so
the LLM can mix retrieval and full sub-capability runs in one turn. - Bounded retries and quotas — independent retry budgets for router-LLM
errors, per-delegation failures, and arg-validation feedback; a configurable
max_same_capability_callsquota keeps the loop from spinning on one mode. - Clean conversation history — sub-capability events flow through a
forward_eventsshim that tags every content event with acall_id, so the
conversation turn-runtime filter keeps only Auto's own final synthesis in
saved history. Sub-runs are still streamed live to the UI. answer_nowfast-path — when the user asks to "answer now" the pipeline
skips analysis + delegation and produces an immediate inline reply.
Three-Layer Memory Subsystem (Memory v2)
The previous flat memory page is replaced by a structured three-layer store
with an explicit consolidation pipeline and a dedicated workbench.
- L1 / L2 / L3 layout — L1 captures raw run traces, L2 holds normalised
document records, L3 holds curated slots per surface (chat, notebook, book,
TutorBot). Per-user paths flow throughPathServiceso multi-user
deployments stay isolated. - Consolidator pipeline — modular
consolidator/modules (chunker, guards,
parse, references, runs, modes, line-doc, meta) turn run traces into
versioned line-oriented documents with stable ids, references between
layers, and a snapshot history. - Memory Workbench UI — new
/memoryroutes (graph,l1,l2,l3,
resolve) ship as standalone pages with workbench, hub, graph viewer, run
panel, and an archived-state banner. A reusableMemorySectioncomponent is
embedded where the legacy memory panel used to live. - First-class chat tools —
read_memoryandwrite_memoryare exposed
as agent tools (with i18n hints) so chat / Auto can recall and update memory
inside a turn instead of needing a separate save step. - Settings integration — Memory now has its own page under
/settings/memorywith run controls, mode toggles, and storage status.
Deep Research, Deep Solve, and Question on the Agentic Engine
The three multi-agent pipelines have been rewritten as orchestrators on top of
the shared agentic-engine primitives, deleting hundreds of bespoke prompt
files and per-agent classes.
- Deep Research →
agents/research/pipeline.py— four phases (Rephrase,
Decompose,Research blocks,Reporting) implemented as labeled steps
(THINK/TOOL/APPEND/OUTLINE/SECTION/FINISH). The dynamic
topic queue andCitationManagerare preserved; the newAPPENDlabel lets
research blocks add follow-up topics to the queue without leaving the loop.
ask_userv2 drives up to three rephrase rounds with multi-question cards. - Deep Solve →
agents/solve/pipeline.py—Pre-retrieve(KB-only),
Plan,Solve(per-stepTHINK/TOOL/FINISH/REPLANloop with a
back-edge from solve to plan), and a finalSynthesizestep. Each step's
FINISHflows into the next step's prompt context so the answer reads as
one continuous narrative. - Question / Quiz — coordinator + pipeline replace the old
generator/
idea_agent/modelsmodules; the old prompt directories have been
removed entirely. - All three drop the legacy
agents/andprompts/directories for their
respective modes, leaving one pipeline file and shared labeled-step prompts.
Chat Capability & LlamaIndex RAG Refactor
The agentic chat pipeline has been rebuilt around a session-cumulative
"Attached Sources" manifest and a cleaner LlamaIndex pipeline.
- Branch-isolated source inventory —
services/session/source_inventory.py
materialises every source attached on the active branch's ancestor chain.
Fresh sources from the current turn show a full preview; historical sources
show a one-line row with id, name, kind, size, and the turn ordinal where
they first appeared. The LLM callsread_source(id)to expand the full
text on demand. Sibling branches never leak sources into each other. - LlamaIndex pipeline split-out — dedicated
config.py,ingestion.py,
retrievers.py, anddocument_loader.pyreplace the previous monolithic
pipeline module. Storage stays backward-compatible with v1.3 versioned
indexes. - Lean agentic chat prompt —
agentic_chat.yaml(EN/ZH) was rewritten to
match the new tool surface and the source-inventory contract; the old
parallel-tool prompt scaffolding is gone. - Builtin tools registry —
tools/builtin/__init__.pyis the single place
where chat-mounted tools, hint prompts, and arg-augmentation wrappers are
registered.
Capabilities Infrastructure Unification
Every capability now goes through one shared envelope, one status-i18n loader,
and one cost-tracking surface.
emit_capability_resulthelper — every capability emits its final
result through one helper that fills the result envelope (label, summary,
payload, render hints) and the trailing usage-tracker totals consistently.StatusI18n— capability status copy lives in
capabilities/prompts/{en,zh}/<name>.yamland is loaded via a shared
StatusI18naccessor. Hard-coded English status strings have been removed
from the pipelines.UsageTrackercost surface — token usage and cost are tracked through
one tracker per capability run, exposed to the result envelope, and shown
on the new/settings/capabilitiesadmin page (live list, defaults,
per-capability override toggles).- Deprecated
main.yamlkeys removed — the legacymain.yamlcapability
copy has been deleted in favor of per-capability prompt files.
Visualize: Animator Folded Into One Capability
The standalone Animator menu has been merged into Visualize so the user picks a
visualization once and the system chooses the renderer.
render_typediscriminator —AnalysisAgentpicks one of six render
types —svg,chartjs,mermaid,html(text-emitting, three-stage
pipeline) ormanim_video/manim_image(Manim subprocess pipeline). The
result envelope carriesrender_typeso the frontend delegates to the
right viewer.- Single sidebar entry — the old
Animatormenu entry is gone; users now
go throughVisualizefor both static charts and Manim videos. The
fullscreen viewer / config panel handle all render types.
New Chat Tools
ask_user— packages 1–3 structured questions into a single payload that
pauses the same turn until the user answers. The frontend renders a card
letting the user navigate questions and submit answers in one batch; the
pipeline resumes the turn with the answers wired back as the tool result.
Used by Deep Research's Rephrase phase and available to chat / Auto.web_fetch— URL fetch with readable-content extraction, strict scheme
/ private-IP / size guards (applied both pre-flight and post-redirect),
and…[truncated]markers when output exceeds the cap.write_note— replaces the oldsave_to_notebooktool. Two modes:
appendcreates a new record (default body is the rendered transcript,
optional agent-authored body) andeditupdates an existing record by
record_id.list_notebook— read-only index / drill-down listing of the active
user's notebooks and records. Only mounted when the user actually has
notebooks, so empty runs are impossible by construction.github_query— read-onlyghCLI wrapper coveringpr,issue,
run,repo, and a GET-onlyapifallback. No mutation verbs are
reachable through the tool surface. Returns a clean "tool unavailable"
outcome whenghis not installed.
Chat Surface Features
- Delete chat turn (#443) — message items now carry a stable
id, the
session API exposesdeleteMessage, the chat reducer adds aDELETE_TURN
action, and a 409 vs 404 check rejects deletion of a still-running turn.
Optimistic temp ids are resolved before deletion to avoid orphaned UI rows. - Quiz follow-up chat composer —
FollowupChatComposerand
QuizFollowupContextlet the user start a chat thread directly from a quiz
question. The composer reuses the ma...
DeepTutor-v1.3.10
DeepTutor v1.3.10 Release Notes
Release Date: 2026.05.10
v1.3.10 is a focused reliability release for the issues reported after v1.3.9.
It restores smoother remote Docker access, makes self-signed LLM endpoints work
consistently across SDK-backed providers, protects code snippets from citation
rewrites, and splits Matrix E2EE into an explicit opt-in dependency.
Highlights
Remote Docker and CORS Recovery
- Remote single-user Docker works out of the box again - when
AUTH_ENABLED=false, DeepTutor now accepts browser origins over HTTP/HTTPS so
LAN or remote-server frontends no longer hit the v1.3.8/v1.3.9 CORS
regression reported in #463. - Authenticated deployments stay explicit - when
AUTH_ENABLED=true, CORS
still requires a concrete allowlist throughCORS_ORIGINorCORS_ORIGINS,
preserving the credentialed-auth safety boundary. - Multiple deployment origins are supported -
CORS_ORIGINSaccepts comma
or newline separated values, and both Docker Compose files pass the setting
through to the backend container. - Settings no longer drop network flags -
CORS_ORIGIN,CORS_ORIGINS, and
DISABLE_SSL_VERIFYare part of the canonical.envwrite order.
Provider TLS and Rendering Fixes
DISABLE_SSL_VERIFYnow reaches OpenAI SDK paths - OpenAI-compatible,
Azure OpenAI, executor, TutorBot, and legacy embedding SDK clients all receive
a sharedhttpx.AsyncClient(verify=False)when the flag is enabled, fixing
self-signed HTTPS LLM endpoints reported in #464.- Production still blocks unsafe TLS bypasses -
ENVIRONMENT=prodor
ENVIRONMENT=productionrejectsDISABLE_SSL_VERIFY, with a single warning
logged in non-production use. - Code blocks keep array indexes intact - Markdown citation linkification now
masks fenced and inline code before rewriting references, sovalues[0]stays
code instead of becoming a#referencescitation link (#468).
Matrix Install Compatibility
- Matrix no longer installs E2EE by default - the standard
matrixextra and
requirements/matrix.txtnow use plainmatrix-nio, avoiding the
python-olm/libolmbuild failures seen on macOS Python 3.14 and Apple
Clang 21 (#462). - Encrypted rooms are an explicit add-on - install
deeptutor[matrix-e2e]
orrequirements/matrix-e2e.txtwhen E2EE support is needed and libolm is
available. - Runtime failures are clearer - Matrix defaults to non-E2EE mode, and
enabling E2EE without crypto dependencies now raises an actionable install
message instead of failing at import time.
Multi-User Runtime Compatibility
- Default workspace paths stay stable outside user scope - when no current
multi-user context is active, path resolution falls back to the default data
workspace rather than forcing an admin scope. - Legacy test and monkeypatch hooks remain available - session and settings
routers keep compatibility shims used by tests and older integrations. - Local agent artifacts are ignored -
.claude/is now excluded from Git so
local worktrees and agent metadata do not accidentally enter releases.
Tests
- Added CORS setting tests for unauthenticated remote origins and authenticated
explicit allowlists. - Added shared OpenAI SDK HTTP-client tests across provider-core, Azure,
executors, TutorBot, and embedding adapters. - Added Markdown display tests for prose citations, fenced code, inline code,
and explicit backticked citations. - Added Matrix dependency split tests to keep default installs free of
matrix-nio[e2e]. - Re-ran targeted Python tests, web node tests, Ruff checks, and diff whitespace
validation for the release patch.
Upgrade Notes
- If you run remote Docker with
AUTH_ENABLED=false, no extra CORS setting is
required for normal HTTP/HTTPS browser origins. - If you run a shared or authenticated deployment with
AUTH_ENABLED=true, set
CORS_ORIGINorCORS_ORIGINSto the exact frontend origin(s), for example
https://learn.example.com. - Use
DISABLE_SSL_VERIFY=trueonly for local, self-signed, or air-gapped test
LLM endpoints. It remains blocked inENVIRONMENT=prodand
ENVIRONMENT=production. - Matrix installs are now non-E2EE by default. For encrypted Matrix rooms,
install.[matrix-e2e]orrequirements/matrix-e2e.txt, ensure libolm is
present, and sete2ee_enabled=truein the Matrix channel config. - If you previously installed
.[matrix]only to get non-encrypted Matrix
messaging, reinstalling after this release should no longer require native
libolm build tooling.
Full Changelog: v1.3.9...v1.3.10
DeepTutor-v1.3.9
DeepTutor v1.3.9 Release Notes
Release Date: 2026.05.09
v1.3.9 builds on the v1.3.8 multi-user foundation with broader TutorBot
deployment options, safer provider routing for thinking models, and a smoother
web onboarding path. It adds Zulip and NVIDIA NIM support, improves startup
ergonomics, and folds in the main issue fixes reported after the last release.
Highlights
TutorBot Channel and Provider Expansion
- Zulip is now a TutorBot channel - bots can listen to private messages and
stream topics, enforceallow_from, choose mention-only or open stream
replies, and bridge Zulip's event queue into the async TutorBot bus. - Math and files work better in Zulip - LaTeX is converted to Zulip-friendly
KaTeX markup, upload/download calls use configurable retry behavior, and
attachment filenames include upload-path digests to avoid collisions. - Zulip topics keep conversations separated - stream topics now become part
of the chat/session key, with a stable(no topic)fallback for empty topics. - TutorBot supports NVIDIA NIM -
nvidia_nimis available in TutorBot
provider config and registry detection, including NIM's streaming behavior
that omits unsupportedstream_options.
Model and Runtime Reliability
- Configured context windows are respected - the safety ceiling is raised to
1,000,000 tokens while the large-model fallback remains 65,536, so explicit
128K-style model settings are no longer silently clamped. - Qwen vision detection is fixed - Qwen VL models are treated as
vision-capable across DashScope, OpenAI-compatible, and custom bindings. - Minimal thinking mode is provider-safe - DeepSeek, DashScope, VolcEngine,
BytePlus, and MiniMax no longer receive a rejected top-level
reasoning_effort=minimal; DeepTutor sends the provider-specific disable
signal instead. - DeepSeek v4 costs are tracked - research token accounting includes
deepseek-v4-flashanddeepseek-v4-propricing entries.
Web and CLI Polish
deeptutor startlaunches the full web stack - the CLI now delegates to
scripts/start_web.pyso backend and frontend can be started from one
command, and launcher failures propagate through the CLI exit code.- Sidebar onboarding is clearer - primary navigation icons now expose
scoped, localized tooltips with descriptions and keyboard focus support. - Multi-line user messages stay readable - chat message rendering preserves
Shift+Enter line breaks, fixing code blocks and structured prompts that were
previously collapsed into one line. - Assigned resources are easier to understand - model-selection summaries
and read-only knowledge-base actions now present clearer labels for
non-admin, grant-scoped sessions.
Multi-User and Session Store Parity
- Assigned model options match the selector contract - non-admin LLM choices
now return profile names, model names, labels, and active/default metadata in
the same shape expected by the web model selector. - PocketBase sessions support more chat flows - message metadata can be
persisted, last-message lookup is available, and message deletion works with
PocketBase string IDs as well as SQLite integer IDs. - Regenerate remains storage-neutral - turn retry logic can remove the last
assistant message without assuming the backing session store uses integer
primary keys.
Tests
- Added Zulip channel coverage for config parsing, permission checks, duplicate
filtering, mentions, stream topic scoping, attachment extraction, retry
behavior, LaTeX conversion, typing status, sending, uploads, and startup
failures. - Added TutorBot NVIDIA NIM provider tests for registry detection, schema
acceptance, and streaming request compatibility. - Added LLM regression tests for Qwen vision capability, explicit context-window
budgets, and minimal-thinking provider kwargs. - Added CLI coverage so
deeptutor startpropagates the launcher exit code. - Added research token-pricing coverage for the DeepSeek v4 model entries.
Upgrade Notes
- Install or refresh the
.[tutorbot]extra, orrequirements/tutorbot.txt, to
include the newzulip>=0.8.0,<1.0.0dependency before enabling Zulip bots. - Configure Zulip bots with
site,email,apiKey,allowFrom, and
groupPolicy; usementionfor safer stream deployments andopenonly
when every stream message should reach the bot. - If you use
LLM_REASONING_EFFORT=minimalwith DeepSeek, DashScope,
VolcEngine, BytePlus, or MiniMax, keep the setting as-is; v1.3.9 translates it
to the correct provider-specific disable payload. - Large configured context windows may now be honored instead of capped at
65,536 tokens, so verify provider limits and expected prompt-cost behavior. - Optional PocketBase deployments should ensure the
messagescollection has a
metadata_jsonJSON field before relying on regenerate/session metadata
parity.
What's Changed
- fix: raise context_window ceiling and add qwen vision support by @wedone in #442
- fix: add deepseek-v4-flash and deepseek-v4-pro to model pricing table by @Starfie1d1272 in #447
- fix(llm): stop sending reasoning_effort=minimal as top-level param to providers that reject it by @Starfie1d1272 in #453
- feat: add deeptutor start command to launch backend and frontend together by @Starfie1d1272 in #445
- fix(web): preserve newlines in user chat messages by @kagura-agent in #449
- feat(tutorbot): add Zulip channel support by @wedone in #452
- feat: tooltips for sidebar by @philliplagoc in #457
- fix: add TutorBot NVIDIA NIM provider support by @Bortlesboat in #455
New Contributors
- @philliplagoc made their first contribution in #457
- @Bortlesboat made their first contribution in #455
Full Changelog: v1.3.8...v1.3.9
DeepTutor-v1.3.8
DeepTutor v1.3.8 Release Notes
Release Date: 2026.05.08
v1.3.8 brings DeepTutor's optional multi-user mode into the main release line.
It keeps local single-user installs unchanged while adding authenticated shared
deployments with isolated user workspaces, admin-managed access, and clearer
deployment guidance.
Highlights
Multi-User Workspaces
- Authentication can gate shared deployments - enabling
AUTH_ENABLED
adds login, registration, JWT sessions, and a first-user admin flow. - Each user gets isolated data - ordinary users work under
multi-user/<uid>/with separate chat history, memory, notebooks, and
knowledge bases, while admins keep the main workspace. - Admin grants control access -
/admin/userslets admins create users and
assign allowed model profiles, knowledge bases, skills, and copied spaces
without exposing API keys.
Safer Runtime Boundaries
- Knowledge and RAG stay scoped - assigned knowledge bases are visible with
badges, and non-admin RAG calls no longer fall back silently to admin data. - Model routing honors grants - non-admin chat turns use an assigned model
profile and fail early if no LLM is available. - Settings are redacted for users - non-admin settings show theme, language,
and model summaries, while provider secrets and endpoints remain admin-only.
Deployment and UI
- Frontend auth routes are included -
/login,/register, auth-aware
middleware, logout controls, and admin navigation are wired into the web app. - Multi-user docs are now first-class - README and translated READMEs
document setup, workspace layout, audit logs, env vars, and production
caveats. - Optional PocketBase remains documented - PocketBase can still be used as a
sidecar path, but true multi-user deployments should leavePOCKETBASE_URL
unset and use the built-in JSON/SQLite backend.
Tests
- Added multi-user tests for identity migration, first-admin registration,
grants, settings restrictions, scoped interface preferences, skill access, and
RAG fallback prevention. - Added status-redaction coverage so non-admin users do not receive provider
model or search endpoint details.
Upgrade Notes
- Existing local installs stay in single-user mode unless
AUTH_ENABLED=true. - For real multi-user deployments, set
AUTH_ENABLED=true, keep
POCKETBASE_URLblank, create the first admin through/register, and assign
models before ordinary users start chat turns. - New deployment state is stored under
multi-user/; back up bothdata/and
multi-user/before upgrading shared instances. - Multi-worker deployments should bootstrap the first admin carefully because
first-user promotion is protected by an in-process lock.
Full Changelog: v1.3.7...v1.3.8
DeepTutor-v1.3.7
DeepTutor v1.3.7 Release Notes
Release Date: 2026.05.04
v1.3.7 focuses on thinking-model compatibility, clearer knowledge-base index
history, and safer Co-Writer editing. It keeps provider-specific reasoning
output under control while making index activity easier to understand in the UI.
Highlights
Thinking-Model and Gateway Compatibility
- Reasoning output stays separate - OpenAI-compatible and TutorBot providers
keepreasoning_contentout of visible answer text, and streaming avoids
replaying internal scratchpad as final content. - DeepSeek thinking can be configured from
.env-LLM_REASONING_EFFORT
is documented and applied through the resolver path. Useminimalto disable
DeepSeek thinking, orhigh/maxto enable it. - Custom gateway headers are preserved - chat and explicit LLM calls inherit
profileextra_headers, fixing gateways that require custom headers such as
aUser-Agentoverride. - Structured generation is more tolerant - book blocks and question ideation
now handle fenced, repaired, list-shaped, or otherwise imperfect JSON outputs
more reliably.
Knowledge Index Visibility
- Index activity is recorded - create, upload, and re-index flows now store
last_indexed_at, indexed document count, and the index action in knowledge
metadata. - Progress payloads describe real index changes - backend status updates can
distinguish metadata-only completion from an actual vector-index update. - The Knowledge UI shows index history - detail, settings, and index-version
panels display the latest index time and document count when available.
Co-Writer Editing Safety
- Clear and template actions ask first - replacing a non-empty draft now
opens a confirmation dialog before the editor is cleared or overwritten. - Undo is more dependable - pending typing snapshots are committed before
toolbar edits, and editor shortcuts support Ctrl/Cmd+Z, Shift+Cmd+Z, and
Ctrl/Cmd+Y. - Toolbar controls are clearer - destructive and template actions now have
distinct tones, focus states, labels, and accessible tooltips.
Tests
- Added OpenAI-compatible provider tests to keep
reasoning_contentseparate
from visible response content in both service and TutorBot paths. - Expanded LLM factory tests for inherited
extra_headers, inherited
reasoning_effort, and reasoning-only streaming behavior. - Added knowledge manager coverage for recording
last_indexed_*metadata only
when the index actually changes.
Upgrade Notes
- Set
LLM_REASONING_EFFORTin.envif you need global thinking control.
Leave it empty to let DeepTutor auto-detect behavior from the active model. - Knowledge-base metadata may now include
last_indexed_at,
last_indexed_count, andlast_indexed_action. - Co-Writer clear/template actions are recoverable through undo until the user
leaves the current draft.
Full Changelog: v1.3.6...v1.3.7
DeepTutor-v1.3.6
DeepTutor v1.3.6 Release Notes
Release Date: 2026.05.03
v1.3.6 focuses on making model routing explicit across DeepTutor. Users can
choose configured LLM profiles from chat and TutorBot flows, runtime services
resolve those choices without leaking provider secrets, and RAG/knowledge-base
index handling is more defensive when persisted embeddings are invalid.
Highlights
Catalog-Based Model Selection
- Chat can target a configured model - unified chat turns now carry a
profile_idandmodel_idselection through the WebSocket payload, session
preferences, turn snapshots, and regenerate flows. - Settings exposes safe LLM options - the new settings options endpoint
returns display-ready provider/model choices while omitting credentials and
connection secrets from the response. - Runtime model overrides are scoped per turn - selected profiles are
resolved through the provider catalog for the active request without writing
temporary choices back to disk or changing global defaults. - Model-selector UI is shared - chat and TutorBot screens use the same
configured-model selector, with localized labels and system-default handling.
TutorBot Model Control
- Bots can persist model selections - TutorBot create/update flows now accept
llm_selection, validate it against the configured catalog, and store it with
each bot. - Running bots can reload their LLM - changing a bot's model updates the
active agent loop instead of requiring a full bot restart. - Recent bot history is steadier - TutorBot history assembly now sorts by
message timestamp with stable tie-breaking before taking the latest context. - Bot chat route changes are cleaner - the web chat page cancels in-flight
bot requests and resets transient reasoning state when switching bots.
RAG and Knowledge Reliability
- Invalid vectors trigger rebuilds - re-indexing no longer treats a matching
document signature as reusable when the existing vector store fails embedding
validation. - Full rebuilds use fresh version directories - complete knowledge-base
rebuilds write to a new flat index version while leaving failed old storage
available for inspection. - RAG tool logs can stream to clients - retrieval runs can forward captured
INFO-level process logs as raw tool events when an event sink is available. - Knowledge health checks recognize bad embeddings - invalid persisted
vectors are surfaced earlier instead of producing opaque search failures.
Provider and Launch Fixes
- OpenAI Responses token limits are normalized - Responses API calls now map
chat-stylemax_completion_tokensandmax_tokenstomax_output_tokens,
fixing the SDK error reported for newer OpenAI models in #437. - Azure and OpenAI-compatible paths share the mapping - both streaming and
non-streaming Responses API routes use the same conversion helper. - Launch ports come from
.envand environment variables - setup and launch
helpers now keep backend/frontend port behavior aligned around the project
.envfile instead of the older runtime settings JSON.
Web UX Polish
- Skill names validate before save - the Skills editor slugifies names,
flags invalid input inline, and prevents silent API failures for uppercase
letters, spaces, underscores, or other unsupported characters. - Skill editor modals are opaque across themes - the editor now uses the
page background token, avoiding text bleed-through in translucent themes. - Space navigation is easier to scan - Space mini-navigation, notebook,
question-bank, skills, and session-list spacing were tightened with clearer
card and divider treatment.
Tests
- Added model-selection service tests for safe option listing, active markers,
invalid profile/model rejection, and non-mutating catalog overrides. - Added unified WebSocket turn-runtime tests for persisted LLM selections,
invalid selections, model switching, snapshots, and regenerate behavior. - Added TutorBot API and manager tests for
llm_selectionpersistence,
validation, runtime reload, and default-model behavior. - Added settings, provider-runtime, and LLM-config tests for scoped catalog
selection and per-turn config precedence. - Added RAG and knowledge-router tests for invalid vector stores, re-index
rebuild decisions, and storage version resolution. - Added OpenAI Responses converter tests for token-limit aliases, precedence,
Nonefiltering, and input immutability. - Added frontend slug tests for skill-name normalization and validation.
Upgrade Notes
- Chat and TutorBot clients that want explicit model routing should send
llm_selectionas{ "profile_id": "...", "model_id": "..." }. Omitting it
continues to use the configured system default. - TutorBot configuration files may now contain
llm_selection. Existing bot
configs without that field continue to load, and legacymodelvalues remain
usable as model-name overrides. - Launch ports should be configured in
.envor process environment variables
(BACKEND_PORT/FRONTEND_PORT). The olddata/user/settings/env.json
port block is no longer used as a launch-port source. - Knowledge bases with stale or invalid persisted vectors may rebuild on the
next re-index even when document signatures have not changed. - Skill names are now normalized and validated as lowercase slugs of up to 64
characters using letters, numbers, and hyphens.
Full Changelog: v1.3.5...v1.3.6
DeepTutor-v1.3.5
DeepTutor v1.3.5 Release Notes
Release Date: 2026.05.02
v1.3.5 focuses on making local setup and knowledge-base chat more reliable. The
launcher now follows the same runtime settings users configure in the web app,
RAG tool calls are stricter about real search queries, and local embedding
servers no longer receive placeholder auth headers.
Highlights
Smoother Local Launch
- Setup Tour writes launch ports - the guided installer now records backend
and frontend ports indata/user/settings/env.json, so later launches can use
the same choices. start_web.pyreads runtime settings first - backend/frontend ports and UI
language come from web settings when available, with.envkept as fallback.- Cleaner process handling - the launcher records started processes, detects
port conflicts, waits for readiness, and exposesscripts/stop_web.pyfor
cleaning up recorded backend/frontend processes. - Setup requirements are clearer - README and environment examples now align
around Node.js 20.9+, install profiles, complete embedding endpoint URLs, and
optional attachment storage.
More Reliable RAG Tool Calls
- RAG queries must be non-empty - tool schemas, prompts, and built-in checks
now reject blank queries early instead of passing empty input into retrieval. - Chat-side fallback is safer - when a model omits the RAG query, the agentic
pipeline can reuse the user's actual question as the retrieval query. - ReAct calls accept simple string input -
ragactions that provide a
string are normalized to{"query": ...}, reducing fragile tool-call failures.
Local Embedding Compatibility
- No fake API key for local embedding providers - runtime config no longer
injectssk-no-key-requiredfor local embedding servers. - Placeholder keys are not sent as auth headers - OpenAI-compatible
embedding requests suppressAuthorizationandapi-keywhen the configured
key is the local placeholder, which helps LM Studio, Ollama, vLLM, and similar
servers. - Embedding examples are easier to follow - English and Chinese sample env
files now explain thatEMBEDDING_HOSTis the exact endpoint DeepTutor calls.
Web UX Polish
- Dark-mode provider dropdown is readable - the Settings provider selector
now uses the theme background token, fixing the white native dropdown popover
reported on Edge/Chromium. - Settings controls are more consistent - select fields and setup tour
spotlight behavior were tightened for a steadier settings experience. - Book reference payloads are normalized more defensively - selected book
references keep the same behavior with cleaner filtering and deduplication.
Tests
- Added launch settings tests for runtime settings precedence,
.envfallback,
and invalid-port handling. - Added
start_web.pytests for translation, state persistence, and recorded
process matching. - Added Setup Tour coverage for dependency profiles, Math Animator selection,
Node.js version validation, and saved launch ports. - Added RAG/tool tests for non-empty query schemas, blank-query rejection, and
fallback query behavior. - Added embedding runtime and adapter tests for local providers, placeholder API
keys, and auth header suppression.
Upgrade Notes
- Local web installs now require Node.js 20.9 or newer.
start_web.pyand setup helpers preferdata/user/settings/env.jsonand
interface.jsonover.env; edit the web Settings page or rerun
start_tour.pywhen changing launch ports.- Local OpenAI-compatible embedding servers should use an empty API key unless a
real key is required. Avoid relying onsk-no-key-requiredas a transmitted
credential. - Custom RAG callers should always provide a non-empty
query; blank queries now
fail fast by design.
Full Changelog: v1.3.4...v1.3.5