Skip to content

Releases: HKUDS/DeepTutor

v1.4.2

28 May 14:37

Choose a tag to compare

DeepTutor v1.4.2 Release Notes

Release Date: 2026.05.28

v1.4.2 is a stability and polish release on top of v1.4.1.
It unblocks Gemini 2.5+ across Visualize and the chat agent, fixes a
ContextVar regression that silently routed authenticated requests to the
admin workspace, hardens the chat protocol for reasoning models with
native tool calling, ships smooth-streaming UX across every chat
surface, and adds support for the Lemonade local provider.

Gemini 2.5+ Reasoning Default-Off

Gemini 2.5 / 3 ship with thinking enabled by default and burn the entire
max_tokens budget on reasoning unless reasoning_effort: "none" is
sent on the request. v1.4.2 centralizes that logic in
reasoning_params.default_reasoning_effort_for, the single source of
truth used by all three execution paths (the OpenAI SDK, the aiohttp
fallback, and the reasoning-kwargs builder). Visualize, Chat, Solve,
and the agentic loop all stop returning empty bodies when configured
against gemini-2.5-pro / gemini-2.5-flash / gemini-3-*.

Visualize Pipeline Hardening

Three independent failure modes are fixed:

  • Per-capability max_tokens defaults — Visualize now has its own
    entry in agents.yaml (16k tokens) seeded from
    DEFAULT_AGENTS_SETTINGS, so existing users with a stale
    data/user/settings/agents.yaml pick up the higher cap automatically
    without hand-editing.
  • SVG / HTML root trim — when a model wraps its output with prose
    ("Here you go: <svg>…") or emits a closing fence on the same line
    as the closing tag, the generator agent now trims to the outermost
    <svg>…</svg> / <!doctype>…</html> so the renderer always receives
    a clean root.
  • Review-step JSON-mode crash → graceful fallback — large or
    complex SVGs occasionally trip JSON-mode escaping inside the review
    step. Instead of crashing the turn, Visualize now logs the failure
    and ships the unreviewed draft so the user still sees a rendered
    result.

Authenticated Requests Land In The Right Workspace (#485)

In v1.4.1, require_auth was a sync FastAPI dependency. FastAPI
dispatches sync dependencies via anyio.to_thread.run_sync, which
runs them in a worker thread under a copy of the request context —
so the set_current_user(...) call inside the dependency installed
the user on the thread's context, which was discarded when the
thread returned. The endpoint then read the unset default and fell
back to the admin workspace, silently routing every authenticated
user's reads/writes through the local admin's data.

require_auth and require_admin are now async def, so they
execute in the same asyncio task as the endpoint and the
ContextVar is visible everywhere downstream. HTTP and WebSocket
entry points now share a single _install_current_user helper so
the user object resolved from a token payload is identical across
transports.

Reasoning Models + Native Tool Calling: Label Protocol Fixed

v1.4.1 tried to be clever with reasoning models that have native
tool-calling support — it told them to ignore the TOOL/THINK/
FINISH/PAUSE labels and rely on reasoning_content plus
tool_calls alone, and inside run_labeled_step it treated
<think> preludes and any incoming tool-call delta as implicit
label resolutions. In practice both shortcuts hurt: when a tool
call leaked into the content stream as JSON instead of a real
tool_calls delta, there was no label to repair against, and the
loop happily treated the JSON-as-answer as a FINISH. Multi-turn
reasoning + tool workflows would either burn iterations on repair
retries or silently terminate early.

In v1.4.2:

  • Reasoning + native-tools system prompt tells the model that
    reasoning is displayed in a separate trace area, but the formal
    content stream must still start with exactly one of
    FINISH/TOOL/THINK/PAUSE.
  • run_labeled_step no longer treats tool-call deltas as
    authoritative for label resolution, and implicit_think_label is
    ignored (kept for API compatibility). A missing label always falls
    to LABEL_UNKNOWN, so the chat pipeline's protocol-repair path
    catches it instead of silently mis-routing the turn.
  • Inline <think>...</think> preludes are streamed live into the
    reasoning sub-trace and stripped from the formal text returned
    to the loop — so the answer area no longer leaks raw provider
    markers.

Smooth Streaming Across Every Chat Surface

The rAF typewriter (useSmoothStreamText) introduced last week for
the main chat is now wired through AssistantResponse, so the
book chat panel, quiz follow-up tab, and any other surface that
renders an assistant message all get the same frame-aligned cadence
during streaming and a no-op pass-through for completed messages.

Companion fixes:

  • Book chat panel and quiz follow-up tab moved their autoscroll to
    useLayoutEffect and stopped using scrollIntoView({behavior: "smooth"}) — the smooth animation races against the next-frame
    layout update during fast streams and produces visible jitter. They
    now do a single scrollTop = scrollHeight pin in layout phase,
    matching what useChatAutoScroll does on the main chat.
  • Book chat panel marks its scroller with data-chat-scroll-root so
    the global overflow-anchor: none rule applies (the browser's
    built-in scroll anchoring fights manual pinning when code blocks
    reflow above the cursor).
  • AssistantResponse is now memoized — completed bubbles stop
    re-parsing markdown when an unrelated streaming sibling updates the
    parent.

Sidebar Redesign

The expanded sidebar's chat-session list moved into its own
collapsible Recents region with an independent scroll viewport, so
long histories no longer push secondary nav off-screen. The "New chat"
button is gone (clicking Chat in the nav already starts a new
session), and a Docs link to deeptutor.info
sits next to the GitHub link in the footer.

Each session now renders with a deterministic, friendly Lucide icon —
sparkles, leaf, feather, cloud, droplet, sun, moon, flame, star, etc.
— so the sidebar feels varied at a glance without shuffling on
re-render. Running sessions add a gentle wiggle animation; idle ones
stay still.

Lemonade Local Provider

New lemonade provider binding for the AMD Ryzen AI / NPU runtime
(default base URL http://localhost:13305/api/v1). Auto-detected by
port 13305, no API key required, listed in the README Docker host-
gateway section and in the provider configuration docs alongside
Ollama / LM Studio / llama.cpp / vLLM.

Models-Endpoint Probe Honors DISABLE_SSL_VERIFY

The context-window auto-detection now passes
aiohttp.TCPConnector(ssl=False) when DISABLE_SSL_VERIFY is set,
matching the behavior of the rest of the HTTP layer. Self-signed local
inference servers no longer fall back to the default context window
just because the probe couldn't verify their cert.

Tests

  • tests/api/test_auth_contextvar.py — pins the regression from #485:
    a sync require_auth would lose the ContextVar; the async version
    preserves it across the dependency boundary.
  • tests/services/llm/test_reasoning_params.py — covers the
    centralized default_reasoning_effort_for mapping.
  • tests/core/test_labeled_step_think_prelude.py — updated to reflect
    the new "labels are always required" semantics.
  • tests/agents/chat/test_agentic_parallel_tools.py — verifies the
    reasoning + native-tools path still resolves multi-tool turns.
  • tests/services/config/test_context_window_detection.py — the
    models-endpoint probe honors DISABLE_SSL_VERIFY and passes a
    TCPConnector(ssl=False) to the aiohttp session.

Upgrade Notes

  • Drop-in from v1.4.1: pip install -U deeptutor; Docker users pull
    ghcr.io/hkuds/deeptutor:latest.
  • If you previously hand-edited data/user/settings/agents.yaml to
    bump Visualize's max_tokens, that value still wins. The new 16k
    default only seeds users whose agents.yaml doesn't mention
    Visualize at all.
  • If you wired a Gemini 2.5+ model and saw empty or truncated outputs,
    no configuration change is needed — the default-off behavior now
    applies automatically.

What's Changed

  • fix(auth): make require_auth async so the user ContextVar reaches the endpoint by @truffle-dev in #485
  • fix(visualize): unblock Gemini 2.5+ and harden Visualize pipeline by @skinred78 in #490

New Contributors

Full Changelog: v1.4.1...v1.4.2

v1.4.1

27 May 15:17

Choose a tag to compare

DeepTutor v1.4.1 Release Notes

Release Date: 2026.05.27

v1.4.1 is a security and stability patch on v1.4.0. It locks down
the TutorBot tool sandbox, isolates per-user resources, fixes a v1.4.0 chat
regression, adds an HTTP API for talking to a specific TutorBot, and ships a
multimodal image-fallback fix for providers DeepTutor has no vision entry for.

What's New

TutorBot Tool Sandbox Is Opt-In

The shell exec tool is no longer registered unless an admin sets
allow_shell_exec, and all filesystem + shell access is confined to the bot
workspace by default. Command deny-lists were re-anchored at command
boundaries, and allow_shell_exec can't be flipped on via the update payload.

Per-User Resource Isolation

Book roots, session databases, turn-runtime stores, and TutorBot directories
are scoped per user, and web/API conversations are keyed per session — so
cross-user requests can't reach each other's data.

HTTP / SSE API For A TutorBot

New POST /{bot_id}/chat and /chat/execute-stream (SSE) endpoints with
auto-start and persistent per-session context, for multi-turn conversations
with a specific bot from external clients.

Multimodal Image Fallback

Images are sent optimistically to every provider; if a request carrying images
fails and the model isn't in the known-vision allowlist, the turn retries
text-only. Fixes silently dropped images on Doubao / VolcEngine and other
multimodal models that lack a capability entry.

Safe ZIP Upload + Network Settings

.zip knowledge uploads expand member-by-member through the document validator
with size / count / compression-ratio bounds and path-escape guards; the
archive itself is never indexed. A new /settings/network page surfaces ports,
public API base, and CORS origins (normalized to tolerate host:port and
trailing slashes), plus a "fetch models" action listing model IDs from an
OpenAI-compatible endpoint.

Community Fixes & Changes

Security — closed TutorBot RCE via the shell tool (#518), path traversal in
the filesystem tool (#517), cross-bot file-management authz bypass (#516),
cross-session turn-regeneration authz bypass (#515), book-confirmation authz
bypass (#514), and ExecTool executing LLM shell commands over chat (#506,
first hardened in PR #507).

Bug fixes — chat input disabled after the first turn (v1.4.0 regression,
#520), knowledge-base embedding failure on long documents (#521 / PR #509),
new users unable to create a profile under Docker (#512 / PR #513), Qwen
reasoning models failing native tool calling (#527 / PR #528), and the GPT-5
init-wizard token parameter (PR #508).

Merged / reworked PRs — native tool calling for reasoning models (#528),
oversized session-event truncation (#524), empty-state profile button (#513),
chunking-pipeline fix (#509), GPT-5 probe (#508), ExecTool hardening (#507).
Contributions #522 (zip upload) and #523 (model fetching + notebook lookup)
were reimplemented locally and ship here.

Feature request — HTTP API for multi-turn chat with a specific TutorBot
(#511).

Upgrade Notes

  • Drop-in from v1.4.0: pip install -U deeptutor; Docker users pull
    ghcr.io/hkuds/deeptutor:latest.
  • TutorBot shell exec is now disabled by default. If you relied on it, set
    allow_shell_exec on the bot; tool access stays confined to the workspace.
  • For cross-site HTTPS auth, set explicit CORS origins and cookie_secure=true.

What's Changed

  • fix(settings): add profile button when no profiles exist by @wedone in #513
  • Fix GPT-5 token limit parameter in init wizard probe by @lezhimiffyliu in #508
  • fix(security): harden ExecTool defaults against command injection (#506) by @kagura-agent in #507
  • fix: prevent Document nodes from bypassing chunking pipeline by @washi4 in #509
  • feat: truncate oversized event payloads in session response by @xiongjnu in #524
  • fix(chat): enable native tool calling for reasoning models via OpenAI-compatible endpoints by @wedone in #528

New Contributors

Full Changelog: v1.4.0...v1.4.1

v1.4.0

22 May 13:10

Choose a tag to compare

DeepTutor v1.4.0 Release Notes

Release Date: 2026.05.22

v1.4.0 is the GA cut of the v1.4 line. It carries the full v1.4.0-beta scope
(Auto Mode, three-layer Memory workbench, agentic Deep Research / Deep Solve /
Question, LlamaIndex chat refactor, unified capabilities infrastructure +
i18n, Visualize/Animator merge, new chat tools, settings split, multi-user
isolation hardening, and the deeptutor start / deeptutor init launcher)
and adds a focused set of agent-engine and runtime hardening fixes on top.
For the full beta scope see v1.4.0-beta.

What's New Since v1.4.0-beta

Reasoning Effort + Provider Thinking Flags

The agentic engine now normalizes "reasoning effort" across providers in one
place, instead of each pipeline re-deriving it.

  • build_provider_extra_kwargs in deeptutor/core/agentic/client.py
    resolves reasoning_effort against provider_registry, maps minimal /
    minimum to the right wire shape (DashScope uses minimum), and routes
    the on/off intent through provider thinking-style toggles —
    thinking_type (DeepSeek), enable_thinking (Qwen-style), or
    reasoning_split. Known reasoning models default to high; the
    deepseek-v4-flash shape gets thinking disabled by default.
  • LLMClientConfig.reasoning_effort is plumbed end-to-end. All four
    pipelines — agentic chat, question, research, solve — pass the binding +
    effort into build_completion_kwargs, so a single user-level setting
    controls reasoning behavior across capabilities.

Tool-Schema Fallback For Strict Providers

Some OpenAI-compatible providers reject native function-calling schemas with
a non-standard error. run_labeled_step now detects that case, strips
tools / tool_choice from the request, retries once, and emits a warning
progress event so the user knows the call ran in tool-less prose mode.
Behavior is unchanged for providers that accept tool schemas normally.

Restart-Safe Turn Runtime

A server or container restart used to leave the database row for an
in-flight turn stuck on running, blocking the next message in that
session and leaving the UI hanging on an event stream that would never
complete.

  • Orphan detectionTurnRuntimeManager now checks whether this
    process still owns the turn's in-memory runner. If not, the persisted
    turn is marked failed with "Turn interrupted by server restart. Please retry your message."
  • start_turn sweeps stale active turns for the session before
    creating a new turn, so the user can immediately retry after a restart.
  • subscribe_turn synthesizes a terminal error + done event for
    the orphaned turn so the frontend cleanly closes its streaming state
    instead of waiting indefinitely.

Frontend API Base — Docker Placeholder Hardening

web/lib/api.ts switched from exact-match to substring-token detection of
the NEXT_PUBLIC_API_BASE placeholder, and exposes isApiBasePlaceholder()
for reuse. This survives small changes to the Docker placeholder shape that
previously slipped through and produced a blank Settings page with no
visible error. The unused auto-generated web/.env.local shipped with the
beta is removed.

LLM Config Probe — Clearer Status Copy

The /settings/llm probe now emits an explicit "Basic LLM completion
succeeded. Chat additionally validates streaming and provider tool
compatibility at runtime." line after the smoke completion, so users know
the probe deliberately doesn't cover the streaming / tool-call paths that
chat exercises.

CLI Chat REPL — Config Surface Aligned With Docs

The deeptutor chat REPL surface was tightened so it matches the public
docs at deeptutor.info.

  • deeptutor chat --config key=value and --config-json '<json>' seed
    the initial config without entering the REPL first.
  • /config set key value and /config set key=value are both accepted;
    JSON values ([…], {…}) are parsed correctly via shlex.
  • Backslash-continuation lets you send multi-line prompts in one turn.
  • /refs prints a structured state snapshot (session, capability, tools,
    KB, history, notebooks, language, config) instead of a single dim line.
  • deeptutor plugin info <name> now also returns cli_aliases and the
    capability availability block.

Repo & Docs Hygiene

  • New issue template for the public docs site (.github/ISSUE_TEMPLATE/docs.yml).
  • AGENTS.md modernized to reflect Auto Mode, tool gating, and the
    current capability stages.
  • .gitignore covers .playwright-cli/; stale .playwright-cli/
    captures, .env.example_CN, and DeepTutor.code-workspace removed.

Tests

  • New tests/core/test_agentic_client_provider_kwargs.py covering the
    reasoning-effort + provider thinking-style matrix.
  • New tests/core/test_labeled_step_tool_fallback.py exercising the
    tool-schema retry path end-to-end against a scripted client.
  • tests/services/session/test_turn_runtime_subscribe.py adds coverage
    for orphan-running-turn recovery on both subscribe and start_turn.
  • tests/cli/test_chat_cli.py adds REPL config/backslash tests and a
    plugin-info contract check.
  • New tests/cli/test_docs_contract.py keeps the public site docs in
    sync with the CLI surface (links + deeptutor … examples).
  • tests/scripts/test_docker_compose.py + web/tests/api-resolve-base.test.ts
    cover the new placeholder detection.

Upgrade Notes

  • From v1.4.0-beta: drop-in. PyPI installs with pip install -U deeptutor
    (PyPI normalizes 1.4.0 over 1.4.0b0). Docker users pull
    ghcr.io/hkuds/deeptutor:latest.
  • Reasoning effort is now read from LLMConfig.reasoning_effort. If
    you previously set provider thinking flags manually in extra_body,
    the agent will still respect them, but the recommended path is to set
    reasoning_effort and let the engine map it.
  • Restart recovery is automatic. After upgrading, any existing
    database rows stuck on running from previous crashes will be
    finalized the next time their session is opened.
  • Beta artifact cleanup: web/.env.local was auto-generated by the
    legacy launcher and is no longer needed. If you rebuilt the frontend
    manually with that file, delete it before the next build.

Full Changelog: v1.4.0-beta...v1.4.0

DeepTutor-v1.4.0-beta

21 May 16:26

Choose a tag to compare

DeepTutor v1.4.0-beta Release Notes

Release Date: 2026.05.21

v1.4.0-beta is the largest release since the agent-native rewrite. It folds an
end-to-end Auto Mode on top of the existing capabilities, ships a
three-layer memory subsystem (L1/L2/L3) with a dedicated workbench, rebuilds
Deep Research / Deep Solve / Question on the same agentic engine as Chat,
re-architects the chat capability + LlamaIndex RAG pipeline around a
session-cumulative source inventory, unifies the Capabilities infrastructure
and i18n
, merges the Animator menu into Visualize, and reorganises
Settings, environment, and the local launcher. Several new chat tools
(ask_user, web_fetch, write_note, list_notebook, github_query) plus a
delete-chat-turn flow, quiz follow-up chat, and a GeoGebra viewer round out the
release.

Highlights

Auto Mode — Agentic Capability Router

A new auto capability sits on top of the existing modes and chooses the right
one for each request, instead of forcing the user to pick a mode up front.

  • Three-stage agent loopANALYZING (single LLM call, streamed as
    thinking) → DELEGATING (up to max_iterations of router calls that emit
    delegate_to_<cap> tool calls or atomic tool calls) → SYNTHESIZING (final
    inline answer, either passed through from the loop or assembled by a closing
    LLM call).
  • Routes to real capabilitiesdeep_solve, deep_question,
    deep_research, math_animator, visualize, plus the chat-level atomic
    tools (web_search, web_fetch, rag, …) live behind the same router so
    the LLM can mix retrieval and full sub-capability runs in one turn.
  • Bounded retries and quotas — independent retry budgets for router-LLM
    errors, per-delegation failures, and arg-validation feedback; a configurable
    max_same_capability_calls quota keeps the loop from spinning on one mode.
  • Clean conversation history — sub-capability events flow through a
    forward_events shim that tags every content event with a call_id, so the
    conversation turn-runtime filter keeps only Auto's own final synthesis in
    saved history. Sub-runs are still streamed live to the UI.
  • answer_now fast-path — when the user asks to "answer now" the pipeline
    skips analysis + delegation and produces an immediate inline reply.

Three-Layer Memory Subsystem (Memory v2)

The previous flat memory page is replaced by a structured three-layer store
with an explicit consolidation pipeline and a dedicated workbench.

  • L1 / L2 / L3 layout — L1 captures raw run traces, L2 holds normalised
    document records, L3 holds curated slots per surface (chat, notebook, book,
    TutorBot). Per-user paths flow through PathService so multi-user
    deployments stay isolated.
  • Consolidator pipeline — modular consolidator/ modules (chunker, guards,
    parse, references, runs, modes, line-doc, meta) turn run traces into
    versioned line-oriented documents with stable ids, references between
    layers, and a snapshot history.
  • Memory Workbench UI — new /memory routes (graph, l1, l2, l3,
    resolve) ship as standalone pages with workbench, hub, graph viewer, run
    panel, and an archived-state banner. A reusable MemorySection component is
    embedded where the legacy memory panel used to live.
  • First-class chat toolsread_memory and write_memory are exposed
    as agent tools (with i18n hints) so chat / Auto can recall and update memory
    inside a turn instead of needing a separate save step.
  • Settings integration — Memory now has its own page under
    /settings/memory with run controls, mode toggles, and storage status.

Deep Research, Deep Solve, and Question on the Agentic Engine

The three multi-agent pipelines have been rewritten as orchestrators on top of
the shared agentic-engine primitives, deleting hundreds of bespoke prompt
files and per-agent classes.

  • Deep Research → agents/research/pipeline.py — four phases (Rephrase,
    Decompose, Research blocks, Reporting) implemented as labeled steps
    (THINK / TOOL / APPEND / OUTLINE / SECTION / FINISH). The dynamic
    topic queue and CitationManager are preserved; the new APPEND label lets
    research blocks add follow-up topics to the queue without leaving the loop.
    ask_user v2 drives up to three rephrase rounds with multi-question cards.
  • Deep Solve → agents/solve/pipeline.pyPre-retrieve (KB-only),
    Plan, Solve (per-step THINK / TOOL / FINISH / REPLAN loop with a
    back-edge from solve to plan), and a final Synthesize step. Each step's
    FINISH flows into the next step's prompt context so the answer reads as
    one continuous narrative.
  • Question / Quiz — coordinator + pipeline replace the old generator /
    idea_agent / models modules; the old prompt directories have been
    removed entirely.
  • All three drop the legacy agents/ and prompts/ directories for their
    respective modes, leaving one pipeline file and shared labeled-step prompts.

Chat Capability & LlamaIndex RAG Refactor

The agentic chat pipeline has been rebuilt around a session-cumulative
"Attached Sources" manifest and a cleaner LlamaIndex pipeline.

  • Branch-isolated source inventoryservices/session/source_inventory.py
    materialises every source attached on the active branch's ancestor chain.
    Fresh sources from the current turn show a full preview; historical sources
    show a one-line row with id, name, kind, size, and the turn ordinal where
    they first appeared. The LLM calls read_source(id) to expand the full
    text on demand. Sibling branches never leak sources into each other.
  • LlamaIndex pipeline split-out — dedicated config.py, ingestion.py,
    retrievers.py, and document_loader.py replace the previous monolithic
    pipeline module. Storage stays backward-compatible with v1.3 versioned
    indexes.
  • Lean agentic chat promptagentic_chat.yaml (EN/ZH) was rewritten to
    match the new tool surface and the source-inventory contract; the old
    parallel-tool prompt scaffolding is gone.
  • Builtin tools registrytools/builtin/__init__.py is the single place
    where chat-mounted tools, hint prompts, and arg-augmentation wrappers are
    registered.

Capabilities Infrastructure Unification

Every capability now goes through one shared envelope, one status-i18n loader,
and one cost-tracking surface.

  • emit_capability_result helper — every capability emits its final
    result through one helper that fills the result envelope (label, summary,
    payload, render hints) and the trailing usage-tracker totals consistently.
  • StatusI18n — capability status copy lives in
    capabilities/prompts/{en,zh}/<name>.yaml and is loaded via a shared
    StatusI18n accessor. Hard-coded English status strings have been removed
    from the pipelines.
  • UsageTracker cost surface — token usage and cost are tracked through
    one tracker per capability run, exposed to the result envelope, and shown
    on the new /settings/capabilities admin page (live list, defaults,
    per-capability override toggles).
  • Deprecated main.yaml keys removed — the legacy main.yaml capability
    copy has been deleted in favor of per-capability prompt files.

Visualize: Animator Folded Into One Capability

The standalone Animator menu has been merged into Visualize so the user picks a
visualization once and the system chooses the renderer.

  • render_type discriminatorAnalysisAgent picks one of six render
    types — svg, chartjs, mermaid, html (text-emitting, three-stage
    pipeline) or manim_video / manim_image (Manim subprocess pipeline). The
    result envelope carries render_type so the frontend delegates to the
    right viewer.
  • Single sidebar entry — the old Animator menu entry is gone; users now
    go through Visualize for both static charts and Manim videos. The
    fullscreen viewer / config panel handle all render types.

New Chat Tools

  • ask_user — packages 1–3 structured questions into a single payload that
    pauses the same turn until the user answers. The frontend renders a card
    letting the user navigate questions and submit answers in one batch; the
    pipeline resumes the turn with the answers wired back as the tool result.
    Used by Deep Research's Rephrase phase and available to chat / Auto.
  • web_fetch — URL fetch with readable-content extraction, strict scheme
    / private-IP / size guards (applied both pre-flight and post-redirect),
    and …[truncated] markers when output exceeds the cap.
  • write_note — replaces the old save_to_notebook tool. Two modes:
    append creates a new record (default body is the rendered transcript,
    optional agent-authored body) and edit updates an existing record by
    record_id.
  • list_notebook — read-only index / drill-down listing of the active
    user's notebooks and records. Only mounted when the user actually has
    notebooks, so empty runs are impossible by construction.
  • github_query — read-only gh CLI wrapper covering pr, issue,
    run, repo, and a GET-only api fallback. No mutation verbs are
    reachable through the tool surface. Returns a clean "tool unavailable"
    outcome when gh is not installed.

Chat Surface Features

  • Delete chat turn (#443) — message items now carry a stable id, the
    session API exposes deleteMessage, the chat reducer adds a DELETE_TURN
    action, and a 409 vs 404 check rejects deletion of a still-running turn.
    Optimistic temp ids are resolved before deletion to avoid orphaned UI rows.
  • Quiz follow-up chat composerFollowupChatComposer and
    QuizFollowupContext let the user start a chat thread directly from a quiz
    question. The composer reuses the ma...
Read more

DeepTutor-v1.3.10

10 May 14:58

Choose a tag to compare

DeepTutor v1.3.10 Release Notes

Release Date: 2026.05.10

v1.3.10 is a focused reliability release for the issues reported after v1.3.9.
It restores smoother remote Docker access, makes self-signed LLM endpoints work
consistently across SDK-backed providers, protects code snippets from citation
rewrites, and splits Matrix E2EE into an explicit opt-in dependency.

Highlights

Remote Docker and CORS Recovery

  • Remote single-user Docker works out of the box again - when
    AUTH_ENABLED=false, DeepTutor now accepts browser origins over HTTP/HTTPS so
    LAN or remote-server frontends no longer hit the v1.3.8/v1.3.9 CORS
    regression reported in #463.
  • Authenticated deployments stay explicit - when AUTH_ENABLED=true, CORS
    still requires a concrete allowlist through CORS_ORIGIN or CORS_ORIGINS,
    preserving the credentialed-auth safety boundary.
  • Multiple deployment origins are supported - CORS_ORIGINS accepts comma
    or newline separated values, and both Docker Compose files pass the setting
    through to the backend container.
  • Settings no longer drop network flags - CORS_ORIGIN, CORS_ORIGINS, and
    DISABLE_SSL_VERIFY are part of the canonical .env write order.

Provider TLS and Rendering Fixes

  • DISABLE_SSL_VERIFY now reaches OpenAI SDK paths - OpenAI-compatible,
    Azure OpenAI, executor, TutorBot, and legacy embedding SDK clients all receive
    a shared httpx.AsyncClient(verify=False) when the flag is enabled, fixing
    self-signed HTTPS LLM endpoints reported in #464.
  • Production still blocks unsafe TLS bypasses - ENVIRONMENT=prod or
    ENVIRONMENT=production rejects DISABLE_SSL_VERIFY, with a single warning
    logged in non-production use.
  • Code blocks keep array indexes intact - Markdown citation linkification now
    masks fenced and inline code before rewriting references, so values[0] stays
    code instead of becoming a #references citation link (#468).

Matrix Install Compatibility

  • Matrix no longer installs E2EE by default - the standard matrix extra and
    requirements/matrix.txt now use plain matrix-nio, avoiding the
    python-olm / libolm build failures seen on macOS Python 3.14 and Apple
    Clang 21 (#462).
  • Encrypted rooms are an explicit add-on - install deeptutor[matrix-e2e]
    or requirements/matrix-e2e.txt when E2EE support is needed and libolm is
    available.
  • Runtime failures are clearer - Matrix defaults to non-E2EE mode, and
    enabling E2EE without crypto dependencies now raises an actionable install
    message instead of failing at import time.

Multi-User Runtime Compatibility

  • Default workspace paths stay stable outside user scope - when no current
    multi-user context is active, path resolution falls back to the default data
    workspace rather than forcing an admin scope.
  • Legacy test and monkeypatch hooks remain available - session and settings
    routers keep compatibility shims used by tests and older integrations.
  • Local agent artifacts are ignored - .claude/ is now excluded from Git so
    local worktrees and agent metadata do not accidentally enter releases.

Tests

  • Added CORS setting tests for unauthenticated remote origins and authenticated
    explicit allowlists.
  • Added shared OpenAI SDK HTTP-client tests across provider-core, Azure,
    executors, TutorBot, and embedding adapters.
  • Added Markdown display tests for prose citations, fenced code, inline code,
    and explicit backticked citations.
  • Added Matrix dependency split tests to keep default installs free of
    matrix-nio[e2e].
  • Re-ran targeted Python tests, web node tests, Ruff checks, and diff whitespace
    validation for the release patch.

Upgrade Notes

  • If you run remote Docker with AUTH_ENABLED=false, no extra CORS setting is
    required for normal HTTP/HTTPS browser origins.
  • If you run a shared or authenticated deployment with AUTH_ENABLED=true, set
    CORS_ORIGIN or CORS_ORIGINS to the exact frontend origin(s), for example
    https://learn.example.com.
  • Use DISABLE_SSL_VERIFY=true only for local, self-signed, or air-gapped test
    LLM endpoints. It remains blocked in ENVIRONMENT=prod and
    ENVIRONMENT=production.
  • Matrix installs are now non-E2EE by default. For encrypted Matrix rooms,
    install .[matrix-e2e] or requirements/matrix-e2e.txt, ensure libolm is
    present, and set e2ee_enabled=true in the Matrix channel config.
  • If you previously installed .[matrix] only to get non-encrypted Matrix
    messaging, reinstalling after this release should no longer require native
    libolm build tooling.

Full Changelog: v1.3.9...v1.3.10

DeepTutor-v1.3.9

08 May 18:59

Choose a tag to compare

DeepTutor v1.3.9 Release Notes

Release Date: 2026.05.09

v1.3.9 builds on the v1.3.8 multi-user foundation with broader TutorBot
deployment options, safer provider routing for thinking models, and a smoother
web onboarding path. It adds Zulip and NVIDIA NIM support, improves startup
ergonomics, and folds in the main issue fixes reported after the last release.

Highlights

TutorBot Channel and Provider Expansion

  • Zulip is now a TutorBot channel - bots can listen to private messages and
    stream topics, enforce allow_from, choose mention-only or open stream
    replies, and bridge Zulip's event queue into the async TutorBot bus.
  • Math and files work better in Zulip - LaTeX is converted to Zulip-friendly
    KaTeX markup, upload/download calls use configurable retry behavior, and
    attachment filenames include upload-path digests to avoid collisions.
  • Zulip topics keep conversations separated - stream topics now become part
    of the chat/session key, with a stable (no topic) fallback for empty topics.
  • TutorBot supports NVIDIA NIM - nvidia_nim is available in TutorBot
    provider config and registry detection, including NIM's streaming behavior
    that omits unsupported stream_options.

Model and Runtime Reliability

  • Configured context windows are respected - the safety ceiling is raised to
    1,000,000 tokens while the large-model fallback remains 65,536, so explicit
    128K-style model settings are no longer silently clamped.
  • Qwen vision detection is fixed - Qwen VL models are treated as
    vision-capable across DashScope, OpenAI-compatible, and custom bindings.
  • Minimal thinking mode is provider-safe - DeepSeek, DashScope, VolcEngine,
    BytePlus, and MiniMax no longer receive a rejected top-level
    reasoning_effort=minimal; DeepTutor sends the provider-specific disable
    signal instead.
  • DeepSeek v4 costs are tracked - research token accounting includes
    deepseek-v4-flash and deepseek-v4-pro pricing entries.

Web and CLI Polish

  • deeptutor start launches the full web stack - the CLI now delegates to
    scripts/start_web.py so backend and frontend can be started from one
    command, and launcher failures propagate through the CLI exit code.
  • Sidebar onboarding is clearer - primary navigation icons now expose
    scoped, localized tooltips with descriptions and keyboard focus support.
  • Multi-line user messages stay readable - chat message rendering preserves
    Shift+Enter line breaks, fixing code blocks and structured prompts that were
    previously collapsed into one line.
  • Assigned resources are easier to understand - model-selection summaries
    and read-only knowledge-base actions now present clearer labels for
    non-admin, grant-scoped sessions.

Multi-User and Session Store Parity

  • Assigned model options match the selector contract - non-admin LLM choices
    now return profile names, model names, labels, and active/default metadata in
    the same shape expected by the web model selector.
  • PocketBase sessions support more chat flows - message metadata can be
    persisted, last-message lookup is available, and message deletion works with
    PocketBase string IDs as well as SQLite integer IDs.
  • Regenerate remains storage-neutral - turn retry logic can remove the last
    assistant message without assuming the backing session store uses integer
    primary keys.

Tests

  • Added Zulip channel coverage for config parsing, permission checks, duplicate
    filtering, mentions, stream topic scoping, attachment extraction, retry
    behavior, LaTeX conversion, typing status, sending, uploads, and startup
    failures.
  • Added TutorBot NVIDIA NIM provider tests for registry detection, schema
    acceptance, and streaming request compatibility.
  • Added LLM regression tests for Qwen vision capability, explicit context-window
    budgets, and minimal-thinking provider kwargs.
  • Added CLI coverage so deeptutor start propagates the launcher exit code.
  • Added research token-pricing coverage for the DeepSeek v4 model entries.

Upgrade Notes

  • Install or refresh the .[tutorbot] extra, or requirements/tutorbot.txt, to
    include the new zulip>=0.8.0,<1.0.0 dependency before enabling Zulip bots.
  • Configure Zulip bots with site, email, apiKey, allowFrom, and
    groupPolicy; use mention for safer stream deployments and open only
    when every stream message should reach the bot.
  • If you use LLM_REASONING_EFFORT=minimal with DeepSeek, DashScope,
    VolcEngine, BytePlus, or MiniMax, keep the setting as-is; v1.3.9 translates it
    to the correct provider-specific disable payload.
  • Large configured context windows may now be honored instead of capped at
    65,536 tokens, so verify provider limits and expected prompt-cost behavior.
  • Optional PocketBase deployments should ensure the messages collection has a
    metadata_json JSON field before relying on regenerate/session metadata
    parity.

What's Changed

  • fix: raise context_window ceiling and add qwen vision support by @wedone in #442
  • fix: add deepseek-v4-flash and deepseek-v4-pro to model pricing table by @Starfie1d1272 in #447
  • fix(llm): stop sending reasoning_effort=minimal as top-level param to providers that reject it by @Starfie1d1272 in #453
  • feat: add deeptutor start command to launch backend and frontend together by @Starfie1d1272 in #445
  • fix(web): preserve newlines in user chat messages by @kagura-agent in #449
  • feat(tutorbot): add Zulip channel support by @wedone in #452
  • feat: tooltips for sidebar by @philliplagoc in #457
  • fix: add TutorBot NVIDIA NIM provider support by @Bortlesboat in #455

New Contributors

Full Changelog: v1.3.8...v1.3.9

DeepTutor-v1.3.8

07 May 18:37

Choose a tag to compare

DeepTutor v1.3.8 Release Notes

Release Date: 2026.05.08

v1.3.8 brings DeepTutor's optional multi-user mode into the main release line.
It keeps local single-user installs unchanged while adding authenticated shared
deployments with isolated user workspaces, admin-managed access, and clearer
deployment guidance.

Highlights

Multi-User Workspaces

  • Authentication can gate shared deployments - enabling AUTH_ENABLED
    adds login, registration, JWT sessions, and a first-user admin flow.
  • Each user gets isolated data - ordinary users work under
    multi-user/<uid>/ with separate chat history, memory, notebooks, and
    knowledge bases, while admins keep the main workspace.
  • Admin grants control access - /admin/users lets admins create users and
    assign allowed model profiles, knowledge bases, skills, and copied spaces
    without exposing API keys.

Safer Runtime Boundaries

  • Knowledge and RAG stay scoped - assigned knowledge bases are visible with
    badges, and non-admin RAG calls no longer fall back silently to admin data.
  • Model routing honors grants - non-admin chat turns use an assigned model
    profile and fail early if no LLM is available.
  • Settings are redacted for users - non-admin settings show theme, language,
    and model summaries, while provider secrets and endpoints remain admin-only.

Deployment and UI

  • Frontend auth routes are included - /login, /register, auth-aware
    middleware, logout controls, and admin navigation are wired into the web app.
  • Multi-user docs are now first-class - README and translated READMEs
    document setup, workspace layout, audit logs, env vars, and production
    caveats.
  • Optional PocketBase remains documented - PocketBase can still be used as a
    sidecar path, but true multi-user deployments should leave POCKETBASE_URL
    unset and use the built-in JSON/SQLite backend.

Tests

  • Added multi-user tests for identity migration, first-admin registration,
    grants, settings restrictions, scoped interface preferences, skill access, and
    RAG fallback prevention.
  • Added status-redaction coverage so non-admin users do not receive provider
    model or search endpoint details.

Upgrade Notes

  • Existing local installs stay in single-user mode unless AUTH_ENABLED=true.
  • For real multi-user deployments, set AUTH_ENABLED=true, keep
    POCKETBASE_URL blank, create the first admin through /register, and assign
    models before ordinary users start chat turns.
  • New deployment state is stored under multi-user/; back up both data/ and
    multi-user/ before upgrading shared instances.
  • Multi-worker deployments should bootstrap the first admin carefully because
    first-user promotion is protected by an in-process lock.

Full Changelog: v1.3.7...v1.3.8

DeepTutor-v1.3.7

04 May 08:22

Choose a tag to compare

DeepTutor v1.3.7 Release Notes

Release Date: 2026.05.04

v1.3.7 focuses on thinking-model compatibility, clearer knowledge-base index
history, and safer Co-Writer editing. It keeps provider-specific reasoning
output under control while making index activity easier to understand in the UI.

Highlights

Thinking-Model and Gateway Compatibility

  • Reasoning output stays separate - OpenAI-compatible and TutorBot providers
    keep reasoning_content out of visible answer text, and streaming avoids
    replaying internal scratchpad as final content.
  • DeepSeek thinking can be configured from .env - LLM_REASONING_EFFORT
    is documented and applied through the resolver path. Use minimal to disable
    DeepSeek thinking, or high / max to enable it.
  • Custom gateway headers are preserved - chat and explicit LLM calls inherit
    profile extra_headers, fixing gateways that require custom headers such as
    a User-Agent override.
  • Structured generation is more tolerant - book blocks and question ideation
    now handle fenced, repaired, list-shaped, or otherwise imperfect JSON outputs
    more reliably.

Knowledge Index Visibility

  • Index activity is recorded - create, upload, and re-index flows now store
    last_indexed_at, indexed document count, and the index action in knowledge
    metadata.
  • Progress payloads describe real index changes - backend status updates can
    distinguish metadata-only completion from an actual vector-index update.
  • The Knowledge UI shows index history - detail, settings, and index-version
    panels display the latest index time and document count when available.

Co-Writer Editing Safety

  • Clear and template actions ask first - replacing a non-empty draft now
    opens a confirmation dialog before the editor is cleared or overwritten.
  • Undo is more dependable - pending typing snapshots are committed before
    toolbar edits, and editor shortcuts support Ctrl/Cmd+Z, Shift+Cmd+Z, and
    Ctrl/Cmd+Y.
  • Toolbar controls are clearer - destructive and template actions now have
    distinct tones, focus states, labels, and accessible tooltips.

Tests

  • Added OpenAI-compatible provider tests to keep reasoning_content separate
    from visible response content in both service and TutorBot paths.
  • Expanded LLM factory tests for inherited extra_headers, inherited
    reasoning_effort, and reasoning-only streaming behavior.
  • Added knowledge manager coverage for recording last_indexed_* metadata only
    when the index actually changes.

Upgrade Notes

  • Set LLM_REASONING_EFFORT in .env if you need global thinking control.
    Leave it empty to let DeepTutor auto-detect behavior from the active model.
  • Knowledge-base metadata may now include last_indexed_at,
    last_indexed_count, and last_indexed_action.
  • Co-Writer clear/template actions are recoverable through undo until the user
    leaves the current draft.

Full Changelog: v1.3.6...v1.3.7

DeepTutor-v1.3.6

03 May 08:40

Choose a tag to compare

DeepTutor v1.3.6 Release Notes

Release Date: 2026.05.03

v1.3.6 focuses on making model routing explicit across DeepTutor. Users can
choose configured LLM profiles from chat and TutorBot flows, runtime services
resolve those choices without leaking provider secrets, and RAG/knowledge-base
index handling is more defensive when persisted embeddings are invalid.

Highlights

Catalog-Based Model Selection

  • Chat can target a configured model - unified chat turns now carry a
    profile_id and model_id selection through the WebSocket payload, session
    preferences, turn snapshots, and regenerate flows.
  • Settings exposes safe LLM options - the new settings options endpoint
    returns display-ready provider/model choices while omitting credentials and
    connection secrets from the response.
  • Runtime model overrides are scoped per turn - selected profiles are
    resolved through the provider catalog for the active request without writing
    temporary choices back to disk or changing global defaults.
  • Model-selector UI is shared - chat and TutorBot screens use the same
    configured-model selector, with localized labels and system-default handling.

TutorBot Model Control

  • Bots can persist model selections - TutorBot create/update flows now accept
    llm_selection, validate it against the configured catalog, and store it with
    each bot.
  • Running bots can reload their LLM - changing a bot's model updates the
    active agent loop instead of requiring a full bot restart.
  • Recent bot history is steadier - TutorBot history assembly now sorts by
    message timestamp with stable tie-breaking before taking the latest context.
  • Bot chat route changes are cleaner - the web chat page cancels in-flight
    bot requests and resets transient reasoning state when switching bots.

RAG and Knowledge Reliability

  • Invalid vectors trigger rebuilds - re-indexing no longer treats a matching
    document signature as reusable when the existing vector store fails embedding
    validation.
  • Full rebuilds use fresh version directories - complete knowledge-base
    rebuilds write to a new flat index version while leaving failed old storage
    available for inspection.
  • RAG tool logs can stream to clients - retrieval runs can forward captured
    INFO-level process logs as raw tool events when an event sink is available.
  • Knowledge health checks recognize bad embeddings - invalid persisted
    vectors are surfaced earlier instead of producing opaque search failures.

Provider and Launch Fixes

  • OpenAI Responses token limits are normalized - Responses API calls now map
    chat-style max_completion_tokens and max_tokens to max_output_tokens,
    fixing the SDK error reported for newer OpenAI models in #437.
  • Azure and OpenAI-compatible paths share the mapping - both streaming and
    non-streaming Responses API routes use the same conversion helper.
  • Launch ports come from .env and environment variables - setup and launch
    helpers now keep backend/frontend port behavior aligned around the project
    .env file instead of the older runtime settings JSON.

Web UX Polish

  • Skill names validate before save - the Skills editor slugifies names,
    flags invalid input inline, and prevents silent API failures for uppercase
    letters, spaces, underscores, or other unsupported characters.
  • Skill editor modals are opaque across themes - the editor now uses the
    page background token, avoiding text bleed-through in translucent themes.
  • Space navigation is easier to scan - Space mini-navigation, notebook,
    question-bank, skills, and session-list spacing were tightened with clearer
    card and divider treatment.

Tests

  • Added model-selection service tests for safe option listing, active markers,
    invalid profile/model rejection, and non-mutating catalog overrides.
  • Added unified WebSocket turn-runtime tests for persisted LLM selections,
    invalid selections, model switching, snapshots, and regenerate behavior.
  • Added TutorBot API and manager tests for llm_selection persistence,
    validation, runtime reload, and default-model behavior.
  • Added settings, provider-runtime, and LLM-config tests for scoped catalog
    selection and per-turn config precedence.
  • Added RAG and knowledge-router tests for invalid vector stores, re-index
    rebuild decisions, and storage version resolution.
  • Added OpenAI Responses converter tests for token-limit aliases, precedence,
    None filtering, and input immutability.
  • Added frontend slug tests for skill-name normalization and validation.

Upgrade Notes

  • Chat and TutorBot clients that want explicit model routing should send
    llm_selection as { "profile_id": "...", "model_id": "..." }. Omitting it
    continues to use the configured system default.
  • TutorBot configuration files may now contain llm_selection. Existing bot
    configs without that field continue to load, and legacy model values remain
    usable as model-name overrides.
  • Launch ports should be configured in .env or process environment variables
    (BACKEND_PORT / FRONTEND_PORT). The old data/user/settings/env.json
    port block is no longer used as a launch-port source.
  • Knowledge bases with stale or invalid persisted vectors may rebuild on the
    next re-index even when document signatures have not changed.
  • Skill names are now normalized and validated as lowercase slugs of up to 64
    characters using letters, numbers, and hyphens.

Full Changelog: v1.3.5...v1.3.6

DeepTutor-v1.3.5

02 May 12:16

Choose a tag to compare

DeepTutor v1.3.5 Release Notes

Release Date: 2026.05.02

v1.3.5 focuses on making local setup and knowledge-base chat more reliable. The
launcher now follows the same runtime settings users configure in the web app,
RAG tool calls are stricter about real search queries, and local embedding
servers no longer receive placeholder auth headers.

Highlights

Smoother Local Launch

  • Setup Tour writes launch ports - the guided installer now records backend
    and frontend ports in data/user/settings/env.json, so later launches can use
    the same choices.
  • start_web.py reads runtime settings first - backend/frontend ports and UI
    language come from web settings when available, with .env kept as fallback.
  • Cleaner process handling - the launcher records started processes, detects
    port conflicts, waits for readiness, and exposes scripts/stop_web.py for
    cleaning up recorded backend/frontend processes.
  • Setup requirements are clearer - README and environment examples now align
    around Node.js 20.9+, install profiles, complete embedding endpoint URLs, and
    optional attachment storage.

More Reliable RAG Tool Calls

  • RAG queries must be non-empty - tool schemas, prompts, and built-in checks
    now reject blank queries early instead of passing empty input into retrieval.
  • Chat-side fallback is safer - when a model omits the RAG query, the agentic
    pipeline can reuse the user's actual question as the retrieval query.
  • ReAct calls accept simple string input - rag actions that provide a
    string are normalized to {"query": ...}, reducing fragile tool-call failures.

Local Embedding Compatibility

  • No fake API key for local embedding providers - runtime config no longer
    injects sk-no-key-required for local embedding servers.
  • Placeholder keys are not sent as auth headers - OpenAI-compatible
    embedding requests suppress Authorization and api-key when the configured
    key is the local placeholder, which helps LM Studio, Ollama, vLLM, and similar
    servers.
  • Embedding examples are easier to follow - English and Chinese sample env
    files now explain that EMBEDDING_HOST is the exact endpoint DeepTutor calls.

Web UX Polish

  • Dark-mode provider dropdown is readable - the Settings provider selector
    now uses the theme background token, fixing the white native dropdown popover
    reported on Edge/Chromium.
  • Settings controls are more consistent - select fields and setup tour
    spotlight behavior were tightened for a steadier settings experience.
  • Book reference payloads are normalized more defensively - selected book
    references keep the same behavior with cleaner filtering and deduplication.

Tests

  • Added launch settings tests for runtime settings precedence, .env fallback,
    and invalid-port handling.
  • Added start_web.py tests for translation, state persistence, and recorded
    process matching.
  • Added Setup Tour coverage for dependency profiles, Math Animator selection,
    Node.js version validation, and saved launch ports.
  • Added RAG/tool tests for non-empty query schemas, blank-query rejection, and
    fallback query behavior.
  • Added embedding runtime and adapter tests for local providers, placeholder API
    keys, and auth header suppression.

Upgrade Notes

  • Local web installs now require Node.js 20.9 or newer.
  • start_web.py and setup helpers prefer data/user/settings/env.json and
    interface.json over .env; edit the web Settings page or rerun
    start_tour.py when changing launch ports.
  • Local OpenAI-compatible embedding servers should use an empty API key unless a
    real key is required. Avoid relying on sk-no-key-required as a transmitted
    credential.
  • Custom RAG callers should always provide a non-empty query; blank queries now
    fail fast by design.

Full Changelog: v1.3.4...v1.3.5