revert: undo unconditional chdir that breaks cd persistence (#1400) by TimeToBuildBob · Pull Request #1487 · gptme/gptme

TimeToBuildBob · 2026-02-25T13:03:22Z

Summary

Reverts d2cef94 ("fix(server): always chdir to workspace at step start (#1400)").

That commit changed the server to unconditionally os.chdir(workspace) at the start of every step. This breaks cd persistence between tool calls — any cd a user or agent does in a shell step gets silently reset before the next step runs.

Fixes #1486.

What this restores

The previous behavior: only chdir to the workspace when the conversation has one or fewer user messages (i.e. the initial step). After that, the working directory set by tools is preserved across steps.

Context

The unconditional chdir was intended to fix conversations not starting in the right directory, but it introduced a worse regression: directory changes no longer stick. The original conditional approach handled the "start in the right place" case without clobbering subsequent cd calls.

A proper fix for concurrent conversations (where process-global os.chdir is inherently racy) would require threading the workspace path through the tool chain, but that's a larger refactor tracked separately.

Important

Reverts unconditional os.chdir in step() to restore cd persistence between steps in api_v2_sessions.py.

Behavior:
- Reverts unconditional os.chdir(workspace) at every step start in step() in api_v2_sessions.py.
- Restores behavior to only chdir to workspace if conversation has one or fewer user messages.
Context:
- Fixes regression where cd persistence was broken between steps.
- Original change was to ensure correct start directory but caused worse issues.
- Proper fix for concurrent conversations requires larger refactor.

^{This description was created by}^{for 9d4ac74. You can customize this summary. It will automatically update as commits are pushed.}

This reverts commit d2cef94.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to 9d4ac74 in 19 seconds. Click for details.

Reviewed 32 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_LMq4z8xztisbVAKb

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

greptile-apps · 2026-02-25T13:06:10Z

Greptile Summary

This PR reverts commit d2cef94 which made os.chdir(workspace) run unconditionally at the start of every step in the server's step() function. That unconditional chdir broke cd persistence — any directory change made by the shell tool (via os.chdir in shell.py) was silently reset before the next step.

The restored behavior:

Initial step (≤1 user message): os.chdir(workspace) runs to ensure the conversation starts in the correct directory
Subsequent steps (>1 user message): the working directory set by tools (e.g., cd commands) is preserved across steps

This aligns with the CLI behavior in chat.py, which only calls os.chdir(workspace) once at session start.

The existing TODO comment acknowledges the limitation: os.chdir() is process-global, so concurrent conversations can still interfere with each other's working directory. A proper fix (threading workspace through the tool chain) is tracked separately.
No new logic is introduced — this is a clean revert to the prior conditional approach.

Confidence Score: 5/5

This PR is a clean, minimal revert that restores previously working behavior. Safe to merge.
Score reflects that this is a simple revert of a single commit in one file. The restored code was the previous production behavior, the logic is sound and consistent with the CLI path (chat.py), and the change fixes a clear regression (broken cd persistence). No new code or logic is introduced.
No files require special attention.

Important Files Changed

Filename	Overview
gptme/server/api_v2_sessions.py	Reverts unconditional `os.chdir(workspace)` to conditional form (only when ≤1 user message), restoring `cd` persistence between steps. Clean revert with no new issues.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Server as step()
    participant OS as os.chdir()
    participant Shell as shell tool

    Note over Server: Conversation start (≤1 user msg)
    Client->>Server: POST /step (1st user message)
    Server->>OS: os.chdir(workspace)
    Note over OS: CWD = /workspace
    Server->>Shell: Execute tool (e.g., cd /project)
    Shell->>OS: os.chdir("/project")
    Note over OS: CWD = /project

    Note over Server: Subsequent step (>1 user msg)
    Client->>Server: POST /step (2nd+ user message)
    Note over Server: Skip os.chdir (preserve CWD)
    Note over OS: CWD still = /project ✓
    Server->>Shell: Execute tool (uses /project)

_{Last reviewed commit: 9d4ac74}

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

gptme/server/api_v2_sessions.py

codecov · 2026-02-25T15:19:44Z

Codecov Report

❌ Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
gptme/server/api_v2_sessions.py	0.00%	4 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Replace os.chdir(repo_dir) with cwd=repo_dir in subprocess calls to preserve working directory persistence (per revert in #1487) - Consolidate duplicate 'from datasets import' into single import line

- Replace os.chdir(repo_dir) with cwd=repo_dir in subprocess calls to preserve working directory persistence (per revert in gptme#1487) - Consolidate duplicate 'from datasets import' into single import line

@ErikBjare

…424) (#1489) * feat(eval): add SWE-bench evaluation modules Combines work from two WIP branches into a single recoverable base: - **gptme/eval/swebench/**: Core SWE-bench evaluation framework (originally by @ErikBjare, PR #142) - Instance loading, repository setup utilities - Evaluation runner with gptme agent integration - CLI entry point (`gptme-eval-swebench`) - **gptme/eval/swe_extra/**: SWE-bench setup scripts and data analysis (originally by @bjsi, PR #424) - Setup scripts for running SWE-bench evals - Data loading and analysis utilities (top-50 easiest instances) - Test specification generation - SWE-bench constants and harness integration Both modules are WIP and require further integration work. Key known limitations: - `swe_extra` imports `SWEBenchInfo` from logmanager (not yet in main) - `swe_extra` references `SWEBenchAgent` (not yet merged) - `datasets` and `swebench` added as optional eval dependencies Adds mypy overrides for WIP modules to defer type checking. Co-authored-by: bjsi <44608421+bjsi@users.noreply.github.com> * fix(eval): remove os.chdir() and duplicate import in swebench utils - Replace os.chdir(repo_dir) with cwd=repo_dir in subprocess calls to preserve working directory persistence (per revert in #1487) - Consolidate duplicate 'from datasets import' into single import line * fix(eval): address greptile review - fix repo_dir in files dict and missing model arg - evaluate.py: replace incorrect {"repo_dir": repo_dir} Files dict with proper prompt-embedded context; add missing log_dir/workspace_dir to EvalResult instances - run_swe_extra.py: add --model CLI arg so cli() can call main() correctly for non-resume evaluations * fix(eval): address greptile review feedback on swebench module * fix(eval): copy repo to agent workspace and capture diff via git * fix(eval): enable mypy for swebench modules per review Remove WIP mypy ignore_errors for gptme.eval.swebench.* — code is clean. Keep ignore_errors for gptme.eval.swe_extra.* (depends on unmerged SWEBenchInfo). * feat(eval): integrate SWEBenchInfo/SWEBenchAgent and fix swe_extra for current swebench API Per review, integrate bjsi's SWEBenchInfo and SWEBenchAgent code from PR #424: - Add SWEBenchInfo frozen dataclass to logmanager.py for evaluation metadata - Add SWEBenchAgent class in eval/agents/swebench.py with multi-stage pipeline - Convert eval/agents.py to package (eval/agents/__init__.py + swebench.py) - Update swe_bench_constants.py for current swebench library API - Fix swe_bench_test_spec.py imports for restructured swebench package - Fix type errors in swe_bench_extra_data.py and run_swe_extra.py - Add matplotlib to mypy ignore_missing_imports - Remove WIP ignore_errors override for swe_extra modules * fix(eval): restore cwd after os.chdir in SWEBenchAgent.replay() * refactor(eval): move SWEBenchInfo from logmanager to eval/swebench/info.py Per Erik's review: SWEBenchInfo doesn't belong in logmanager.py, move it into the eval/swebench module where it belongs. - New: gptme/eval/swebench/info.py with SWEBenchInfo dataclass - Updated gptme/eval/swebench/__init__.py to export SWEBenchInfo - Removed SWEBenchInfo from gptme/logmanager.py - Updated all imports in eval/agents/swebench.py, swe_extra/{run_swe_extra, swe_bench_test_spec,swe_bench_extra_data}.py --------- Co-authored-by: TimeToBuildBob <TimeToBuildBob@users.noreply.github.com> Co-authored-by: bjsi <44608421+bjsi@users.noreply.github.com>

Revert "fix(server): always chdir to workspace at step start (#1400)"

9d4ac74

This reverts commit d2cef94.

ellipsis-dev bot reviewed Feb 25, 2026

View reviewed changes

TimeToBuildBob mentioned this pull request Feb 25, 2026

fix(server): always chdir to workspace at step start #1400

Merged

4 tasks

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

ErikBjare reviewed Feb 25, 2026

View reviewed changes

gptme/server/api_v2_sessions.py Show resolved Hide resolved

Apply suggestions from code review

f82fa8c

ErikBjare merged commit e90f9e5 into master Feb 25, 2026
8 checks passed

This was referenced Feb 25, 2026

feat(eval): add SWE-bench evaluation modules (recovered from #142 and #424) #1489

Merged

bug(server): unconditional chdir breaks cd persistence between steps #1486

Closed

TimeToBuildBob deleted the revert-chdir-unconditional branch February 28, 2026 21:05

ErikBjare mentioned this pull request Mar 22, 2026

fix(server): make shell cwd thread-safe for concurrent sessions #1749

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revert: undo unconditional chdir that breaks cd persistence (#1400)#1487

revert: undo unconditional chdir that breaks cd persistence (#1400)#1487
ErikBjare merged 2 commits intomasterfrom
revert-chdir-unconditional

TimeToBuildBob commented Feb 25, 2026 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

greptile-apps bot commented Feb 25, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TimeToBuildBob commented Feb 25, 2026 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this restores

Context

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 25, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 25, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TimeToBuildBob commented Feb 25, 2026 •

edited by ellipsis-dev bot

Loading