SWE-Effi

Note

Our evaluation script will be released soon!

SWE-Effi

A comprehensive benchmark evaluation platform for Software Engineering Efficiency across different AI scaffolds and models.

📊 Overview

SWE-Effi provides a standardized platform for evaluating and comparing AI-powered software engineering tools across different scaffolds and language models. Our platform aggregates benchmark results and presents them through an interactive web interface.

🌐 Visit the Live Platform
📝 Submit Your Results

📁 Repository Structure

SWE-Effi
├── benchmark
│   └── results
│       └── agent-scaffold-stats
│           ├── agentless/
│           │   ├── GPT-4o-mini-2024-07-18/
│           │   │   ├── combined_stats.json
│           │   │   └── summary_stats.json
│           │   └── qwen3-32B/
│           │       ├── combined_stats.json
│           │       └── summary_stats.json
│           ├── agentless-mini/
│           ├── auto-code-rover/
│           ├── openhands/
│           └── swe-agent/
├── scripts/
│   ├── transform-benchmark.py      # data transformation
│   └── update-website.sh           # easy update script
└── website/
    ├── public/
    │   └── data/
    │         └── benchmark/
    │             └── raw/            # benchmark data
    │                 └── summary/    # benchmark data
    └── src/
        └── docs/
            ├── about/
            └── index.tsx

🚀 Quick Start

For Contributors

Want to submit your benchmark results? Follow our submission guide →

For Developers & Maintainers

Clone the repository:

git clone https://github.com/your-org/swe-effi.git
cd swe-effi

Process benchmark data:

# Process all new benchmark data
./scripts/update-website.sh --auto

# Process specific scaffold/model
./scripts/update-website.sh agentless gpt-4

# Validate files before processing
./scripts/update-website.sh --validate-only

Run the website locally:
```
cd website
npm install
npm run dev
```

🛠 Development Workflow

Processing New Submissions

When contributors submit benchmark results via PR:

Review the Pull Request for correctness

Validate locally (optional):

git checkout [pr-branch]
python3 scripts/transform-benchmark.py --validate-only

Merge the PR
Update the website:
```
./scripts/update-website.sh --auto
```

Script Reference

update-website.sh options:

--auto: Process all available data automatically
--validate-only: Only validate files, don't transform
--verbose: Show detailed logs
--help: Show help information

transform-benchmark.py options:

--scaffold NAME --model NAME: Process specific combination
--validate-only: Only validate file format
--auto: Auto process all data with validation
--verbose: Show detailed logs

🔧 Technical Requirements

Prerequisites

Python 3 for data processing
Node.js and npm for website

Environment Setup

cd website && npm install

🤝 Contributing

Submit Benchmark Results Data Flow

Contributor Results → PR Submission → Validation → Processing → Website Integration

Results Collection: Contributors submit via GitHub PRs
Validation: Automated checks ensure data quality
Processing: Scripts transform data for website consumption
Integration: Website automatically displays new results

File Format

Results must include:

combined_stats.json
summary_stats.json

📄 License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github		.github
benchmark/results/agent-scaffold-stats		benchmark/results/agent-scaffold-stats
scripts		scripts
website		website
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SWE-Effi

📊 Overview

📁 Repository Structure

🚀 Quick Start

For Contributors

For Developers & Maintainers

🛠 Development Workflow

Processing New Submissions

Script Reference

🔧 Technical Requirements

Prerequisites

Environment Setup

🤝 Contributing

Submit Benchmark Results Data Flow

File Format

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SWE-Effi

📊 Overview

📁 Repository Structure

🚀 Quick Start

For Contributors

For Developers & Maintainers

🛠 Development Workflow

Processing New Submissions

Script Reference

🔧 Technical Requirements

Prerequisites

Environment Setup

🤝 Contributing

Submit Benchmark Results Data Flow

File Format

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages