A curated list of frameworks, tools, platforms, and resources for AI coding agents — systems that can write, modify, test, and manage code with varying levels of autonomy.
- Agent Frameworks
- Coding Agents & Platforms
- IDE Integrations
- Agent Infrastructure
- Testing & Evaluation
- Datasets & Benchmarks
- Research & Papers
Frameworks and libraries for building AI agents capable of multi-step reasoning, tool use, and code execution.
- LangChain — Framework for building LLM-powered applications with tools, memory, and agents.
- LangGraph — Stateful, graph-based orchestration for building reliable AI agents.
- AutoGen — Multi-agent conversation framework for complex task execution and collaboration.
- CrewAI — Role-based multi-agent system for orchestrating collaborative AI workflows.
- Semantic Kernel — SDK for integrating AI models with plugins, memory, and planning.
- OpenAgents — Open platform for building general-purpose AI agents.
- Haystack Agents — Modular framework for building search and agent pipelines.
End-to-end systems focused on autonomous or semi-autonomous coding, task execution, and software development.
- OpenDevin — Open-source attempt to replicate autonomous software engineering agents.
- Devika — AI software engineer that plans, writes, and executes code tasks.
- SWE-agent — Language model agent for fixing issues in real GitHub repositories.
- GPT Engineer — Generates entire codebases from prompts with iterative refinement.
- Smol Developer — Minimal agent framework for generating and refining software projects.
- MetaGPT — Multi-agent software company simulation for structured code generation.
- AgentCoder — Multi-agent coding system with planning, coding, and testing roles.
Tools and extensions that bring AI coding agents directly into development environments.
- Cursor — AI-first code editor with integrated agent capabilities.
- Continue — Open-source AI coding assistant for VS Code and JetBrains.
- Codeium — AI coding assistant with autocomplete and chat capabilities.
- Tabnine — AI code completion tool with privacy-focused deployment options.
- GitHub Copilot — AI pair programmer integrated into popular IDEs.
Core components and tooling required to build, run, and scale AI coding agents.
- OpenAI API — API for accessing advanced language models and tool use capabilities.
- vLLM — High-performance inference engine for LLMs.
- Ollama — Local model runtime for running LLMs on personal machines.
- Ray — Distributed execution framework for scaling AI workloads.
- Modal — Serverless infrastructure for AI workloads and agents.
Tools and frameworks for validating agent performance, reliability, and correctness.
- LangSmith — Observability and evaluation platform for LLM applications and agents.
- Promptfoo — Tool for testing and evaluating LLM prompts and outputs.
- DeepEval — Evaluation framework for LLM systems and agents.
- HELM — Benchmarking framework for evaluating language models.
- Ragas — Evaluation toolkit for RAG systems and agent outputs.
Datasets and benchmarks used to evaluate coding agents and software engineering capabilities.
- SWE-bench — Benchmark for evaluating models on real-world GitHub issues.
- HumanEval — Dataset for evaluating code generation capabilities.
- MBPP — Benchmark for Python programming tasks.
- CodeContests — Competitive programming dataset for code generation.
Key research papers and resources exploring AI agents and autonomous coding systems.
- SWE-agent Paper — Language agents solving real GitHub issues.
- Reflexion — Framework for iterative self-improvement in agents.
- ReAct — Combining reasoning and acting in language models.
- Voyager — Lifelong learning agent with skill acquisition.
- Toolformer — Training models to use external tools.
- Awesome AI — General tools, frameworks, and resources for artificial intelligence.
- Awesome AI Agents — Frameworks and tools for building autonomous AI agents.
- Awesome Prompt Engineering — Techniques, tools, and resources for prompt design.
- Awesome LLMOps — Infrastructure and operational practices for LLM systems.
Contributions are welcome. Please ensure your submission fully follows the requirements outlined in CONTRIBUTING.md, including formatting, scope alignment, and category placement.
Pull requests that do not adhere to the contribution guidelines may be closed.