Dive into the rapidly evolving world of AI-powered research automation! π‘ This curated repository is your compass to navigate the cutting-edge tools, frameworks, and methodologies that are revolutionizing scientific discovery. We track the latest advancements, focusing on resources that empower researchers and accelerate breakthroughs.
AI is transforming the scientific landscape, offering powerful tools to assist researchers throughout the entire research lifecycle. This repository provides a structured overview of the AI researcher ecosystem, focusing on key components and advancements.
We categorize resources π based on the core stages of research automation impacted by AI, drawing heavily from the surveyed literature. For the papers compiled, we provide links to the paper itself π, associated code π», and available reviews π. Crucially, for some publications lacking open reviews, we've utilized our DeepReviewer to generate AI-powered review analyses π€, also including links to these results. We welcome everyone to use our AI-Researcher.net, including DeepReviewer, CycleResearcher, AutoSurvey, and others! β¨
- βοΈ Core Components
- π° Latest Buzz / News
- π₯ Featured Projects
- π Knowledge Acquisition
- π‘ Idea Generation
- π¬ Verification and Falsification
- βοΈ Review System
- π Evolution
- π Survey
- π Benchmarks
- π Community
- π Contributing
- π License
The fundamental layers where AI is impacting the research lifecycle:
| Layer | Key Papers Category Alignment | Example Technologies / Concepts | Description |
|---|---|---|---|
| Knowledge Building | Knowledge Building | Elicit, Consensus, SciSpace, Semantic Scholar, KG Tools | AI tools for literature discovery, summarization, knowledge graph construction, understanding figures/tables, and synthesizing existing work. |
| Idea Generation | Idea Generation | Hypothesis generators (e.g., based on KGs, LLMs), Agent-based exploration (e.g., ResearchAgent) | AI assisting in formulating novel research questions, hypotheses, identifying gaps, and exploring new research directions. |
| Experimentation | Experiment Execution | Autonomous Labs (e.g., CoScientist), AlphaFold, AutoML, Agent-driven simulation | AI aiding in experimental design, automating experiment execution (physical or simulated), data collection, analysis, and interpretation. |
| Paper Writing | Paper Writing | Overleaf AI, PaperRobot, CycleResearcher,Citation/Figure generators | AI assistance in drafting sections (abstract, related work), generating figures/tables, citation management, editing, and formatting. |
| Paper Review | Paper Review | Reviewer assignment tools, Argument mining, Review generation aids (e.g., ReviewRobot, CycleResearcher) | AI tools supporting reviewer assignment, review quality assessment, argument extraction, and potentially review generation/summarization. |
Exciting frontiers pushing the boundaries of AI in research:
- End-to-End Research Agents: AI systems attempting to automate multiple stages of the research cycle autonomously (e.g., The AI Scientist, AutoSurvey).
- Human-AI Collaboration Patterns: Novel workflows and interfaces integrating AI assistance seamlessly into researcher tasks for enhanced productivity and creativity.
- AI for Reproducibility & Validation: Tools focused on verifying scientific claims, attempting automated replication of results, and ensuring overall scientific rigor.
Recent news, announcements, and noteworthy updates circulating in the AI researcher ecosystem.
-
Launch | Airaxiv. Your Gateway to AI-Generated Research!
- Source: Airaxiv Website
- Link: Website
-
Paper | Using generative AI, researchers design compounds that can kill drug-resistant bacteria
- Source: MIT News (Aug 2025)
- Link: Article
-
Survey | How Far Are AI Scientists from Changing the World?
- Source: arXiv Publication (Jul 2025)
- Link: Paper
-
Paper | Autonomous Scientific Discovery Through Hierarchical AI Scientist Systems
- Source: Preprints.org Publication (Jul 2025)
- Link: Paper
-
Position Paper | AI Scientists Fail Without Strong Implementation Capability
- Source: arXiv Publication (Jun 2025)
- Link: Paper
-
Sakana AI Labs Update on AI Scientist
- Source: X (Twitter) Post (Mar 2025)
- Link: Tweet
-
Meet CARL: The First AI System To Produce Academically Peer-Reviewed Research
- Source: AutoScience AI Blog Post (Mar 2025)
- Link: Blog Post
| Project | Description | Stars | Link |
|---|---|---|---|
| CycleResearcher&DeepReviewr | Improving Automated Research via Automated Review. | GitHub | |
| AutoSurvey | Large language models automatically writing surveys. | GitHub | |
| AI-Scientis | Towards Fully Automated Open-Ended Scientific Discovery. | GitHub | |
| Galactica | Meta's scientific language model for research tasks. | GitHub | |
| SciKit-LLM | Integration of LLMs into scientific workflows. | GitHub | |
| CoScientist | Autonomous chemical research with LLMs. | GitHub | |
| PaperRobot | Incremental Draft Generation of Scientific Ideas. | GitHub |
- AI Researcher
- aiXiv - Next-generation open access ecosystem for AI scientists [Paper]
- AiraXiv
- Elicit
- Consensus
- SciSpace
- Wolfram Research Assistant
- IBM Watson Discovery
- Overleaf AI
- A dataset of peer reviews (peerread): Collection, insights and nlp applications [Paper]
[AI Review]
- Biobert: a pre-trained biomedical language representation model for biomedical text mining [Paper]
[AI Review]
- Scibert: Pretrained language model for scientific text [Paper]
[AI Review]
- Probing biomedical embeddings from language models [Paper]
[AI Review]
- Clinicalbert: Modeling clinical notes and predicting hospital readmission [Paper]
[AI Review]
- Identification of tasks, datasets, evaluation metrics, and numeric scores for scientific leaderboards construction [Paper]
[AI Review]
- AxCell: Automatic extraction of results from machine learning papers [Paper]
[AI Review]
- TLDR: Extreme summarization of scientific documents [Paper]
[Review]
- Domain-specific language model pretraining for biomedical natural language processing [Paper]
[AI Review]
- BioMegatron: Larger biomedical domain language model [Paper]
[AI Review]
- BioM-transformers: Building large biomedical language models with BERT, ALBERT and ELECTRA [Paper] [No Review]
- Towards table-to-text generation with numerical reasoning [Paper] [No Review]
- Scigen: a dataset for reasoning-aware text generation from scientific tables [Paper] [No Review]
- MatSciBERT: A materials domain language model for text mining and information extraction [Paper]
[Review]
- The diminishing returns of masked language models to science [Paper]
[AI Review]
- TELIN: Table entity LINker for extracting leaderboards from machine learning publications [Paper] [No Review]
- Comlittee: Literature discovery with personal elected author committees [Paper]
[AI Review]
- Orkg-leaderboards: a systematic workflow for mining leaderboards as a knowledge graph [Paper]
[AI Review]
- All data on the table: Novel dataset and benchmark for cross-modality scientific information extraction [Paper]
[AI Review]
- Paperqa: Retrieval-augmented generative agent for scientific research [Paper]
[Review]
- Retrieval-augmented generation for large language models: A survey [Paper]
[AI Review]
- Paperweaver: Enriching topical paper alerts by contextualizing recommended papers with user-collected papers [Paper]
[AI Review]
- The ai scientist: Towards fully automated open-ended scientific discovery [Paper]
[Review]
- Language agents achieve superhuman synthesis of scientific knowledge [Paper]
[AI Review]
- Agent laboratory: Using Ilm agents as research assistants [Paper]
[AI Review]
- Pasa: An Ilm agent for comprehensive academic paper search [Paper]
[AI Review]
- Evaluating sakana's ai scientist for autonomous research: Wishful thinking or an emerging reality towards' artificial research intelligence' (ari)? [Paper]
[AI Review]
- Towards an ai co-scientist [Paper]
[Review]
- Dora ai scientist: Multi-agent virtual research team for scientific exploration discovery and automated report generation [Paper] [No Review]
- Codescientist: End-to-end semi-automated scientific discovery with code-based experimentation [Paper]
[AI Review]
- Ai-researcher: Autonomous scientific innovation [Paper]
[AI Review]
- Bigpatent: A large-scale dataset for abstractive and coherent summarization [Paper]
[AI Review]
- TLDR: Extreme summarization of scientific documents [Paper]
[Review]
- X-scitldr: cross-lingual extreme summarization of scholarly documents [Paper]
[AI Review]
- Structured information extraction from scientific text with large language models [Paper]
[AI Review]
- Legobench: Scientific leaderboard generation benchmark [Paper]
[Review]
- Litllm: A toolkit for scientific literature review [Paper]
[AI Review]
- Data extraction from polymer literature using large language models [Paper] [No Review]
- Automated literature review using nlp techniques and Ilm-based retrieval-augmented generation [Paper]
[AI Review]
- Llms for literature review: Are we there yet? [Paper]
[Review]
- Extracting knowledge from scientific texts on patient-derived cancer models using large language models: algorithm development and validation [Paper] [No Review]
- Lag: Llm agents for leaderboard auto generation on demanding [Paper]
[Review]
- Can llms help uncover insights about llms? a large-scale, evolving literature analysis of frontier Ilms [Paper]
[AI Review]
- Agentrxiv: Towards collaborative autonomous research [Paper]
[Review]
- How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? [Paper]
[AI Review]
- SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents [Paper]
[AI Review]
- The Budget AI Researcher and the Power of RAG Chains [Paper]
[AI Review]
- TLDR: Extreme summarization of scientific documents [Paper]
[Review]
- X-scitldr: cross-lingual extreme summarization of scholarly documents [Paper]
[AI Review]
- Appraising the potential uses and harms of llms for medical systematic reviews [Paper]
[AI Review]
- Citeme: Can language models accurately cite scientific claims? [Paper]
[Review]
- CHIME: LLM-assisted hierarchical organization of scientific studies for literature review support [Paper]
[Review]
- Litsearch: A retrieval benchmark for scientific literature search [Paper]
[AI Review]
- Llms for literature review: Are we there yet? [Paper]
[Review]
- Unsupervised word embeddings capture latent knowledge from materials science literature [Paper] [No Review]
- Paperrobot: Incremental draft generation of scientific ideas [Paper]
[AI Review]
- Exploring the limits of transfer learning with a unified text-to-text transformer [Paper]
[AI Review]
- Agatha: automatic graph mining and transformer based hypothesis generation approach [Paper]
[AI Review]
- Literature-based discovery beyond the abc paradigm: a contrastive approach [Paper] [No Review]
- Chain-of-thought prompting elicits reasoning in large language models [Paper]
[AI Review]
- Knowledge integration and decision support for accelerated discovery of antibiotic resistance genes [Paper] [No Review]
- Reflexion: Language agents with verbal reinforcement learning [Paper]
[AI Review]
- SciMON: Scientific inspiration machines optimized for novelty [Paper]
[AI Review]
- Large language models for automated open-domain scientific hypotheses discovery [Paper]
[AI Review]
- Large language models are zero shot hypothesis proposers [Paper]
[AI Review]
- The ai scientist: Towards fully automated open-ended scientific discovery [Paper]
[Review]
- Can llms generate novel research ideas? a large-scale human study with 100+ nlp researchers [Paper]
[AI Review]
- MOOSE-chem: Large language models for rediscovering unseen chemistry scientific hypotheses [Paper]
[AI Review]
- Chain of ideas: Revolutionizing research via novel idea development with Ilm agents [Paper]
[Review]
- Nova: An iterative planning and search approach to enhance novelty and diversity of Ilm generated ideas [Paper]
[AI Review]
- Cycleresearcher: Improving automated research via automated review [Paper]
[AI Review]
- Aigs: Generating science from ai-powered automated falsification [Paper]
[AI Review]
- Towards an ai co-scientist [Paper]
[Review]
- Codescientist: End-to-end semi-automated scientific discovery with code-based experimentation [Paper]
[AI Review]
- The ai scientist-v2: Workshop-level automated scientific discovery via agentic tree search [Paper]
[Review]
- Spark: A System for Scientifically Creative Idea Generation [Paper]
[AI Review]
- Sparks of Science: Hypothesis Generation Using Structured Paper Data [Paper]
[AI Review]
- Scientific Hypothesis Generation and Validation: Methods, Datasets, and Future Directions [Paper]
[AI Review]
- Dynamic Knowledge Exchange and Dual-diversity Review: Concisely Unleashing the Potential of a Multi-Agent Research Team [Paper]
[AI Review]
- AlphaEvolve: A coding agent for scientific and algorithmic discovery [Paper]
[AI Review]
- Bleu: a method for automatic evaluation of machine translation [Paper] [No Review]
- Rouge: A package for automatic evaluation of summaries [Paper] [No Review]
- The perils of using mechanical turk to evaluate open-ended text generation [Paper]
[AI Review]
- Longeval: Guidelines for human evaluation of faithfulness in long-form summarization [Paper]
[AI Review]
- SciMON: Scientific inspiration machines optimized for novelty [Paper]
[AI Review]
- Judging llm-as-a-judge with mt-bench and chatbot arena [Paper]
[AI Review]
- Large language models for automated open-domain scientific hypotheses discovery [Paper]
[AI Review]
- Detecting pretraining data from large language models [Paper]
[AI Review]
- Large language models are zero shot hypothesis proposers [Paper]
[AI Review]
- Exploring scientific hypothesis generation with mamba [Paper] [No Review]
- Can llms generate novel research ideas? a large-scale human study with 100+ nlp researchers [Paper]
[AI Review]
- MOOSE-chem: Large language models for rediscovering unseen chemistry scientific hypotheses [Paper]
[AI Review]
- Chain of ideas: Revolutionizing research via novel idea development with Ilm agents [Paper]
[Review]
- Nova: An iterative planning and search approach to enhance novelty and diversity of Ilm generated ideas [Paper]
[AI Review]
- An empirical analysis of uncertainty in large language model evaluations [Paper]
[AI Review]
- Dynamic Knowledge Exchange and Dual-diversity Review: Concisely Unleashing the Potential of a Multi-Agent Research Team [Paper]
[AI Review]
- HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation [Paper]
[AI Review]
- AI Idea Bench 2025: AI Research Idea Generation Benchmark [Paper]
[AI Review]
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning [Paper]
[AI Review]
- Highly accurate protein structure prediction with alphafold [Paper] [No Review]
- The ai scientist: Towards fully automated open-ended scientific discovery [Paper]
[Review]
- Figureqa: An annotated figure dataset for visual reasoning [Paper]
[AI Review]
- Reviewergpt? an exploratory study on using large language models for paper reviewing [Paper]
[AI Review]
- Swe-bench: Can language models resolve real-world github issues? [Paper]
[Review]
- Teaching code llms to use autocompletion tools in repository-level code generation [Paper]
[AI Review]
- Codeagent: Enhancing code generation with tool-integrated agent systems for real-world repo-level coding challenges [Paper]
[AI Review]
- Multimodal arxiv: A dataset for improving scientific comprehension of large vision-language models [Paper]
[AI Review]
- Iterative refinement of project-level code context for precise code generation with compiler feedback [Paper]
[AI Review]
- Enhancing repository-level code generation with integrated contextual information [Paper]
[AI Review]
- MMSci: A dataset for graduate-level multi-discipline multimodal scientific understanding [Paper]
[AI Review]
- Scicode: A research coding benchmark curated by scientists [Paper]
[Review]
- Mlr-copilot: Autonomous machine learning research based on large language models agents [Paper]
[AI Review]
- Dsbench: How far are data science agents from becoming data science experts? [Paper]
[AI Review]
- Scienceagentbench: Toward rigorous assessment of language agents for data-driven scientific discovery [Paper]
[Review]
- Repograph: Enhancing ai software engineering with repository-level code graph [Paper]
[Review]
- Llm-ref: Enhancing reference handling in technical writing with large language models [Paper]
[AI Review]
- Mcx-llm: an experiment in bridging natural language problem descriptions with quantitative scientific simulations [Paper] [No Review]
- Ai becomes a masterbrain scientist [Paper] [No Review]
- Zochi technical report [Paper] [No Review]
- Aide: Ai-driven exploration in the space of code [Paper]
[AI Review]
- Repocoder: Repository-level code completion through iterative retrieval and generation [Paper]
[AI Review]
- The ai scientist-v2: Workshop-level automated scientific discovery via agentic tree search [Paper]
[Review]
- Autop2c: An Ilm-based agent framework for code repository generation from multimodal content in academic papers [Paper]
[AI Review]
- Biomni: A general-purpose biomedical ai agent [Paper] [No Review]
- ResearchCodeAgent: An LLM Multi-Agent System for Automated Codification of Research Methodologies [Paper]
[AI Review]
- Scientific Hypothesis Generation and Validation: Methods, Datasets, and Future Directions [Paper]
[AI Review]
- GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis [Paper]
[Code] [AI Review]
- Core-bench: Fostering the credibility of published research through a computational reproducibility agent benchmark [Paper]
[AI Review]
- Mle-bench: Evaluating machine learning agents on machine learning engineering [Paper]
[Review]
- Ml-dev-bench: Comparative analysis of ai agents on ml development workflows [Paper]
[AI Review]
- Scireplicate-bench: Benchmarking llms in agent-driven algorithmic reproduction from research papers [Paper]
[AI Review]
- GenoTEX: An LLM Agent Benchmark for Automated Gene Expression Data Analysis [Paper]
[Code] [AI Review]
- Paperbench: Evaluating ai's ability to replicate ai research [Paper]
[AI Review]
- MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges? [Paper]
[AI Review]
- ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows [Paper]
[AI Review]
- Benchmarking AI scientists in omics data-driven biological research [Paper]
[AI Review]
- EXP-Bench: Can AI Conduct AI Research Experiments? [Paper]
[AI Review]
- The toronto paper matching system: an automated paper-reviewer assignment system [Paper] [No Review]
- Peerreview4all: Fair and accurate reviewer assignment in peer review [Paper]
[AI Review]
- Reviewrobot: Explainable paper review generation based on knowledge synthesis [Paper]
[AI Review]
- Towards fair, equitable, and efficient peer review [Paper] [No Review]
- Investigating fairness disparities in peer review: A language model enhanced approach [Paper]
[AI Review]
- PlagBench: Exploring the duality of large language models in plagiarism generation and detection [Paper]
[AI Review]
- Usefulness of llms as an author checklist assistant for scientific papers: Neurips'24 experiment [Paper]
[AI Review]
- Openreviewer: A specialized large language model for generating critical scientific paper reviews [Paper]
[Review]
- Deepreview: Improving llm-based paper review with human-like deep thinking process [Paper]
[Review]
- Automated research review support using machine learning, large language models, and natural language processing [Paper] [No Review]
- Does my rebuttal matter? insights from a major nlp conference [Paper]
[AI Review]
- Matching papers and reviewers at large conferences [Paper]
[AI Review]
- Has the machine learning review process become more arbitrary as the field has grown? the neurips 2021 consistency experiment [Paper]
[AI Review]
- Marg: Multi-agent review generation for scientific papers [Paper]
[AI Review]
- Zero-shot generative large language models for systematic review screening automation [Paper]
[AI Review]
- Reviewer2: Optimizing review generation through prompt generation [Paper]
[AI Review]
- Prompting is all you need: Llms for systematic review screening [Paper] [No Review]
- Autosurvey: Large language models can automatically write surveys [Paper]
[AI Review]
- Relevai-reviewer: A benchmark on ai reviewers for survey paper relevance [Paper]
[AI Review]
- AgentReview: Exploring peer review dynamics with LLM agents [Paper]
[Review]
- A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations [Paper]
[AI Review]
- Review-llm: Harnessing large language models for personalized review generation [Paper]
[AI Review]
- Cutting through the clutter: The potential of llms for efficient filtration in systematic literature reviews [Paper]
[AI Review]
- Automated review generation method based on large language models [Paper]
[AI Review]
- Usefulness of Ilms as an author checklist assistant for scientific papers: Neurips'24 experiment [Paper]
[AI Review]
- Are we there yet? revealing the risks of utilizing large language models in scholarly peer review [Paper]
[AI Review]
- Metawriter: Exploring the potential and perils of ai writing support in scientific peer review [Paper] [No Review]
- Surveyx: Academic survey automation via large language models [Paper]
[AI Review]
- Reviewagents: Bridging the gap between human and ai-generated paper reviews [Paper]
[Review]
- Lgar: Zero-shot llm-guided neural ranking for abstract screening in systematic literature reviews [Paper]
[AI Review]
- Reviewriter: Ai-generated instructions for peer review writing [Paper]
[AI Review]
- The illusion of thinking: Understanding the strengths and limitations of reasoning models via the lens of problem complexity [Paper]
[AI Review]
- Can we automate scientific reviewing? [Paper]
[AI Review]
- Peersum: a peer review dataset for abstractive multi-document summarization [Paper]
[AI Review]
- React: A re view comment dataset for act ionability (and more) [Paper]
[AI Review]
- Moprd: A multidisciplinary open peer review dataset [Paper]
[AI Review]
- Scientific opinion summarization: Paper meta-review generation dataset, methods, and evaluation [Paper]
[AI Review]
- Exploring jiu-jitsu argumentation for writing peer review rebuttals [Paper]
[AI Review]
- Automatic analysis of substantiation in scientific peer reviews [Paper]
[AI Review]
- Politepeer: does peer review hurt? a dataset to gauge politeness intensity in the peer reviews [Paper] [No Review]
- Is Ilm a reliable reviewer? a comprehensive evaluation of Ilm on automatic paper reviewing tasks [Paper] [No Review]
- Automated focused feedback generation for scientific writing assistance [Paper]
[AI Review]
- Peer review as a multi-turn and long-context dialogue with role-based interactions [Paper]
[AI Review]
- Relevai-reviewer: A benchmark on ai reviewers for survey paper relevance [Paper]
[AI Review]
- Aaar-1.0: Assessing ai's potential to assist research [Paper]
[AI Review]
- Large language models for automated scholarly paper review: A survey [Paper]
[AI Review]
- Peerqa: A scientific question answering dataset from peer reviews [Paper]
[AI Review]
- Is your paper being reviewed by an llm? a new benchmark dataset and approach for detecting ai text in peer review [Paper]
[AI Review]
- Automatic evaluation metrics for artificially generated scientific research [Paper]
[AI Review]
- Discoveryworld: A virtual environment for developing and evaluating automated scientific discovery agents [Paper]
[AI Review]
- Dolphin: Closed-loop open-ended auto-research through thinking, practice, and feedback [Paper]
[AI Review]
- The ai scientist-v2: Workshop-level automated scientific discovery via agentic tree search [Paper]
[Review]
- Zochi technical report [Paper] [No Review]
- Star: Bootstrapping reasoning with reasoning [Paper]
[AI Review]
- Constitutional ai: Harmlessness from ai feedback [Paper]
[AI Review]
- Large language models are better reasoners with self-verification [Paper]
[AI Review]
- Self-refine: Iterative refinement with self-feedback [Paper]
[AI Review]
- Automatically correcting large language models: Surveying the landscape of diverse automated correction strategies [Paper]
[AI Review]
- Mathematical discoveries from program search with large language models [Paper] [No Review]
- Cycleresearcher: Improving automated research via automated review [Paper]
[AI Review]
- The ai scientist: Towards fully automated open-ended scientific discovery [Paper]
[Review]
- Researchtown: Simulator of human research community [Paper]
[AI Review]
- Agentrxiv: Towards collaborative autonomous research [Paper]
[Review]
- Alphaevolve: A coding agent for scientific and algorithmic discovery [Paper]
[AI Review]
- Scientific Hypothesis Generation and Validation: Methods, Datasets, and Future Directions [Paper]
[AI Review]
- Hallucination, reliability, and the role of generative AI in science [Paper] [AI Review]
- From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery [Paper] [Review]
- Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems [Paper] [Review]
- AI Scientists Fail Without Strong Implementation Capability [Paper] [AI Review]
- Position: Intelligent Science Laboratory Requires the Integration of Cognitive and Embodied AI [Paper] [Review]
- The Singapore Consensus on Global AI Safety Research Priorities [Paper] [Review]
- Agent4S: The Transformation of Research Paradigms from the Perspective of Large Language Models [Paper] [AI Review]
- AI4Research: A Survey of Artificial Intelligence for Scientific Research [Paper] [AI Review]
- AI2 Scientific Reasoning Challenge (ARC) [Dataset]
- MLAgentBench [Paper]
[Code]
- SWE-bench [Paper]
[Code]
- ScienceAgentBench [Paper]
[Code]
- DSBench [Paper]
[Code]
- AAAR-1.0 [Paper]
- MLGym [Paper]
[Code]
- PaperBench [Paper]
[Code]
Connect with fellow researchers, share insights, and stay updated!
We encourage you to get involved! Stay updated on the latest discussions and breakthroughs by joining these community events:
Join the conversation and exchange ideas in these online communities:
-
- An active community for general AI research discussions.
-
AI Scientist Research Discussion Group:
(Scan QR Code to join. If expired, please contact maintainer via Email: 18856306350@163.com or WeChat: nauhcutnil)
We welcome contributions! Please follow this workflow:
- Fork the repository.
- Create a new branch:
git checkout -b feature/your-contribution - Add or modify resources. Please try to follow the existing format, including adding conference/journal badges.
- Submit a Pull Request (PR) with a clear description of your changes.
For detailed guidelines, please see CONTRIBUTING.md.
This project is licensed under the MIT License - see the LICENSE file for details.
