LLM Reference

LLM Reference helps tech leaders quickly find and compare the best AI models and providers for their specific project needs.

Visit

Published on:

May 29, 2026

Category:

AI Assistants

Pricing:

Free

LLM Reference application interface and features

About LLM Reference

LLM Reference is a comprehensive decision-support directory designed specifically for engineers, technology leaders, and AI practitioners who need to select the right large language model (LLM) and provider in the rapidly evolving AI landscape. The platform tracks over 1,800 language models from more than 140 providers and 247 research labs, with data refreshed weekly to include new releases, verified price changes, and benchmark updates. The core value proposition is eliminating the time wasted hunting through scattered sources so users can ship with confidence. Whether building a coding assistant, an agentic workflow, a writing tool, or a research pipeline, LLM Reference provides a single, trustworthy place to compare models side by side. Users can see who offers the cheapest pricing for frontier output, browse curated editors' picks for specific tasks like coding, agents, writing, research, image generation, and video creation, and access a Pulse feed highlighting weekly changes including new models, price cuts, and benchmark refreshes. The site is designed for fast triage, allowing users to quickly identify the right model for their job, determine the most cost-effective provider, and get back to building. Built by the Data Advantage project and updated daily, LLM Reference is an essential resource for anyone needing to stay current with the exploding LLM ecosystem.

Features of LLM Reference

Comprehensive Model Directory

The platform maintains an extensive, searchable directory of over 1,800 language models from more than 140 providers and 247 research labs. Users can search by model name, provider, capability, or use case, and filter results based on specific tasks such as coding, RAG, agents, long context, vision, classification, and JSON or tool use. This centralized repository eliminates the need to visit multiple websites or rely on fragmented information sources.

Curated Editors' Picks

LLM Reference offers expert-curated recommendations for specific use cases, organized by audience type including developers, knowledge workers, and creatives. Each pick includes a detailed rationale, benchmark scores, and eligibility information. For example, the coding default pick highlights Claude Fable 5 with 80.3% SWE-bench Pro and 96% SWE-bench Verified scores, while the writing pick recommends Claude Opus 4.7 based on Chatbot Arena rankings and real-world writing quality.

Pulse Feed and Weekly Updates

The Pulse feature provides a weekly snapshot of what changed in the model market, including new models, verified price cuts, and benchmark refreshes. Users can see at a glance how many new models were added (e.g., 177), how many price cuts occurred (e.g., 53), and how many benchmark refreshes were processed (e.g., 368). This keeps users informed without overwhelming them with noise, ensuring they always have the latest information for decision-making.

Side-by-Side Model Comparison

The compare feature allows users to evaluate two models directly against each other, examining their benchmark scores, pricing, and suitability for specific tasks. Popular comparisons include Claude Fable 5 versus Claude Opus 4.8, and GPT-5.5 versus Gemini 3.1 Pro Preview. This feature is essential for making informed decisions when multiple models appear equally capable for a given task.

Use Cases of LLM Reference

Selecting a Coding Assistant Model

Engineering teams building AI-assisted coding tools can use LLM Reference to identify the best model for their needs. The platform provides specific benchmark scores like SWE-bench Pro and SWE-bench Verified, along with editors' picks for coding tasks. For example, Claude Fable 5 is recommended for production coding with 80.3% SWE-bench Pro and 96% SWE-bench Verified on Vals.ai, making it suitable for non-trivial engineering tasks.

Choosing a Cost-Effective Provider for Frontier Output

Technology leaders responsible for budget management can use the frontier pricing data to find the cheapest provider for high-quality model output. The platform tracks the lowest cost per million output tokens across all providers, with current data showing Hunyuan HY3 Preview via Tencent Cloud TI Platform at $0.260 per 1M output tokens. This enables teams to maximize performance while minimizing operational costs.

Building Agentic Workflows

Developers creating AI agents can leverage the agents-specific editors' picks and benchmark data. LLM Reference highlights models like Claude Sonnet 4.6 with the best generally-available tau-bench score of 87.5, which stays on-task across long tool loops and self-corrects without prompting. This ensures agentic workflows are built on reliable, well-tested foundation models.

Comparing Models for Research and Analysis

Knowledge workers conducting research or data analysis can use the platform to find models optimized for their specific tasks. The research section recommends Claude Fable 5 based on GDPval-AA ELO 1932 and reported wins in finance, trading, and analytics. The data and SQL category recommends GPT-5.5, while summarization and translation tasks are best served by Gemini 3 Flash and Gemini 3 Pro respectively.

Frequently Asked Questions

How often is the data on LLM Reference updated?

The data is updated weekly with new model releases, verified price changes, and benchmark refreshes. The Pulse feed shows exactly what changed each week, including the number of new models added, price cuts verified, and benchmark scores refreshed. The platform is built by the Data Advantage project and updated daily, ensuring users always have access to the most current information.

How are the editors' picks determined?

Editors' picks are curated based on a combination of benchmark performance, real-world testing, and expert evaluation. Each pick includes specific benchmark scores, the date it was researched, and a detailed rationale explaining why the model is recommended for that particular use case. For example, the coding pick for Claude Fable 5 cites specific SWE-bench scores, while the writing pick for Claude Opus 4.7 references Chatbot Arena rankings and practical writing quality.

Can I compare two specific models directly?

Yes, the compare feature allows you to select any two models from the directory and view them side by side. You can compare benchmark scores, pricing, provider information, and suitability for specific tasks. Popular comparisons are also provided as quick links, including Claude Fable 5 versus Claude Opus 4.8 and GPT-5.5 versus Gemini 3.1 Pro Preview.

Is LLM Reference free to use?

Yes, LLM Reference is a free resource provided by the Data Advantage project. Users can browse the model directory, access editors' picks, compare models, view the Pulse feed, and use all other features without any cost. The platform is designed to be an essential, accessible tool for anyone navigating the LLM ecosystem.

Pricing of LLM Reference

LLM Reference itself is a free resource. There are no paid tiers, subscriptions, or usage fees for accessing the directory, editors' picks, comparison tools, Pulse feed, or any other feature. The platform is provided as a public service by the Data Advantage project to help engineers and technology leaders make informed decisions about LLM selection and provider choice.

Explore more in this category:

Best AI Assistants products

View all alternatives for LLM Reference

Similar to LLM Reference

Visit

Oravaa

Oravaa is an enterprise Voice AI platform that transforms how businesses handle high-volume phone traffic. By deploying ultra-low latency conversation

AI Assistants Speech & Voice Paid