💰
Live-synced input and output token rates
Each row shows the vendor's current published rate for input tokens (per 1M) and output tokens (per 1M). Since output tokens usually cost 3-5× input, splitting the two is essential — an ai model comparison that shows only a blended rate hides the real cost of long-response workloads.
📏
Context window and long-context support
The context column shows how much input a model handles in one call — 200k for Claude Opus 4.8, 1M for Gemini 3 Pro, 128k for GPT-5. If your workload includes long documents or huge codebases, this is often more decisive than raw price. Prompt caching discounts noted.
🔧
Feature matrix: vision, tools, JSON, streaming
Cost isn't the only axis. The dashboard surfaces which models support vision, function/tool calling, JSON mode, streaming, prompt caching, and structured output schemas. Teams evaluating GPT-5 pricing against Claude Sonnet 4.6 often find the right pick is the one whose feature set matches the workload.
🧮
Estimated monthly cost calculator
Enter your rough monthly input volume, output volume, and cache hit rate; the calculator returns a projected monthly bill per model. Sort by cost, filter by capability, screenshot the result for finance. No account required — the dashboard is publicly accessible.
⚡
Latency benchmarks per provider
Beyond price, we track median and p95 response latency per model per provider (OpenRouter, direct Anthropic, direct OpenAI, direct Google). If your workload needs sub-second first-token, the latency column rules out models the price column would otherwise recommend.
📈
Historical pricing chart per model
Every model row expands into a 12-month price history — when it launched, when the rate dropped, when a new tier appeared. Handy for finance teams building forecasts and for engineers negotiating internal cost budgets: 'GPT-5 input tokens are down 40% since Q1.'