State of AI Models — May 2026

Every major AI model compared in one place. Pricing, capabilities, context windows, and honest ratings — updated monthly. Bookmark this page.

Feature
🟢ChatGPT
GPT-5.5
🟠Claude
Opus 4.8
🔵Gemini
Gemini 3.5 Flash
🔷Perplexity
Multi-model (GPT-5.5, Claude, etc.)
🟦Copilot
GPT-5.5 (Microsoft hosted)
Grok
Grok 4.3
Flagship ModelGPT-5.5Opus 4.8Gemini 3.5 FlashMulti-model (GPT-5.5, Claude, etc.)GPT-5.5 (Microsoft hosted)Grok 4.3
Consumer Price$20/mo (Plus) / $200/mo (Pro)$20/mo (Pro) / $100 (Max 5x) / $200 (Max 20x)$19.99/mo (AI Pro) / $99.99 (Ultra)$20/mo (Pro)$20/mo (Pro) / $30/mo (M365)Included with X Premium+ ($16/mo)
Free Tier
Context Window256K tokens200K standard / 1M beta1M tokensN/A (search-based)128K tokens128K tokens
Best ForGeneralists, beginners, creative tasksWriters, developers, agents, long docsGoogle users, speed-sensitive work, multimodalResearchers, fact-checkers, journalistsMicrosoft ecosystem users, Office workersX/Twitter power users, unfiltered analysis
Writing
4/5
5/5
4/5
3/5
3/5
3/5
Coding
5/5
5/5
4/5
2/5
4/5
4/5
Research
4/5
4/5
4/5
5/5
3/5
4/5
Speed
4/5
4/5
5/5
4/5
4/5
4/5
Value
4/5
4/5
5/5
4/5
3/5
3/5
Web Search
Image Gen
Voice Mode
File Upload
Code Exec
API Access
Mobile App
Strengths
  • + Plugin ecosystem
  • + Image gen (DALL-E)
  • + Voice mode
  • + Broadest feature set
  • + Improved factuality (GPT-5.5)
  • + Agentic coding leader
  • + Calibrated honesty
  • + Superior writing
  • + Instruction following
  • + API $5/$25 per M tokens
  • + 4x faster output vs frontier
  • + Google Workspace integration
  • + Best free tier
  • + Multimodal (video/audio)
  • + API $1.50/$9 per M tokens
  • + Real-time citations
  • + Source transparency
  • + Research-grade accuracy
  • + Multi-model access
  • + Office 365 integration
  • + Windows native
  • + Free web access
  • + Enterprise-friendly
  • + Real-time X/Twitter data
  • + Unfiltered responses
  • + Image generation
  • + Strong reasoning
  • + Competitive API pricing
Weaknesses
  • Limits on Thinking model
  • No long-context flat pricing
  • Generic writing style
  • No image generation
  • Smaller plugin ecosystem
  • Fewer integrations
  • Writing still behind Claude
  • 2x price above 200K context
  • Quality varies by task
  • Not ideal for creative writing
  • Limited long conversations
  • No custom workflows
  • Relies on OpenAI models
  • Less customizable
  • Aggressive upselling
  • Tied to X ecosystem
  • Less polished UX
  • Smaller developer ecosystem

Quick Verdicts

🟢
ChatGPT
OpenAI

GPT-5.5 is the broadest feature set and biggest ecosystem. If you want one tool that does everything — search, images, voice, code — this is still the default. Hallucination rates improved vs GPT-5.4, but writing quality is fine, rarely great.

🟠
Claude
Anthropic

Opus 4.8 leads agentic coding and calibrated honesty — the model that flags its own uncertainty instead of bluffing. Best writer and coder for hard work; 200K standard context (1M in beta). No image generation.

🔵
Gemini
Google

Gemini 3.5 Flash is Google's default model in 2026: roughly 4x faster output than other frontier models, the best free tier, and tight Workspace integration. Writing still trails Claude; for a three-way benchmark read, see our Opus 4.8 vs GPT-5.5 vs Gemini breakdown.

🔷
Perplexity
Perplexity AI

Built for research, not conversation. Every answer comes with citations. If you need facts with sources, this is the tool. Not great for creative work or coding.

🟦
Copilot
Microsoft

Makes sense if you're already paying for Microsoft 365 — you get GPT-5.5 (Microsoft hosted) inside Office. Otherwise, you're getting repackaged OpenAI with fewer features. Office integration is the main edge.

Grok
xAI

Grok 4.3 is the X/Twitter-native option: real-time social data and unfiltered analysis. The ecosystem is small and the UX is rough, but reasoning and API pricing are competitive.

How We Rate

Ratings are 1–5 dots for Writing, Coding, Research, Speed, and Value.

They reflect practical, day-to-day usefulness for knowledge work (not benchmark scores). When models ship major updates, we revise these monthly.

Want to optimize prompts across multiple AI models?

Get TresPrompt →
Updated monthly. Last verified: May 2026. Something wrong? Let us know.