State of AI Models — May 2026
Every major AI model compared in one place. Pricing, capabilities, context windows, and honest ratings — updated monthly. Bookmark this page.
| Feature | 🟢ChatGPT GPT-5.5 | 🟠Claude Opus 4.8 | 🔵Gemini Gemini 3.5 Flash | 🔷Perplexity Multi-model (GPT-5.5, Claude, etc.) | 🟦Copilot GPT-5.5 (Microsoft hosted) | ⚫Grok Grok 4.3 |
|---|---|---|---|---|---|---|
| Flagship Model | GPT-5.5 | Opus 4.8 | Gemini 3.5 Flash | Multi-model (GPT-5.5, Claude, etc.) | GPT-5.5 (Microsoft hosted) | Grok 4.3 |
| Consumer Price | $20/mo (Plus) / $200/mo (Pro) | $20/mo (Pro) / $100 (Max 5x) / $200 (Max 20x) | $19.99/mo (AI Pro) / $99.99 (Ultra) | $20/mo (Pro) | $20/mo (Pro) / $30/mo (M365) | Included with X Premium+ ($16/mo) |
| Free Tier | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| Context Window | 256K tokens | 200K standard / 1M beta | 1M tokens | N/A (search-based) | 128K tokens | 128K tokens |
| Best For | Generalists, beginners, creative tasks | Writers, developers, agents, long docs | Google users, speed-sensitive work, multimodal | Researchers, fact-checkers, journalists | Microsoft ecosystem users, Office workers | X/Twitter power users, unfiltered analysis |
| Writing | 4/5 | 5/5 | 4/5 | 3/5 | 3/5 | 3/5 |
| Coding | 5/5 | 5/5 | 4/5 | 2/5 | 4/5 | 4/5 |
| Research | 4/5 | 4/5 | 4/5 | 5/5 | 3/5 | 4/5 |
| Speed | 4/5 | 4/5 | 5/5 | 4/5 | 4/5 | 4/5 |
| Value | 4/5 | 4/5 | 5/5 | 4/5 | 3/5 | 3/5 |
| Web Search | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Image Gen | ✓ | — | ✓ | ✓ | ✓ | ✓ |
| Voice Mode | ✓ | — | ✓ | — | ✓ | — |
| File Upload | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Code Exec | ✓ | ✓ | ✓ | — | — | — |
| API Access | ✓ | ✓ | ✓ | ✓ | — | ✓ |
| Mobile App | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Strengths |
|
|
|
|
|
|
| Weaknesses |
|
|
|
|
|
|
Quick Verdicts
GPT-5.5 is the broadest feature set and biggest ecosystem. If you want one tool that does everything — search, images, voice, code — this is still the default. Hallucination rates improved vs GPT-5.4, but writing quality is fine, rarely great.
Opus 4.8 leads agentic coding and calibrated honesty — the model that flags its own uncertainty instead of bluffing. Best writer and coder for hard work; 200K standard context (1M in beta). No image generation.
Gemini 3.5 Flash is Google's default model in 2026: roughly 4x faster output than other frontier models, the best free tier, and tight Workspace integration. Writing still trails Claude; for a three-way benchmark read, see our Opus 4.8 vs GPT-5.5 vs Gemini breakdown.
Built for research, not conversation. Every answer comes with citations. If you need facts with sources, this is the tool. Not great for creative work or coding.
Makes sense if you're already paying for Microsoft 365 — you get GPT-5.5 (Microsoft hosted) inside Office. Otherwise, you're getting repackaged OpenAI with fewer features. Office integration is the main edge.
Grok 4.3 is the X/Twitter-native option: real-time social data and unfiltered analysis. The ecosystem is small and the UX is rough, but reasoning and API pricing are competitive.
How We Rate
Ratings are 1–5 dots for Writing, Coding, Research, Speed, and Value.
They reflect practical, day-to-day usefulness for knowledge work (not benchmark scores). When models ship major updates, we revise these monthly.
Want to optimize prompts across multiple AI models?
Get TresPrompt →