Ranked: The Smartest AI Models of 2026

Like
Liked

Date:

Ranked: The Smartest AI Models of 2026

See visuals like this from many other data creators on our Voronoi app. Download it for free on iOS or Android and discover incredible data-driven charts from a variety of trusted sources.

Key Takeaways

  • Grok-4.20 Expert Mode and OpenAI GPT 5.4 Pro (Vision) tie for the top spot in TrackingAI’s April 2026 Mensa Norway benchmark, each scoring 145.
  • The top tier is getting crowded, with several leading models now separated by only a few points.
  • Scores have risen sharply from 2025, highlighting how quickly frontier AI reasoning has improved on visual pattern-recognition tests.

The race to build smarter AI models is getting tighter at the top.

This visualization, part of Visual Capitalist’s AI Week, sponsored by Terzo, ranks leading systems using data from TrackingAI, which benchmarks models on the Mensa Norway IQ test as of April 2026.

The results show both who leads today and how little now separates the top contenders, with multiple frontier models clustered near the top of the leaderboard.

A Tie at the Top

The ranking offers a snapshot of how today’s leading AI models perform on abstract pattern-recognition tasks, and just how close the race has become.

As the table below shows, only a small gap now separates the top models:

Model Mensa Norway IQ (April 2026)
Grok-4.20 Expert Mode 145
OpenAI GPT 5.4 Pro (Vision) 145
Gemini 3.1 Pro Preview 141
OpenAI GPT 5.4 Thinking (Vision) 139
OpenAI GPT 5.3 136
Grok-4.20 Expert Mode (Vision) 133
OpenAI GPT 5.4 Thinking 133
Meta Muse Spark 133
Gemini 3.1 Pro Preview (Vision) 132
Qwen 3.5 130
Claude-4.6 Opus 130
Kimi K2.5 127
Manus 115
DeepSeek R1 112
DeepSeek V3 111
Gemini 3.1 Flash Preview 110
Llama 4 Maverick 110
OpenAI GPT 5.3 (Vision) 109
Claude-4.6 Sonnet 106
Bing Copilot 101
Perplexity 97
Mistral Medium 3.1 96
Claude-4.6 Sonnet (Vision) 94
Claude-4.6 Opus (Vision) 82
Llama 4 Maverick (Vision) 79
OpenAI GPT 5.4 Pro 73

The biggest takeaway is how compressed the top of the leaderboard has become. Grok-4.20 Expert Mode and OpenAI GPT 5.4 Pro (Vision) are tied for first at 145, while Gemini 3.1 Pro Preview follows closely at 141.

That narrow spread suggests frontier AI models are increasingly converging at the top, where a difference of just a few points can shift the rankings.

The gains from 2025 are also notable. Last year’s top score was 135, compared with 145 in this year’s results, highlighting the speed at which leading models are improving on this benchmark.

Not all models are keeping pace. Among major AI developers, Mistral’s top model ranks lowest in this dataset, scoring 97—well below the leading group.

How TrackingAI Runs the Test

TrackingAI uses the public Mensa Norway test, a set of 35 visual-pattern puzzles. For non-vision models, the questions are verbalized, while vision models receive the original images directly.

As a result, these results are best understood as a benchmark comparison—not a definitive measure of overall intelligence. Because the test is fundamentally visual, model scores can vary depending on how the questions are presented.

Why This Benchmark Matters

TrackingAI’s leaderboard is useful because it offers a simple, familiar way to compare reasoning performance over time. The site also notes that if a model refuses to answer, it is asked the same question up to 10 times, and the most recent answer is used for scoring.

Still, an IQ-style benchmark captures only one slice of capability. It does not measure everything that matters in real-world AI use, such as coding ability, factual reliability, tool use, or performance in professional domains.

Learn More on the Voronoi App

If you enjoyed today’s post, check out Global AI Adoption on Voronoi.

 

ALT-Lab-Ad-1

Recent Articles