o3 and o4-mini - they’re great, but easy to over-hype

o3 and o4-mini - they’re great, but easy to over-hype

AI Explained Official Podcast · 2025-04-16

Critical analysis of the two most powerful new models behind ChatGPT, o3 and o4-mini. Not just the system cards, benchmarks, and my own tests, but some you may not have seen before. Yes, they can whip up amazing front-end in a few seconds, but you always have to ask what is in their data. Either way, they prove the gains from RL are just beginning…

https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medium=simple_bench&utm_campaign=ai_explained

AI Insiders ($9!): https://www.patreon.com/AIExplained


Chapters:
00:00 - o3 and o4-mini


https://simple-bench.com/

Plus, Teams and Pro, plus token count: https://x.com/btibor91/status/1912568994512662679

System Card: https://openai.com/index/o3-o4-mini-system-card/

Release Notes: https://openai.com/index/introducing-o3-and-o4-mini/

https://deepmind.google/technologies/gemini/pro/

https://x.com/DeryaTR_/status/1912558350794961168

https://x.com/polynoamial/status/1912564068168450396

API Pricing:https://openai.com/api/pricing/

https://aider.chat/docs/leaderboards/


Non-hype Newsletter: https://signaltonoise.beehiiv.com/

AI Explained Official Podcast

Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.

Where can you listen?

Apple Podcasts Logo Spotify Logo Podtail Logo Google Podcasts Logo RSS

Episodes