03 Pro Beats Other AI Advisors
Key Points
- The speaker evaluated several top AI models (Gemini 2.5 Pro, Claude 4, 03) and found that only 03 Pro consistently delivered insights that felt “resonant” and personally relevant.
- In three benchmark tests—critiquing the Apple “illusion” paper, drafting a Datadog roadmap, and optimizing a Wordle algorithm—03 Pro outperformed the baseline 03 and other models, even when its answers were shorter or less exhaustive.
- 03 Pro’s edge came from its ability to recognize tool‑calling limits and deliberately stop or clarify rather than hallucinate data, which produced more trustworthy and actionable results.
- Although not perfect, the speaker argues that 03 Pro is the first model capable of acting as a strategic advisor at the founder level without major caveats, highlighting its rapid development within just 48 days of the original 03 launch.
- This progress signals a shift from AI being merely tactical to becoming a genuine strategic partner for complex, multi‑dimensional problems.
Sections
- 03 Pro Outshines Rival AI Models - The speaker compared 03 Pro to Gemini 2.5 Pro, Claude 4, and others across three assessments—an Apple paper review, a Datadog roadmap, and a Wordle optimization—and found 03 Pro consistently provided more insightful, resonant answers, not merely longer ones.
- Leveraging 03 Pro for Complex Strategy - The speaker highlights 03 Pro’s rapid advancement and its ability to deliver deep, strategic insights on heavyweight, context‑rich problems—provided users feed it extensive background, set clear constraints, and let it autonomously gather needed information.
- 03 Pro: The Ultimate AI Ferrari - The speaker praises 03 Pro as the current top AI model, likening its power to a Ferrari while warning that it requires careful prompting and may struggle with simple tasks like document summarization.
- Executive Strategy Model Insights - The speaker explains how advanced AI can generate high‑level strategic plans but still requires careful prompting and cannot fully replace human nuance, context, or the desire for external expertise.
- Seeking AI Insight for Happiness - The speaker asks the AI what could make them 50% happier and recommends using a more advanced “03 Pro” model for deeper, more useful advice.
Full Transcript
# 03 Pro Beats Other AI Advisors **Source:** [https://www.youtube.com/watch?v=5kWuXbiQ2zY](https://www.youtube.com/watch?v=5kWuXbiQ2zY) **Duration:** 00:14:28 ## Summary - The speaker evaluated several top AI models (Gemini 2.5 Pro, Claude 4, 03) and found that only 03 Pro consistently delivered insights that felt “resonant” and personally relevant. - In three benchmark tests—critiquing the Apple “illusion” paper, drafting a Datadog roadmap, and optimizing a Wordle algorithm—03 Pro outperformed the baseline 03 and other models, even when its answers were shorter or less exhaustive. - 03 Pro’s edge came from its ability to recognize tool‑calling limits and deliberately stop or clarify rather than hallucinate data, which produced more trustworthy and actionable results. - Although not perfect, the speaker argues that 03 Pro is the first model capable of acting as a strategic advisor at the founder level without major caveats, highlighting its rapid development within just 48 days of the original 03 launch. - This progress signals a shift from AI being merely tactical to becoming a genuine strategic partner for complex, multi‑dimensional problems. ## Sections - [00:00:00](https://www.youtube.com/watch?v=5kWuXbiQ2zY&t=0s) **03 Pro Outshines Rival AI Models** - The speaker compared 03 Pro to Gemini 2.5 Pro, Claude 4, and others across three assessments—an Apple paper review, a Datadog roadmap, and a Wordle optimization—and found 03 Pro consistently provided more insightful, resonant answers, not merely longer ones. - [00:03:28](https://www.youtube.com/watch?v=5kWuXbiQ2zY&t=208s) **Leveraging 03 Pro for Complex Strategy** - The speaker highlights 03 Pro’s rapid advancement and its ability to deliver deep, strategic insights on heavyweight, context‑rich problems—provided users feed it extensive background, set clear constraints, and let it autonomously gather needed information. - [00:06:36](https://www.youtube.com/watch?v=5kWuXbiQ2zY&t=396s) **03 Pro: The Ultimate AI Ferrari** - The speaker praises 03 Pro as the current top AI model, likening its power to a Ferrari while warning that it requires careful prompting and may struggle with simple tasks like document summarization. - [00:10:23](https://www.youtube.com/watch?v=5kWuXbiQ2zY&t=623s) **Executive Strategy Model Insights** - The speaker explains how advanced AI can generate high‑level strategic plans but still requires careful prompting and cannot fully replace human nuance, context, or the desire for external expertise. - [00:14:00](https://www.youtube.com/watch?v=5kWuXbiQ2zY&t=840s) **Seeking AI Insight for Happiness** - The speaker asks the AI what could make them 50% happier and recommends using a more advanced “03 Pro” model for deeper, more useful advice. ## Full Transcript
03 Pro is out. I've been testing AI
models for years now. They're helpful.
They're tactical. They've recently
become strategic. I'm looking at you,
Gemini 2.5 Pro, Claude 4, 03,
but they have not yet been resonant.
What I mean by that is they haven't yet
been so on the money consistently with
their perspective that their words stick
in my head and just live rentree.
That is what we are getting to with 03
Pro. I don't just mean they're good
writers. What I mean by that is they're
so insightful that I feel like I am
profoundly known in the problems I am
grappling with.
So when I started to dig into 03 Pro, I
wanted to give it, you know, an honest
test. I wanted to give it something that
would give me a sense of how it actually
works. And so I picked three things
where I felt like I could make an
assessment.
One was an assessment of that infamous
Apple paper and I wanted to stack it up
against 03's assessment.
One was a road map that I would share
with the seauite and I wanted to pick a
company I knew reasonably well. I picked
data dog
and one was a interesting algorithm
optimization problem and I picked Wordle
optimization.
Now that's a fairly easy one if if like
you've done optimization problems but I
wanted to see like what it would do and
how it would write it relative to what
03 would do.
I looked at all three.
In every single case, 03 Pro did better.
And the reason why it did better
surprised me. It was not that it was a
longer answer. It was not even
necessarily that it had all of the
sections or was more complete. In fact,
in one case, it was less complete than
03 and still won anyway. Do you know
why? because 03 went beyond its tool
calling capabilities and 03 Pro knew
when to stop and could explain why that
is a huge deal. This was a case where it
was looking at Twitter mentions for the
Apple thinking is an illusion paper and
it was looking at a very specific
criteria around retweets with a certain
number of likes and it said I can't get
that out of my tool call right now. I'm
just going to not mention it. Also, I'm
near the word limit you imposed in the
prompt, so it's not worth me going
after. Both were correct.
03 phoned something in in a table, and
it looked plausible. It named real
Twitter users who had really talked
about the paper,
but the table itself wasn't useful
because it didn't specifically refer to
the tweets because underneath the hood,
03 couldn't get to them. Now I am not
here to tell you that this is a perfect
model. I do think it is the first model
that can operate as a strategic advisor
at the founder level without any
caveats. Does that mean that I think
it's the best founder advisor in the
world? Didn't say that did I?
But the fact that we're even talking
about that 48 days after 03 itself
launched is a big deal. That is how fast
project pro progress is going right now.
03 Pro is able to strategically
understand very difficult
multi-dimensional heavy context problems
and come out with strategic insights
that are correct and act as a sparring
partner. This is a model that is hungry
for context. I have made the mistake
even in the little bit of time I've been
using it of feeding it prompts where the
context was too light. This is a model
that seeks to understand like global
thinking. It wants to think big. It will
go get context. And if it goes and gets
context that you didn't direct it to go
and get, you're going to be surprised
perhaps unpleasantly at what you find.
And so my advice to you, which I've seen
elsewhere around the web as people have
played with 03 Pro, is that you should
use this model for hard problems that
you can give a lot of context to the
model on. That is what it shines at. If
you have a truly strategic conundrum,
something you're wrestling with, you
should be able to come up with a lot of
context either from your own head or
from the web. And you should be able to
feed it to the model, tell the model
where to go, give the model constraints
and warnings, and set the model up for a
really successful hard think. And I mean
like a 15, 20 minute think. This is a
get a sandwich while you wait kind of
model experience.
And I got used to that with 01 Pro. But
the difference with 01 Pro is that 01
Pro felt like a complete essayist. it
would come back with a very well written
response, but 03 Pro comes back with the
strategic insight
that actually underlies that response.
And is it more readable than 03? It
actually is. One of the things I've
noticed in the roughly 6 weeks I've been
using 03
is that 03 is extremely technically
intelligent
and has real trouble dumbing that down
into writing that is clear for
non-technical audiences. And I say
dumbing that down because I think 03
thinks of it that way. 03 has trouble
simplifying pros into plain English a
fair bit.
03 Pro is much better at it. If you ask
it for a plain English summary of a very
technical topic, you are likely to get a
better result out of 03 Pro. Now, that
is not a full measure of intelligence.
There are other models out there that do
that very well. Sonnet 4 is a phenomenal
writer. It just is. I've been playing
with it a bit and have been struck by
how Opus and Sonnet really have a good
onetwo punch when it comes to thinking
about hard problems with Opus and then
writing well with Sonnet. And I guess
that's as good a bridge as any to
talking about model comparison.
This is unquestionably 03 Pro is
unquestionably a model in a class of its
own. I get asked a lot, is X the best
model in the world? Then people will
throw out a name like Gro 3, Opus 4, 03,
uh, Deep Seek, whatever it is. Gemini
2.5 Pro.
I I feel very good after playing with
this, telling you that 03 Pro is
unquestionably
the biggest and best model on the planet
right now, and it's not close. However,
I do not think a lot of people will
understand or appreciate it. Partly
because they're releasing it only on the
Pro and the team's plans.
And I think they'll bring it down
because the unit economics seem to be
much more favorable with 03 Pro. They
released 03 Pro for 87% less than 01
Pro.
But even if they bring it down into the
lower tiers, this is still a model that
takes prompting carefully. You need to
be thoughtful about the problems you
hand this model. It's like driving a
Ferrari. If you drive it well, it's
going to do a phenomenal job on amazing
roads and you'll have a great time. If
you take it to the grocery store, you'll
regret it. And if you drive it on bad
roads, you'll just blow it up. And I
will say there are ways you can make
this model quote unquote blow up. I
don't mean actually malfunction, but I
mean I have found that when I have just
attached a document and asked it to
summarize the document, it doesn't do a
super great job at that because it is
unable to restrain itself from being a
global thinker and bringing in extra
context.
And by the way, people are probably
going to call that habit hallucinations.
And I think that is probably incorrect.
And I'll explain why. Hallucinations, if
we look at them, the way we name weeds
and gardens, weeds is just an undesired
plant. A hallucination is just an
undesired thought from a model, right?
Whatever you want to call it. In this
case, I think it's actually very
intentional on OpenAI's part to launch a
model that is a true global thinker
because they need that on the path to
AGI. So that part makes sense.
I think the challenge is
because it gathers context from across
the web and it is difficult to
understand what all the sources were
that it got a hold of.
It is hard to know at first glance
whether the numbers it is giving you and
the facts it is giving you in a response
are absolutely correct every single one
of them or whether some of them might be
made up. And it is so persuasive
and so clean in its pros and so
insightful
you won't get the feeling intuitively
that the numbers are made up. they won't
sit out to you and like jump out and
say, "Oh, this is a madeup number." the
way they have in the past. You will have
to do your checking. And so, I would
think this is the first model where it
is probably going to end up being
malpractice
not to check the model's response with
another model before publication. If
you're going to go to an executive, if
you're going to go to the internet with
a model's output, it is on you to use
another model to help you check because
the number of things that is looking at
is kind of too high at this point for a
human to fact check individually unless
you have hours and hours and hours and
hours.
And part of what we're using these
models for is that they save us time.
Like this is better than McKenzie
strategy decks I've seen. Like this is
truly a
executive level strategic thinking
model.
And so I guess if you bring that back
around to where I started, you have an
executive strategic model in your
pocket.
It is picky about prompting. It is a
global thinker. You have to expect it to
be one.
What are you going to do with it? How
are you going to prompt it? What are the
problems that you were going to give it?
And I deliberately want to point out
that I do not think that just because
this model is capable of this level of
strategic thinking, that does not mean
everyone is going to go out, use this
model for strategic thinking and make
all consultants go away. Partly because
no matter how good this model is, it is
not going to be able to understand the
hidden depths of quiet context, the vibe
in your office the way you do and the
way a consultant does if they really sit
down with you. And honestly, partly
because people are kind of lazy and
don't always
actually use the capabilities that are
in front of them.
And so imagine this as like an
incredibly powerful home cooking machine
that has magically arrived in all of our
homes or soon will.
We're still going to go out to eat at
restaurants. We still want to order in
shrimp lain or pokey or sushi sometimes,
even if the magical cooking machine can
do a phenomenal job because we're human.
And that's actually a point that Sam
Alman made. He intentionally published
an essay today called the gentle
singularity where he talked about the
fact that a lot of what humans care
about is going to continue to exist in
the 2030s. And yet at the same time this
takeoff into intelligence is going to
continue to happen. And his thesis is
essentially that we will have much more
abundance etc because we let the
intelligence happen. We will see how all
of that plays out. Sam has the ability
to actually drive some of that. you and
I don't uh we're just along for the ride
and it's helpful to understand what's
going on. In my view, there's two big
things that stand out that I want you to
take away. One, it's been 48 days. I
said it before, it's been 48 days since
03. We are going fast. 04 is around the
corner. 04 Pro is coming. GPT5 is
coming. That is all from one model maker
this year alone. Plus the open source
model they're going to release.
There's other model makers right around
the corner. I'm sure they're working out
late tonight on 03 Pro,
so things are going fast. Number two,
yes, this is a model that is worth
getting to know. It is the best model in
the world. It needs an excellent,
excellent problem. I am fully like
having talked to a lot of people in
tech, outside of tech, we all grapple
with problems that are really, really
tough for us. It is worth it to have a
strategic thinking partner. And I don't
just mean for business, although I've
spent most of this uh video talking
about the business side. On the Substack
that I wrote about this, which you can
check out if you like, I actually
include a very simple prompt for people
who find this model scary to get
started. I'm going to go ahead and read
it here as well.
Based on everything you know about me,
what would make me 50% happier? I if you
have been talking to your chat GPT, give
that to 03 Pro and see what a difference
it makes.
See what a difference it makes. It is a
much more insightful model than 03.
And I think that's where I'll leave it.
Good luck uh becoming 50% happier with
03