GPT-5 Pro: Smarter Yet Experientially Worse
Key Points
- GPT5 Pro is the first AI model that is provably smarter yet experientially worse, a paradox that signals a fundamental shift in AI development.
- Its superior intelligence comes from a compute‑time architecture that runs multiple parallel reasoning chains, letting the model debate internally like a panel of experts before delivering a unified answer.
- This emphasis on coherent judgment enables GPT5 Pro to excel on tasks that require strong decision‑making, such as scoring highly on IQ‑style tests.
- Effective business adoption isn’t as simple as swapping in GPT5 Pro for any use case; careful selection of scenarios where its judgment‑focused strengths shine is essential.
- Prospective users—especially individual subscribers facing a $200‑per‑month price tag—need a decision‑making framework to evaluate whether the upgrade delivers enough practical value for their workflow.
Sections
- GPT5 Pro: Smarter Yet Worse - The speaker introduces GPT5 Pro, explains its paradox of higher intelligence paired with poorer user experience, and offers guidance on assessing its practical value and upgrade worthiness for businesses and individual users.
- GPT-5 Pro Parallel Compute Trade-offs - It explains that GPT‑5 Pro’s higher price stems from running multiple reasoning threads simultaneously, boosting correctness but incurring heavy compute costs and occasional instability.
- GPT‑5 Trade‑offs: Personality, Context, Data - The speaker explains how shifting from the emotionally‑driven GPT‑4 to the correctness‑focused GPT‑5 sacrifices personality, strains multi‑thread context continuity, and requires highly structured, multi‑perspective data.
- GPT‑5 Pro as Research Partner - The speaker highlights GPT‑5 Pro’s capacity to generate advanced scientific insights and conduct comprehensive financial modeling, emphasizing that clean, multi‑layered data is essential for optimal performance.
- Parallel Reasoning Limits for GPT‑5 - The speaker warns that GPT‑5 Pro’s parallel, architecture‑focused reasoning can cause it to lose coherence in sequential tasks like coding and singular‑voice creative writing.
- Anthropic vs Google Model Strategy - The speaker questions whether Anthropic should keep optimizing coding‑centric, tool‑using models or pivot to broader reasoning while preserving Claude’s personality, and notes that Google already possesses strong reasoning capabilities but must decide how to productize them into a distinctive chat‑based product.
Full Transcript
# GPT-5 Pro: Smarter Yet Experientially Worse **Source:** [https://www.youtube.com/watch?v=7-LFn11dNHA](https://www.youtube.com/watch?v=7-LFn11dNHA) **Duration:** 00:23:52 ## Summary - GPT5 Pro is the first AI model that is provably smarter yet experientially worse, a paradox that signals a fundamental shift in AI development. - Its superior intelligence comes from a compute‑time architecture that runs multiple parallel reasoning chains, letting the model debate internally like a panel of experts before delivering a unified answer. - This emphasis on coherent judgment enables GPT5 Pro to excel on tasks that require strong decision‑making, such as scoring highly on IQ‑style tests. - Effective business adoption isn’t as simple as swapping in GPT5 Pro for any use case; careful selection of scenarios where its judgment‑focused strengths shine is essential. - Prospective users—especially individual subscribers facing a $200‑per‑month price tag—need a decision‑making framework to evaluate whether the upgrade delivers enough practical value for their workflow. ## Sections - [00:00:00](https://www.youtube.com/watch?v=7-LFn11dNHA&t=0s) **GPT5 Pro: Smarter Yet Worse** - The speaker introduces GPT5 Pro, explains its paradox of higher intelligence paired with poorer user experience, and offers guidance on assessing its practical value and upgrade worthiness for businesses and individual users. - [00:04:11](https://www.youtube.com/watch?v=7-LFn11dNHA&t=251s) **GPT-5 Pro Parallel Compute Trade-offs** - It explains that GPT‑5 Pro’s higher price stems from running multiple reasoning threads simultaneously, boosting correctness but incurring heavy compute costs and occasional instability. - [00:07:22](https://www.youtube.com/watch?v=7-LFn11dNHA&t=442s) **GPT‑5 Trade‑offs: Personality, Context, Data** - The speaker explains how shifting from the emotionally‑driven GPT‑4 to the correctness‑focused GPT‑5 sacrifices personality, strains multi‑thread context continuity, and requires highly structured, multi‑perspective data. - [00:10:28](https://www.youtube.com/watch?v=7-LFn11dNHA&t=628s) **GPT‑5 Pro as Research Partner** - The speaker highlights GPT‑5 Pro’s capacity to generate advanced scientific insights and conduct comprehensive financial modeling, emphasizing that clean, multi‑layered data is essential for optimal performance. - [00:14:40](https://www.youtube.com/watch?v=7-LFn11dNHA&t=880s) **Parallel Reasoning Limits for GPT‑5** - The speaker warns that GPT‑5 Pro’s parallel, architecture‑focused reasoning can cause it to lose coherence in sequential tasks like coding and singular‑voice creative writing. - [00:19:32](https://www.youtube.com/watch?v=7-LFn11dNHA&t=1172s) **Anthropic vs Google Model Strategy** - The speaker questions whether Anthropic should keep optimizing coding‑centric, tool‑using models or pivot to broader reasoning while preserving Claude’s personality, and notes that Google already possesses strong reasoning capabilities but must decide how to productize them into a distinctive chat‑based product. ## Full Transcript
This is your introduction to GPT5 Pro.
Now, I know that not everybody has Chad
GPT5 Pro. The reason I'm covering Chad
GPT5 Pro is because it represents a
different kind of computing. It gives us
a hint of where AI is scaling next. And
figuring out how to apply it in your
business is not nearly as simple as
taking all the AI use cases and adding
GPT5 Pro to them. It takes a lot of
judgment. What follows are my field
notes as I've dug into the use cases
that I'm seeing actually work and the
rationale for why those use cases work
so you can figure out where GPT5 Pro
might work in your business. The central
thesis I want to explore over the course
of these notes is this. GPT5 Pro is the
first AI model that is provably smarter
and also experientially worse. and that
this paradox reveals something really
fundamental about the future of AI
development. So, I'm gonna say it again
because I think that people are going to
kind of cough and spit out their coffee.
This this model is smarter, yes, which
everybody expected, but it's also
experientially worse. And I'm going to
get into why and kind of how that works.
We're going to dive into the details on
this one because I want you to walk away
with the tools that you need to figure
out where GPT5 Pro fits in your workflow
and whether it's worth upgrading, right?
because some people they are asking the
question like I'm an individual user.
Should I pay the really expensive $200 a
month to get this thing? And I want you
to walk away with the tools to make that
decision. Okay, first let's talk about
the architecture of GPT5 Pro because
that underlies everything else we're
going to discuss today. OpenAI has
reimagined intelligence in terms of
time. Now, I've talked about inference
compute a fair bit, but it is worth
revisiting because fundamentally with
GPT5 Pro, that is where the smarts come
from. It is not just model size. It is
compute time. Specifically, GPT5 Pro, it
doesn't just process your query. It's
running multiple parallel reasoning
chains at once. It can explore multiple
solution paths independently. It
evaluates them against each other and
then it synthesizes the best approach
out of all those reasoning chains. What
this enables it to do is to think like a
panel of experts that's debating
internally before presenting a unified
answer. I don't want to pretend to you
that chat GPT has a monopoly on this
general approach to inference time
compute. It doesn't. There's other model
makers out there that are working on
this too. However, what GPT5 Pro does
really, really well is it actually takes
all of that parallel reasoning and it
judges really coherently what is the
correct decision or approach. And this
emphasis on judging correctly is one of
the hallmarks of GPT5 Pro and it's
something that you'll see as a
throughine when we get to the use cases
that work. I think it is why GPT5 Pro
with internet access scored so well on
the IQ test. Now, I don't I'm not a huge
believer in IQ tests. I think they're
interesting directionally. It is
unquestionably true that if you are
following the story of LLMs and IQ
tests, GPT5 Pro is really good. I think
it scored a 148. Like, it's a
phenomenally smart model in that
specific measured test environment. And
I think why is because that test
environment values correctness too. And
so GPT5 Pro is sort of in its element
there. But this idea, let's come back to
this idea of this panel of experts
debating. This mirrors how humans
actually solve hard problems. And I
haven't seen this part discussed a ton
online. When you face a difficult
decision, you don't really just think
linearly. If A then B, right? Like
that's not how we actually think. It may
be how we write, but it's not how we
think. you are actually considering
multiple perspectives like facets
simultaneously. When you ruminate, when
you think about an idea, it's almost
like you're walking through different
ideas at once and kind of even in the
back of your head turning them over and
looking at different angles of the idea.
You might be saying, "What are the
risks? What are the opportunities? How
does this affect this other concept?
What would happen if in a sense GPT5 Pro
is mechanizing this parallel
deliberation that we do in our heads?"
It's trying to simulate it a little bit.
You're not just paying $200 for access
to a smarter model. You're paying for
the compute to run multiple reasoning
threads at once. And that gives you a
clue as to why it's reserved for the
smarter model. It's not cheap to run.
Every query spawns parallel processes
that take real compute resources. The
thing is you get an advance on
correctness. And so you can look at
different sort of tests that show that
you know this test 100% on advanced
mathematics, right? Or 88.4% on graduate
level reasoning, 22% fewer major errors
in the bench. Okay, fine. Right? Like I
have learned to take the test with a
grain of salt. What I'm more interested
in is the architecture that leads to
correctness because that's what actually
gets us where we need to go. However,
before we get into use cases, this is
where I talk about the disappointments
or the fact that this is both a smarter
model, which I think I've talked about
with this concept of of inference time
compute and the value and correctness.
That's one of the things GPT5 Pro has
really emphasized. We have a trade-off
here. This is one of the reasons why
this experience is somewhat
disappointing. The parallel processing
that makes GPT5 Pro really smart also
breaks it depending on how you define
broken in some very specific and
predictable ways. The first one is a
little bit ironic and it's worth paying
attention to if you're in a business
context. Right now GPT5
Pro is much much more vulnerable from a
security perspective than GPT. And
that's not just me saying that. that's
widely reported across the security uh
publications that matter. They are using
adversarial techniques, jailbreaking
techniques to test these models. And
what they're discovering is GPT5 Pro and
the GPT5 family overall don't test well.
And by the way, if you're wondering what
is the difference between pro and GPT5
thinking, very simply, it's about how
much you're turning up the dial on that
parallel reasoning. And GPT5 Pro is
turned up to 11 like Spinal Tap, right?
Like it that's just how it works. When
the model is exploring multiple
perspectives, adversarial prompts can
poison a particular thread and influence
the eventual synthesis. Essentially, you
have more surface area for the prompts
to attack. That's the architectural cost
of parallel reasoning. Now, is somebody
at OpenAI hard at work fixing that? I
have no doubt. But at the moment, that
is part of the challenge right now with
GPT5 Pro. When you expand parallel
threads, you expand surface attack
vectors. You just do. Trade-off number
two, personality loss. When you
synthesize multiple reasoning chains,
you get a synthesis. The model can
struggle to maintain a consistent voice
when it's aggregating perspectives. This
is why you sometimes get really clean,
really correct, but what users might
call robotic responses from GPT5 Pro.
It's part of the root cause for the
frustration
with the move from 40, which was an
emotional model, to GPT5, which is a
model that values correctness.
When you when you look at multiple
viewpoints and you pick the exact right
one, and you're averaging and
synthesizing, a lot of the personality
just isn't there anymore. Trade-off
number three, context degradation.
Maintaining coherent context across
parallel threads is much much harder
than maintaining a single narrative
thread which creates challenges because
the parallel paths can start to diverge
and create sort of memory fragmentation
issues etc. This will come back as we
talk about use cases and where to use
GPT. The fourth one because before we
jump on from this Chad GPT has done a
lot of work behind the scenes I think to
manage the risk of this so it's still
usable for context. So, we'll we'll get
into that. The fourth trade-off, data
structure requirements. GPT5 Pro is
hungry for data, but it needs data
organized for multi-perspective
analysis. A financial document, for
example, should not just contain the
numbers. It should contain multiple
structured layers where it can account
from a strategic perspective, a risk
perspective, an accounting perspective.
Organizations that are used to holding a
lot of those strategic layers in the
CFO's head or in multiple people's heads
really are going to struggle with
presenting GPT5 Pro with the kind of
data it needs to thrive. So, let's get
into the use cases. We've talked about
some of the things that GPT5 Pro does
well. We've talked about how that very
power, the parallel reasoning creates
vulnerabilities. Let's start to dive
into where do we have use cases that
work and where do we have use cases that
don't. And I want to give you a key so
that you can start to use these for
yourself. Use GPT5 in cases where
parallel reasoning is going to serve you
really really well and correctness
really really matters. as an example,
scientific re research when uh Amgen and
I believe this is a real example
analyzes polymer structures, GPT5 Pro
can evaluate chemical properties. It can
evaluate structural integrity,
manufacturing feasibility, and
regulatory compliance all at once. We
actually have like a lot of
documentation on the web about the way
GPT5 Pro and the way other OER reasoning
models have helped to advance scientific
research. And you see this thread over
at Google as well. It's not the Oser
model. They have their own reasoning
models, but they are fundamentally going
after scientific research because it
enables you to reason across different
perspectives on a body of data at once
and it enables you to converge on a
correct solution and correctness really
matters. And so in the GPT5 Pro case, if
you're analyzing these polymer
structures, you can bring in multiple
perspectives in each reasoning thread,
right? domain expertise. You can bring
in the structure of the molecule etc.
And eventually the synthesis can produce
insights that a single reasoning trace
could not match and critically that can
advance the field or at least act as a
very strong thought partner to a PhD
level researcher. And that is part of
the reason why scientific research is so
emphasized by modelmakers. They're good
at it. The model's good at it. Not too
many of us are scientists. So I want to
give you some other examples of GPT5 Pro
use cases that feel a little more
accessible. Financial modeling. Every
business at a certain scale has to
financially model. GPT5 Pro is the kind
of model that can simultaneously parse
income statements, balance sheet, and
cash flows and cross reference them for
consistency. It can look at reconciling
multiple data sources. It can look at
accounting standards. It can look at
time periods. If you process the data
and feed it in a structured manner, it
actually is going to do a great job of
this. One of the things that I chuckled
about when I did my review of Chat GPT5
is that I deliberately didn't do this as
a way to test the model. And this is my
chance to make it up to GPT5 Pro. I know
I gave it really dirty data on purpose
as a way of testing its reasoning
ability. It did okay. I would recommend
in practice you put the effort in to
giving GPT5 Pro multiple perspectives at
different layers in the business and
make the data as clean as you possibly
can because then you're going to get
more useful information back. I do think
financial modeling is a nice use case
for GPT5 Pro. Legal analysis. Do some
due diligence on large collections of
documents. Look at contract terms. Maybe
you identify legal risk. Look at
dependencies. These reasoning traces can
look at things from multiple
perspectives and the synthesis can catch
things that human reviewers might miss.
This is not about saying the humans
don't need to review the legal
documents. It is about saying how can a
tool that is designed for parallel
reasoning converge toward correctness
when a correct answer is available.
Because in legal analysis also a correct
answer is available. there's a correct
and optimal legal stance on a particular
due diligence question. You can name the
top risks and you would be wrong if you
missed one. Similarly, with financial
modeling, you can name the overall
correct financial output statement and
you would be incorrect not just if a
number was wrong, but if you did not
take account of all of the components of
the business and the financial model.
GPT5 Pro excels at that kind of
analysis. And so you have opportunities.
And by the way, the financial modeling,
the legal analysis also based on early
insights from teams. And so science and
finance and legal. Fine. What about
something that's closer to tech? McKay
Wriggley is both a content creator and
also a coder. One of the things that
he's called out is that he is excited
about GPT5 Pro in the coding space
specifically for architectural
decisions. And that has been one of the
areas where LLMs have historically
struggled. Defining how you put
technical systems together has been
hard. GPT5 Pro with a sizable context
window can enable you to look across
large chunks of your codebase and make
architectural recommendations about that
codebase and it reasons toward
correctness. like it will think through
coding best practices run multiple
reasoning traces all of those hallmarks
of parallel reasoning and where it sings
they come through and it thinks
correctly. If you want to talk about
marketing if you want to talk about
product and where those things have GPT5
pro use cases look for areas where you
have a correct or optimal decision and
you can feed the model multiple parallel
perspectives. And so if you are trying
to enter into the market and your your
product team and your marketing team are
there and they're trying to figure out
how to crack the market with a product,
great great opportunity. Bring in some
user interviews, bring in a survey of
the market, bring in a company profile,
bring in some product opportunities. Lot
of grounds that help GPT5 Pro reason in
parallel and you're going to get to a
correct answer. That is the goal, right?
Like you're going to get to something
that gives you an optimal path through
all of those variables. Let's look at a
few cases where parallel reasoning
probably doesn't help. I'm going to
suggest to you that GPT5 Pro requires
you to think architecturally to the
extent that it may not help you with
thinking sequentially. And that's where
parallel reasoning can be a challenge
because it can produce an overall
coherent perspective in the ways I've
described. That's really good. But for
example, coding, which a lot of other
LLM agents are actually quite good at.
Coding is a much lower level of
decisioning than architecture. Coding
requires very sequential logic. There
are reports already coming out that GPT5
Pro can weirdly lose the plot sometimes
when it is producing code. And that is
likely because it is running multiple
plots, multiple sequential coding
threads simultaneously. So be aware of
that. You may not want to use it for
coding. creative writing, you have to
have a narrative with a particular
singular voice. I would not use GPT5 Pro
for this. And I don't know of many
people who are, so this feels like an
easy one. But you're going to get maybe
some really coherent, thoughtful plot
feedback from this model, plot
architecture, where it's going to give
you its solution to a particular plot
problem, but it's not going to make the
bold creative choice. It's not going to
write in a particular voice. That is not
really what this model does.
conversation and this is a really
important LLM use case. A lot of the LLM
use cases that we see in production are
conversational use cases. This is not a
model for conversation. One, it takes a
long time. And two, human dialogue needs
consistency and personality. If it feels
robotic, which GPT5 Pro is going to feel
robotic, if it doesn't feel sequential,
if it if it jumps around, humans aren't
going to like it. And I think that is
part of the reason why 40 is preferred
by a lot of people and why ultimately
Chad GPT had to bring it back. So those
are a few cases. I hope they give you a
sense of where parallel reasoning works
well, where parallel reasoning doesn't
work well. The key is can you give it
the data it needs. And that brings me to
the infrastructure cost of using GPT5
Pro. Success with GPT Pro requires a
fundamental data restructuring that
organizations tend to underestimate.
Instead of linear documents that you
feed, it would be ideal to feed GPT5
Pro more multi-dimensional data
architectures. So if you're doing
financial analysis, feed it the core
data statements. These are facts,
metrics, these are calculations. And
then feed it perspectives. Here's a risk
lens. What we think could go wrong.
Here's a growth lens. What are the
opportunities in the space? Here's a
competitive lens with our market
positioning. Then feed it cross
references, temporal cross references,
how metrics change over time,
relational, how departments interact.
Basically, you need to start thinking of
it as giving this multiple thread
reasoning agent as much context as you
can in a very structured way because
each parallel thread will need a
coherent data path to run. And so you
want to think about how you are
orchestrating a symphony of reasoning
threads that need to maintain some
degree of coherence. One of the things
that's interesting is the responses API
is able to main maintain some chain of
thought persistence across threads. And
so if you're giving it multiple whacks
at the apple, if you're giving it
multiple attacks at the problem with
context, this kind of multi-dimensional
data architecture can let you start to
feed it perspectives that build over
time. I think the thing I want to call
out here is that most organizations
don't have the actual patience in
practice to do this. And if you're going
to use GPT5 Pro at its best, this
underlines one of the consistent themes
with AI, which is that we need to change
to take advantage of what AI brings to
the table. Our data needs to change to
take advantage of what GPT5 Pro and
other AIs bring to the table. And GPT5
Pro really forces that with a parallel
reasoning architecture. So what are the
strategic implications here? I would
argue that GPT5 Pro presents the
industry with some interesting strategic
questions. So for OpenAI, they've proven
that they can innovate on inference time
compute and they can command premium
pricing for specific use cases, but they
haven't yet shown they can expand these
use cases more generally. I've had to
spend a lot of this video talking about
where you don't use GPT5 Pro, and I
think that's indicative. Claude is not
actually an inference time compute
model. Claude 4.1, Opus 4.1, it is using
tools. It is interpreting, but it is not
a traditional inference time compute
model the way I've described GPT5.
That's really interesting. Anthropic has
been happy to train a model that is very
good at tool use and tool calling and
has been getting great results and great
reviews, especially in the coding arena
for that choice. Does Anthropic want to
keep going down that path? Do they want
to keep optimizing for coding because
they believe coding has so much
explanatory power long term over
technical development trajectories? Or
do they want to start to lean in on a
thinking and reasoning model? And if
they do, how does it reinforce their
core value proposition around coding and
their core value proposition around
their personality? Because people love
Claude's personality. Do they want to
risk losing that? It's an interesting
question. Google has to figure out how
they are going to get to a model with a
chat surface that is widely used and
decide where they want to apply that
reasoning power that they do have. They
have reasoning power now that they
employ to get phenomenal results in
academic and technical domains. They
have the the awards for science research
and for protein folding and for math
olympiad etc etc. It's not that they're
missing the knowhow here at all, nor are
they missing the technical architecture
to get it done. They have their own
separate architecture based on tranium
chips, but they have to figure out where
to productize that architectural
innovation so that they have a unique
product surface that people know to go
to Google for. And that's something that
Google has been struggling with for a
while. Right now, the reason to go to
Google is either you're already in
Google Cloud or you really want the
cheapest tokens per intelligence and you
go to Google for that. Is that enough to
sustain a strategic advantage or
strategic share of the market over time?
That's a question and I think it's a
question GPT5 Pro puts a fine point on
because what OpenAI is basically saying
is we have a scaling paradigm here.
We're going to keep making the model
smarter and we're kind of going to dare
you to beat us on smart reasoning models
and Anthropic has their own corner with
coding and non-reasoning models and
Google's sort of in the middle right
now. We are entering an era of
architectural specialization. The next
breakthrough and I and I think that
people need to get past this idea of
bigger models. The next breakthrough may
not be a bigger model. It may be how we
use reasoning architecture for specific
cognitive tasks. Now that we're in the
LLM era, we may see more specialization.
That would not surprise me. So where do
I want to leave you? Intelligence is not
the same as utility. GPT5, however you
measure it, is a very intelligent model,
but its intelligence is not what makes
it a success or a failure. The key is
understanding that intelligence and
utility are diverging as we get farther
into the LLM era. And it's up to you to
figure out if parallel reasoning makes
AI smarter for the tasks that you want
to accomplish. I think we're headed
toward a future of AI stratification. I
think we're going to have deep reasoning
systems for very high stakes analysis.
We're going to have conversational
systems for daily interaction and we're
going to have specialized tools for
specific domains. The dream of one model
that's better is I think it's dead. I
don't think it's happening. And I think
what's ironic is it's killed by the very
GPT generation that promised the one
model better at everything. I think what
GPT5 Pro is showing us is that it's
possible to have a model that is indeed
better and also in some ways worse than
its predecessors. There will not be one
model to rule them all. And so the
question for you isn't whether GPT5 Pro
is worth $200 a month. It's whether you
can define use cases that fit better
with specialized tools or with deep
reasoning systems or with conversational
systems. If you are a conversational
model person, do not pay the $200 a
month. If you are a deep reasoning
person, well, now you have to think
about the analysis and whether you have
the data to get ready and then maybe
you're ready for GPT5 Pro. And if you're
someone who only uses specialized tools,
maybe you're not even using Chad GPT at
all. This is the opening move in a new
AI game where architectural
differentiation is going to matter more
and more. And that is why I've spent so
much of this video explaining
architectures and how they work and why
GPT5 Pro is different. I hope this has
been helpful. I hope you have a sense of
where to use GPT5 Pro or whether or not
to get it at all. Tears.