Switch Models, Prompt Smarter
Key Points
- The video’s first goal is to steer users away from defaulting to ChatGPT‑4 and instead adopt stronger reasoning models such as GPT‑3.5, Claude Opus 4, or Gemini 2.5 Pro, which deliver better performance and tool‑use transparency.
- After selecting a superior model, the second goal is to simplify prompting by focusing on a handful of evidence‑based, memorable techniques rather than overwhelming users with dozens of tips.
- The presenter distilled core prompting principles by reviewing guides from Anthropic, Google, OpenAI, and third‑party sources, identifying the most reliable strategies that actually improve results.
- Effective prompting in 2025 emphasizes leveraging the model’s reasoning capabilities—asking it to think step‑by‑step, generate multiple candidate answers, and then evaluate or compare them.
- By consistently using these concise, tested methods with a better model, users can achieve measurable gains without needing to study extensive prompting literature.
Sections
- Upgrade Model, Simplify Prompting - The speaker explains the importance of moving to a superior AI model and introduces a concise, proven set of prompting principles that are easy to remember and apply without sifting through extensive guides.
- Three Prompting Strategies - The speaker outlines three validated techniques for improving LLM performance: request tool-generated code for math, generate multiple responses for self‑consistency, and have the model create a step‑by‑step plan before execution.
- Prompt Engineering: Guardrails & Positioning - The speaker outlines three essential prompting strategies—constructing comprehensive guardrails and edge‑case handling, positioning critical instructions in the first and last 10% of the prompt, and prioritizing negative over positive examples—to keep probabilistic models reliably aligned with user intent.
- Prompt Optimization via Uncertainty Checks - The speaker outlines techniques—uncertainty probing, capability discovery, and self‑improvement loops—to refine prompts, expose hidden ambiguities, and align model responses with realistic abilities.
- Beyond ChatGPT-4: Prompt Best Practices - The speaker outlines structural prompting guardrails, self‑consistency, tool use, and planning techniques—effective for newer inference models—and urges users to stop relying on legacy ChatGPT‑4.
Full Transcript
# Switch Models, Prompt Smarter **Source:** [https://www.youtube.com/watch?v=hMKRBldkWEk](https://www.youtube.com/watch?v=hMKRBldkWEk) **Duration:** 00:15:10 ## Summary - The video’s first goal is to steer users away from defaulting to ChatGPT‑4 and instead adopt stronger reasoning models such as GPT‑3.5, Claude Opus 4, or Gemini 2.5 Pro, which deliver better performance and tool‑use transparency. - After selecting a superior model, the second goal is to simplify prompting by focusing on a handful of evidence‑based, memorable techniques rather than overwhelming users with dozens of tips. - The presenter distilled core prompting principles by reviewing guides from Anthropic, Google, OpenAI, and third‑party sources, identifying the most reliable strategies that actually improve results. - Effective prompting in 2025 emphasizes leveraging the model’s reasoning capabilities—asking it to think step‑by‑step, generate multiple candidate answers, and then evaluate or compare them. - By consistently using these concise, tested methods with a better model, users can achieve measurable gains without needing to study extensive prompting literature. ## Sections - [00:00:00](https://www.youtube.com/watch?v=hMKRBldkWEk&t=0s) **Upgrade Model, Simplify Prompting** - The speaker explains the importance of moving to a superior AI model and introduces a concise, proven set of prompting principles that are easy to remember and apply without sifting through extensive guides. - [00:03:39](https://www.youtube.com/watch?v=hMKRBldkWEk&t=219s) **Three Prompting Strategies** - The speaker outlines three validated techniques for improving LLM performance: request tool-generated code for math, generate multiple responses for self‑consistency, and have the model create a step‑by‑step plan before execution. - [00:06:47](https://www.youtube.com/watch?v=hMKRBldkWEk&t=407s) **Prompt Engineering: Guardrails & Positioning** - The speaker outlines three essential prompting strategies—constructing comprehensive guardrails and edge‑case handling, positioning critical instructions in the first and last 10% of the prompt, and prioritizing negative over positive examples—to keep probabilistic models reliably aligned with user intent. - [00:09:57](https://www.youtube.com/watch?v=hMKRBldkWEk&t=597s) **Prompt Optimization via Uncertainty Checks** - The speaker outlines techniques—uncertainty probing, capability discovery, and self‑improvement loops—to refine prompts, expose hidden ambiguities, and align model responses with realistic abilities. - [00:13:03](https://www.youtube.com/watch?v=hMKRBldkWEk&t=783s) **Beyond ChatGPT-4: Prompt Best Practices** - The speaker outlines structural prompting guardrails, self‑consistency, tool use, and planning techniques—effective for newer inference models—and urges users to stop relying on legacy ChatGPT‑4. ## Full Transcript
We're going to do two things today.
Number one, we are going to help you
into a better model. I sound like a used
car salesman, but we're going to get you
into a better AI model, and I'm going to
explain why it matters. And number two,
we're going to talk about the state of
prompting, and how prompting with that
better model is something that you can
learn to do without reading the
thousands of pages of tips and guides.
And frankly, to get ready for this
video, I did a ton of research on those
prompts and guides. So, you're not
missing out. I looked at Anthropic, I
looked at Google, I looked at thirdparty
guides, I looked at OpenAI's guides. I
wanted to see what are the overall
principles of prompting now that we have
reasoning models that we can start to
pull out and name and really drill on
clearly so that there's just a few
things you can learn and take away with
you that are easy to remember so your
prompting actually gets better. That is
my goal with this video. I don't want
you to remember a 25 things that you
need to think about when you prompt. I
want you to have a clear model you can
pick and I want you to have very
memorable prompting tidbits that like
actually make a measurable difference
that have been tested to work and that
there's not very many of. So let's get
into it. Number one, finding a better
model. Don't use chat GPT. That is
almost always what people use when they
say they use AI and that goes for CEOs.
I have talked with CEOs who think there
is no better model than 40 because four
is bigger than three. And chat GPT is
obviously the best. Neither of those two
statements is as true as you might
think. Three better models you can pick.
All much better than 40. Chat GPT03.
It's a reasoning model. I don't care
what the number says. It's a little
colder personalitywise, but it does a
lot of work. It's my daily driver.
Claude Opus 4. Fantastic model. Great
writer. It thinks things through. It's a
really, really good reader and it
exposes its tool use transparently.
Gemini 2.5 Pro, very strong model. you
can go and use it over uh in Google's
Vertex or in a lot of other places now
that they're starting to expose it. It's
a thinking model, a reasoning model. It
also does tool use. It has a nice big
context window and it works fast. These
are all great choices. I'm not here to
give you an extensive discussion about
03 versus Opus 4 versus Gemini 2.5 Pro
because frankly that is a a 5% of the
population problem. The 95% of the
population problem is getting people to
stop using 40. If if we just decided to
use 03, our collective perception of
what AI could do would significantly
improve. Okay, that being said, once you
were using a better model, a reasoning
model, a model that takes its time to
think, which is what you get with 03,
what do you do to prompt in a way that
makes sense in 2025? I want to give you
a few evidence-based techniques that
actually work. Number one, ask the model
to generate a few responses and then
check the responses for consistency. And
I'm not saying a few different responses
to different questions. I'm actually
saying ask the model to generate
optionality and then to check responses
for consistency. And so basically an
example of that would be saying, give me
five ways that you could define the
answer to this question. Whatever the
question might be, maybe tell me what an
amphibian is, right, for biology. Give
me five different definitions and check
them for consistency. If you're solving
a coding problem, give me five possible
solutions and check them for
consistency. Having the model check its
work across multiple options is really
cheap because producing those new
solutions is relatively easy for these
reasoning models. And having them check
their work ensures that you are actually
getting the benefit of that reflection
or inference boost that you get from a
reasoning model. The second technique I
want to call out is program of thought.
ask the model to solve this with math or
code. And so instead of saying like if
you have a math problem instead of
saying please explain how you solve this
just say write a function to solve this
suggest that it call a tool. These
inference machines, these the 03, Opus
4, Gemini 2.5 Pro, they can all write
Python code. They can write code to
solve these math problems. And that
makes them much more accurate with
numbers. Call for the use of the tool in
the prompt. It's called program of
thought, whatever you want to call it.
Like you're basically programming it to
call the tool by asking for it. It's not
that hard. Number three, plan and solve.
Ask it to create a stepby-step plan for
a particular task first and then you get
into execution. I do this with writing
all the time or with anything that I
need to do from a software perspective.
Create a step-by-step plan too. I just
ask it to lay out its thinking and then
I critique that thinking. It's a huge
step forward. It makes a big difference.
Plan and solve. Plan and solve. So
already there's three things that you
can remember, right? Ask it to generate
multiple responses, which is easy to do.
and then critique them for consistency.
That's a self-consistency tip. Ask it to
use tools to solve math problems. That's
a program of thought tip. Ask it to plan
and solve. By the way, I'm not making
these up. These are actually validated.
These are the creme de la creme. These
are the examples that are standing out
as I survey the use of reasoning models
across thousands and thousands of pages
of prompt engineering best practices.
I'm trying to give you the stuff that is
actually easy to remember, easy to
implement, and that you will see real
benefit from right away. I'm not trying
to give you fancy magic words. I'm
trying to give you ways to engineer and
work with these machines. So, with that
in mind, I want to go below the level of
the prompt. And I want to suggest that
you also need to understand a couple of
structural principles for whatever
prompt you're writing in 2025. Number
one, understand that you really need to
give guardrails and edge cases. You
can't just prompt for the happy path. If
unable to X, then do Y. Please privately
reflect on X, then summarizes Y, then do
Z. You need to give it ways to handle
the content you're giving it explicitly.
Too many people stop at that here's the
actual instructions happy path and don't
do anything to provide structure around
how the prompt is executed or edge cases
or guardrails. I have so many prompts
that I've put together and shown that
illustrate this but the core concepts
remain really consistent and it works
regardless of whether you're in a cloud
model or if you're in an open AI model
or Google model. You want to be clear
with your constraints and edge cases.
You want to be clear with how you handle
fallbacks. You want to be clear with the
output structure. The actual
instructions just need to be very clear
and live nested inside a format that
allows the model to know what your
intent is if things don't go perfectly
well or if it starts to stray far a
field from the actual instructions.
Essentially,
the actual instructions are the motor
and you have to build the ship around
the motor through the guardrails and
edge cases and all of that to get where
you want to go. Otherwise, if you just
stick a motor in the water, what's it
going to do? It's going to sink no
matter how well it runs. You can have a
perfectly good set of actual
instructions and it just doesn't go
anywhere because it doesn't teach the
model how to handle all of the things
that can go wrong when a probabilistic
model tries to understand your intent
like that. So, I always say look for
guardrails and edge cases. Claude system
prompt is 90% guardrails and edge cases
by the way. 90%. Just think about that.
Okay, that's principle one. If you're
prompting in 2025, principle two,
context positioning. Attention is not
uniform. Put your critical instructions
in the first 10% of the prompt. You can
put examples, you can put data in the
middle, and then you reiterate with key
constraints at the end. You want to
assume that the model needs to anchor on
this prompt and it takes the first part
and the last part of that prompt very
seriously. Principle number three,
negative examples are more critical to
include than positive examples. It is
good to include a positive examples. It
is even more important to include don't
do X examples. Show failure modes
explicitly and say avoid it. For
example, if you're writing and you want
it to sound like it's not silly, you
might say, "Never use phrases like
inconclusion or never use phrases like
wrapping all of this up." Whatever you
want to avoid and then it will just do
that. It will know what bad looks like.
Now, you're not going to be able to
describe it exhaustively and so you're
going to need to couch those bad
examples inside a general statement that
says this is the overall behavior to
avoid and here's an example. But it is
really important to include those
negative examples so you don't run into
issues. So, let's just re sort of review
where we've been at. We talked first
about evidence-based techniques that
really work like self-consistency where
you make it generate multiple responses
and make it self-consistent across those
responses which helps it to reduce
random incorrect answers. We talked
about program of thought where you ask
it to use a tool like Python to solve a
problem. We talked about planning first
and then solving. So, those are specific
tips. We then talked about structural
principles. uh the idea of having
guardrails and edge cases, the idea of
having context positioning, first 10%,
last 10% of the prompt being more
important. We talked about negative
examples beating positives just now. I
want to give you one big insight at the
end here that really unlocks where
prompting is at. The key insight is
this. Models know themselves better than
we know them at this point. And so part
of what you're doing is you're using a
technique called metaprompting to help
the models reveal what they know to you
in a way that's usable. Again, we go
back to the fact that we started this
video moving from 40 to 03. If you are
not using a better inference model, this
will not go as well. So go back, use a
better influence model, and the
inference model will help you when
you're doing metaparrompting. Let me
give you a few examples with specific
prompt phrases that you can use that are
in the spirit of metaprompting or
getting the model to help you prompt
itself. A self-improvement loop. Here's
my current prompt. Just write it out.
How would you improve this prompt to get
better results from you? It will come
back with an answer. Very simple. You
can write that different ways, but
oftent times it's going to tell you
things it knows about prompting that you
didn't know for free. Isn't that great?
Number two, uncertainty check. What
parts of this request are unclear or
ambiguous? What assumptions are you
making? What additional information
would help you execute this prompt with
more accuracy? This pushes the model to
voice hidden uncertainties that it would
otherwise infer. It helps to prevent
overconfident hallucination. Capability
discovery is another one. How would you
approach this if you had no constraints?
What would be your ideal process? What
tools or information would help you?
This reveals what the model thinks it's
capable of, which by the way is not
always true. But often times you get
suggestions for approaches that you
hadn't considered before. And the model
will reveal to you an overall approach
and desire for information that you can
help it with. And then you can wind it
back from imagined capabilities that
aren't true and get to a point where
it's actually a useful prompt that has
all the information that it needs. See,
even those things, a self-improvement
loop, an uncertainty probe, just
checking to see if things are unclear,
checking for the edge of capabilities
that models have. These are techniques
to help the model help you. It helps the
model understand what's what it's able
to do in order to support you as you
prompt. But they're not the only ones.
There's other things that sort of fit in
this meta prompting bucket. Explain your
reasoning step by step. What parts are
you most or least confident about?
Sometimes they hide the reasoning. And
sometimes it's actually good to have
prompts that hide reasoning that that
can make sense and not distra distract
from the output. But when you want to
diagnose it, even if the reasoning isn't
pure in the sense that it's exactly what
it thought, asking the model to go back
and think step by step is still really
helpful and it helps you understand what
the model thinks about the work that
it's done so far. You can also use
really old human techniques and the
model has read so much human literature
it actually works. The Socratic method
works. What why did you choose that
approach? I will ask the model that.
What alternatives did you consider? I
asked the model that all the time. What
would you change if I changed this
constraint? I also ask that. It uncovers
implicit assumptions from the model,
which is really, really interesting. So,
these are not random tips. I don't want
you to walk away from this video and
think it's a bucket of random tips and
an ask to not use 40. Instead, what I am
surfacing are the best practices that
have popped up again and again and again
and again in this survey of literature
that I've done. And so, when I look
across all of prompting literature, I
see some of these themes emerge and I
don't see them talked about as clearly
as they need to be. And so, my goal with
this video has to been to be crystal
clear. You want to use meta prompting.
You want to ask the model to help you
prompt. There are documented ways to do
that. the self-improvement loop, the
uncertainty probe, checking for
capabilities, making sure that you
understand how to ask for an explanation
of work, understanding how to ask for uh
clear reasoning through the Socratic
method. It's not just me saying that.
I'm trying to actually lad up the best
practices so they're accessible. The
structural principles I called out
around guardrails and edge cases, around
context positioning, around adding
negative examples, these are structural,
too. Like, they're really helpful to
have regardless of model. And the
evidence-based techniques that work,
they're evidence-based for a reason.
They're repeated for a reason. You want
to have clear self-consistency so that
you can give the model the chance to
generate lots of options and then
reinforce the elimination of
hallucinations by forcing it to make
those options consistent. You want to
give the model a chance to call tools,
program of thought. You want to give the
model a chance to plan first. And again,
I'll go back to the very beginning.
These work because we have inference
models. And for so many people, the
reason why all of these tips are
frustrating and why people keep earning
clicks making all of these tips in
prompting technique articles on the
internet is because everyone's using
chat GPT40 and we're not talking about
that as a problem. It is a problem if
you are using a model that is last
generation. It is hard to make the
techniques I'm describing here that will
inform the rest of 2025. Like this will
be helpful for chat GPT5 too. It's hard
to make those work if all you're doing
is sticking with 40.
So, please move beyond chat GPT40. Now,
eventually chat GPT5 may come out and
they will literally drop it from the
menu and you can't use it anymore. And
when that happens, that's fantastic
because people are going to get access
to a reasoning model they didn't know
they had. But again, at that point,
they're still going to need prompting
techniques that work. They're still
going to need to understand how to use
these models usefully.
I hope that this has been a helpful
summation of the literature. Are there
thousands of pages and many many many
more things to discover? There are and
I've written up more on this so that you
can go more in depth. But from an
overall survey perspective talking with
hundreds of people looking across the
population, this is what I would want to
say to most people about AI. Basically,
just a few techniques that will help you
dramatically improve your prompting plus
pick a better model. It's sometimes
getting better at AI.