From Keywords to AI Search
Key Points
- Traditional search relied on keyword matching, TF‑IDF weighting, and PageRank link analysis, which struggled with context, synonyms, and user intent.
- The introduction of transformer‑based models like BERT (2019) and MUM brought deep natural‑language understanding to search, enabling more accurate interpretation of queries.
- Modern AI search pipelines begin with natural‑language query processing, using an LLM’s NLU capabilities to infer the user’s intent and nuances.
- Retrieval now leverages vector embeddings and semantic (vector) search, matching query vectors with document vectors to find conceptually related content rather than exact keyword matches.
- Large language models generate direct answers from the retrieved information, shifting search from simply providing links to delivering concise, context‑aware responses.
Sections
- Evolution of AI-Powered Search - The speaker explains how search has progressed from basic keyword matching and TF‑IDF to link‑based ranking with PageRank, and finally to transformer models like BERT, MUM, and modern large language models that grasp context, intent, and generate direct answers.
- Embedding-Based Retrieval Augmented Generation - The speaker outlines how text is converted into semantic vectors, matched in a vector database, and combined with retrieved snippets in an LLM to produce cited, trustworthy answers, concluding with a feedback loop.
- AI Search vs Traditional Search - The speaker contrasts traditional search’s limited memory and list-based results with AI search’s contextual, multi‑turn conversation and synthesized answers, highlighting the resulting challenges for SEO.
- EEAT Optimization and Formatting Relevance - The speaker explains applying Google's E‑E‑A‑T principles to make content machine‑readable while noting that traditional HTML formatting like H1s still helps SEO but is less critical for AI‑driven retrieval.
Full Transcript
# From Keywords to AI Search **Source:** [https://www.youtube.com/watch?v=iVUMuC7OzUI](https://www.youtube.com/watch?v=iVUMuC7OzUI) **Duration:** 00:12:01 ## Summary - Traditional search relied on keyword matching, TF‑IDF weighting, and PageRank link analysis, which struggled with context, synonyms, and user intent. - The introduction of transformer‑based models like BERT (2019) and MUM brought deep natural‑language understanding to search, enabling more accurate interpretation of queries. - Modern AI search pipelines begin with natural‑language query processing, using an LLM’s NLU capabilities to infer the user’s intent and nuances. - Retrieval now leverages vector embeddings and semantic (vector) search, matching query vectors with document vectors to find conceptually related content rather than exact keyword matches. - Large language models generate direct answers from the retrieved information, shifting search from simply providing links to delivering concise, context‑aware responses. ## Sections - [00:00:00](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=0s) **Evolution of AI-Powered Search** - The speaker explains how search has progressed from basic keyword matching and TF‑IDF to link‑based ranking with PageRank, and finally to transformer models like BERT, MUM, and modern large language models that grasp context, intent, and generate direct answers. - [00:03:15](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=195s) **Embedding-Based Retrieval Augmented Generation** - The speaker outlines how text is converted into semantic vectors, matched in a vector database, and combined with retrieved snippets in an LLM to produce cited, trustworthy answers, concluding with a feedback loop. - [00:06:27](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=387s) **AI Search vs Traditional Search** - The speaker contrasts traditional search’s limited memory and list-based results with AI search’s contextual, multi‑turn conversation and synthesized answers, highlighting the resulting challenges for SEO. - [00:09:37](https://www.youtube.com/watch?v=iVUMuC7OzUI&t=577s) **EEAT Optimization and Formatting Relevance** - The speaker explains applying Google's E‑E‑A‑T principles to make content machine‑readable while noting that traditional HTML formatting like H1s still helps SEO but is less critical for AI‑driven retrieval. ## Full Transcript
AI search is transforming how we locate and consume information online,
but how?
Well, back in the day, search engines were pretty simple because they were based more or less just on keyword search.
They matched words in a user's query to words in documents.
Several methods that they would use for that, including things like boolean keyword matching.
That was one method to do it.
Now keyword search has moved on since then, algorithms such as TF-IDF,
they rank documents by term frequency and inverse document frequency and that helps improve relevance by assigning more weight to important terms.
And Google's breakthrough in the late 1990s was called PageRank and that added link analysis to judge a page's authority,
but traditional keyword search
has some clear limitations.
It can't truly understand context and synonyms and user intent.
So when my search string includes the word Apple, am I referring to the fruit or the tech company?
Well, enter machine learning and the world of ai search.
So technologies like BERT from Google in 19' or 2019
that introduce a transformer-based language model into search, helping better understand the context of natural language queries.
And that was followed two years later by MUM, that's Multitask Unified Model, a much more powerful model than BERT to both understand and generate language,
and then Today we have large language models.
Where the AI generates an answer rather than just retrieving links.
So how does AI Search powered by large language models actually work?
Well, we can think of it in four stages and at the top here, first of all, we've got the natural language that's coming in.
Specifically, we're gonna perform natural language query processing.
So when a user asks a question in plain language, the system uses an LLM to interpret the query.
That uses the LLM's Natural Language Understanding capabilities, it's NLU, to parse the query's intent and nuances.
So if I ask what's the best way to peel an orange,
well, the system recognizes I'm probably looking for a method or a tutorial, even though the query doesn't explicitly contain those words.
We've moved far beyond the old days of keyword matching here.
Now, with intent established...
we move to the next stage, which is retrieval.
Now, instead of relying solely on keyword matching, although that does still play a part, AI search often uses vectors.
Specifically, it uses vector search to find relevant documents semantically.
Now, how does that work?
Well, text, both search queries and documents are encoded into numerical vectors.
Those are called embeddings.
And vectors capture semantic meaning.
The user's query vector is then matched with vectors of documents in a vector database to find the content that is conceptually related.
This allows, for instance, a query about puppy play things
to retrieve an article that talks about dog toys, even though the wording differs because these terms are semantically similar.
Who's a good boy?
Now the next stage is answer generation.
So this is where we have now gone to retrieval
and we've retrieved some relevant documents or actually more likely not entire documents but really snippets of those documents.
And now an LLM is given the query along with those retrieved snippets and it generates a cohesive answer in natural language.
Using those sources of information.
Now regular viewers of this channel probably recognize what this is.
It's our old friend RAG or retrieval augmented generation where the LLM's knowledge is augmented with up-to-date retrieved data.
By grounding its answer in retrieved facts the AI search system can provide current and accurate information.
The generated answer it can include citations
linking back to the original sources,
which is a level of transparency that's important for gaining a user's trust, showing that this answer is not just hallucinated by the model.
Now, the final stage in all of this is the feedback stage.
Many AI search implementations learn from feedback to improve.
So users might give a thumbs up or a thumbs down, or the system observes follow-up queries to figure out if the answer was helpful.
This data can fine-tune the LLM and the retrieval component over time.
So how do traditional search and AI search powered by large language models compare?
Well in response format traditional search what does that return?
It typically returns a list of links for a user click through,
but AI search doesn't provide a list of links,
it provides a direct answer to whatever it is you were searching for to your query, in natural language.
It's generated original content on the fly now as for the query understanding traditional search as we've already mentioned that is kind of primarily keyword based
Whereas AI search, that is based on NLU or natural language understanding to derive context and intent.
And speaking of context, when it comes to contextual awareness, a traditional search, that is pretty independent.
What I mean by that is it has a limited memory of a user's previous interactions.
Whereas, AI search, that really keeps the context in mind.
It maintains context, allowing a multi-turn conversation, allowing follow-up questions that understand references to earlier parts of a dialog.
And when it comes to information synthesis, well, traditional search, that really separates results out.
So, different sources,
different lists, whereas AI search, it really combines information.
It takes information from multiple sources and puts them into one coherent answer.
Now AI search isn't just changing how results are displayed,
it's really challenging how the entire web has been built, because for years websites have been optimized for traditional search engines using a practice called SEO,
search engine optimization, to rank as high as possible in results pages, but
What happens now when the result of an AI search isn't a list of links, but instead is written text incorporating snippets from multiple web page sources?
Well, that's a good question for Donna Bedford.
Donna Bedford, Global SEO at Lenovo.
Now, for years, Donna has been making sure that her company's web pages rank as highly as possible in search engine results.
So Donna, if somebody wanted to make their content a bit more AI-friendly today, where would they start?
Well, the great news is they don't have to start afresh.
What they're already doing for traditional search is gonna work for them.
What they need to do is like up the game, but have a narrow focus.
So what you're gonna do is focus on two real things.
One, think human.
And two, think like the machine.
So what do I mean about that?
So AI is still a machine.
It still has to come and find your information.
It still has to work it out.
So you want to make it as easy as possible, bite-sized chunks, good structure, good navigation, so it understands.
And you want a complete journey.
You want everything in there so it understand.
But you also have to tackle the human element, where as traditional search tends to be singular words, couple of words.
This is more conversational.
It is definitely more a personal journey.
So you need to start writing like a human might ask.
Okay, so addressing both sides of it now that makes me think about keyword counts
because I know in the old days you would kind of want to stuff a webpage with as many keywords as possible to get as high a count as possible.
Is that old news now?
It's kind of old news but there is a variable in it that works.
So what you're talking about is like keyword density, how many times can you write the exact match word on the page.
What we're now extending out is using an algorithm update that Google came out with a number of years ago
and is commonly used which is called EEAT.
EEAT so and it's actually two E's in here originally it was just a one So we're talking about experience, expertise, authority, and trust, right?
So as you mentioned, traditionally, the site change ensures a number of links and all sorts of things.
Here, what you're trying to do is give the full experience to the machines, to the AI,
to tell them that you have the expertise, the authority, the experience, the trust, and you're a trusted source for this information.
So you write like a human, but you give the information that a machine needs to logically make the response.
Okay, gotcha.
So one more question for you.
Okay.
I want to know how much formatting matters.
So formatting, like making sure you're using H1s and stuff like that.
When we think now that AI is putting information from all sorts of different web pages rather than just a single page, so does formatting stuff still matter?
So it does, but not in the same way.
So traditional search engines, you'll use like H1 to tell the search engine how important an element is or what your page is about.
In most cases, whatever you do for the AI is gonna benefit your traditional search and traditional search is not going away, right?
But there's a gotcha in here that you have to watch out for.
I'm saying you make it better every time, there's one particular element that you're actually gonna have to step back on.
And that's JavaScript.
Traditional search engines at the beginning had a problem with JavaScript.
They've managed to solve that.
The AI models haven't.
So they have an issue with the JavaScript.
So you just wanna make sure that, again, going back to the very first question, crawlable, navigable, can find the information, and that they can find it.
Because if they can't find the information about you, They can't have a story about you.
That makes a lot of sense.
Well, thank you, Donna.
So that's AI Search.
It's changing both how users locate and consume information online, and even how that information is represented online in the first place.