Learning Library

← Back to Library

Vector Databases: Bridging the Semantic Gap

Key Points

  • Vector databases store data as mathematical vector embeddings—arrays of numbers—that capture the semantic essence of unstructured items like images, text, and audio.
  • Traditional relational databases rely on structured metadata and manual tags, which creates a “semantic gap” that makes it difficult to query for nuanced concepts such as similar color palettes or scene content.
  • In a vector space, similar items are positioned close together and dissimilar items far apart, enabling similarity searches through simple distance calculations.
  • By converting complex, unstructured data into embeddings and indexing them in a vector database, you can perform efficient, semantically aware retrieval that goes beyond exact keyword matching.
  • This approach allows queries like “find images with similar landscapes” or “retrieve audio clips with alike tones,” which are impractical with conventional SQL‑style queries.

Full Transcript

# Vector Databases: Bridging the Semantic Gap **Source:** [https://www.youtube.com/watch?v=gl1r1XV0SLw](https://www.youtube.com/watch?v=gl1r1XV0SLw) **Duration:** 00:09:36 ## Summary - Vector databases store data as mathematical vector embeddings—arrays of numbers—that capture the semantic essence of unstructured items like images, text, and audio. - Traditional relational databases rely on structured metadata and manual tags, which creates a “semantic gap” that makes it difficult to query for nuanced concepts such as similar color palettes or scene content. - In a vector space, similar items are positioned close together and dissimilar items far apart, enabling similarity searches through simple distance calculations. - By converting complex, unstructured data into embeddings and indexing them in a vector database, you can perform efficient, semantically aware retrieval that goes beyond exact keyword matching. - This approach allows queries like “find images with similar landscapes” or “retrieve audio clips with alike tones,” which are impractical with conventional SQL‑style queries. ## Sections - [00:00:00](https://www.youtube.com/watch?v=gl1r1XV0SLw&t=0s) **Bridging the Semantic Gap** - The speaker explains how relational databases store image files and basic metadata but fail to capture semantic context, highlighting the need for vector databases to enable similarity‑based queries. - [00:03:20](https://www.youtube.com/watch?v=gl1r1XV0SLw&t=200s) **Understanding Image Vector Embeddings** - The speaker explains that image embeddings are numeric vectors whose dimensions encode learned visual features, illustrating this by comparing a mountain scene and a beach sunset and showing how similar dimensions reflect shared attributes like warm colors. - [00:06:25](https://www.youtube.com/watch?v=gl1r1XV0SLw&t=385s) **Layered Feature Extraction & Vector Indexing** - The passage explains how embedding models progressively abstract data into high‑dimensional vectors and why vector indexing is essential for fast similarity search across massive vector databases. - [00:09:28](https://www.youtube.com/watch?v=gl1r1XV0SLw&t=568s) **Dual Role of Vector Stores** - The passage explains that these systems act both as repositories for unstructured data and as engines for rapid, semantic retrieval. ## Full Transcript
0:00What is a vector database? 0:02Well, they say a picture is worth a thousand words. 0:04So let's start with one. 0:06Now in case you can't tell, this is a picture of a sunset on a mountain vista. 0:12Beautiful. 0:13Now let's say this is a digital image and we want to store it. 0:18We want to put it into a database and we're going to use a traditional database here called a relational database. 0:29Now what can we store in that relational database of this picture? 0:34Well we can put the actual picture binary data into our database to start with, 0:41so this is the actual image file but we can also store some other information as well 0:45like some basic metadata about the picture so that would be. 0:50things like the file format and the date that it was created, stuff like that. 0:55And we can also add some manually added tags to this as well. 1:01So we could say, let's have tags for sunset and landscape and orange, 1:07and that sort of gives us a basic way to be able to retrieve this image, 1:12but it kind of largely misses the images overall semantic context. 1:17Like how would you query for images with similar color palettes for example using this information 1:23or images with landscapes of mountains in the background for example. 1:28Those concepts aren't really represented very well in these structured fields 1:34and that disconnect between how computers store data how humans understand it has a name. 1:41It's called the semantic gap. 1:45Now traditional database queries like select star where color equals orange, 1:52it kind of falls short because it doesn't really capture the nuanced multi-dimensional nature of unstructured data. 2:00Well, that's where vector databases come in by representing data as mathematical vector embeddings. 2:11and what vector embeddings are, 2:16it's essentially an array of numbers. 2:19Now these vectors, they capture the semantic essence of the data where 2:23similar items are positioned close together in vector space and dissimilar items are positioned far apart, 2:30and with vector databases, we can perform similarity searches as mathematical operations, 2:36looking for vector embeddings that are close to each other, 2:39and that kind of translates to finding semantically similar content. 2:43Now we can represent 2:45all sorts of unstructured data in a vector database. 2:49What could we put in here? 2:51Well image files of course like our mountain sunset. 2:56We could put in a text file as well or we could even store audio files as well in here. 3:04Well this is unstructured data and these complex objects They are actually transformed into vector embeddings, 3:15and those vector embeddings are then stored in the vector database. 3:21So what do these vector embeddings look like? 3:24Well, I said there are arrays of numbers 3:27and there are arrays of numbers where each position represents some kind of learned feature. 3:31So let's take a simplified example. 3:35So remember our mountain picture here? 3:38Yep, we can represent that as a vector embedding. 3:42Now, let's say that the vector embedding for the mountain has a first dimension of say 0.91, 3:50then let's say the next one is 0.15, and then there's a third dimension of 0.83 and kind of so forth. 3:59What does all that mean? 4:00Well, the 0.91 in the first dimension, that indicates significant elevation changes because, hey, this is the mountains. 4:10Then 0.15 The second dimension here, that shows few urban elements, 4:16don't see many buildings here, so that's why that score is quite low. 4:200.83 in the third dimension, that represents strong warm colors like a sunset and so on. 4:27All sorts of other dimensions can be added as well. 4:30Now we could compare that to a different picture. 4:33What about this one, which is a sunset at the beach? 4:37So let's have a look at the vector embeddings for the beach example. 4:43So this would also have a series of dimensions. 4:46Let's say the first one is 0.12, then we have a 0.08, and then finally we have a 0.89 and then more dimensions to follow. 4:59Now, notice how there are some similarities here. 5:02The third dimension, 0.83 and 0.89, pretty similar. 5:09That's because they both have warm colors. 5:11They're both pictures of sunsets, 5:14but the first dimension that differs quite a lot here 5:18because a beach has minimal elevation changes compared to the mountains. 5:24Now this is a very simplified example. 5:26In real machine learning systems vector embeddings typically contain hundreds or even thousands of dimensions 5:33and I should also say that individual dimensions like this they rarely correspond 5:37to such clearly interpretable features, but you get the idea. 5:42And this all brings up the question of how are these vector embeddings actually created? 5:48Well, the answer is through embedding models that have been trained on massive data sets. 5:53So each type of data has its own specialized type of embedding model that we can use. 6:02So I'm gonna give you some examples of those. 6:06For example, Clip. 6:07You might use Clip for images. 6:10if you're working with text, you might use GloVe, and if you're working with audio, you might use Wav2vec 6:21These processes are all kind of pretty similar. 6:25Basically, you have data that passes through multiple layers. 6:30And as it goes through the layers of the embedding model, each layer is extracting progressively more abstract features. 6:38So for images, the early layers might detect some pretty basic stuff, like let's say edges, 6:45and then as we get to deeper layers, we would recognize more complex stuff, like maybe entire objects. 6:53perhaps for text these early layers would figure out the words that we're looking at, individual words, 7:01but then later deeper layers would be able to figure out context and meaning, 7:07and how this essentially works is we take the high dimensional vectors from this deeper layer here, 7:16and those high dimensional vectors often have hundreds 7:19or maybe even thousands of dimensions that capture the essential characteristics of the input. 7:25Now we have vector embeddings created. 7:28We can perform all sorts of powerful operations that just weren't possible with those traditional relational databases, 7:34things like similarity search, where we can find items 7:37that are similar to a query item by finding the closest vectors in the space. 7:42But when you have millions of vectors in your database and those vectors are made up of hundred or maybe even 7:50thousands of dimensions, 7:53you can't effectively and efficiently compare your query vector to every single vector in the database. 8:00It would just be too slow. 8:02So there is a process to do that and it's called vector indexing. 8:09Now this is where vector indexing uses something called approximate nearest neighbor or ANN algorithms 8:16and instead of finding the exact closest match 8:20these algorithms quickly find vectors that are very likely to be among the closest matches. 8:26Now there are a bunch of approaches for this. 8:29For example, HNSW, that is Hierarchical Navigable Small World that creates multi-layered graphs connecting similar vectors, 8:39and there's also IVF, 8:41that's Inverted File Index, which divides the vector space into clusters and only searches the most relevant of those clusters. 8:50These indexing methods, they basically are trading a small amount of accuracy 8:54for pretty big improvements in search speed. 8:57Now, vector databases are a core feature of something called RAG, retrieval augmented generation, 9:06where vector databases store chunks of documents 9:09and articles and knowledge bases as embeddings and 9:13then when a user asks a question, the system finds the relevant text chunks by comparing vector similarity? 9:20and feeds those to a large language model to generate responses using the retrieved information. 9:26So that's vector databases. 9:29They are both a place to store unstructured data and a place to retrieve it quickly and semantically.