Learning Library

← Back to Library

Understanding Generative Adversarial Networks

Key Points

  • GANs are an unsupervised learning framework that pits a **generator** (which creates fake data) against a **discriminator** (which learns to tell real from fake), forming an adversarial training loop.
  • Unlike typical supervised models that predict outputs from labeled inputs and adjust based on prediction error, GANs “self‑supervise” by using the discriminator’s feedback to improve the generator.
  • The generator receives a random input vector and iteratively refines its output until it produces samples—often images such as faces, cats, or 3D objects—that can deceive both the discriminator and human observers.
  • Training typically starts by teaching the discriminator to recognize real examples (e.g., many flower photos) and reject non‑examples, after which the generator begins producing fake flowers that the discriminator evaluates, driving both networks toward higher realism.

Full Transcript

# Understanding Generative Adversarial Networks **Source:** [https://www.youtube.com/watch?v=TpMIssRdhco](https://www.youtube.com/watch?v=TpMIssRdhco) **Duration:** 00:08:18 ## Summary - GANs are an unsupervised learning framework that pits a **generator** (which creates fake data) against a **discriminator** (which learns to tell real from fake), forming an adversarial training loop. - Unlike typical supervised models that predict outputs from labeled inputs and adjust based on prediction error, GANs “self‑supervise” by using the discriminator’s feedback to improve the generator. - The generator receives a random input vector and iteratively refines its output until it produces samples—often images such as faces, cats, or 3D objects—that can deceive both the discriminator and human observers. - Training typically starts by teaching the discriminator to recognize real examples (e.g., many flower photos) and reject non‑examples, after which the generator begins producing fake flowers that the discriminator evaluates, driving both networks toward higher realism. ## Sections - [00:00:00](https://www.youtube.com/watch?v=TpMIssRdhco&t=0s) **GANs: Dual-Model Unsupervised Learning** - The speaker explains how Generative Adversarial Networks use a generator and a discriminator in an adversarial, self‑supervising framework to produce realistic data—particularly images—contrasting this approach with traditional supervised prediction models. - [00:03:13](https://www.youtube.com/watch?v=TpMIssRdhco&t=193s) **Training a GAN with Flowers** - The speaker outlines how a discriminator is first taught to recognize real flower images before the generator creates fake flowers, turning the pair into an adversarial zero‑sum game where both models iteratively improve. - [00:06:21](https://www.youtube.com/watch?v=TpMIssRdhco&t=381s) **GAN Applications Beyond Images** - The speaker explains how CNNs support GANs and outlines various use cases such as video frame prediction, image super‑resolution, and even encryption. ## Full Transcript
0:01One of my favorite machine learning algorithms is 0:04Generative Adversarial Networks, or GAN. 0:09It pits two AI models off against each other, 0:12hence the "adversarial" part. 0:16Now, most machine learning models 0:19are used to generate a prediction. 0:22So we start with some input training data. 0:27And we feed that into our model. 0:31A model then makes a prediction 0:34in the form of output. 0:39And we can compare the predicted output 0:42with the expected output 0:44from the training data set. 0:47And then based upon that expected output 0:50and the actual predicted output, 0:52we can figure out how we should update our model 0:55to create better outputs. 1:05That is an example of supervised learning. 1:10A GAN is an example of unsupervised learning, 1:14it effectively supervises itself, 1:16and it consists of two submodels. 1:20So we have a generator submodel. 1:29And we have a discriminator submodel. 1:39Now, the generator's job is to create 1:43fake input or fake samples. 1:53And the discriminator's job is to take a given sample 2:00and figure out if it is a fake sample 2:03or if it's a real sample from the domain. 2:12And therein lies the adversarial nature of this. 2:17We have a generator creating fake samples 2:19and sending them to a discriminator. 2:23The discriminator is taking a look at a given sample and figuring out, 2:25"Is this a fake sample from the generator? 2:29Or is this a real sample from the domain set?" 2:35Now, this sort of scenario is often applied in image generation. 2:41There are images all over the internet of generators 2:46that have been used to create fake 3D models, 2:48fake faces, fake cats and so forth. 2:53So this really works by the generator iterating through 2:57a number of different cycles of creating samples, 3:00updating its model and so forth 3:03until it can create a sample that is so convincing 3:06that it can fool a discriminator and also fool us humans as well. 3:14So let's let's take an example 3:17of how this works with, let's say, a flower. 3:21So we are going to train a generator to create really convincing fake flowers, 3:28and the way that we start by doing this is we need to, first of all, 3:32train our discriminator model to recognize what a picture of a flower looks like. 3:37So our domain is lots of pictures of flowers, 3:40and we will be feeding this into the discriminator model 3:43and telling it to look at all of the attributes 3:45that make up those flower images. 3:48Take a look at the colors, the shading, 3:50the shapes and so forth. 3:53And when our discriminator gets good at recognizing real flowers, 3:56then we'll feed in some shapes that are not flowers at all. 4:00And make sure that it can discriminate those as being not-flowers. 4:05Now, this whole time our generator here was frozen, 4:08it wasn't doing anything. 4:10But we're our discriminator gets good enough at recognizing things from our domain, 4:14then we apply our generator to start creating fake versions of those things. 4:21So a generator is going to take a random input vector 4:26and it is going to use that to create its own fake flower. 4:34Now, this fake flower image is sent to the discriminator, 4:37and now the discriminator has a decision to make: 4:40is that image of a flower the real thing from the domain, 4:44or is it a fake from the generator? 4:50Now, the answer is revealed to both the generator and the discriminator. 4:56The flower was fake and based upon that, 5:01the generator and discriminator will change their behavior. 5:06This is a zero sum game, there's always a winner and a loser. 5:10The winner gets to remain blissfully unchanged. 5:13Their model doesn't change at all, 5:15whereas the loser has to update their model. 5:17So if the discriminator successfully spotted that this flower was a fake image, 5:22then lead discriminator remains unchanged. 5:26But the generator will need to change its model to generate better fakes. 5:31Whereas if the reverse is true 5:32and the generator is creating something that fools the discriminator, 5:36the discriminator model will need to be updated itself 5:39in order to better be able to tell where we have a fake sample coming in, 5:46so it's fooled less easily. 5:49And that's basically how these things work, 5:52and we go through many, many iterations of this 5:55until the generator gets so good that the discriminator can no longer pick out its fakes. 6:02And there we have built a very successful generator to do whatever it is we wanted it to do. 6:09Now, often in terms of images, the generator and the discriminator implemented as CNNs. 6:18These are Convolutional Neural Networks. 6:21CNN's are a great way of recognizing patterns in image data 6:27and entering into sort of the area of object identification. 6:31We have a whole separate video on CNNs, but they're a great way 6:34to really implement the generator and discriminator in this scenario. 6:40But the whole process of a GAN, isn't just to create really good fake flowers 6:46or fake cat images for the internet. 6:49You can apply it to all sorts of use cases. 6:51So take, for example, video frame prediction. 6:57If we fit in a particular frame of video from a camera, 7:01we can use a GAN to predict what the next frame in this sequence will look like. 7:07This is a great way to be able to predict what's going to happen in the immediate future 7:11and might be used, for example, in a surveillance system. 7:15If we can figure out what is likely to happen next, 7:18we can take some action based upon that. 7:21There's also other things you can do, like image enhancement. 7:25So if we have a kind of a low resolution image, 7:28we can use a GAN to create a much higher resolution version of the image 7:33by figuring out what each individual pixel is 7:35and then creating a higher resolution version of that. 7:39And we can even go as far as using this for things that are not related 7:42to images at all, like encryption. 7:45But we can create a secure encryption algorithm that can be decrypted and encrypted 7:49by the sender and receiver, but cannot be easily intercepted, 7:53again by going through these GAN iterations to create a really good generator. 7:59So that's GAN. 8:00It's the battle of the bots 8:02where you can take your young, impressionable and unchanged generator 8:06and turn it into a master of forgery. 8:11If you have any questions, please drop us a line below. 8:13And if you want to see more videos like this in the future, please like and subscribe.