“The most interesting idea in the last 10 years in ML” – Yann LeCun (Facebook AI Research Director)
In 2014 when Ian Goodfellow, Yoshua Bengio and a few other researchers from the University of Montreal introduced GANs in their seminal research paper, it caused the kind of disruption the Machine Learning world had not seen in a long, long time. Understand this, the Machine Learning world is one of the most heavily researched, one of the most lavishly funded, and one of the most dynamic fields across the globe right now.
There are new papers every other day, beating previous results and setting new benchmarks in performance, speed and quality. We wouldn’t be too wrong in saying that we’re basically having new definitions for ‘state-of-the-art’ every other day.
And in a world that is changing every day, to cause a wave of the magnitude that GANs did is by no means a small thing.
If we could indulge you into a little philosophy, we’d say that ideas in general, in any field, on any front, are always divided into two sides.
On one side of this divide, there are the good ideas, and the great ideas, and the smart ideas and all the other adjectives for all the possible types of ideas you can think of.
These are the ideas that hit the target. These are the ideas that get the job done.
And then on the other side of this divide, there are revolutionary ideas.
Ideas, that don’t just ubiquitously shake things up, but take it to a completely new level no one even knew existed.
Ideas that hit a target that nobody even saw.
And it is these ideas, that are the most disruptive and also the most dangerous.
It is the second category of ideas that GANs belong to. We’ll look at the disruptive part in this blog and the next few, and then we’ll take up the dangerous part in another blog.
The Basic Concept
Well, there is a technical way to explain the basic concept behind GANs which includes a lot of jargon that doesn’t do anybody any good. At Eduonix, we’re always focused on the type of learning that stays with the student and we don’t believe starting out with jargon is the best way to do that.
So taking an analogical alternative route, we implore you to imagine two situations.
Imagine yourself as a student, not particularly skilled in the art, is in an art class, and there is a certain painting that you have to replicate. The painting you have to replicate is in front of you, with all the necessary colors, palettes and brushes. You also have an infinite number of canvases available to try and replicate the painting on.
In this scenario, you are that same student, with that same skill set in art, with the same task of replicating the said painting with the same colors, palettes, brushes and canvasses available in front of you. Only this time, you have an art expert witnessing your attempts at the painting giving you criticism and essentially telling you whether the painting is good enough to be a decent replica or not.
Let us analyze both situations now.
Situation A – ANALYSIS
You take the first canvas, mix the colors in the patellate to the best of your judgment, and try your hand at the painting. It is likely, that you yourself won’t be satisfied with your first attempt. So over the hours, with the painting as your reference, you try to match it patch by patch, trying your judgemental best to minimize the difference between the painting and your replica. Perhaps, after a few dozen attempts, you think to the best of your judgment, that you’ve managed a decent job with the replica and that’ll, of course, be the end of your assignment.
Now let’s analyze the other situation.
Situation B – ANALYSIS
You’d proceed the same way you did in the previous scenario that we analyzed. The difference, of course, is that now you have an art expert scrutinizing every canvas that you produce. In your first attempt, you’d just be warming up so you and the art expert both will agree that the painting doesn’t match the original.
This will naturally continue for a period of time since you and the art expert are likely to mutually agree in your first few attempts that the painting isn’t a good enough replica. But just like in the previous scenario, after several tedious attempts, when you finally think you did a good enough job and feel satisfied with your performance and be beginning to leave, the art expert will give you his peace of mind and tell you that your painting is not nearly as close to the original as you’d like to think.
Now you’d have to listen to the guy because he’s obviously the expert and maybe he found something in the hue of the colors that was off or maybe the brush strokes were a little too wide.
So you’d pick up a fresh new canvas and be more careful now to take care of these small details that your eye couldn’t catch at the first glance but the discriminative eye of the expert could.
You redo the painting and yet again the expert doesn’t think the paintings match.
You’re even more careful of the details this time but he doesn’t seem to agree still.
You redo, and you redo again. But he just wouldn’t let you be in peace.
It makes you almost hate the man in all honesty, like an adversary you have to beat to get done here and go home.
After several more of these exhaustive attempts, finally both of you are in agreement over the painting and you are free to go.
Which final painting do you think would make a better replica? The painting in Situation A or Situation B?
I think we can safely say that we’d all agree Situation B would result in a much more better replica than Situation A and the credit for this, of course, goes to our hated adversary, the art expert.
Coincidentally, Situation B is in essence, also happens to be the basic principle behind Generative Adversarial Networks.
In situation B, we had some rigorous adversarial ‘training’ from the art expert who blatantly kept denying that your painting matched the original even if you thought it did with your untrained judgment. This kept you looking for the more finer details that you might have missed in an attempt to convince or ‘bypass’ the expert which eventually resulted in a much better replica that you would have achieved without your adversary.
If you’ve understood this part, believe us, you’ve understood Generative Adversarial Networks. Let’s make things a little more formal in the next section.
Formally Describing Generative Adversarial Networks (GANs)
In Generative Adversarial Networks, we have two Neural Networks pitted against each other, much like you and the art expert.
One of the Neural Networks is called the Generator because its job is to generate the distribution of data that most closely resembles the target output data, like you trying to draw a painting that most closely resembles the original.
And just like the eyes of the art expert in our Situation B, in GANs we have a Neural Network which essentially acts as a discriminative algorithm that can classify the input data to a label or category to which that data belongs. In our case, the categories for the discriminator (our art expert), would be ‘resembles the original closely’ and ‘doesn’t resemble the original closely’. The discriminator Neural Network is concerned solely with that correlation.
To sum this discussion up, in GANs we have one neural network, called the generator, that generates new data instances, while the other, the discriminator, evaluates them for authenticity; i.e. the discriminator decides whether each instance of data that it reviews belongs to the actual training dataset or not so as to force the generator to learn how to output a similar distribution given unseen data.
Essentially, the goal of the generator is to ‘bypass’ or ‘convince’ the discriminator that their generated output is indeed authentic.
Of course, we are aware that these ideas can still come off as a little abstract even after our explanation, and more specifically, it can be difficult to see how this scenario actually pans out in terms of Neural Networks and code. I mean, it’s all good until the generator is you trying to replicate a painting and the discriminator is the art expert but the implementational details of how do you get two Neural Networks to face-off against each other in code, is rather fuzzy.
This is also exactly the reason we have split this topic of introducing you to GANs into two parts.
The best way to learn, as we always emphasize here at Eduonix, is by example. So in the next part of this blog, we’ll walk you through an example of a basic GAN which tries to generate to handwritten digits from the MNIST dataset. And rest assured, we’ll do it with code.
Be sure to tune in for the action in the continuation to this blog and I saw action because we’ll get to witness something you’ve never even dreamed off: Two Neural Networks facing off against each other.
See you on the other side fellas!
To continue reading about the GAN, you can click on the below link which will take you to the next part of this series.