15 February 2018

When Will Networks Have Common Sense? Generative Adversarial Networks are on the way…


Generative Adversarial Networks (GANs) is a relatively new Machine Learning architecture for neural networks: it was first introduced in 2014 by University of Montreal (see this paper).

In order to better capture the value of GANs, one has to consider the difference between Supervised and Unsupervised learning. Supervised neural machineries are trained and tested based on large quantities of “labeled” samples. For example, a supervised image classifier engine would require a set of images with correct labels (e.g. cats, dogs, birds, . . .). Unsupervised neural machineries learn on the job from mistakes and try avoiding errors in the future. One can view a GAN as a new architecture for an unsupervised neural network able to achieve far better performance compared to traditional ones.

Main idea of GAN is to let two neural networks competing in a zero-sum game framework. A first network takes noise as input and generates samples (generator). The second one (discriminator) receives samples from both the generator and the training data, and has to be able to distinguish between the two sources.
The two networks play a game, where the generator is learning to produce more and more realistic samples, and the discriminator is learning to get better and better at distinguishing generated data from real data. These two networks are trained simultaneously, in order to drive the generated samples to be indistinguishable from real data.

GANs will allow training a discriminator as an unsupervised “density estimator”, i.e. a contrast function that gives us a low value for data and higher output for everything else: discriminator has to develop a good internal representation of the data to solve this problem properly. More details here.

GANs were previously thought to be unstable. Facebook AI Research (FAIR) published a set of papers on stabilizing adversarial networks, starting with image generators using Laplacian Adversarial Networks (LAPGAN) and Deep Convolutional Generative Adversarial Networks (DCGAN), and continuing into the more complex endeavor of video generation using Adversarial Gradient Difference Loss Predictors (AGDL).

As claimed here, it seems that GANs can provide a strong algorithmic framework for building unsupervised learning models that incorporate properties such as common sense.

There is a nice metaphor here about GANs: “In a way of an analogy, GANs act like the political environment of a country with two rival political parties. Each party continuously attempts to improve on its weaknesses while trying to find and leverage vulnerabilities in their adversary to push their agenda. Over time both parties become better operators”.