Vanilla Generative Adversarial Networks
Implementation with TensorFlow2 Keras
Summary of Generative Adversarial Network
Essentially, in the case of vanilla GANs, the entire architecture is composed of a generator and a discriminator.
- The goal of the generator is to create “Fake Data” that resembles the “Real Data” using the “Noise Vectors”.
- Whereas, the goal of the discriminator is to distinguish generated “Fake Data” from the “Real Data”.
These two models competitively learned new weights. In other words, given noise vectors, the generator gradually learns to create outputs that hopefully trick the discriminator. When the discriminator is not able to give accurate predictions on whether the input is from the real dataset, the generator is recognized as a good model.
Here is the code that demonstrates how these models learn
Note that BCE Loss is not the only loss that can be used for training GANs! There are different losses for different GAN architectures. This is just an example that affiliates with Vanilla GANs.
# Initiate BCE Loss
loss_func = tf.keras.losses.BinaryCrossentropy(from_logits=False)# Simple Classification Task
def discriminator_loss(real_output, fake_output):
real_loss = loss_func(tf.ones_like(real_output), real_output)
fake_loss = loss_func(tf.zeros_like(fake_output), fake_output)
total_loss = (real_loss + fake_loss) / 2
return total_loss# Give Generated Outputs with 1 as to pretend its real data // This eventually help the generator to know how discriminator thinks about its outputs. (If loss is high, the generator is doing a poor job, vice versa)def generator_loss(fake_output):
return loss_func(tf.ones_like(fake_output), fake_output)
Implementation and Results
Vanilla GAN on MNIST and Fashion-MNIST Datasets
generator = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=[z_dim]),
tf.keras.layers.Dense(dense_units[0], activation='selu'),
tf.keras.layers.Dense(dense_units[1], activation='selu'),
tf.keras.layers.Dense(img_dim,activation='sigmoid'),
])discriminator = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=[img_dim]),
tf.keras.layers.Dense(dense_units[0], activation='relu'),
tf.keras.layers.Dense(dense_units[1], activation='relu'),
tf.keras.layers.Dense(1,activation='sigmoid')
])
Deep Convolutional GAN on MNIST and Fashion-MNIST Datasets
generator = tf.keras.models.Sequential([
tf.keras.layers.Dense(7 * 7 * 128, input_shape=[z_dim]),
tf.keras.layers.Reshape([7, 7, 128]),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(filters, kernel_size=5, strides=2, padding="SAME",activation="selu"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2, padding="SAME",activation="tanh")
])discriminator = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters, kernel_size=5, strides=2, padding="SAME",activation=tf.keras.layers.LeakyReLU(0.2),input_shape=img_dim),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Conv2D(filters * 2, kernel_size=5, strides=2, padding="SAME", activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation="sigmoid")
])
Wasserstein GAN with Gradient Penalty on MNIST and Fashion-MNIST Datasets
generator = tf.keras.models.Sequential([
tf.keras.layers.Dense(7 * 7 * 128, input_shape=[z_dim]),
tf.keras.layers.Reshape([7, 7, 128]),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(filters, kernel_size=5, strides=2, padding="SAME",activation="selu"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2, padding="SAME",activation="tanh"),
])discriminator = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters, kernel_size=5, strides=2, padding="SAME",activation=tf.keras.layers.LeakyReLU(0.2),
input_shape=img_dim),
tf.keras.layers.Dropout(0.2), tf.keras.layers.Conv2D(filters * 4, kernel_size=5, strides=2, padding="SAME",activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv2D(filters * 2, kernel_size=5, strides=2, padding="SAME", activation=tf.keras.layers.LeakyReLU(0.2)), tf.keras.layers.Dropout(0.2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1)
])
Conditional WGAN-GP with on MNIST and Fashion-MNIST Datasets
# Noise Input & Condition Input
noise_vec = tf.keras.Input(shape = [z_dim], name='noise')
condition_vec = tf.keras.Input(shape=[condition_dim], name='condition')
inputs = tf.keras.layers.Concatenate()([noise_vec, condition_vec])# Model Layers
dense = tf.keras.layers.Dense(7 * 7 * 128)(inputs)
net = tf.keras.layers.Reshape([7, 7, 128])(dense)
net = tf.keras.layers.BatchNormalization()(net)
net = tf.keras.layers.Conv2DTranspose(filters, kernel_size=5, strides=2,
padding="SAME",activation="selu")(net)
net = tf.keras.layers.BatchNormalization()(net)
net = tf.keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2,
padding="SAME",activation="tanh")(net)
generator = tf.keras.Model(inputs=[noise_vec, condition_vec], outputs=net)critic = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters, kernel_size=5, strides=2, padding="SAME",
activation=tf.keras.layers.LeakyReLU(0.2),
input_shape=img_dim),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv2D(filters * 4, kernel_size=5, strides=2, padding="SAME",
activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv2D(filters * 2, kernel_size=5, strides=2, padding="SAME",
activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Flatten(),
# tf.keras.layers.Dense(1, activation="sigmoid")
tf.keras.layers.Dense(1)
])
WGAN-GP & Deep Convolutional GAN on Anime faces Dataset
(Both are generated results)
Conclusion
Starting from the vanilla GAN architecture to a more advanced WGAN-GP architecture, the generated images gradually gain more diversity and quality. For example, comparing vanilla GAN and DCGAN, we see that the generator generates trousers and shirts. However, in DCGAN, we start to see dresses and other fashion items of higher quality.
The goal of this post is not to teach anyone how to write codes for GAN, instead is a quick session that introduces the power of GAN to the audience. Is it amazing that computers can generate realistic results from random noises?