Vanilla Generative Adversarial Networks

Implementation with TensorFlow2 Keras

Che-Jui Huang
4 min readJun 15, 2022

Summary of Generative Adversarial Network

Essentially, in the case of vanilla GANs, the entire architecture is composed of a generator and a discriminator.

  1. The goal of the generator is to create “Fake Data” that resembles the “Real Data” using the “Noise Vectors”.
  2. Whereas, the goal of the discriminator is to distinguish generated “Fake Data” from the “Real Data”.

These two models competitively learned new weights. In other words, given noise vectors, the generator gradually learns to create outputs that hopefully trick the discriminator. When the discriminator is not able to give accurate predictions on whether the input is from the real dataset, the generator is recognized as a good model.

Vanilla GAN

Here is the code that demonstrates how these models learn

Note that BCE Loss is not the only loss that can be used for training GANs! There are different losses for different GAN architectures. This is just an example that affiliates with Vanilla GANs.

# Initiate BCE Loss
loss_func = tf.keras.losses.BinaryCrossentropy(from_logits=False)
# Simple Classification Task
def discriminator_loss(real_output, fake_output):
real_loss = loss_func(tf.ones_like(real_output), real_output)
fake_loss = loss_func(tf.zeros_like(fake_output), fake_output)
total_loss = (real_loss + fake_loss) / 2
return total_loss
# Give Generated Outputs with 1 as to pretend its real data // This eventually help the generator to know how discriminator thinks about its outputs. (If loss is high, the generator is doing a poor job, vice versa)def generator_loss(fake_output):
return loss_func(tf.ones_like(fake_output), fake_output)
A visualization for the code above (How Weights are Updated)

Implementation and Results

Vanilla GAN on MNIST and Fashion-MNIST Datasets

generator = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=[z_dim]),
tf.keras.layers.Dense(dense_units[0], activation='selu'),
tf.keras.layers.Dense(dense_units[1], activation='selu'),
tf.keras.layers.Dense(img_dim,activation='sigmoid'),
])
discriminator = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=[img_dim]),
tf.keras.layers.Dense(dense_units[0], activation='relu'),
tf.keras.layers.Dense(dense_units[1], activation='relu'),
tf.keras.layers.Dense(1,activation='sigmoid')
])
Real Data (Left) | Generated Data (Right)

Deep Convolutional GAN on MNIST and Fashion-MNIST Datasets

generator = tf.keras.models.Sequential([
tf.keras.layers.Dense(7 * 7 * 128, input_shape=[z_dim]),
tf.keras.layers.Reshape([7, 7, 128]),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(filters, kernel_size=5, strides=2, padding="SAME",activation="selu"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2, padding="SAME",activation="tanh")
])
discriminator = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters, kernel_size=5, strides=2, padding="SAME",activation=tf.keras.layers.LeakyReLU(0.2),input_shape=img_dim),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Conv2D(filters * 2, kernel_size=5, strides=2, padding="SAME", activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation="sigmoid")
])
Real Data (Left) | Generated Data (Right)

Wasserstein GAN with Gradient Penalty on MNIST and Fashion-MNIST Datasets

generator = tf.keras.models.Sequential([
tf.keras.layers.Dense(7 * 7 * 128, input_shape=[z_dim]),
tf.keras.layers.Reshape([7, 7, 128]),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(filters, kernel_size=5, strides=2, padding="SAME",activation="selu"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2, padding="SAME",activation="tanh"),
])
discriminator = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters, kernel_size=5, strides=2, padding="SAME",activation=tf.keras.layers.LeakyReLU(0.2),
input_shape=img_dim),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv2D(filters * 4, kernel_size=5, strides=2, padding="SAME",activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.2),

tf.keras.layers.Conv2D(filters * 2, kernel_size=5, strides=2, padding="SAME", activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.2),

tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1)
])
Real Data (Left) | Generated Data (Right)

Conditional WGAN-GP with on MNIST and Fashion-MNIST Datasets

# Noise Input & Condition Input
noise_vec = tf.keras.Input(shape = [z_dim], name='noise')
condition_vec = tf.keras.Input(shape=[condition_dim], name='condition')
inputs = tf.keras.layers.Concatenate()([noise_vec, condition_vec])
# Model Layers
dense = tf.keras.layers.Dense(7 * 7 * 128)(inputs)
net = tf.keras.layers.Reshape([7, 7, 128])(dense)
net = tf.keras.layers.BatchNormalization()(net)
net = tf.keras.layers.Conv2DTranspose(filters, kernel_size=5, strides=2,
padding="SAME",activation="selu")(net)

net = tf.keras.layers.BatchNormalization()(net)
net = tf.keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2,
padding="SAME",activation="tanh")(net)

generator = tf.keras.Model(inputs=[noise_vec, condition_vec], outputs=net)
critic = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters, kernel_size=5, strides=2, padding="SAME",
activation=tf.keras.layers.LeakyReLU(0.2),
input_shape=img_dim),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv2D(filters * 4, kernel_size=5, strides=2, padding="SAME",
activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.2),

tf.keras.layers.Conv2D(filters * 2, kernel_size=5, strides=2, padding="SAME",
activation=tf.keras.layers.LeakyReLU(0.2)),
tf.keras.layers.Dropout(0.2),

tf.keras.layers.Flatten(),
# tf.keras.layers.Dense(1, activation="sigmoid")
tf.keras.layers.Dense(1)
])
Real Data (Left) | Generated Data (Right)

WGAN-GP & Deep Convolutional GAN on Anime faces Dataset
(Both are generated results)

DCGAN (LEFT) | WGAN-GP (Right)

Conclusion

Starting from the vanilla GAN architecture to a more advanced WGAN-GP architecture, the generated images gradually gain more diversity and quality. For example, comparing vanilla GAN and DCGAN, we see that the generator generates trousers and shirts. However, in DCGAN, we start to see dresses and other fashion items of higher quality.

The goal of this post is not to teach anyone how to write codes for GAN, instead is a quick session that introduces the power of GAN to the audience. Is it amazing that computers can generate realistic results from random noises?

--

--