Deep Feature Consistent Variational Autoencoder
We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, thus ensuring the VAE’s output image to have more natural visual appearance and better perceptual quality. Based on recent deep learning work such as style transfer, we employ a pre-trained deep convolutional neu- ral network (CNN) and use its hidden features to define a feature perceptual loss for a VAE training. Evaluated on the CelebA face dataset, we show that our model can pro- duce better results than other methods in the literature. We also show that our method can produce latent vectors that can capture the conceptual information of face expressions and can be used to produce state-of-the-art performance in facial attribute prediction.