Skip to content

raphaelsenn/DCGAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DCGAN

PyTorch reimplementation of "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks" (Radford et al., 2016).

dcgan_generator Figure 1: DCGAN generator used for LSUN (Radford et al., 2016).

loss Equation 1: Loss of the discriminator and generator (Lucic et al., 2018).

Large-scale CelebFaces Attributes (CelebA) Dataset

1. Data collection

Download CelebA from one of the two sources:

Ensure the data folder supports the following tree structure and naming convention:

|-- celeba
|   |-- data  // image folder or prepro_data folder
|   |-- landmarks.csv

2. Preprocessing (optional)

The original CelebA dataset consists of human faces with diverse backgrounds. However, working with human faces and diverse backgrounds makes it difficult for the DCGAN generator. To make his life more easy, I used a pretrained YOLOv8 medium (v0.2) face detector to extract tight face crops from the original CelebA images (see in prepro.py).

Ensure you download the pretrained weights of YOLOv8 medium, and specify the path to the data directory in ./utils/prepro.py. After that, you are ready to run:

python3 ./utils/prepro.py

NOTE: The preprocessing step using the YOLOv2 model is optional.

3. Training

Modify the trainig script train_celeba.py to point to your data directory.

python3 train_celeba.py

This script will train both the generator and discriminator on the CelebA dataset. It will automatically create:

  • a file dcgan_report_celeba.csv where the losses of both, the generator and discriminator are stored,
  • a directory checkpoints_dcgan where the weights will be saved,
  • a directory celeba_preview where example images are saved during training.

dcgan_loss_celeba Figure 2: Loss of the DCGAN discriminator (left) and the DCGAN generator (right) when training on CelebA.

4. Inference: Generated faces using the DCGAN generator

celeba_fake Figure 3: Image generated by DCGAN generator by follwoing the instructions from this repository.

Manifold Interpolation

Using the trained generator you can easily interpolate two noise samples, and project them using the generator as follows:

# Generator output shape is [1, 3, 64, 64]

noise_start = torch.rand((1, 100))*2 - 1  # shape [1, 100]
noise_end = torch.rand((1, 100))*2 - 1    # shape [1, 100]

n_steps = 16
steps = torch.linspace(0, 1, n_steps)     # shape [n_steps,]

zs = []
for i in range(n_steps):
      alpha = steps[i]
      z = (1-alpha)*noise_start + alpha*noise_end
inter_noise = torch.cat(zs, dim=0)        # shape [n_steps, 100]

fake_imgs = generator(inter_noise)        # shape [n_steps, 3, 64, 64]

celeba_manifold_interpolation Figure 4: Interpolation between a series of 9 noise samples. All noise samples were projected using the trained DCGAN generator (from left-to-right).

LSUN

1. Data collection

Since the original LSUN dataset is "large" (> 43GB), I used a subset of the dataset.

Download the subset (or the original one), and ensure the following tree structure inside the data directory:

|-- LSUN 
|   |-- bedroom
|   |   |-- 0     // image folder
|   |   |-- 1     // image folder
|   |   |-- 2     // image folder
|   |   |-- .      
|   |   |-- .      
|   |   |-- .      

2. Training

Modify the trainig script train_lsun.py to point to your data directory.

python3 train_lsun.py

This script will train both the generator and discriminator on the LSUN dataset. It will automatically train the models, and create some files and folders as described in the CelebA description.

dcgan_loss_lsun Figure 5: Loss of the DCGAN discriminator (left) and the DCGAN generator (right) when training on LSUN.

3. Inference: Generated bedrooms using the DCGAN generator

lsun_fake Figure 6: Image generated by DCGAN generator by follwoing the instructions from this repository.

Experimental setup

  • OS: Fedora Linux 42 (Workstation Edition) x86_64

  • CPU: AMD Ryzen 5 2600X (12) @ 3.60 GHz

  • GPU: NVIDIA GeForce RTX 3060 ti (8GB VRAM)

  • RAM: 32 GB DDR4 3200 MHz

  • CelebA training time: < 3 hours

  • LSUN training time: < 3 hours

Citations

@misc{radford2016unsupervisedrepresentationlearningdeep,
      title={Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks}, 
      author={Alec Radford and Luke Metz and Soumith Chintala},
      year={2016},
      eprint={1511.06434},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1511.06434}, 
}
@misc{goodfellow2014generativeadversarialnetworks,
      title={Generative Adversarial Networks}, 
      author={Ian J. Goodfellow and Jean Pouget-Abadie and Mehdi Mirza and Bing Xu and David Warde-Farley and Sherjil Ozair and Aaron Courville and Yoshua Bengio},
      year={2014},
      eprint={1406.2661},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/1406.2661}, 
}
@misc{lucic2018ganscreatedequallargescale,
      title={Are GANs Created Equal? A Large-Scale Study}, 
      author={Mario Lucic and Karol Kurach and Marcin Michalski and Sylvain Gelly and Olivier Bousquet},
      year={2018},
      eprint={1711.10337},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/1711.10337}, 
}
@inproceedings{liu2015faceattributes,
  title = {Deep Learning Face Attributes in the Wild},
  author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
  booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
  month = {December},
  year = {2015} 
}
@article{yu15lsun,
    Author = {Yu, Fisher and Zhang, Yinda and Song, Shuran and Seff, Ari and Xiao, Jianxiong},
    Title = {LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop},
    Journal = {arXiv preprint arXiv:1506.03365},
    Year = {2015}
}

About

PyTorch reimplementation of "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks" (Radford et al., 2016).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages