Conversation
|
Job PR-1446-1 is done. |
bryanyzhu
left a comment
There was a problem hiding this comment.
Minor comments, otherwise LGTM. Thank you for the contribution.
| ``` | ||
|
|
||
| *Notes for training* | ||
| 1) The original tensorflow implementation can't be 100% converted by MXNet. Two functions are missing, gradient penalty and blur. The lack of gradient penalty can cause mode collapse while training, so it is neccessary to tune the learning rate based on the number of GPUs and apply early stop. The lack of blur function results in the low image quality and this is one of the important reasons that high-resolution images can't be generted via our implementation. |
There was a problem hiding this comment.
Please add links to gradient penalty and blur function in original Tensorflow implementation, so that people interested can go check it out.
Besides, generted -> generated, just a typo.
|
|
||
| *Notes for training* | ||
| 1) The original tensorflow implementation can't be 100% converted by MXNet. Two functions are missing, gradient penalty and blur. The lack of gradient penalty can cause mode collapse while training, so it is neccessary to tune the learning rate based on the number of GPUs and apply early stop. The lack of blur function results in the low image quality and this is one of the important reasons that high-resolution images can't be generted via our implementation. | ||
| 2) This is an unstable version of StyleGAN. We've tested the training by using 8 GPUs and single GPU. Single GPU can be problematic. The following images are generated by a model trained with 8 GPUs. |
There was a problem hiding this comment.
This is an unstable version of StyleGAN -> The training of StyleGAN is not stable at this moment due to the aforementioned reasons.
| *Notes for training* | ||
| 1) The original tensorflow implementation can't be 100% converted by MXNet. Two functions are missing, gradient penalty and blur. The lack of gradient penalty can cause mode collapse while training, so it is neccessary to tune the learning rate based on the number of GPUs and apply early stop. The lack of blur function results in the low image quality and this is one of the important reasons that high-resolution images can't be generted via our implementation. | ||
| 2) This is an unstable version of StyleGAN. We've tested the training by using 8 GPUs and single GPU. Single GPU can be problematic. The following images are generated by a model trained with 8 GPUs. | ||
| 3) It takes around 4 days with 8 GPUs to train a StyleGAN to generate 128x128 images. |
There was a problem hiding this comment.
What GPU? 8 GPUs -> 8 K80 GPUs
There was a problem hiding this comment.
All minor changes have been done. Please check.
|
Job PR-1446-2 is done. |
|
Job PR-1446-3 is done. |
|
Job PR-1446-4 is done. |
Pull request for adding training for StyleGAN