Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation
Authors: Lijie Fan, Wenbing Huang, Chuang Gan, Junzhou Huang, Boqing Gong3510-3517
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments and user studies verify the effectiveness of our approach. |
| Researcher Affiliation | Collaboration | Lijie Fan,1 Massachusetts Institute of Technology, 2Tencent AI Lab, 3MIT-Watson Lab lijiefan@mit.edu, hwenbing@126.com, chuangg@mit.edu, jzhuang@uta.edu, boqinggo@outlook.com |
| Pseudocode | No | The paper does not contain a block labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | The paper does not provide any specific statement or link indicating that the source code for their methodology is publicly available. |
| Open Datasets | Yes | We not only use the public CK+ (Lucey et al. 2010) dataset for model training but also significantly extend it in scale. The new larger-scale dataset is named CK++. |
| Dataset Splits | Yes | We use 10 video clips from the CK++ dataset for validation and all the others for training. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Dlib Library (King 2009)' but does not specify its version number or versions for any other software dependencies. |
| Experiment Setup | Yes | The Adam optimizer is used in the experiments, with the initial learning rate of 0.0002. The whole training process takes 2100 epochs, where one epoch means a complete pass over the training data. All images are resized to 289x289 and randomly cropped to 256x256 before being fed into the network. We set the small increment to a = 0.1 for temporal regulation Rt. |