Visual Data Synthesis via GAN for Zero-Shot Video Classification
Authors: Chenrui Zhang, Yuxin Peng
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on four video datasets demonstrate that our approach can improve the zero-shot video classification performance significantly. |
| Researcher Affiliation | Academia | Institute of Computer Science and Technology, Peking University, Beijing 100871, China |
| Pseudocode | Yes | Algorithm 1 Training process of the proposed framework |
| Open Source Code | No | The paper links to PyTorch (1http://pytorch.org/) but does not provide a specific link or statement about releasing the source code for their proposed methodology. |
| Open Datasets | Yes | HMDB51 [Kuehne et al., 2013], UCF101 [Soomro et al., 2012], Olympic Sports [Niebles et al., 2010] and Columbia Consumer Video (CCV) [Jiang et al., 2011] |
| Dataset Splits | Yes | 50/50 for every dataset, i.e., video feature of 50% categories are used for model training and the other 50% categories are held unseen until test time. We take the average accuracy and standard deviation as evaluation metrics and report the results over 50 independent splits generated randomly. |
| Hardware Specification | No | The paper does not explicitly mention the hardware specifications (e.g., specific GPU or CPU models) used for running the experiments. |
| Software Dependencies | No | Our model is implemented with PyTorch1. We adopt GloVe [Pennington et al., 2014] trained on Wikipedia with more than 2.2 million unique vocabularies to obtain semantic embedding and its dimension is 300. |
| Experiment Setup | Yes | We train our framework for 300 epochs using Adam optimizer with momentum 0.9. We initialize the learning rate to 0.01 and decay it every 50 epochs by a factor of 0.5. Both λ1 and λ2 are set to 1. |