Efficient Large-Scale Multi-Modal Classification
Authors: Douwe Kiela, Edouard Grave, Armand Joulin, Tomas Mikolov
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate on three datasets that are large enough to examine accuracy/speed trade-offs in a meaningful way. Two of our datasets (Food101 and MM-IMDB) are medium-sized; while the third dataset (Flickr Tag) is very large by today s standards. The quantitative properties of the respective datasets are shown in Table 1 and they are described in more detail in what follows. [...] The results of the comparison may be found in Table 2. |
| Researcher Affiliation | Industry | Douwe Kiela, Edouard Grave, Armand Joulin, Tomas Mikolov Facebook AI Research {dkiela,egrave,ajoulin,tmikolov}@fb.com |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper does not provide explicit statements or links for its own open-source code release. |
| Open Datasets | Yes | Food101 The UPMC Food101 dataset (Wang et al. 2015) contains web pages with textual recipe descriptions for 101 food labels automatically retrieved online. [...] MM-IMDB The recently introduced MM-IMDB dataset (Arevalo et al. 2017) contains movie plot outlines and movie posters. [...] Flickr Tag and Flickr Tag-1 We use the Flickr Tag dataset based on the massive YFCC100M Flickr dataset of (Thomee et al. 2016) that was used in (Joulin et al. 2016). |
| Dataset Splits | Yes | Table 1: Evaluation datasets with their quantitative properties. (showing #Train, #Valid, #Test columns) |
| Hardware Specification | No | The paper mentions "trained asynchronously on multiple CPUs" and gives training times but does not specify the exact hardware (e.g., CPU model, GPU type, memory) used for the experiments. |
| Software Dependencies | No | The paper mentions using "Fast Text" but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | In all experiments, the model is tuned on the validation set. We tried the following hyperparameters: a learning rate in {0.1, 0.25, 0.5, 1.0, 2.0}, a number of epochs in {5, 10, 20}, a reweighting parameter in {0.01, 0.02, 0.05, 0.1, 0.2, 0.5} and an embedding dimensionality of either 20 or 100. These hyperparameters were sweeped using grid search and we used a softmax loss. For other hyperparameters, such as the number of threads in the parallel optimization and the minimum word count, we fixed their values to standard values in Fast Text (4 threads, minimum count of 1, respectively) |