Zero-Shot Learning by Convex Combination of Semantic Embeddings
Authors: Unknown
ICLR 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness of our method called convex combination of semantic embeddings (Con SE) is evaluated on Image Net zero-shot learning task. By employing a convolutional neural network [7] trained only on 1000 object categories from Image Net, the Con SE model is able to achieve 9.4% hit@1 and 24.7% hit@5 on 1600 unseen objects categories, which were omitted from the training dataset. We report quantitative results in terms of two metrics: flat hit@k and hierarchical precision@k. |
| Researcher Affiliation | Collaboration | Mohammad Norouzi , Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg S. Corrado, Jeffrey Dean norouzi@cs.toronto.edu, {tmikolov, bengio, singer}@google.com {shlens, afrome, gcorrado, jeff}@google.com University of Toronto Google, Inc. ON, Canada Mountain View, CA, USA |
| Pseudocode | No | The paper describes the model and method using mathematical formulas and descriptive text but does not include any formal pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing the source code for the methodology described in this paper, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We compare our approach, convex combination of semantic embedding (Con SE), with a stateof-the-art method called Deep Visual-Semantic Embedding (De Vi SE) [6] on the Image Net dataset [3]. |
| Dataset Splits | No | The convolutional neural network of [7], used in both Con SE and De Vi SE, is trained on Image Net 2012 1K set with 1000 training labels. These test datasets do not include any image labeled with any of the 1000 training labels. The paper clearly defines training and test sets based on class labels, but it does not provide details on a separate validation set or specific image-level splits for training, validation, and testing. |
| Hardware Specification | No | The paper mentions using a 'convolutional neural network' but does not specify any hardware components (e.g., GPU models, CPU types, or cloud computing instances) used for running the experiments. |
| Software Dependencies | No | The paper refers to using a 'convolutional neural network [7]' and a 'skipgram text model [12]' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used in their implementation. |
| Experiment Setup | Yes | T is a hyper-parameter controlling the maximum number of embedding vectors to be considered. We report the results for T = 1, 10, 1000 as Con SE (T) in Table 1. The Con SE(10) model uses the top T = 10 predictions of the Softmax baseline to generate convex combination of embeddings. |