When is an Embedding Model More Promising than Another?

Authors: Maxime Darrin, Philippe Formont, Ismail Ayed, Jackie CK Cheung, Pablo Piantanida

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate experimentally that our approach aligns closely with the capability of embedding models to facilitate various downstream tasks in both natural language processing and molecular biology. This effectively offers practitioners a valuable tool for prioritizing model trials.
Researcher Affiliation Academia Maxime DARRIN1,2,3,4 Philippe FORMONT1,2,4,5 Ismail BEN AYED1,5 Jackie Chi Kit CHEUNG2,3 Pablo PIANTANIDA1,2,4,6 1International Laboratory on Learning Systems, 2Mila Quebec AI Institute, 3Mc Gill University 4Université Paris-Saclay, 5ÉTS Montréal, 6CNRS, Centrale Supélec
Pseudocode Yes Procedure 1 Estimation of IS(U Z), GMµ,Σ,w denotes the Gaussian Mixture model with means µ, covariances Σ and weights w.
Open Source Code Yes The code used to perform all experiments is available at https://github.com/ills-montreal/emir
Open Datasets Yes We used them to extract embeddings for many different datasets from the MTEB benchmark such as Banking77 [19], Sickr [122], Amazon polarity [72], SNLI [120] and IMDB [70].
Dataset Splits Yes Datasets collected are split into a training, validation, and test set, following the scaffold-split strategy, further described in see Sec. D.3.
Hardware Specification Yes All our experiments were conducted on NVIDIA V100 and NVIDIA A6000 GPUs.
Software Dependencies No The paper mentions "ADAM [56]" as an optimizer and "RD-Kit and Datamol tool-kits[61, 71]" but does not specify version numbers for these or other key software dependencies required for reproducibility.
Experiment Setup Yes All the downstream tasks are trained in the exact same way. We use a dense classifier with two hidden layers of dimension 256 and train for two epochs using ADAM [56] with a learning rate of 10 3, on the official training set and evaluated on either the validation or test set when they are available (with respect to the Huggingface datasets).