Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
Authors: Ziyue Li, Tianyi Zhou
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments are conducted on 6 embedding tasks with 20 datasets from the Massive Text Embedding Benchmark (MTEB). The results demonstrate the significant improvement brought by MOEE to LLM-based embedding without further finetuning. |
| Researcher Affiliation | Academia | Ziyue Li, Tianyi Zhou Department of Computer Science University of Maryland, College Park EMAIL |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It primarily describes methods using natural language and mathematical equations. |
| Open Source Code | Yes | Project: https://github.com/tianyi-lab/MoE-Embedding |
| Open Datasets | Yes | Our experiments are conducted on 6 embedding tasks with 20 datasets from the Massive Text Embedding Benchmark (MTEB). The results demonstrate the significant improvement brought by MOEE to LLM-based embedding without further finetuning. |
| Dataset Splits | Yes | We conduct extension evaluations of MOEE and compare it with baselines on the Massive Text Embedding Benchmark (MTEB) (Muennighoff et al., 2022), which covers a wide range of tasks designed to test embedding quality. ... For consistent and fair comparisons, we adopt the MTEB evaluation framework and use task-specific metrics... |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not list specific versions for any key software components or libraries used in the implementation or experimentation. |
| Experiment Setup | Yes | The final similarity score is then computed as: simfinal = simHS + α simRW, where α is used as a hyperparameter to control the contribution of RW. To maximize the complementary strengths of HS and RW, we optimize α adaptively at test time. ... All models use per-token routing, but MOEE uses the last token’s routing weights, which consistently outperform averaging across all tokens. For the hidden state (HS) embeddings, we use the last-layer hidden state of the last token. |