Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Global Minimizers of Sigmoid Contrastive Loss
Authors: Kiril Bangachev, Guy Bresler, Iliyas Noman, Yury Polyanskiy
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper we theoretically explain the advantages of synchronizing with trainable inverse temperature and bias under the sigmoid loss, as implemented in the recent Sig LIP and Sig LIP2 models of Google Deep Mind. ... All code is available at Representation Learning Theory/Sig LIP. ... We verify our findings by performing experiments with 8 different Sig LIP models from Hugging Face on the Image Net dataset (models given in Table 5). ... As we can see in Figure 7, the models with trainable t, b (respectively t, brel) significantly outperform the model with fixed temperature and bias. |
| Researcher Affiliation | Academia | Kiril Bangachev EMAIL Guy Bresler EMAIL Iliyas Noman EMAIL Yury Polyanskiy EMAIL Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Cambridge, MA, 02139 |
| Pseudocode | No | The paper describes methods and constructions in text and mathematical formulas (e.g., Construction 1, Construction 2), but does not contain explicitly formatted pseudocode or algorithm blocks. |
| Open Source Code | Yes | All code is available at Representation Learning Theory/Sig LIP. ... Justification: In a repo linked in the abstract. Representation Learning Theory/Sig LIP. |
| Open Datasets | Yes | We verify our findings by performing experiments with 8 different Sig LIP models from Hugging Face on the Image Net dataset (models given in Table 5). ... The data we used is the validation dataset of Image Net which contains 50000 captioned images with 1000 distinct captions. |
| Dataset Splits | Yes | The data we used is the validation dataset of Image Net which contains 50000 captioned images with 1000 distinct captions. ... We embedded all images and labels in the validation set using the B/16 model. |
| Hardware Specification | Yes | For the experiments in Appendix D.1, we used a single A100 GPU. All other experiments are done on a standard CPU and take at most several minutes. |
| Software Dependencies | No | The paper references Adam [KB15] as an optimizer and PIL for image resizing (We used PIL to resize all images to 224x24.), but it does not specify version numbers for these or other key software libraries. |
| Experiment Setup | Yes | Fixed Low Temperature t = 200 and bias b = 0. ... Fixed High Temperature t = 10 and bias b = 0. ... Trainable Temperature and Bias. We initialize at t = 10 = et , b = 0 and run Adam with on {Vi}N i=1, t , b for the loss LSig({Ui}N i=1, {Vi}N i=1; et , b) and initial learning rate 0.01. ... The specific experiment in Fig. 7 is for d = 10, N = 100. |