Contrastive losses as generalized models of global epistasis

Authors: David Brookes, Jakub Otwinowski, Sam Sinai

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that contrastive losses are able to accurately estimate a ranking function from limited data even in regimes where MSE is ineffective and validate the practical utility of this insight by demonstrating that contrastive loss functions result in consistently improved performance on empirical benchmark tasks. and Next, we present simulation results aimed at showing that global epistasis adversely effects the ability of models to effectively learn fitness functions from incomplete data when trained with MSE loss and that models trained with BT loss are more robust to the effects of global epistasis.
Researcher Affiliation Industry David H. Brookes Dyno Therapeutics Jakub Otwinowski Dyno Therapeutics Sam Sinai Dyno Therapeutics and david.brookes@dynotx.com
Pseudocode No The paper describes its methods in prose and equations but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes The code for running the simulations used herein is available at https://github.com/dhbrookes/Contrastive-Losses-Global-Epistasis.git.
Open Datasets Yes We particularly focus on the FLIP benchmark [14], which comprises a total of 15 fitness prediction tasks derived from three empirical fitness datasets. and Protein Gym [28]
Dataset Splits Yes In every case, the models were fully-connected neural networks with two hidden layers and the optimization was terminated using early stopping with 20 percent of the training data used as a validation set. and The validation metrics for models trained with the BT and MSE losses were the Spearman and Pearson correlations, respectively, between the model predictions on the validation set and the corresponding labels.
Hardware Specification No The paper describes the computational models and training procedures but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using the Adam optimizer [25] and neural networks, but it does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages (e.g., Python version) used for the implementation.
Experiment Setup Yes For this task, we first sampled a complete latent fitness function, f(xi) for i = 1, 2, ...2L, from the NK model, using the parameters L = 8, K = 2 and q = 2. and We then constructed a neural network model, fθ, in which input binary sequences of length L were transformed by two hidden layers with 100 nodes each and ReLU activation functions and a final linear layer that produced a single fitness output. To fit the parameters of this model, we performed stochastic gradient descent using the Adam method [25] on the Bradley-Terry (BT) loss with all (xi, yi) pairs as training data, a learning rate of 0.001, a batch size of 256.