Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Gaussian Copula Embeddings
Authors: Chien Lu, Jaakko Peltonen
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments on five different scenarios, the proposed model is shown to be effective, outperforming competitive methods in task-based evaluations and yielding insights in a social media analysis task. |
| Researcher Affiliation | Academia | Chien Lu Jaakko Peltonen Tampere University |
| Pseudocode | Yes | The complete stochastic inference procedure is given in Algorithm 1. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | The Anime rating data is a set of user ratings on anime movies and series collected from myanimelist.net. 1From Kaggle, https://www.kaggle.com/datasets/Cooper Union/anime-recommendations-database. The CIC Dark-net traffic data set [6]. Spanish Twitch gamers is a subgraph of the Twitch gamers graph data [17]. The Reddit Hyperlink Network [12]. |
| Dataset Splits | No | For all methods, when training the model, we hold out 10% of the data as the testing data set, and the trained models are used to predict the ratings in the test data. The paper specifies a test split but does not explicitly mention a separate validation split or how it was used if present. |
| Hardware Specification | No | The main text of the paper does not provide specific details on the hardware used for running experiments (e.g., GPU/CPU models, memory, or cloud instances). While the checklist indicates this information is in supplementary materials, it is not present in the main paper. |
| Software Dependencies | No | The paper mentions software components and algorithms like 'Adam optimizer' and 'Plackett-Luce model', but it does not provide specific version numbers for any libraries, frameworks, or programming languages used in the implementation. |
| Experiment Setup | Yes | The precision parameter λα is set to 0 corresponding to a very wide prior for α. In experiments we use M = 1000 mini-batches and 5 negative samples for each positive sample. The optimization then updates the embedding vectors in each epoch by gradient steps with step sizes chosen by the Adam optimizer. |