Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Diffusing Gaussian Mixtures for Generating Categorical Data
Authors: Florence Regol, Mark Coates
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method of evaluation highlights the capabilities and limitations of different generative models for generating categorical data, and includes experiments on synthetic and real-world protein datasets. |
| Researcher Affiliation | Academia | Florence Regol, Mark Coates Dept. Electrical and Computer Engineering, Mc Gill University Montr eal, QC, Canada EMAIL, EMAIL |
| Pseudocode | No | Algorithms detailing the training and sampling procedures are provided in the supplementary. |
| Open Source Code | Yes | The source code is available at https://github. com/networkslab/gmcd. |
| Open Datasets | Yes | As a real world application, we measure the performance of the models on two protein datasets from the Pfam protein family : PF00076, which contains N = 137, 605 proteins of length S = 70 and PF00014, which contains N = 13, 600 proteins of length S = 53. The number of categories for both datasets corresponds to the list of amino acids K = 21. |
| Dataset Splits | Yes | A split of 70/20/10 is used for the protein datasets. |
| Hardware Specification | Yes | Experiments are conducted on GPU machines NVIDIA Ge Force RTX 2060 . |
| Software Dependencies | No | The paper mentions using the RAdam optimizer but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | No | The paper states, 'We provide a complete description of architectures, the hyperparameters selection procedure in the supplementary,' indicating that specific experimental setup details like hyperparameter values are not in the main text. |