reproducibilityindex.ai

Generative Forests

Authors: Richard Nock, Mathieu Guillame-Bert

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the quality of generated data display substantial improvements compared to the state of the art.
Researcher Affiliation	Industry	Richard Nock Google Research richardnock@google.com Mathieu Guillame-Bert Google gbm@google.com
Pseudocode	Yes	Algorithm 1 INIT(tΥtu T t 1); Algorithm 2 STARUPDATE(Υ, C, R); Algorithm 3 GF.BOOST(R, J, T)
Open Source Code	Yes	Our code is provided and commented in Appendix, Section V.2.
Open Datasets	Yes	We carried out experiments on a total of 21 datasets, from UCI [10], Kaggle, Open ML, the Stanford Open Policing Project, or simulated. All are presented in Appendix, Section V.1.
Dataset Splits	Yes	The evaluation pipeline is simple: we create for each domain a 5-fold stratified experiment.
Hardware Specification	Yes	We ran part of the experiments on a Mac Book Pro 16 Gb RAM w/ 2 GHz Quad-Core Intel Core i5 processor, and part on a desktop Intel(R) Xeon(R) 3.70GHz with 12 cores and 64 Gb RAM.
Software Dependencies	Yes	MICE We have used the R MICE package V 3.13.0 with two choices of methods for the round robin (column-wise) prediction of missing values: CART [1] and random forests (RF) [42].
Experiment Setup	Yes	In Table 5, contenders are parameterized as follows: ARFs learn sets of 200 trees. CT-GANs are trained for 1 000 epochs. Forest Flows and VCAEs are run with otherwise default parameters. We optimized MICE by choosing as supervised models trees (CART) and random forests (RFs, we increased the number of trees to 100 for better results).