Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

Authors: Carles Domingo i Enrich, Youssef Mroueh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate and clarify our findings, we perform experiments of the settings studied Sec. 5, Sec. 6 and Sec. 7. We use the Re Lu activation function σ(x) = (x)+, although remark that the results of Sec. 5 hold for a generic activation function, and the results of Sec. 6 and Sec. 7 hold for non-negative integer powers of the Re Lu activation. The empirical estimates in the plots are detailed in App. G. They are averaged over 10 repetitions; the error bars show the maximum and minimum.
Researcher Affiliation Collaboration Carles Domingo-Enrich Courant Institute of Mathematical Sciences (NYU) cd2754@nyu.edu Youssef Mroueh IBM Research AI mroueh@us.ibm.com
Pseudocode No The paper provides mathematical definitions and descriptions of methods but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No The paper constructs specific probability measures µd and νd for its experiments and uses 'samples of µd and νd' or 'a standard multivariate Gaussian and a Gaussian with unit variance', but it does not state that these are publicly available datasets or provide access information for them.
Dataset Splits No The paper describes using a certain number of samples for empirical estimates (e.g., '4400 million samples of µd and νd are used'), but it does not specify any training, validation, or test dataset splits.
Hardware Specification No The paper describes experiments but does not specify any hardware details such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions the ReLU activation function but does not provide any specific software names with version numbers that are necessary for replication.
Experiment Setup No The paper mentions the use of the ReLU activation function and the number of samples used for estimates (e.g., '4400 million samples'), but it does not provide specific hyperparameter values like learning rates, batch sizes, optimizer settings, or detailed training configurations in the main text.