Neural Conditional Probability for Uncertainty Quantification
Authors: Vladimir Kostic, Grégoire Pacreau, Giacomo Turri, Pietro Novelli, Karim Lounici, Massimiliano Pontil
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we show that NCP with a 2-hidden-layer network matches or outperforms leading methods. This demonstrates that a a minimalistic architecture with a theoretically grounded loss can achieve competitive results, even in the face of more complex architectures. |
| Researcher Affiliation | Academia | 1CSML, Istituto Italiano di Tecnologia 2University of Novi Sad 3CMAP-Ecole Polytechnique 4AI Centre, University College London |
| Pseudocode | Yes | Algorithm 1 Separable density learning procedure |
| Open Source Code | Yes | Code is available at https://github.com/CSML-IIT-UCL/NCP. |
| Open Datasets | Yes | To sample data from Econ Density, Arma Jump, Gaussian Mixture, and Skew Normal, we used the library Conditional Density Estimation (Rothfuss et al., 2019) available at https://github.com/freelunchtheorem/Conditional_Density_Estimation. and Student Performance dataset available at https://www.kaggle.com/datasets/ nikhil7280/student-performance-multiple-linear-regression/data. |
| Dataset Splits | Yes | ranging from 102 to 105, with a validation set of 103 samples. |
| Hardware Specification | Yes | Experiments were conducted on a high-performance computing cluster equipped with an Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz Sky Lake CPU, 377GB RAM, and an NVIDIA Tesla V100 16Gb GPU. |
| Software Dependencies | No | The paper mentions various software components and libraries (e.g., 'normflows', 'rfcde library'), but does not provide a comprehensive list of all key software dependencies with specific version numbers (e.g., Python, PyTorch, CUDA versions) required for reproduction of their own implementation. |
| Experiment Setup | Yes | We trained an NCP model with uθ and vθ as multi-layer perceptrons (MLPs), each having two hidden layers of 64 units using GELU activation function in between. The vector σθ has a size of d = 100, and γ is set to 10 3. Optimization was performed over 104 epochs using the Adam optimizer with a learning rate of 10 3. Early stopping was applied based on the validation set with patience of 1000 epochs. |