Estimating Regression Predictive Distributions with Sample Networks
Authors: Ali Harakeh, Jordan Sir Kwang Hu, Naiqing Guan, Steven Waslander, Liam Paull
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments to answer three questions and draw the following conclusions. Can Sample Net estimate multimodal probability distributions? We test Sample Net on two real-world datasets with data generating distributions that are multimodal and show its ability to accurately predict these distributions with no assumptions on their parametric form. How does Sample Net perform in comparison to distributional regression baselines? We compare Sample Net to other lightweight distributional regression methods on real-world regression datasets and on monocular depth prediction. Sample Net is shown to perform on par or better than all tested baselines. |
| Researcher Affiliation | Academia | 1Mila Quebec AI Institute 2Universit e de Montr eal 3University of Toronto Institute for Aerospace Studies 4University of Toronto |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found in the paper. The methodology is described through prose and mathematical equations. |
| Open Source Code | Yes | Example code can be found at: https://samplenet.github.io/. |
| Open Datasets | Yes | We perform experiments on the Weather (figure 4) and Traffic datasets (figure A.5)... We evaluate the performance of Sample Net in comparison to baselines on datasets from the UCI regression benchmark (Dua and Graff 2017)... We use the NYUv2 dataset (Nathan Silberman and Fergus 2012) for training and testing. |
| Dataset Splits | Yes | Using a proper scoring rule allows us to quantitatively evaluate both the sharpness and the calibration of pθ(y|x) (Gneiting, Balabdaoui, and Raftery 2007) on validation and test datasets, where the lower its value, the closer pθ(y|x) is to p (y|x). ... We report the mean and standard deviation of the ES in table 2 and the Gaussian NLL in table 3 computed over 20 train-test splits, similar to (Seitzer et al. 2022). |
| Hardware Specification | No | The paper mentions 'Our compute resources allow us to set a maximum value of M = 25 when K=M and L=1 before running out of memory' (Section 5), but does not provide specific details on CPU, GPU, or other hardware components used for the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the implementation or experimentation. |
| Experiment Setup | Yes | Details of our experimental setup can be found in sect. A. ... We use a learning rate of 0.001 and train for 200 epochs. We set the batch size to 64 for all datasets. |