Generalized Belief Transport
Authors: Junqi Wang, PEI WANG, Patrick Shafto
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We visualize the space of learning models encoded by GBT as a cube which includes classic learning models as special points. We derive critical properties of this parameterized space including proving continuity and differentiability which is the basis for model interpolation, and study limiting behavior of the parameters, which allows attaching learning models on the boundaries. Moreover, we investigate the long-run behavior of GBT, explore convergence properties of models in GBT mathematical and computationally, document the ability to learn in the presence of distribution drift, and formulate conjectures about general behavior. and Fig. 2 illustrates convergence over learning problems and episodes. In each bar, we sample 100 learning problems (C, θ0, h ) from Dirichlet distribution with hyperparameters the vector 1. Then we sample 1000 data sequences (episodes) of maximal length N = 10000. The learner learns with Algo. 2 where the stopping condition ω is set to be maxh H θ(h) > 1 s with s = 0.001. |
| Researcher Affiliation | Academia | Junqi Wang Department of Math & CS Rutgers University Newark, NJ, 07102 junqi.wang@rutgers.edu Pei Wang Department of Math & CS Rutgers University Newark, NJ, 07102 peiwang@rutgers.edu Patrick Shafto Department of Math & CS Rutgers University Newark, NJ, 07102 shafto@rutgers.edu |
| Pseudocode | Yes | Algorithm 1 Unbalanced Sinkhorn Scaling and Algorithm 2 Generalized Belief Transport |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository or mention code availability in supplementary materials. |
| Open Datasets | No | The paper mentions generating synthetic data for simulations (e.g., 'sample 100 learning problems (C, θ0, h ) from Dirichlet distribution') and uses abstract concepts like 'data sampled from D,' but it does not specify any publicly available datasets by name, provide direct links, DOIs, or formal citations for data access. |
| Dataset Splits | No | The paper does not provide specific details about training, validation, or test dataset splits (e.g., percentages, sample counts, or methodology for splitting) for its experiments. It describes simulations and 'learning episodes' without these explicit divisions. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., exact GPU/CPU models, memory, or detailed computer specifications) used to run the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') that would allow for reproducible setup of the experiment environment. |
| Experiment Setup | Yes | In each bar, we sample 100 learning problems (C, θ0, h ) from Dirichlet distribution with hyperparameters the vector 1. Then we sample 1000 data sequences (episodes) of maximal length N = 10000. The learner learns with Algo. 2 where the stopping condition ω is set to be maxh H θ(h) > 1 s with s = 0.001. and We sample a learning problem with a dimension 5 5 from Dirichlet distribution with hyperparameters the vector 1. Each learner ϵ = (1, ϵη, ), is equipped with a fixed C, θ0 and ηk = η for all k. We run 400, 000 learning episodes per learner. |