reproducibilityindex.ai

Learning Mixture of Gaussians with Streaming Data

Authors: Aditi Raghunathan, Prateek Jain, Ravishankar Krishnawamy

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we study the problem of learning a mixture of Gaussians with streaming data: given a stream of N points in d dimensions generated by an unknown mixture of k spherical Gaussians, the goal is to estimate the model parameters using a single pass over the data stream. We analyze a streaming version of the popular Lloyd s heuristic and show that the algorithm estimates all the unknown centers of the component Gaussians accurately if they are sufﬁciently separated. Our main contribution is the ﬁrst bias-variance bound for the problem of learning Gaussian mixtures with streaming data.
Researcher Affiliation	Collaboration	Aditi Raghunathan Stanford University aditir@stanford.edu Prateek Jain Microsoft Research, India prajain@microsoft.com Ravishankar Krishnaswamy Microsoft Research, India rakri@microsoft.com
Pseudocode	Yes	Algorithm 1 Init Alg(N0) ... Algorithm 2 Stream Kmeans(N, N0) ... Algorithm 3 Stream Soft Update(N, N0)
Open Source Code	No	The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper describes a synthetic data generation model ('mixture of k spherical Gaussians distributions') but does not specify or provide access information for any publicly available or open dataset.
Dataset Splits	No	The paper is theoretical and does not describe experimental validation with specific training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as library names with version numbers.
Experiment Setup	No	The paper does not provide specific experimental setup details such as hyperparameter values or training configurations.