reproducibilityindex.ai

On Variational Inference in Biclustering Models

Authors: Guanhua Fang, Ping Li

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical results validate our new theories. Multiple numerical results are provided in Section 5 to support our theoretical ﬁndings.
Researcher Affiliation	Industry	Guanhua Fang, Ping Li Cognitive Computing Lab Baidu Research 10900 NE 8th St Bellevue WA 98004 USA {guanhuafang, liping11}@baidu.com
Pseudocode	Yes	Algorithm 1 CAVI for Biclustering Model. 1: Input. Observations: {yij} 2: Output. Estimated parameter: ˆθ 3: Initialization. Randomly sample θ(0) from parameter space B and choose π(0) 1 = ( 1/J1 , . . . , 1/J1 ) and π(0) 2 = ( 1/J2 , . . . , 1/J2 ). Sample variational parameter φ(0) 1i independently so that P k [J1] φ(0) 1i [k] = 1. Sample φ(0) 2j independently so that P l [J2] φ(0) 2j [l] = 1. 4: while not converged do 5: Increase the time index: t = t + 1. 6: For each k [J1] and l [J2], update θkl by θ(t) kl = arg max θ i,j φ(t 1) 1i [k]φ(t 1) 2j [l] log fθ(yij\|k, l). 7: For each i [m1], update φ1i by φ(t) 1i = arg max φ k [J1],l [J2] φ[k]φ(t 1) 2j [l] log fθ(t)(yij\|k, l) X k [J1] φ[k](log φ[k] log π(t 1) 1 [k]). 8: For each j [m2], update φ2j by φ(t) 2j = arg max φ k [J1],l [J2] φ[l]φ(t) 1i [k] log Fθ(t)(yij\|k, l) X l [J2] φ[l](log φ[l] log π(t 1) 2 [l]). 9: For k [J1], update π(t) 1 [k] = 1/m1 i [m1] φ(t) 1i [k]. 10: For l [J2], update π(t) 2 [l] = 1/m2 j [m2] φ(t) 2j [l]. 11: end while 12: Set ˆθ = θ(Tc), where Tc is the time index when the algorithm converges.
Open Source Code	No	The paper does not contain any statement about releasing source code, nor does it provide links to a code repository.
Open Datasets	No	The paper describes generating synthetic data based on Bernoulli and Poisson biclustering models (e.g., 'Yij Bernoulli(θzizj)) and Poisson biclustering model (i.e., Yij Poisson(θzizj)).'). It does not use or provide access to any publicly available real-world datasets for training or evaluation.
Dataset Splits	No	The paper uses simulated data for its experiments and does not specify training, validation, or testing splits. It mentions setting sample sizes (e.g., 'm1 = m2 = m, where m take values from {100, 200, 300, 400, 500}') and running replications, but no standard dataset splits.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used to run the numerical experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as library names with version numbers, that would be needed to replicate the experiments.
Experiment Setup	Yes	We set sample size m1 = m2 = m, where m take values from {100, 200, 300, 400, 500}. We set number of classes J1 = J2 = J = 2 or 3. True parameter θ is randomly generated and π1, π2 are set to be uniform prior. For each setting, we run 100 replications. ... The parameter choice is speciﬁed as follows. Upper left: π1 = (0.1, 0.2, 0.7), θ11 = 0.9, θ21 = 0.7. Upper right: π1 = (0.1, 0.2, 0.7), θ11 = 0.9, θ21 = 0.6. Bottom left: π1 = (0.2, 0.2, 0.6), θ11 = 1, θ21 = 0. Bottom right: π1 = (0.3, 0.1, 0.6), θ11 = 1, θ21 = 0. For all four cases, m1 = 500 and m2 {250, 500, 750, 1000}.