Memory (and Time) Efficient Sequential Monte Carlo

Authors: Seong-Hwan Jun, Alexandre Bouchard-Côté

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that this difference has a large empirical impact on the quality of the approximation in realistic scenarios, and also since memory access is generally slow on the running time. Our algorithm adaptively selects an optimal number of particle to exploit this fixed memory budget. We show that this adaptation does not interfere with the usual consistency guarantees that come with SMC algorithms. In Figure 2 (b), we show the actual time. We have computed an estimate of P(X1 = +1) for the two methods, which is shown in Figure 1 of the supplement. It can be seen there that both methods approach to the true value of 0.5. The results shown in this sections are average over 3 different runs initialized with different random seeds. To assess the quality of the approximation obtained by large numbers of implicit particles, we first looked at the estimate of the marginal negative log-likelihood as the number of particles is increased. The results over 5 runs are plotted in Figure 3 (a). This experiment was carried out on a simulated dataset of 20 taxa and 1000 sites. We explain data simulation steps in Section 6 of the supplement. In the next experiment, we consider the problem of reconstructing the ancestral relationships between the taxa by inferring the latent tree structure along with the branch lengths. For each pair of taxa, we estimate the pairwise distances; in this case we experimented on a simulated dataset involving 20 taxa and hence, there are 20 2 such pairwise distances. We show in Figure 3 (b) the SSD plotted as time of execution increases.
Researcher Affiliation Academia Seong-Hwan Jun SEONG.JUN@STAT.UBC.CA The University of British Columbia, Vancouver, Canada Alexandre Bouchard-Cˆot e BOUCHARD@STAT.UBC.CA The University of British Columbia, Vancouver, Canada
Pseudocode Yes See Figure 1 and Algorithm 1 in the Supplement. See Algorithm 2 in the Supplement for details. See Algorithm 3 in the Supplement for details.
Open Source Code No The paper does not provide any statement about releasing the source code for the IPSMC method or a link to a code repository.
Open Datasets No This experiment was carried out on a simulated dataset of 20 taxa and 1000 sites. We explain data simulation steps in Section 6 of the supplement. In the next experiment, we consider the problem of reconstructing the ancestral relationships between the taxa by inferring the latent tree structure along with the branch lengths. For each pair of taxa, we estimate the pairwise distances; in this case we experimented on a simulated dataset involving 20 taxa. The paper mentions using 'simulated datasets' but does not provide concrete access information (e.g., link, DOI, citation for a public dataset) for these datasets.
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits, percentages, or sample counts for reproducibility.
Hardware Specification No In contrast, we were able to run most of these experiments on a laptop using 4 gigabytes of RAM. The paper mentions 'a laptop using 4 gigabytes of RAM' but does not provide specific CPU or GPU models, or other detailed hardware specifications.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes We consider the problem for L = 32 at the temperatures Tstart = 100 to Tend = 1 with the annealing step size of 0.5. We carried out a simple experiment using K = 10, 000 as the memory budget for both IPSMC and SMC, and N = 1 million as the computation ceiling for IPSMC. We examined the density estimated by the particles for R = 100 using σ2 V = σ2 W = 1.