Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On Unbiased Estimation for Partially Observed Diffusions
Authors: Jeremy Heng, Jeremie Houssineau, Ajay Jasra
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate various aspects of our method on an Ornstein Uhlenbeck model, a logistic diffusion model for population dynamics, and a neural network model for grid cells. |
| Researcher Affiliation | Academia | Jeremy Heng EMAIL ESSEC Business School; Jeremie Houssineau EMAIL Division of Mathematical Sciences Nanyang Technological University; Ajay Jasra EMAIL School of Data Science Chinese University of Hong Kong, Shenzhen |
| Pseudocode | Yes | Algorithm 1 Conditional particle filter (CPF) at parameter θ Θ and discretization level l N0... Algorithm 2 Two coupled CPF (2-CCPF) at parameter θ Θ and discretization level l N0... Algorithm 3 Maximal coupling of two resampling distributions R(w1:N) and R( w1:N) (2-Maximal)... Algorithm 4 Four coupled CPF (4-CCPF) at parameter θ Θ and discretization levels l 1 and l N... Algorithm 5 Maximal coupling of the maximal couplings R(wl 1,1:N, wl,1:N) and R( wl 1,1:N, wl,1:N) (4-Maximal)... Algorithm 6 Multilevel CPF (ML-CPF) at parameter θ Θ and discretization levels l 1 and l N |
| Open Source Code | Yes | An R package to reproduce all numerical results can be found at https://github.com/jeremyhengjm/Unbiased Score. |
| Open Datasets | Yes | Next we consider an application from population ecology to model the dynamics of a population of red kangaroos (Macropus rufus) in New South Wales, Australia. Figure 5a displays data yt1, . . . , yt P N2 0 from Caughley et al. (1987)... neural network model for single neurons to analyze grid cells spike data (https://www.ntnu.edu/kavli/research/grid-cell-data) recorded in the medial entorhinal cortex of rats that were running on a linear track (Hafting et al., 2008). |
| Dataset Splits | No | The paper uses either simulated data or real-world observed data (e.g., population dynamics, grid cell spike data) for parameter inference. It discusses the number of observations (e.g., T=25, P=41) but does not specify any explicit training, validation, or test dataset splits, as the goal is parameter inference on the entire dataset rather than model evaluation through splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used to run the experiments. It focuses on algorithmic performance and outcomes without mentioning the underlying computational resources. |
| Software Dependencies | No | The paper mentions an "R package to reproduce all numerical results" (https://github.com/jeremyhengjm/Unbiased Score), indicating that R is used. However, it does not specify the version of R or any other software libraries, packages, or their respective version numbers. |
| Experiment Setup | Yes | Section 5.1: "N = 128 particles", "b = 90%-quantile( τ l θ) at level l = 3 and I = b ( naive ); b = 90%-quantile( τ l θ) at level l = 3 and I = b ( simple ); and b = 90%-quantile( τ l θ) and I = 10b ( time-averaged )". Section 5.2: "N = 256 particles and adaptive resampling", "burn-in of b = 90%-quantile( τ l θ) at level l = 3", "prior distribution for the transformed parameters (θ1, log θ2, log θ3, log θ4) as Ndθ(µ0, Σ0), with µ0 = (0, 1, 1, 1) and Σ0 = diag(52, 22, 22, 22)", "learning rate in (5) be component-dependent by taking εm = diag((100 + m) 0.6(10 2, 10 2, 10 4, 10 2))". Section 5.3: "N = 256 particles and adaptive resampling", "burn-in of b = 100", "constant learning rate of εm = 10 3". |