Particle Filter Recurrent Neural Networks

Authors: Xiao Ma, Peter Karkus, David Hsu, Wee Sun Lee5101-5108

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that the proposed PF-RNNs outperform the corresponding standard gated RNNs on a synthetic robot localization dataset and 10 real-world sequence prediction datasets for text classification, stock price prediction, etc.
Researcher Affiliation Academia Xiao Ma, Peter Karkus, David Hsu, Wee Sun Lee National University of Singapore {xiao-ma, karkus, dyhsu, leews}@comp.nus.edu.sg
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks. Figure 3 shows network architectures, but not algorithmic steps.
Open Source Code Yes The code is available at https://github.com/Yusufma03/pfrnns
Open Datasets Yes The paper cites various datasets used, implying their public availability through the citations, e.g., "NASDAQ (Qin et al. 2017)", "appliances energy prediction (AEP (Candanedo, Feldheim, and Deramaix 2017))", "air quality prediction (AIR (De Vito et al. 2008) and PM (Liang et al. 2015))". These are standard academic datasets with proper attribution.
Dataset Splits Yes We train models on a set of 10, 000 trajectories. We evaluate and test on another 1, 000 and 2, 000 trajectories, respectively.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used for implementation or experimentation.
Experiment Setup Yes Specifically, we use a latent state size of 64 and 30 particles for PF-LSTM and PF-GRU, and a latent state size of 80 for LSTM and 86 for GRU. [...] We perform a search over these hyper-parameters independently for all models and datasets, including learning rate, dropout rate, batch size, and gradient clipping value, and report the bestachieved result. [...] L(θ) = Lpred(θ) + βLELBO(θ), where β is a weight parameter. We use β = 1.0 in the experiments.