Learning Predictive State Representations From Non-Uniform Sampling

Authors: Yuri Grinberg, Hossein Aboutalebi, Melanie Lyman-Abramovitch, Borja Balle, Doina Precup

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations on both synthetic and real datasets highlight the advantages of the proposed approach.We conducted several experiments on real and simulated environments in order to evaluate our denoising algorithm, in comparison to the standard PSR learning approach.
Researcher Affiliation Collaboration 1 National Research Council of Canada 2 Mc Gill University, Canada 3 Amazon Research Cambridge, UK
Pseudocode Yes Algorithm 1: Denoising Algorithm Data: ˆPH,T = U T SV Rm n, step size α , threshold ϵ, reg. parameters λ1, λ2 Let f(Q, R) = i,j || ˆPH,T QR||2 F,W + λ1( k |C| i,j Ck(q T xiryi q T xjryj)2) +λ2 l k=1 || y Yk R:,y R:,jk||2 Result: Rank-k matrix R Initialize: Q = U T S; R = V ; Until: convergence Q = Q + α Qf(Q, R); R = R + α Rf(Q, R);
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper uses synthetic, simulated, and 'real data obtained from the robot cyclically exploring his environment using 24 sensors (Freire et al. 2009)'. While a citation is provided for the origin of the robot data, it refers to a paper describing the data gathering process, not a direct link, DOI, or repository for public access to the dataset itself.
Dataset Splits Yes Performance was evaluated using 5-fold cross validation, such that each fold represents a new trajectory of the robot.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies No The paper mentions software components like 'Fitted-Q iteration (FQI)' and 'Extremely randomized trees' but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes In all experiments we set the Frobenious norm weights to be the number of samples used to estimate each entry as a proxy to the inverse variance.The SDM was constructed using histories up to length 3 and tests up to length 2.In all cases, we used rank-15 SDM, as it produced better performance for both the standard and denoising methods.A rank-40 SDM was used to compute PSR parameters, as increasing the rank did not have any notable effect on either of the methods.