Online Bayesian Transfer Learning for Sequential Data Modeling

Authors: Priyank Jaini, Zhitang Chen, Pablo Carbajal, Edith Law, Laura Middleton, Kayla Regan, Mike Schaekermann, George Trimponias, James Tung, Pascal Poupart

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section describes experiments on three tasks from different domains activity recognition, sleep cycle prediction among healthy individuals and patients suffering from Parkinson s disease and packet flow prediction in telecommunication networks.
Researcher Affiliation Collaboration 1 David R. Cheriton School of Computer Science, University of Waterloo, Ontario, Canada 2 Department of Kinesiology, University of Waterloo, Ontario, Canada 3 Dept. of Mechanical and Mechatronics Engineering, University of Waterloo, Ontario, Canada 4 Noah s Ark Laboratory, Huawei Technologies, Hong Kong, China
Pseudocode Yes Algorithm 1 Online Transfer Learning by Bayesian Moment Matching
Open Source Code No The paper states "Our implementation is based on the Theano library (Theano Development Team, 2016) in Python." but does not provide a link or explicit statement about releasing their own source code for the methodology described.
Open Datasets No The paper mentions a "publicly available dataset of real traffic from academic buildings" for flow direction prediction, but does not provide a specific link, DOI, formal citation with authors/year, or a repository name to access it.
Dataset Splits Yes We report the results based on leave-one-out cross validation where the data of a different individual is left out in each round. For each task, we treat every individual as a target individual once.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments or train the models.
Software Dependencies No The paper states "Our implementation is based on the Theano library (Theano Development Team, 2016) in Python." but does not specify version numbers for Python or Theano, nor does it list other software dependencies with versions.
Experiment Setup Yes We perform grid search to select the best hyper-parameters for each setting. For the training method, we either use Nesterov s accelerated gradient descent (Nesterov, 1983; Sutskever et al., 2013) with learning rates [0.001,0.01,0.1,0.2] and momentum values [0,0.2,0.4,0.6,0.8,0.9], or rmsprop (Tieleman & Hinton, 2012) having ε = 10 4 and decay factor 0.9 (standard values) with learning rates [0.00005,0.0001,0.0002,0.001] and momentum values [0,0.2,0.4,0.6,0.8,0.9]. The weight decay takes values from [0.001,0.01,0.1], whereas the number of LSTM units in the hidden layer takes the possible values [2,4,6,9,12].