reproducibilityindex.ai

TODTLER: Two-Order-Deep Transfer Learning

Authors: Jan Van Haaren, Andrey Kolobov, Jesse Davis

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluation demonstrates our approach to outperform the existing transfer learning techniques in terms of accuracy and runtime. We present extensive empirical results for TODTLER on three domains: Yeast, Web KB, and Twitter. These results show that TODTLER s approximation outperforms the state-of-the-art deep transfer learning method DTM, and the state-of-the-art ﬁrst-order inductive learner LSM (Kok and Domingos 2010). In addition to learning more accurate models, TODTLER is also much faster than DTM.
Researcher Affiliation	Collaboration	Jan Van Haaren Department of Computer Science KU Leuven, Belgium jan.vanhaaren@cs.kuleuven.be Andrey Kolobov Microsoft Research Redmond, WA, USA akolobov@microsoft.com Jesse Davis Department of Computer Science KU Leuven, Belgium jesse.davis@cs.kuleuven.be
Pseudocode	Yes	Algorithm 1: The TODTLER framework. Algorithm 2: An approximation to TODTLER.
Open Source Code	Yes	Our implementation is available for download.1 (Footnote 1: http://dtai.cs.kuleuven.be/ml/systems/todtler)
Open Datasets	Yes	We use three datasets of which the ﬁrst two have been widely used and are publicly available.3 The Yeast protein dataset comes from the MIPS4 Comprehensive Yeast Genome Database (Mewes et al. 2000; Davis et al. 2005). The Web KB dataset consists of labeled web pages from the computer science departments of four universities (Craven and Slattery 2001). The Twitter5 dataset contains tweets about Belgian soccer matches. (Footnote 3: http://alchemy.cs.washington.edu. Footnote 5: http://dtai.cs.kuleuven.be/ml/systems/todtler)
Dataset Splits	No	The paper describes training on a subset of databases and testing on the remaining databases, but does not explicitly mention a dedicated 'validation' split or its size/proportion.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running experiments.
Software Dependencies	No	The paper mentions using 'Alchemy' and 'MC-SAT' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	For DTM, we generated all clauses containing at most three literals and three object variables, and transferred ﬁve and ten second-order cliques to the target domain. Since DTM s reﬁnement step can be computationally expensive, we limited its runtime to 100 hours per database. For TODTLER, we enumerated all second-order templates containing at most three literals and three object variables. We assumed a uniform prior distribution over the second-order templates in the source domain, which means TODTLER s p0 T parameter was set to 0.5 for each template. After a burnin of 1,000 samples, we computed the probabilities with the next 10,000 samples. We applied a pruning threshold of 0.05 on the weights of the clauses.