reproducibilityindex.ai

Scalable Training of Markov Logic Networks Using Approximate Counting

Authors: Somdeb Sarkhel, Deepak Venugopal, Tuan Pham, Parag Singla, Vibhav Gogate

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we propose principled weight learning algorithms for Markov logic networks that can easily scale to much larger datasets and application domains than existing algorithms. We demonstrate experimentally that they are orders of magnitude faster and achieve the same accuracy or better than existing approaches. We conducted our experiments on three datasets publicly available on the Alchemy website. Table 3 illustrates our results on the different benchmarks. For all these algorithms, on all four datasets, we performed ﬁvefold cross-validation. In each fold s test set, we measured the conditional log-likelihood (CLL).
Researcher Affiliation	Academia	Somdeb Sarkhel,1 Deepak Venugopal,2 Tuan Anh Pham,1 Parag Singla,3 Vibhav Gogate1 1Department of Computer Science, The University of Texas at Dallas {sxs104721, txp112330, vxg112130}@utdallas.edu 2Department of Computer Science, The University of Memphis, dvngopal@memphis.edu 3Indian Institute of Technology Delhi, India, parags@cse.iitd.ac.in
Pseudocode	Yes	Algorithm 1 Scalable Contrastive Divergence (MLN M, i-bound for IJGP, Number of samples N, dataset or world ω, learning rate η, update frequency K)
Open Source Code	No	The paper does not provide an explicit statement about the release of their source code or a link to a repository for the methodology described.
Open Datasets	Yes	We conducted our experiments on three datasets publicly available on the Alchemy website. We used three datasets: Web KB, Entity Resolution (ER) and Protein Interaction (Protein). For sanity check, we added another dataset called Smoker, that we generated randomly for the Friends and Smokers MLN in Alchemy. Table 2 shows the details of our datasets.
Dataset Splits	Yes	For all these algorithms, on all four datasets, we performed ﬁvefold cross-validation. In each fold s test set, we measured the conditional log-likelihood (CLL).
Hardware Specification	No	The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. It only implies that experiments were conducted.
Software Dependencies	No	The paper mentions comparing with "Alchemy (Kok et al. 2008)" and "Tuffy (Niu et al. 2011)" and uses "IJGP" as an algorithm, but it does not specify software dependencies with version numbers for their own implementation.
Experiment Setup	Yes	Algorithm 1 includes parameters like "learning rate η" and "update frequency K". The methodology section states: "For approximate counting within our learning algorithms, we used the IJGP algorithm. Speciﬁcally, for any formula f and world ω, when to estimate the number of satisﬁed groundings of f in ω, we run IJGP on the CSP encoding for (f, ω) for up to 10 iteration." It also mentions: "We collected 100,000 Gibbs samples to estimate each probability."