reproducibilityindex.ai

Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization

Authors: Alessandro Sordoni, Yoshua Bengio, Jian-Yun Nie

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental study All our experiments were conducted using the open source Indri search engine (http://www.lemurproject.org). ... We choose to use the three set of topics of the TREC Web Track from 2010 to 2012 (topics 51-200). In addition to MAP, precision at topranks is an important feature for query expansion models. Hence, we also report NDCG@10 and the recent ERR@10... The statistical signiﬁcance of differences in the performance of tested methods is determined using a randomization test (Smucker, Allan, and Carterette 2007) evaluated at α < 0.05. ... Results Table 3 resumes all our experimental results.
Researcher Affiliation	Academia	Alessandro Sordoni, Yoshua Bengio and Jian-Yun Nie DIRO, Universit e de Montr eal Montr eal, Qu ebec
Pseudocode	Yes	Figure 1: Algorithms for training (a) and testing (b) the hyper parameters Φ of the expansion models directly on MAP. (a) Training Phase Q Train queries For t = 1 . . . n 1. Φt Random(ΩΦ) 2. Mt Train(D, Φt) 3. QE Expand(Q, Mt) 4. λt Grid(QE, λ) 5. MAPΦt Search(QE, λt) 6. If MAPΦt MAPΦ 5.1 Φ = Φt, λ = λt Return Φ , λ (a) Testing Phase Q Test queries 1. M Train(D, Φ ) 2. QE Expand(Q, M ) 3. MAPΦ Search(QE, λ ) 4. Return MAPΦ
Open Source Code	No	All our experiments were conducted using the open source Indri search engine (http://www.lemurproject.org).
Open Datasets	Yes	We test the effectiveness of our approach on the Clue Web09B collection, a noisy web collection containing 50,220,423 documents. ... For this paper, we built the anchor log from the high-quality Wikipedia collection (http://www.wikipedia.org).
Dataset Splits	Yes	We report the results obtained by performing 5-fold cross-validation.
Hardware Specification	No	No specific hardware details (e.g., CPU/GPU models, memory, or cluster specifications) were found.
Software Dependencies	No	All our experiments were conducted using the open source Indri search engine (http://www.lemurproject.org).
Experiment Setup	Yes	For all the embeddings model, we ﬁx the number of latent dimensions to K = 100, the number of epochs to 3. For SSI, we cross-validate the gradient step, while for QEM we include also the margin m. ... Our procedure is depicted in Fig. 1. Given our anchor log D, we sample hyper parameters Φ from a uniform distribution over a ﬁne-grained set of possible values ΩΦ. Clamping Φ, we train the model parameters (embeddings or translation probabilities) on the anchor log. We expand the original queries by selecting the top-10 concepts according to the parameterization discussed previously. Finally, we tune by grid-search the smoothing parameter λ. We repeat the process n = 50 times in order to have good chances to ﬁnd minima of the hyperparameter space.