RLTM: An Efficient Neural IR Framework for Long Documents
Authors: Chen Zheng, Yu Sun, Shengxian Wan, Dianhai Yu
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments based on two datasets, a human-labeled dataset and a click-through dataset, and compare our framework with state-of-the-art IR models. Experimental results show that the RLTM framework not only achieves higher accuracy but also accomplish lower computational cost compared to the baselines. |
| Researcher Affiliation | Industry | Chen Zheng , Yu Sun , Shengxian Wan , Dianhai Yu Baidu Inc., Beijing, China {zhengchen02, sunyu02, wanshengxian, yudianhai}@baidu.com |
| Pseudocode | Yes | Algorithm 1 Reinforced Long Text Matching (RLTM) |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for their methodology is open-source or publicly available. |
| Open Datasets | No | We conduct our experiments on two large-scale datasets, both of them are from one Chinese search engine. The first dataset, named Human-Label dataset, is a human-annotated dataset. The second dataset, named Click-Through data, is sampled from the click-through search log. The paper does not provide public access information (links, repositories, or formal citations) for these datasets. |
| Dataset Splits | Yes | Queries for train 81922 Queries for validation 6228 Queries for test 7312 |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments. It only mentions 'TensorFlow' as the implementation framework. |
| Software Dependencies | No | We implemented all the models using Tensor Flow. We used stochastic gradient descent method, Adam[Kingma and Ba, 2014], as our optimizer for the training. The paper mentions TensorFlow and Adam but does not provide specific version numbers for any software or libraries. |
| Experiment Setup | Yes | We set the batch size to 32 and selected the learning rate from [1e-1, 1e-2, 1e-3, 1e-4, 1e-5]. For the reinforced sentence selection model, the fully connected hidden size is 128, and we choose the number of sentences as (1, 3, 5). For Match Pyramid, we set the query window size to 2, sentence and sentence window size to 4. And the kernel size is 128. For K-NRM, we set the number of bins to 11. |