TableRAG: Million-Token Table Understanding with Language Models
Authors: Si-An Chen, Lesly Miculicich, Julian Eisenschlos, Zifeng Wang, Zilong Wang, Yanfei Chen, YASUHISA FUJII, Hsuan-Tien Lin, Chen-Yu Lee, Tomas Pfister
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Empirical Studies |
| Researcher Affiliation | Collaboration | 1National Taiwan University, 2Google Cloud AI Research, 3Google Deep Mind, 4UC San Diego |
| Pseudocode | Yes | The pseudocode and an answering example on Arcade QA can be found in Alg. 1 and Fig. 8 respectively. Algorithm 1: Table RAG Algorithm |
| Open Source Code | Yes | The implementation and dataset will be available at https://github.com/google-research/google-research/tree/master/table_rag. |
| Open Datasets | Yes | We build two new million-token benchmarks sourced from the real-world Arcade [26] and BIRD-SQL [7] datasets. Additionally, to assess performance across various scales, we generated synthetic data expanding tables from the Tab Fact dataset to larger sizes, while maintaining consistent questions and key table content for evaluation. |
| Dataset Splits | No | The paper doesn't explicitly provide training/validation/test dataset splits with percentages or counts in the main text. It mentions using 'evaluation' and 'test' but not specific 'validation' splits. |
| Hardware Specification | No | Our experiments employ GPT-3.5-turbo [1], Gemini-1.0-Pro [19] and Mistral-Nemo-Instruct-24073 as LM solvers. In ablation study, we use GPT-3.5-turbo if not specified. We use Open AI s textembedding-3-large4 as the encoder for dense retrieval. |
| Software Dependencies | No | Our experiments employ GPT-3.5-turbo [1], Gemini-1.0-Pro [19] and Mistral-Nemo-Instruct-24073 as LM solvers. In ablation study, we use GPT-3.5-turbo if not specified. We use Open AI s textembedding-3-large4 as the encoder for dense retrieval. For Table RAG, we set the cell encoding budget B = 10, 000 and the retrieval limit K = 5. For Rand Row Sampling and Row Col Retrieval, we increase the retrieval limit to K = 30. |
| Experiment Setup | Yes | Our experiments employ GPT-3.5-turbo [1], Gemini-1.0-Pro [19] and Mistral-Nemo-Instruct-24073 as LM solvers. In ablation study, we use GPT-3.5-turbo if not specified. We use Open AI s textembedding-3-large4 as the encoder for dense retrieval. For Table RAG, we set the cell encoding budget B = 10, 000 and the retrieval limit K = 5. For Rand Row Sampling and Row Col Retrieval, we increase the retrieval limit to K = 30. Each experiment is conducted 10 times and evaluated by majority-voting to ensure the stability and consistency. The evaluation metric is the exact-match accuracy if not specified. |