CABINET: Content Relevance-based Noise Reduction for Table Question Answering

Authors: Sohan Patnaik, Heril Changwal, Milan Aggarwal, Sumit Bhatia, Yaman Kumar, Balaji Krishnamurthy

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental CABINET significantly outperforms various tabular LLM baselines, as well as GPT3-based in-context learning methods, is more robust to noise, maintains outperformance on tables of varying sizes, and establishes new So TA performance on Wiki TQ, Fe Ta QA, and Wiki SQL datasets.
Researcher Affiliation Collaboration 1MDSR Lab, Adobe; 2IIT Kharagpur; 3IIT Roorkee {sohanpatnaik106, changwalheril}@gmail.com {milaggar, sumbhati, ykumar, kbalaji}@adobe.com
Pseudocode No The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured steps in a code-like format.
Open Source Code Yes We release our code and datasets here.
Open Datasets Yes We evaluate CABINET on three commonly used datasets (i) Wiki Table Question (Wiki TQ) (Pasupat & Liang, 2015); (ii) Fe Ta QA (Nan et al., 2022); and (iii) Wiki SQL (Zhong et al., 2017)
Dataset Splits Yes Table 9: Dataset Statistics: Dataset # Train samples # Validation samples # Test samples Wiki TQ 11321 2831 4344 Wiki SQL 56355 8421 15878 Fe Ta QA 7326 1001 2003
Hardware Specification Yes We train CABINET and baselines (wherever needed) for 30 epochs on an effective batch size (BS) of 128 using 8 80GB A100 GPUs (BS of 8/GPU with gradient accumulation 2) using a learning rate of 1e 5 with cosine annealing (Loshchilov & Hutter, 2017) through Adam W optimizer (Loshchilov & Hutter, 2019).
Software Dependencies No The paper mentions software like BART-Large, Flan T5-xl, and Adam W optimizer, but does not provide specific version numbers for these software dependencies, only citing their original papers.
Experiment Setup Yes We train CABINET and baselines (wherever needed) for 30 epochs on an effective batch size (BS) of 128 using 8 80GB A100 GPUs (BS of 8/GPU with gradient accumulation 2) using a learning rate of 1e 5 with cosine annealing (Loshchilov & Hutter, 2017) through Adam W optimizer (Loshchilov & Hutter, 2019). We carry out hyper-parameter tuning based on the validation set to come up with the optimal values of learning rate (1e 5), scheduler (Cosine Annealing), batch size (8), gradient accumulation steps (2) and the optimizer (Adam W).