reproducibilityindex.ai

CABINET: Content Relevance-based Noise Reduction for Table Question Answering

Authors: Sohan Patnaik, Heril Changwal, Milan Aggarwal, Sumit Bhatia, Yaman Kumar, Balaji Krishnamurthy

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	CABINET significantly outperforms various tabular LLM baselines, as well as GPT3-based in-context learning methods, is more robust to noise, maintains outperformance on tables of varying sizes, and establishes new So TA performance on Wiki TQ, Fe Ta QA, and Wiki SQL datasets.
Researcher Affiliation	Collaboration	1MDSR Lab, Adobe; 2IIT Kharagpur; 3IIT Roorkee {sohanpatnaik106, changwalheril}@gmail.com {milaggar, sumbhati, ykumar, kbalaji}@adobe.com
Pseudocode	No	The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured steps in a code-like format.
Open Source Code	Yes	We release our code and datasets here.
Open Datasets	Yes	We evaluate CABINET on three commonly used datasets (i) Wiki Table Question (Wiki TQ) (Pasupat & Liang, 2015); (ii) Fe Ta QA (Nan et al., 2022); and (iii) Wiki SQL (Zhong et al., 2017)
Dataset Splits	Yes	Table 9: Dataset Statistics: Dataset # Train samples # Validation samples # Test samples Wiki TQ 11321 2831 4344 Wiki SQL 56355 8421 15878 Fe Ta QA 7326 1001 2003
Hardware Specification	Yes	We train CABINET and baselines (wherever needed) for 30 epochs on an effective batch size (BS) of 128 using 8 80GB A100 GPUs (BS of 8/GPU with gradient accumulation 2) using a learning rate of 1e 5 with cosine annealing (Loshchilov & Hutter, 2017) through Adam W optimizer (Loshchilov & Hutter, 2019).
Software Dependencies	No	The paper mentions software like BART-Large, Flan T5-xl, and Adam W optimizer, but does not provide specific version numbers for these software dependencies, only citing their original papers.
Experiment Setup	Yes	We train CABINET and baselines (wherever needed) for 30 epochs on an effective batch size (BS) of 128 using 8 80GB A100 GPUs (BS of 8/GPU with gradient accumulation 2) using a learning rate of 1e 5 with cosine annealing (Loshchilov & Hutter, 2017) through Adam W optimizer (Loshchilov & Hutter, 2019). We carry out hyper-parameter tuning based on the validation set to come up with the optimal values of learning rate (1e 5), scheduler (Cosine Annealing), batch size (8), gradient accumulation steps (2) and the optimizer (Adam W).