Rissanen Data Analysis: Examining Dataset Characteristics via Description Length
Authors: Ethan Perez, Douwe Kiela, Kyunghyun Cho
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We introduce a method to determine if a certain capability helps to achieve an accurate model of given data... we showcase its applicability on a wide variety of settings in NLP, ranging from evaluating the utility of generating subquestions before answering a question, to analyzing the value of rationales and explanations, to investigating the importance of different parts of speech, and uncovering dataset gender bias. |
| Researcher Affiliation | Collaboration | 1New York University 2Facebook AI Research 3CIFAR Fellow in Learning in Machines & Brains. |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block found in the paper. |
| Open Source Code | Yes | Code at https://github.com/ethanjperez/rda along with a script to conduct RDA on your own dataset. |
| Open Datasets | Yes | To this end, we use CLEVR (Johnson et al., 2017), an image-based question-answering (QA) dataset. |
| Dataset Splits | Yes | To train a model on the first s blocks, we split the available examples into train (90%) and dev (10%) sets, choosing hyperparameters and early stopping epoch using dev loss (codelength). |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are explicitly mentioned in the paper. |
| Software Dependencies | No | The paper mentions software like Hugging Face Transformers, PyTorch Lightning, and FastText, but does not provide specific version numbers for these or other key software components. |
| Experiment Setup | Yes | We use S = 9 blocks where t0 = 0 and t1 = 64 < < t S = N such that ts+1 is constant (log-uniform ts spacing). To train a model on the first s blocks, we split the available examples into train (90%) and dev (10%) sets, choosing hyperparameters and early stopping epoch using dev loss (codelength). We otherwise follow each model s training strategy and hyperparameter ranges as suggested by its original paper. We then evaluate the codelength of the (s + 1)-th block. |