A Theory of Unsupervised Translation Motivated by Understanding Animal Communication
Authors: Shafi Goldwasser, David Gruber, Adam Tauman Kalai, Orr Paradise
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We exemplify this theory with two stylized models of language, for which our framework provides bounds on necessary sample complexity; the bounds are formally proven and experimentally verified on synthetic data. |
| Researcher Affiliation | Collaboration | Shafi Goldwasser UC Berkeley & Project CETI shafi.goldwasser@berkeley.edu; David F. Gruber Project CETI david@projectceti.org; Adam Tauman Kalai Microsoft Research & Project CETI adam@kal.ai; Orr Paradise UC Berkeley & Project CETI orrp@eecs.berkeley.edu |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. It provides formal mathematical definitions and proofs. |
| Open Source Code | Yes | code can be found at https://github.com/orrp/theory-of-umt. |
| Open Datasets | No | The paper states, "We validate our theorems generating synthetic data from randomly-generated languages according to each model". While the generation process is described and the code is available, the data itself is synthetic and not provided as a pre-existing public dataset with specific access information (URL, DOI, citation). |
| Dataset Splits | Yes | Number of validation data 1000 (Figure 8) |
| Hardware Specification | Yes | The experiments were run in parallel on an AWS r6i.4xlarge |
| Software Dependencies | No | The paper mentions using the GPT-3 API for certain examples, but it does not specify version numbers for any software, libraries, or dependencies used to run the experiments. |
| Experiment Setup | Yes | Figure 7: Parameters for experiments in the knowledge graph model (Figure 4). ... Figure 8: Parameters for experiments in the common nonsense model (Figure 5). (These figures list specific values for various parameters defining the experimental setup, such as number of nodes, edge density, agreement parameter, and number of samples.) |