Learning Partitions from Context
Authors: Simon Buchholz
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we study the problem of learning the structure of a discrete set of N tokens based on their interactions with other tokens. We focus on a setting where the tokens can be partitioned into a small number of classes, and there exists a real-valued function f defined on certain sets of tokens. This function, which captures the interactions between tokens, depends only on the class memberships of its arguments. The goal is to recover the class memberships of all tokens from a finite number of samples of f. We begin by analyzing this problem from both complexity-theoretic and information-theoretic viewpoints. We prove that it is NP-complete in general, and for random instances, we show that on the order of N ln(N) samples, implying very sparse interactions, suffice to identify the partition. We then investigate the conditions under which gradient flow dynamics of token embeddings can reveal the class structure, finding that this is achievable in certain settings when given on the order of N 2 ln2(N) samples. |
| Researcher Affiliation | Academia | Simon Buchholz Department for Empirical Inference Max Planck Institute for Intelligent Systems Tübingen AI Center Tübingen, Germany sbuchholz@tue.mpg.de |
| Pseudocode | No | The paper discusses how the problem can be reformulated as a constraint satisfaction problem and mentions using SAT solvers, stating: 'We outline the construction in more detail in Appendix C.' It does not provide explicit pseudocode or an algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper introduces a theoretical framework and uses abstract 'samples' of a function 'f'. It does not refer to a publicly available or open dataset used for empirical training. |
| Dataset Splits | No | The paper is theoretical and describes simulations using model parameters, not empirical experiments with specified training, validation, and test splits. |
| Hardware Specification | No | The paper is theoretical and presents simulations to illustrate concepts, but it does not specify any particular hardware used for these simulations or analyses. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical analysis and theoretical properties. It does not mention any specific software dependencies with version numbers. |
| Experiment Setup | Yes | Figure 2: Simulation of the setting in Theorem 5 with N = 1000, K = I = 3, D = 2, λ = 0, S = 100.000. (left) trajectories of 50 randomly sampled tokens from 6 different classes. (right) Average distance of token embeddings within a class for different classes (colored) and average distance between all pairs of embeddings (black). |