Clustering in Causal Attention Masking
Authors: Nikita Karagodin, Yury Polyanskiy, Philippe Rigollet
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This work is a combination of rigorous mathematical results and non-trivial predictions based on analytical insights and numerical simulations. |
| Researcher Affiliation | Academia | Nikita Karagodin Yury Polyanskiy Philippe Rigollet Laboratory for Information and Decision Systems, MIT, Cambridge, MA, USA Laboratory for Information and Decision Systems, MIT, Cambridge, MA, USA Department of Mathematics, MIT, Cambridge, MA, USA |
| Pseudocode | No | The paper presents mathematical equations and derivations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The NeurIPS Paper Checklist states for Question 5: "The answer NA means that paper does not include experiments requiring code." This indicates that the paper does not provide open-source code for its methodology. |
| Open Datasets | No | The paper describes theoretical models and numerical simulations of particle dynamics (e.g., "n = 32 particles initialized uniformly at random on the sphere") but does not utilize a specific, named dataset or provide access information for any dataset used for training or evaluation. |
| Dataset Splits | No | The paper describes theoretical models and numerical simulations but does not mention the use of validation splits. There is no indication of empirical evaluation on a dataset with standard splits. |
| Hardware Specification | No | The paper describes numerical simulations but does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run these simulations. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, or specialized solvers) used for its numerical simulations or derivations. |
| Experiment Setup | Yes | In all cases we take simple Query and Key matrices K = Q = Id, temperature β = 9 and final time T = 5000 for n = 32 particles initialized uniformly at random on the sphere. Positions of particles at time T are indicated by a red dot. ... Evolution of the system (CSA) with K = Q = V = I2 with n = 200, d = 2, β = 64, strong Rényi centers (red) and Rényi centers (black) with δ = 4β 1/2. |