Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
Authors: Core Francisco Park, Maya Okawa, Andrew Lee, Ekdeep S Lubana, Hidenori Tanaka
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose analyzing a model s learning dynamics via a framework we call the concept space, where each axis represents an independent concept underlying the data generating process. By characterizing learning dynamics in this space, we identify how the speed at which a concept is learned, and hence the order of concept learning, is controlled by properties of the data we term concept signal. Further, we observe moments of sudden turns in the direction of a model s learning dynamics in concept space. Surprisingly, these points precisely correspond to the emergence of hidden capabilities, i.e., where latent interventions show the model possesses the capability to manipulate a concept, but these capabilities cannot yet be elicited via naive input prompting. While our results focus on synthetically defined toy datasets, we hypothesize a general claim on emergence of hidden capabilities may hold: generative models possess latent capabilities that emerge suddenly and consistently during training, though a model might not exhibit these capabilities under naive input prompting. |
| Researcher Affiliation | Collaboration | Core Francisco Park 1,2,3, Maya Okawa 1,3, Andrew Lee4 Hidenori Tanaka 1,3, Ekdeep Singh Lubana 1,3 1CBS-NTT Program in Physics of Intelligence, Harvard University 2Department of Physics, Harvard University 3Physics & Informatics Laboratories, NTT Research, Inc. 4EECS Department, University of Michigan, Ann Arbor |
| Pseudocode | No | The paper describes methods and processes using mathematical equations and descriptive text but does not include any clearly labeled pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | Our code is available at https://github.com/cfpark00/concept-learning. |
| Open Datasets | Yes | Specifically, we use the Celeb A dataset [98], which contains fine-grained attributes corresponding to concepts like Gender, With Hat, and Smiling, and analyze two settings. ... [98] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015. |
| Dataset Splits | Yes | The classifier achieved a final accuracy of respectively 95% and 97% on the held out validation set, which was 10% of the entire dataset. |
| Hardware Specification | Yes | The diffusion model was trained on four Nvidia A100 GPUs and 64 CPUs for the data generating process. A standard model run (e.g., in Sec. 4.2) took 20 minutes on a single NVIDIA A100 40GB GPU. The Celeb A runs took 24 hours on the same GPU. |
| Software Dependencies | No | The paper mentions software like 'Py Torch [124]', 'Adam W [122]', and 'Adam optimizer [123]', but it does not specify explicit version numbers for these software packages (e.g., 'PyTorch 1.9'). |
| Experiment Setup | Yes | We use the Adam W [122] optimizer with learning rate 0.001 and weight decay 0.01 to optimize the parameters of our network. We train our networks for 20K gradient steps. We use the default values for the decay rates: β1 = 0.9, β2 = 0.999. ... We conducted a hyperparameter search, testing batch sizes from 32 to 256, number of channels per layer from 64 to 512, learning rates between 10 4 and 10 3, the number of steps in the diffusion process from 100 to 400, weight decay between 3 10 3 and 5 10 2, and model weight initialization scale between N(0, 0.003) and N(0, 1). |