reproducibilityindex.ai

Overcoming The Spectral Bias of Neural Value Approximation

Authors: Ge Yang, Anurag Ajay, Pulkit Agrawal

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With just a single line-change, our approach, the Fourier feature networks (FFN) produce state-of-the-art performance on challenging continuous control domains with only a fraction of the compute. ... We scale the use of FFN to high-dimensional continuous control tasks from the Deep Mind control suite (Tassa et al., 2020) using soft actor-critic (SAC, Haarnoja et al. 2018) as the base algorithm. ... We provide extensive empirical analysis on eight common DMC domains and additional results with DDPG in Appendix A.9.
Researcher Affiliation	Academia	NSF AI Institute for Artiﬁcial Intelligence and Fundamental Interactions (IAIFI) Computer Science and Artiﬁcial Intelligence Laboratory (CSAIL) Improbable AI Lab Massachusetts Institute Technology
Pseudocode	Yes	Algorithm Learned Fourier Features (LFF) class LFF(nn.Linear): def init (self, in, out, b scale): super(). init (in, out) init.normal (self.weight, std=b scale/in) init.uniform (self.bias, 1.0, 1.0) def forward(self, x): x = np.pi * super().forward(x) return torch.sin(x)
Open Source Code	Yes	Code and analysis available at https://geyang.github.io/ffn.
Open Datasets	Yes	We scale the use of FFN to high-dimensional continuous control tasks from the Deep Mind control suite (Tassa et al., 2020) using soft actor-critic (SAC, Haarnoja et al. 2018) as the base algorithm. ... We use the implementation from the Open AI gym (Brockman et al., 2016), and discretize the state space into 150 bins.
Dataset Splits	No	The paper does not explicitly provide specific training, validation, and test dataset splits (e.g., percentages, sample counts) for its experiments. It describes data generation for a toy MDP and states using the Deep Mind control suite and OpenAI Gym, which are environments for reinforcement learning where data is typically collected through interaction rather than static splits.
Hardware Specification	No	The paper mentions "MIT Super Cloud and Lincoln Laboratory Supercomputing Center for providing high performance computing resources" but does not specify any particular hardware models (e.g., GPU, CPU models, or memory specifications).
Software Dependencies	No	The paper mentions using a "pytorch codebase from Yarats & Kostrikov (2020)" and builds upon "Dr Qv2 (Yarats et al., 2021)", but it does not specify version numbers for PyTorch or any other software libraries used, which are necessary for reproducible descriptions.
Experiment Setup	Yes	Optimization details We use 4-layer MLP with Re LU Activation, with 400 latent neurons. We use Adam optimization with a learning rate of 1e-4, and optimize for 400 epochs. We use gradient descent with a batch size of 200. ... If d is the input dimension, both MLP and FFN have [40 d, 1024, 1024] as hidden dimension for each of their layers.