reproducibilityindex.ai

Functional Regularization for Reinforcement Learning via Learned Fourier Features

Authors: Alexander Li, Deepak Pathak

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on standard state-based and image-based RL benchmarks show clear beneﬁts of our architecture over the baselines.
Researcher Affiliation	Academia	Alexander C. Li Carnegie Mellon University alexanderli@cmu.edu Deepak Pathak Carnegie Mellon University dpathak@cs.cmu.edu
Pseudocode	Yes	Algorithm 1 LFF Py Torch-like pseudocode. class LFF(): def __init__(self, input_size, output_size, n_hidden=1, hidden_dim=256, sigma=1.0, f_dim=256): # create B b_shape = (input_size, f_dim // 2) self.B = Parameter(normal(zeros(b_shape), sigma ones(b_shape))) # create rest of network self.mlp = MLP(in_dims=f_dim + input_size, out_dims=output_size, n_hidden=n_hidden, hidden_dim=hidden_dim) def forward(self, x): proj = (2 np.pi) * matmul(x, self.B) ff = cat([sin(proj), cos(proj), x], dim=-1) return self.mlp.forward(ff)
Open Source Code	Yes	Code available at https://github.com/alexlioralexli/learned-fourier-features
Open Datasets	Yes	We use soft actor-critic (SAC), an entropy-regularized offpolicy RL algorithm [14], to learn 8 environments from the Deep Mind Control Suite [43].
Dataset Splits	No	The paper evaluates performance on reinforcement learning environments (Deep Mind Control Suite) but does not specify fixed training, validation, and test dataset splits in the traditional supervised learning sense. Data is generated through interaction with the environment, and performance is typically evaluated over episodes during training.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory, or cloud instances) used to run the experiments.
Software Dependencies	No	The paper mentions 'PyTorch-like pseudocode' in Algorithm 1, implying the use of PyTorch, but it does not specify any version numbers for PyTorch or other software libraries or dependencies used in the experiments.
Experiment Setup	Yes	Our LFF architecture uses our learnable Fourier feature input layer, followed by 2 hidden layers of 1024 units. We use Fourier dimension dfourier of size 1024. We initialize the entries of our trainable Fourier basis with Bij N(0, σ2), with σ = 0.01 for all environments except Cheetah, Walker, and Hopper, where we use σ = 0.001. [...] The 1x1 conv weights are initialized from N(0, σ2) with σ = 0.1 for Hopper and Cheetah and σ = 0.01 for Finger and Quadruped.