Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hadamax Encoding: Elevating Performance in Model-Free Atari

Authors: Jacob Eeuwe Kooi, Zhao Yang, Vincent Francois-Lavet

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This work introduces a novel encoder architecture for pixel-based model-free reinforcement learning. The Hadamax (Hadamard max-pooling) encoder achieves state-of-the-art performance by max-pooling Hadamard products between GELU-activated parallel hidden layers. Based on the recent PQN algorithm, the Hadamax encoder achieves state-of-the-art model-free performance in the Atari-57 benchmark. Specifically, without applying any algorithmic hyperparameter modifications, Hadamax-PQN achieves an 80% performance gain over vanilla PQN and significantly surpasses Rainbow-DQN. For reproducibility, the full code is available on Git Hub.
Researcher Affiliation Academia Jacob E. Kooi EMAIL Department of Computer Science Vrije Universiteit Amsterdam. Zhao Yang EMAIL Department of Computer Science Vrije Universiteit Amsterdam. Vincent Franรงois-Lavet EMAIL Department of Computer Science Vrije Universiteit Amsterdam.
Pseudocode Yes B Hadamax Encoder Code. We provide the full JAX-based code of the Hadamax encoder for reproducibility purposes. 1 # Input = input_obs , a frame -stacked Atari observation 2 x = jnp.transpose(input_obs , (0, 2, 3, 1)) 3 x = x / 255.0 4 # First block 5 x1 = nn.Conv (32, kernel_size =(8, 8), strides =(1, 1), padding="SAME", 6 kernel_init=nn.initializers. xavier_normal ())(x) 7 x2 = nn.Conv (32, kernel_size =(8, 8), strides =(1, 1), padding="SAME", 8 kernel_init=nn.initializers. xavier_normal ())(x) 9 x1 = normalize(x1) # Normalize before activation 10 x2 = normalize(x2) # Normalize before activation 11 x1 = nn.gelu(x1) # Apply activation 12 x2 = nn.gelu(x2) # Apply activation 13 x = x1 * x2 # Hadamard product 14 x = max_pool(x, window_shape =(4, 4), strides =(4, 4), padding="SAME") 15 # Second block 16 x1 = nn.Conv (64, kernel_size =(4, 4), strides =(1, 1), padding="SAME", 17 kernel_init=nn.initializers. xavier_normal ())(x) 18 x2 = nn.Conv (64, kernel_size =(4, 4), strides =(1, 1), padding="SAME", 19 kernel_init=nn.initializers. xavier_normal ())(x) 20 x1 = normalize(x1) # Normalize before activation 21 x2 = normalize(x2) # Normalize before activation 22 x1 = nn.gelu(x1) # Apply activation 23 x2 = nn.gelu(x2) # Apply activation 24 x = x1 * x2 # Hadamard product 25 x = max_pool(x, window_shape =(2, 2), strides =(2, 2), padding="SAME") 26 # Third block 27 x1 = nn.Conv (64, kernel_size =(3, 3), strides =(1, 1), padding="SAME", 28 kernel_init=nn.initializers. xavier_normal ())(x) 29 x2 = nn.Conv (64, kernel_size =(3, 3), strides =(1, 1), padding="SAME", 30 kernel_init=nn.initializers. xavier_normal ())(x) 31 x1 = normalize(x1) # Normalize before activation 32 x2 = normalize(x2) # Normalize before activation 33 x1 = nn.gelu(x1) # Apply activation 34 x2 = nn.gelu(x2) # Apply activation 35 x = x1 * x2 # Hadamard product 36 x = max_pool(x, window_shape =(3, 3), strides =(1, 1), padding="SAME") 37 # Flatten for MLP layer 38 x = x.reshape ((x.shape [0], -1)) 39 x = nn.Dense (512, kernel_init=nn.initializers.he_normal ())(x) 40 x = normalize(x) 41 x = nn.gelu(x) 42 x = nn.Dense(self.action_dim , name="action_dense")(x) # Final Q-Values
Open Source Code Yes For reproducibility, the full code is available on Git Hub.
Open Datasets Yes Environments: The full 57-game Atari domain [6] is used as a standardized benchmark for evaluating our algorithm s performance. ... More details on each game can be found at https://ale.farama.org. ... Additional experiments have been conducted on the pixel-based Viz Doom environment [30].
Dataset Splits Yes To manage computational load, ablations are done on 40M frames, while comparison with baselines is done at the official 200M frame scores.
Hardware Specification Yes We run all our experiments on a HPC cluster equipped with A100 GPUs.
Software Dependencies No Our work builds upon PQN [17], which leverages Env Pool and Pure Jax RL, achieving greater efficiency compared to conventional Py Torch-based implementations. With the Hadamax encoder, we further architecturally improve PQN to the point that it significantly surpasses Rainbow-DQN, while remaining more than an order of magnitude faster. ... Since the whole PQN codebase is in Jax, we implement the Hadamax encoder for PQN in Jax as well. As Implementations of Rainbow, C51 and DQN from cleanrl are in Py Torch, we also implement the Hadamax encoder for these agents in Py Torch.
Experiment Setup Yes C.1 Hyperparameters. Table 4: Atari Hyperparameters for PQN, PQN (Res Net-15) and Hadamax-PQN. These hyperparameters are equal to the original hyperparameters from the PQN baseline [17].