Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input

Authors: Andi Peng, Yuying Sun, Tianmin Shu, David Abel

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on linear bandit settings in both visionand language-based domains. Results support the efficiency of our approach in quickly converging to accurate rewards with fewer comparisons vs. example-only labels. Finally, we validate the real-world applicability with a behavioral experiment on a mushroom foraging task.
Researcher Affiliation Collaboration 1Massachusetts Institute of Technology 2Boston University 3Johns Hopkins University 4Google Deep Mind.
Pseudocode Yes Algorithm 1 Pragmatic Feature Preference Augmentation
Open Source Code Yes Code available at github.com/andipeng/feature-preference
Open Datasets Yes The original dataset can be found at github.com/jlin816/rewards-from-language.
Dataset Splits No The paper does not explicitly provide specific percentages or counts for training, validation, or test dataset splits. It mentions 'training reward models' but lacks the detailed split information required for reproducibility.
Hardware Specification No The paper does not specify any hardware components (e.g., CPU, GPU models, memory) used for running the experiments or training the models.
Software Dependencies No The paper mentions implementing reward models as 'linear networks' and prompting 'GPT-4', but it does not specify any other software libraries, frameworks, or their version numbers necessary for replication.
Experiment Setup Yes We implement all reward models as linear networks (single layer, no activations). Each feature predictor in the joint model is trained independently without sharing parameters, and their resulting outputs are concatenated and fed through a final layer for reward prediction. We swept possible β values and found 0.5 consistently achieved the best performance.