Learning Embeddings for Sequential Tasks Using Population of Agents

Authors: Mridul Mahajan, Georgios Tzannetos, Goran Radanovic, Adish Singla

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present an information-theoretic framework to learn fixed-dimensional embeddings for tasks in reinforcement learning. ... In addition to qualitative assessment, we empirically demonstrate the effectiveness of our techniques based on task embeddings by quantitative comparisons against strong baselines on two application scenarios: predicting an agent s performance on a new task by observing its performance on a small quiz of tasks, and selecting tasks with desired characteristics from a given set of options. 5 Experiments: Visualizing Embedding Spaces, 6 Experiments: Comparison with Prior Work, and 7 Experiments: Application Scenarios.
Researcher Affiliation Academia Mridul Mahajan , Georgios Tzannetos , Goran Radanovic and Adish Singla Max Planck Institute for Software Systems {mrmahaja, gtzannet, gradanovic, adishs}@mpi-sws.org
Pseudocode Yes Algorithm 1 Learn the Task Embedding Function (fϕ)
Open Source Code Yes 1Git Hub repository: https://github.com/machine-teaching-group/ijcai2024-task-embeddings-rl.
Open Datasets Yes We evaluate our framework on environments with diverse characteristics... CARTPOLEVAR ... is a variation of the classic control task from Open AI gym [Brockman et al., 2016]... KAREL from [Bunel et al., 2018]... BASICKAREL [Tzannetos et al., 2023]
Dataset Splits Yes We create benchmarks for this scenario by generating datasets for quiz sizes ranging from 1 to 20, with 5000 examples for both training and testing. Performance prediction techniques are evaluated by partitioning each dataset into 10 folds.
Hardware Specification No No specific hardware (e.g., GPU models, CPU types, memory) used for running experiments was mentioned in the paper.
Software Dependencies No The paper mentions environments and frameworks (e.g., Open AI gym, neural networks implying PyTorch/TensorFlow usage), but it does not specify version numbers for any software dependencies required to reproduce the experiments.
Experiment Setup Yes Finally, we parameterize the embedding function fϕ(.) with a neural network, optimizing its parameters as described in Algorithm 1. Algorithm 1 Learn the Task Embedding Function (fϕ) ... N, Hyperparameter λ, Number of iterations M