Learning Embeddings for Sequential Tasks Using Population of Agents
Authors: Mridul Mahajan, Georgios Tzannetos, Goran Radanovic, Adish Singla
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present an information-theoretic framework to learn fixed-dimensional embeddings for tasks in reinforcement learning. ... In addition to qualitative assessment, we empirically demonstrate the effectiveness of our techniques based on task embeddings by quantitative comparisons against strong baselines on two application scenarios: predicting an agent s performance on a new task by observing its performance on a small quiz of tasks, and selecting tasks with desired characteristics from a given set of options. 5 Experiments: Visualizing Embedding Spaces, 6 Experiments: Comparison with Prior Work, and 7 Experiments: Application Scenarios. |
| Researcher Affiliation | Academia | Mridul Mahajan , Georgios Tzannetos , Goran Radanovic and Adish Singla Max Planck Institute for Software Systems {mrmahaja, gtzannet, gradanovic, adishs}@mpi-sws.org |
| Pseudocode | Yes | Algorithm 1 Learn the Task Embedding Function (fϕ) |
| Open Source Code | Yes | 1Git Hub repository: https://github.com/machine-teaching-group/ijcai2024-task-embeddings-rl. |
| Open Datasets | Yes | We evaluate our framework on environments with diverse characteristics... CARTPOLEVAR ... is a variation of the classic control task from Open AI gym [Brockman et al., 2016]... KAREL from [Bunel et al., 2018]... BASICKAREL [Tzannetos et al., 2023] |
| Dataset Splits | Yes | We create benchmarks for this scenario by generating datasets for quiz sizes ranging from 1 to 20, with 5000 examples for both training and testing. Performance prediction techniques are evaluated by partitioning each dataset into 10 folds. |
| Hardware Specification | No | No specific hardware (e.g., GPU models, CPU types, memory) used for running experiments was mentioned in the paper. |
| Software Dependencies | No | The paper mentions environments and frameworks (e.g., Open AI gym, neural networks implying PyTorch/TensorFlow usage), but it does not specify version numbers for any software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | Finally, we parameterize the embedding function fϕ(.) with a neural network, optimizing its parameters as described in Algorithm 1. Algorithm 1 Learn the Task Embedding Function (fϕ) ... N, Hyperparameter λ, Number of iterations M |