Introspective Planning: Aligning Robots' Uncertainty with Inherent Task Ambiguity

Authors: Kaiqu Liang, Zixu Zhang, Jaime Fisac

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluations on three tasks, including a newly introduced safe mobile manipulation benchmark, demonstrate that introspection substantially improves both compliance and safety over state-of-the-art LLM-based planning methods.
Researcher Affiliation Academia Kaiqu Liang Zixu Zhang Jaime Fernández Fisac Princeton University {kl2471,zixuz,jfisac}@princeton.edu
Pseudocode Yes Algorithm 1 Knowledge Base Construction Algorithm 2 Introspective Conformal Prediction
Open Source Code Yes The webpage and code are accessible at https://introplan.github.io.
Open Datasets Yes Mobile Manipulation: The original calibration or training dataset comprises 400 examples, while the test set includes 200 examples. ... The original dataset follows the same distribution of different types of examples as in Know No [31]...
Dataset Splits Yes Mobile Manipulation: The original calibration or training dataset comprises 400 examples, while the test set includes 200 examples. ... All calibration processes used the same dataset with 400 instances.
Hardware Specification Yes All experiments were conducted on a Mac Book Pro laptop with an Apple Silicon M2 Pro chip and 16GB memory.
Software Dependencies Yes We implemented all tasks using Open AI s GPT-3.5 (text-davinci-003) and GPT-4 Turbo (gpt-4-1106-preview). We employed Sentence BERT [30] to encode instructions...
Experiment Setup Yes We used the default temperature of 0 to sample the LLM s response. ... We employed Sentence BERT [30] to encode instructions, retrieving the top m = 3 text embeddings based on cosine similarity. ... The knowledge base and the calibration set contain 400 tasks each.