reproducibilityindex.ai

Introspective Planning: Aligning Robots' Uncertainty with Inherent Task Ambiguity

Authors: Kaiqu Liang, Zixu Zhang, Jaime Fisac

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluations on three tasks, including a newly introduced safe mobile manipulation benchmark, demonstrate that introspection substantially improves both compliance and safety over state-of-the-art LLM-based planning methods.
Researcher Affiliation	Academia	Kaiqu Liang Zixu Zhang Jaime Fernández Fisac Princeton University {kl2471,zixuz,jfisac}@princeton.edu
Pseudocode	Yes	Algorithm 1 Knowledge Base Construction Algorithm 2 Introspective Conformal Prediction
Open Source Code	Yes	The webpage and code are accessible at https://introplan.github.io.
Open Datasets	Yes	Mobile Manipulation: The original calibration or training dataset comprises 400 examples, while the test set includes 200 examples. ... The original dataset follows the same distribution of different types of examples as in Know No [31]...
Dataset Splits	Yes	Mobile Manipulation: The original calibration or training dataset comprises 400 examples, while the test set includes 200 examples. ... All calibration processes used the same dataset with 400 instances.
Hardware Specification	Yes	All experiments were conducted on a Mac Book Pro laptop with an Apple Silicon M2 Pro chip and 16GB memory.
Software Dependencies	Yes	We implemented all tasks using Open AI s GPT-3.5 (text-davinci-003) and GPT-4 Turbo (gpt-4-1106-preview). We employed Sentence BERT [30] to encode instructions...
Experiment Setup	Yes	We used the default temperature of 0 to sample the LLM s response. ... We employed Sentence BERT [30] to encode instructions, retrieving the top m = 3 text embeddings based on cosine similarity. ... The knowledge base and the calibration set contain 400 tasks each.