Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Collaborative Planning with Encoding of Users’ High-Level Strategies
Authors: Joseph Kim, Christopher Banks, Julie Shah
AAAI 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through human subject experimentation, we empirically demonstrate that this approach results in statistically significant improvements to plan quality, without substantially increasing computation time. |
| Researcher Affiliation | Academia | Joseph Kim Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge, MA 02139 joseph EMAIL Christopher J. Banks Norfolk State University 700 Park Ave Norfolk, VA 23504 EMAIL Julie A. Shah Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge, MA 02139 julie a EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using existing tools like VAL4, LPRPG-P, and OPTIC, but does not provide any statement or link for the open-source code of its own described methodology. |
| Open Datasets | Yes | We selected problems from two domains: Zenotravel and Satellite, both presented during the third IPC. We used the following benchmark problems for Zenotravel: pfile15 (propositional), pfile13 (numerical) and pfile14 (temporal). For Satellite, we used pfile12 (p), pfile5 (n), and pfile9 (t). |
| Dataset Splits | No | The paper describes using benchmark problems from the International Planning Competitions and a human planning dataset, but it does not specify explicit training, validation, or test dataset splits for these problems or data. |
| Hardware Specification | Yes | All tests were performed using an Intel Xeon Processor (2.27 GHz, 12MB Cache, 16 cores) with 16GB of RAM. |
| Software Dependencies | No | The paper mentions specific software tools such as LPRPG-P, OPTIC, and VAL4, but does not provide their version numbers. |
| Experiment Setup | Yes | We ran the automated planners (anytime planners) and report the best-quality plans generated within 30 min of CPUtime. [...] Penalty weights (λ) on encoded preferences were set to equal 20% of the estimated problem cost. |