reproducibilityindex.ai

Commonsense Interpretation of Triangle Behavior

Authors: Andrew Gordon

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach using Triangle-COPA, a benchmark suite of 100 challenge problems based on an early social psychology experiment by Fritz Heider and Marianne Simmel. Commonsense knowledge of actions, social relationships, intentions, and emotions are encoded as defeasible axioms in ﬁrst-order logic. We identify sets of assumptions that logically entail observed behaviors by backchaining with these axioms to a given depth, and order these sets by their joint probability assuming conditional independence. Our approach solves almost all (91) of the 100 questions in Triangle-COPA, and demonstrates a promising approach to robust behavior interpretation that integrates both logical and probabilistic reasoning.
Researcher Affiliation	Academia	Andrew S. Gordon Institute for Creative Technologies University of Southern California 12015 Waterfront Drive Los Angeles, California 90094 USA gordon@ict.usc.edu
Pseudocode	No	No pseudocode or algorithm blocks are present.
Open Source Code	Yes	Our software implementation2 accepts the knowledge base and a conjunction of observations as input... 2Available at https://github.com/asgordon/EtcAbductionPy
Open Datasets	Yes	The Triangle Choice of Plausible Alternatives (Triangle-COPA) set of one hundred challenge problems is a recent attempt to overcome these two problems with the original Heider-Simmel movie (Maslan, Roemmele, and Gordon 2015)1... 1Available at https://github.com/asgordon/Triangle COPA
Dataset Splits	No	When assessing this result, it is important to remember that Triangle-COPA was designed as a development test set, not as a held-out test set for use in competitive evaluations.
Hardware Specification	No	No specific hardware details (GPU/CPU models, memory, etc.) are provided.
Software Dependencies	No	No specific software dependencies with version numbers are mentioned (e.g., Python version, specific libraries).
Experiment Setup	Yes	By keeping the backchaining depth parameter low (e.g. below 5) and the n-best list short (e.g. 10 solutions), our implementation is able to exhaustively search through millions of assumption sets for each Triangle-COPA problem in seconds. ... A backchaining depth of 3 was sufﬁcient in all but 1 question, where a depth of 4 was necessary. An n-best list length of 10 was sufﬁcient in all but 1 question, where a length of 27 was necessary.