Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Vocabulary Alignment in Openly Specified Interactions
Authors: Paula Daniela Chocron, Marco Schorlemmer
JAIR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two techniques that can be used either to learn an alignment from scratch or to repair an existent one, and we evaluate their performance experimentally. |
| Researcher Affiliation | Academia | Paula Chocron EMAIL Artificial Intelligence Research Institute, IIIA-CSIC Bellaterra (Barcelona), Catalonia, Spain Universitat Autònoma de Barcelona Bellaterra (Barcelona), Catalonia, Spain Marco Schorlemmer EMAIL Artificial Intelligence Research Institute, IIIA-CSIC Bellaterra (Barcelona), Catalonia, Spain |
| Pseudocode | No | The paper describes methods and techniques in prose and mathematical notation (e.g., definitions, equations) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific links to source code repositories, an explicit statement of code release, or mention code in supplementary materials for the methodology described. |
| Open Datasets | No | We evaluate the techniques that we propose with randomly generated data, which allows us to abstract away from any implementation details. |
| Dataset Splits | No | A run of an experiment consists of two agents a1 and a2 with vocabularies V1 and V2 who are sequentially given pairs of protocols compatible under one same translation τ. ... We let agents interact 300 times. After each interaction, we measured how close agents were to the correct translation τ. |
| Hardware Specification | No | The vocabularies of 8 words that we used in the experiments were the larger ones for which the technique could be executed on our server. After that, it became prohibitively space-consuming, and was automatically killed. (No specific hardware details are provided, only a general reference to 'our server'). |
| Software Dependencies | No | We used the Nu SMV model checker (Cimatti et al., 2002) to perform all the necessary satisfiability checks. (The tool Nu SMV is mentioned, but no version number is specified.) |
| Experiment Setup | Yes | A run of an experiment consists of two agents a1 and a2 with vocabularies V1 and V2 who are sequentially given pairs of protocols compatible under one same translation τ. ... We let agents interact 300 times. After each interaction, we measured how close agents were to the correct translation τ. ... We used the values r = 0.3 for the punishment parameter of the simple strategy. Each experiment was repeated 10 times, and we averaged the results. ... We show here the experiments for a vocabulary of 12 words and four different protocol sizes. |