Formalizing Consistency and Coherence of Representation Learning

Authors: Harald Strömfelt, Luke Dickens, Artur Garcez, Alessandra Russo

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that relation-decoders that maintain consistency over unobserved regions of representation space retain coherence across domains, whilst achieving better transfer learning performance. In summary, the contributions of this paper are: A formal definition of consistency and coherence for sub-symbolic learners offering a practical evaluation score for concept coherence; A derived model implementation and PRT data set and experimental setup used to evaluate the interplay between concept coherence and concept transfer; A comprehensive critical evaluation of results and comparison of multiple relation-decoder models, showing that improvements in concept coherence, as defined in this paper, correspond with improved concept transfer. In this section, experimental results show that transfer learning performance is positively correlated with our measures for consistency and coherence.
Researcher Affiliation Academia Harald Strömfelt Department of Computing Imperial College London London, SW7 2AZ h.stromfelt17@imperial.ac.uk Luke Dickens Department of Information Studies University College London London, WC1E 6BT l.dickens@ucl.ac.uk Artur d Avila Garcez Department of Computer Science City, University of London London, EC1V 0HB a.garcez@city.ac.uk Alessandra Russo Department of Computing Imperial College London London, SW7 2AZ a.russo@imperial.ac.uk
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The codebase for this paper can be found at https://github.com/HStromfelt/neurips22-FCA.
Open Datasets Yes For domain X we use the MNIST handwritten digits data set [24], and for domain Y we use the proposed Block Stacks data set, consisting of a single stack of multi-colored cubes of differing heights, each containing one randomly-positioned red cube (see Appendix B for details and examples).
Dataset Splits No The paper does not provide specific details on train/validation/test splits, such as percentages or sample counts for each split.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions software like PyTorch and CUDA generally, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes In the source domain, we explore β values in {1, 4, 8, 12} and set λ = 103. In the target domain, we first normalise losses and set β = 10 4 and λ = 10 2, as these produced good image reconstructions while optimising L Yσ. In all experiments we fix Z = R10. We provide further details for all models, including training regimen, parameterization and implementation in Appendix D.