Meimei: An Efficient Probabilistic Approach for Semantically Annotating Tables

Authors: Kunihiro Takeoka, Masafumi Oyamada, Shinji Nakadai, Takeshi Okadome281-288

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrated the superiority of the proposed approach over state-of-the-art approaches for semantic annotation of real data (183 human-annotated tables obtained from the UCI Machine Learning Repository).
Researcher Affiliation Collaboration Kunihiro Takeoka NEC Corporation k-takeoka@az.jp.nec.com Masafumi Oyamada NEC Corporation m-oyamada@cq.jp.nec.com Shinji Nakadai NEC Corporation s-nakadai@az.jp.nec.com Takeshi Okadome Kwansei Gakuin University tokadome@acm.org
Pseudocode Yes Algorithm 1 Approximate prediction with Gibbs sampling
Open Source Code No The paper does not provide any explicit statement about releasing open-source code or a link to a code repository.
Open Datasets Yes The dataset we used consists of 183 human-annotated tables (with 781 NE-columns and 4,109 literal-columns) obtained from the UCI Machine Learning repository (Dua and Karra Taniskidou 2017).
Dataset Splits No The paper mentions using a 'training dataset' for optimizing parameters and for evaluation, but it does not specify any particular data splits (e.g., 80/10/10) for training, validation, and testing. It refers to 'human-annotated 183 tables'.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU models, GPU types, memory).
Software Dependencies No The paper mentions software components like 'Poincaré embedding' and 'random forest classifiers' but does not specify their version numbers or other software dependencies with versions required for replication.
Experiment Setup Yes We set the number of iterations in Gibbs sampling to 300 because we observed the convergence at that point and further iterations did not affect the accuracy of the model.