Understanding Actors and Evaluating Personae with Gaussian Embeddings

Authors: Hannah Kim, Denys Katerenchuk, Daniel Billet, Jun Huan, Haesun Park, Boyang Li6570-6577

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose two actor-modeling tasks, cast prediction and versatility ranking, which can capture complementary aspects of the relation between actors and the characters they portray. For an actor model, we present a technique for embedding actors, movies, character roles, genres, and descriptive keywords as Gaussian distributions and translation vectors, where the Gaussian variance corresponds to actors versatility. Empirical results indicate that (1) the technique considerably outperforms Trans E (Bordes et al. 2013) and ablation baselines and (2) automatically identified persona topics (Bamman, O Connor, and Smith 2013) yield statistically significant improvements in both tasks, whereas simplistic persona descriptors including age and gender perform inconsistently, validating prior research.
Researcher Affiliation Collaboration Hannah Kim,1* Denys Katerenchuk,2* Daniel Billet,3 Jun Huan,4 Haesun Park,1 Boyang Li4* 1Georgia Institute of Technology, 2The City University of New York, 3Independent Actor, 4Big Data Lab, Baidu Research
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found.
Open Source Code Yes We create a set of versatility rankings by four domain experts. The data and the model will be released.
Open Datasets Yes With permission from The Movie Database3, we collected the metadata of 335,037 movies with complete cast lists, genres, and user-supplied keywords. (3www.themoviedb.org)
Dataset Splits Yes The movie-persona-actor triples are randomly split into a 70% training set, a 15% validation set, and a 15% test set.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) were mentioned.
Experiment Setup Yes The dimension D is set to 40, margin φ is 4, batch size is 128, and dropout probability is 0.6. The learning rate starts at 0.15 and is reduced to 0.0001 over 600 epochs. The optimization is performed using RMSProp. For the AG version, σmin is set to 0.0001 and σmax is set to 25. For every other method, σmax is set to 100.