Understanding Actors and Evaluating Personae with Gaussian Embeddings
Authors: Hannah Kim, Denys Katerenchuk, Daniel Billet, Jun Huan, Haesun Park, Boyang Li6570-6577
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose two actor-modeling tasks, cast prediction and versatility ranking, which can capture complementary aspects of the relation between actors and the characters they portray. For an actor model, we present a technique for embedding actors, movies, character roles, genres, and descriptive keywords as Gaussian distributions and translation vectors, where the Gaussian variance corresponds to actors versatility. Empirical results indicate that (1) the technique considerably outperforms Trans E (Bordes et al. 2013) and ablation baselines and (2) automatically identified persona topics (Bamman, O Connor, and Smith 2013) yield statistically significant improvements in both tasks, whereas simplistic persona descriptors including age and gender perform inconsistently, validating prior research. |
| Researcher Affiliation | Collaboration | Hannah Kim,1* Denys Katerenchuk,2* Daniel Billet,3 Jun Huan,4 Haesun Park,1 Boyang Li4* 1Georgia Institute of Technology, 2The City University of New York, 3Independent Actor, 4Big Data Lab, Baidu Research |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found. |
| Open Source Code | Yes | We create a set of versatility rankings by four domain experts. The data and the model will be released. |
| Open Datasets | Yes | With permission from The Movie Database3, we collected the metadata of 335,037 movies with complete cast lists, genres, and user-supplied keywords. (3www.themoviedb.org) |
| Dataset Splits | Yes | The movie-persona-actor triples are randomly split into a 70% training set, a 15% validation set, and a 15% test set. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) were mentioned. |
| Experiment Setup | Yes | The dimension D is set to 40, margin φ is 4, batch size is 128, and dropout probability is 0.6. The learning rate starts at 0.15 and is reduced to 0.0001 over 600 epochs. The optimization is performed using RMSProp. For the AG version, σmin is set to 0.0001 and σmax is set to 25. For every other method, σmax is set to 100. |