Generating all Possible Palindromes from Ngram Corpora

Authors: Alexandre Papadopoulos, Pierre Roy, Jean-Charles Régin, François Pachet

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We applied our algorithm on several text corpora, and tried to discover interesting palindromes. To improve the entertaining value of palindromes, we tried to generate palindromes alluding to a certain topic. To achieve this, we chose corpora with a very distinctive topic. In order to counterbalance the drastic effect of the palindrome constraint, we enriched the Markov model with 2-grams from the Google Ngram Corpus, with a frequency of occurrence higher than 50,000 (2.4M in total). From a combinatorial point of view, our approach has proved to be very tractable. It takes less than 3 seconds to build the palindrome graph, which has 2,459 vertices, and we can generate 10,000 palindromes in less than a second.
Researcher Affiliation Collaboration Alexandre Papadopoulos,1,2 Pierre Roy,1 Jean-Charles Régin,3 and François Pachet1,2 1SONY CSL, 6 rue Amyot, 75005 Paris 2Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6, F-75005, Paris, France 3Université Nice-Sophia Antipolis, I3S UMR 6070, CNRS, France
Pseudocode Yes Algorithm 1: Forward and backward graph creation; Algorithm 2: Odd Palindrome Graph
Open Source Code No The paper does not provide explicit statements or links to open-source code for the described methodology.
Open Datasets Yes the Google Books Ngram Corpus [Michel et al., 2011]; King James Bible (the English language translation of the Old and New Testament from 1611); ten books on military art downloaded from Project Gutenberg1
Dataset Splits No The paper describes using entire corpora (e.g., King James Bible, Google Ngram Corpus) to build models and generate palindromes, but it does not specify any explicit training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models or memory specifications used for running experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup No The paper describes the algorithmic approach and generation process, but it does not include specific details such as hyperparameter values, learning rates, or detailed training configurations for the experimental setup.