Data Science for Social Good — 2014 KDD Highlights
Authors: Wei Wang
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The KDD conference typically has an emphasis on research motivated by real-world applications. The breadth of topics covered in the 2014 research program is truly comprehensive and nicely balanced among social and information networks, data mining for social good, graph mining, statistical techniques for big data, topic modeling, recommender systems, data streams, scalable methods, Web mining, clustering, feature selection, applications to health care and medicine, public safety, advertising, social analytics, personalization, workforce analytics, health, and many more. [...] Li and co-authors (Li 2014) investigated this issue in the context of topic modeling. Sampling is employed in topic modeling inference in order to associate latent variables with observations. Leveraging the sparsity property, they proposed an efficient algorithm that approximates a dense, slowly changing distribution buy the combination of Metropolis-Hastings step, use of sparsity, and amortized constant time sampling via Walker's alias method. It scales linearly to the number of instantiates topics in the document rather than the total number of topics, leading to an order of magnitude speedup. This algorithm is generic, and has wide applications in statistical modeling. This paper was recognized with the Best Research Paper Award. |
| Researcher Affiliation | Academia | Department of Computer Science, University of California, Los Angeles weiwang@cs.ucla.edu |
| Pseudocode | No | No pseudocode or algorithm blocks are present in this paper. |
| Open Source Code | No | No statement or link indicating that open-source code for the content of this paper is available. |
| Open Datasets | No | The paper mentions 'Electronic health records (EHRs)' and other data types that were used by the summarized research papers, but provides no concrete access information (link, DOI, specific repository, or formal citation with author/year for public dataset) for any dataset. |
| Dataset Splits | No | No specific dataset split information (percentages, counts, or citations to predefined splits) is provided for any of the summarized experiments. |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, processor types, or memory amounts) are mentioned for any of the summarized experiments. |
| Software Dependencies | No | No specific ancillary software details (library or solver names with version numbers) are mentioned for any of the summarized experiments. |
| Experiment Setup | No | No specific experimental setup details (hyperparameter values, training configurations, or system-level settings) are provided for any of the summarized experiments. |