Exploring foci of:
arXiv (Cornell University)
Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding
August 2017 • Erich Schubert, Andreas Spitz, M. Weiler, Johanna Geiß, Michael Gertz
Many word clouds provide no semantics to the word placement, but use a random layout optimized solely for aesthetic purposes. We propose a novel approach to model word significance and word affinity within a document, and in comparison to a large background corpus. We demonstrate its usefulness for generating more meaningful word clouds as a visual summary of a given document. We then select keywords based on their significance and construct the word cloud based on the derived affinity. Based on a modified t-distr…
Computer Science
Artificial Intelligence
Word Embedding
Algorithm
Visualization (Graphics)
Database
Anthropology
Philosophy
Programming Language