Text Mining with R: Top Keywords of the useR! 2016

From June 27-30 the international R user and developer community will meet in Stanford, California for the useR! 2016 Conference. Right in the heart of Silicon Valley, gripping presentations and talks will cover a broad range of topics from R-related computing issues to general statistical topics. In case you are wondering what the most popular topics will be at this year’s useR!, we can help you out.

Growing anticipation and curiosity drove us to a brief text mining analysis identifying the top keywords of the useR! 2016. We therefore examined the abstracts of the contributed talks (http://schedule.user2016.org/) in R and created two different word clouds illustrating their popularity and significance.

Word Cloud with the top Keywords of the useR! 2016
Unweighted Word Cloud with the top Keywords of the useR! 2016

The first word cloud is based only on frequency of mentions of specific terms. Obviously, R is one of the most frequently used words in the abstracts, which is hardly surprising for this kind of Event.

Unweighted Wordcloud with the Keywords of the useR! 2016
Weighted Wordcloud with the Keywords of the useR! 2016

The second word cloud additionally applied a tf-idf weighting in order to reflect the importance of a word by, roughly speaking taking into account the fraction of abstracts in which the term occurs. Therefore, the term R appears much smaller in the weighted word cloud than in the unweighted.

We also generated two bar charts illustrating the findings:

Top 10 most frequent terms from useR! 2016 abstracts.
Top 10 most frequent terms from useR! 2016 abstracts (tf-idf weighted)

This text mining analysis provides a good overview of the top terms at the upcoming useR! 2016. The results might not be surprising. However, they might make you even more excited about the approaching start of this event.

If you want to take a look at the script and the data set used for the analysis, visit GitHub.

This way.

Nutzen Sie unsere maßgeschneiderten Lösungen in Data Science und IT-Sicherheit, um Ihre Systeme zu optimieren und Risiken zu minimieren.

Published: 24. June 2016

Author

Christian Schreiner

Christian Schreiner is a marketing specialist at eoda GmbH. His responsibilities include data infrastructures and marketing solutions. In his spare time, he is interested in search engine optimization and trends in online communication.

Row edge-slant Shape Decorative svg added to top
Row edge-slant Shape Decorative svg added to bottom

Get started now:
We look forward to engaging with you.







    Scroll to Top