0:00 – Start
1:32 – Workshop Goals
3:50 – Introduction to Text Mining
14:18 – How to get the code for this workshop
15:07 – CODING BEGINS
15:30 – Tokenization
16:43 – unnest_tokens()
19:18 – data cleaning
21:03 – Assign line numbers
22:22 – tokenize
23:00 – stop words
25:52 – Count word frequency
26:31 – Visualize word frequency
28:14 – Your turn
29:07 – Q/A
29:57 – Sentiment Analysis
36:04 – Visualize word frequency with a bar graph. e.g. most frequent positive and negative words
36:28 – ggplot2::geom_col() to generate bar graph
38:03 – sentiment dictionaries
40:17 – visualize sentiment when using AFINN sentiment dictionary
41:43 – Q/A part 2.
Apply the lessons of _Text Mining with R_ by Silge & Robinson. First, analyze public domain novels by Jane Austen, wrangle text-data into submission, tokenize corpora, generate word clouds, and be introduced to introductory sentiment analysis.
This Rfun case study demonstrate the utility R / Tidyverse workflows. You can use the Tidyverse as a universal reproducible interface for your analysis projects.
More Rfun at
Part of the DVS Workshop Series:
LINKS
– Code for this workshop:
Documentation: _Text mining with R: a tidy approach_ by Julia Silge & David Robinson ::
tidytext: Text mining using tidy tools ::
Very helpful, thx!
As a german, I have to add that it's not "farfegnugen" but "Fahrvergnügen" – who knows maybe Jane Austen included it in her novels with the correct spelling 😉
Great class.
Keep up the good work.
Thank You,
Natasha Samuel