For reference
-
Tidy Data
Journal of Statistical Software, 2014, 59 (10): 1-23.
Hadley Wickham -
Tricks for Cleaning your Data in R
Tutorial, 2017: 1-12.
Christine Zhang -
Data Wrangling Cheat Sheet
dplyr & tidyr Cheat Sheet
RStudio
DataCamp
-
Working with Data in the Tidyverse (Chapters 1 and 2 only)
Data for In-class Workshop
1. Gapminder exercise
Gapminder data: GM.csv
Choose a dataset to analysis for the exercise: Gapminder datasets
Gapminder R example script: R_Basics_II_Exercise_1_Example.R
Gapminder R exercise script: R_Basics_II_Exercise_1.R
2. Russian social media troll exercise
Russian IRA data: GitHub repo
Russian IRA dataset 1 of 13: Local data
Instructions:
- What are the most frequent and second-most frequent languages?
- What region had the most tweets received by followers?
- On average, how many followers did each tweet reach in each region?
- How many tweets are retweets in each language?
- How many tweets are not retweets in each language?
- How frequently are Trump and Clinton mentioned in the tweets?
3. Visualization exercise
Class exercises: Visualization_Exercise.R