For reference

  1. Tidy Data
    Journal of Statistical Software, 2014, 59 (10): 1-23.
    Hadley Wickham

  2. Tricks for Cleaning your Data in R
    Tutorial, 2017: 1-12.
    Christine Zhang

  3. Data Wrangling Cheat Sheet
    dplyr & tidyr Cheat Sheet
    RStudio


DataCamp


Data for In-class Workshop

1. Gapminder exercise

Gapminder data: GM.csv

Choose a dataset to analysis for the exercise: Gapminder datasets

Gapminder R example script: R_Basics_II_Exercise_1_Example.R

Gapminder R exercise script: R_Basics_II_Exercise_1.R

2. Russian social media troll exercise

Russian IRA data: GitHub repo

Russian IRA dataset 1 of 13: Local data

Instructions:

  1. What are the most frequent and second-most frequent languages?
  2. What region had the most tweets received by followers?
  3. On average, how many followers did each tweet reach in each region?
  4. How many tweets are retweets in each language?
  5. How many tweets are not retweets in each language?
  6. How frequently are Trump and Clinton mentioned in the tweets?

3. Visualization exercise

Class exercises: Visualization_Exercise.R

4. Assignment to ensure you understand R (optional)

Assignment