Visualizing second language research data in R using ggplot2

I've recently given a workshop at Donuts and Distribution, a statistics reading group for the Second Language Studies program at Michigan State University, on the topic of visualizing data. The presentation slides and workshop materials can be found on my RPubs page, Part 1 and Part 2. These were designed for novice/intermediate audience. Part 1 … Continue reading Visualizing second language research data in R using ggplot2

Advertisements

Getting comfortable with the RStudio environment

I have started using R relatively recently, but I see more and more people learning and using it around me. In this post, I offer little tips that might be helpful for beginners setting up their RStudio environment. It is small things that make your lives better, and if you don't already know, here are … Continue reading Getting comfortable with the RStudio environment

A guide to using the Stanford CoreNLP Tools for automatic text annotation

As the title suggests, this is a guide to automatically annotating raw texts using the Stanford CoreNLP. This tool carries out a similar function as the cleanNLP and spaCy combination that I have discussed in a previous post. When working with CoreNLP, the annotation itself does not require using R and the annotated output is … Continue reading A guide to using the Stanford CoreNLP Tools for automatic text annotation

A basic guide to using NLP for corpus analysis with R (Part 2): Processing text files

If you're working with language data, you probably want to process text files rather than strings of words you type on to an R script. Here is how to deal with files. Refer to the previous post for setting the tools up if needed.  Again, please see the pdf version to see the R script output. … Continue reading A basic guide to using NLP for corpus analysis with R (Part 2): Processing text files

A basic guide to using NLP for corpus analysis with R (Part 1): Installing Python, spaCy, and cleanNLP

This is Part 1 of a basic guide for setting up and using a natural language processing (NLP) tool with R. I specifically utilze the spaCy “industrial strength natural language processing” Python library, and an R wrapper called cleanNLP that provides tools for annotating texts and obtaining data tables. In this post, I will explain … Continue reading A basic guide to using NLP for corpus analysis with R (Part 1): Installing Python, spaCy, and cleanNLP