Category: R statistical package

H is for haven

The tidyverse includes many packages meant to make importing, wrangling, analyzing, and visualizing data easier. The haven package allows you to important files from other statistical software, such as SPSS,...continue reading.

G is for group_by

For the letter G, I’d like to introduce a very useful function: group_by. This function lets you group data by one or more variables. By itself, it may not seem...continue reading.

F is for filter

For the letter F – filters! Filters are incredibly useful, especially when combined with the main pipe %>%. I frequently use filters along with ggplot functions, to chart a specific...continue reading.

E is for Exposition Pipe

For the letter E, I want to talk about a set of operators provided by tidyverse (specifically the magrittr package) that makes for much prettier, easier-to-read code: pipes. The main...continue reading.

D is for dummy_cols

For the letter D, I’m going to talk about the dummy_cols functions, which isn’t actually part of the tidyverse, but hey: my posts, my rules. This function is incredibly useful...continue reading.

C is for coalesce

For the letter C, we’ll talk about the coalesce function. If you’re familiar with SQL, you may have seen this function before. It combines two or more variables into a...continue reading.

B is for bind_rows

Moving on to the letter B, today we’ll talk about merging datasets that contain the same variables but add new cases. This is easily done with bind_rows. Let’s say I...continue reading.

A is for arrange

The arrange function allows you to sort a dataset by one or more variable, either ascending or descending. This function is especially helpful if you plan on aggregating your data...continue reading.

New Color Palette for R

As I was preparing some graphics for a presentation recently, I started digging into some of the different color palette options. My motivation was entirely about creating graphics that weren’t...continue reading.