## Quoting in R

Many R users appear to be big fans of “code capturing” or “non standard evaluation” (NSE) interfaces. In this note we will discuss quoting and non-quoting interfaces in R. The...continue reading.

This note is just a quick follow-up to our last note on correcting the bias in estimated standard deviations for binomial experiments. For normal deviates there is, of course, a...continue reading.

This note is about attempting to remove the bias brought in by using sample standard deviation estimates to estimate an unknown true standard deviation of a population. We establish there...continue reading.

R is designed to make working with statistical models fast, succinct, and reliable. For instance building a model is a one-liner: model <- lm(Petal.Length ~ Sepal.Length, data = iris) And...continue reading.

coalesce is a classic useful SQL operator that picks the first non-NULL value in a sequence of values. We thought we would share a nice version of it for picking...continue reading.

We have our latest note on the theory of data wrangling up here. It discusses the roles of “block records” and “row records” in the cdata data transform tool. With...continue reading.

One of the concepts we teach in both Practical Data Science with R and in our theory of data shaping is the importance of identifying the roles of columns in...continue reading.

R is an interpreted programming language with vectorized data structures. This means a single R command can ask for very many arithmetic operations to be performed. This also means R...continue reading.

Authors: John Mount, and Nina Zumel 2018-10-25 As a followup to our previous post, this post goes a bit deeper into reasoning about data transforms using the cdata package. The...continue reading.

In August of 2003 Thomas Lumley added bquote() to R 1.8.1. This gave R and R users an explicit Lisp-style quasiquotation capability. bquote() and quasiquotation are actually quite powerful. Professor...continue reading.

In our wrapr pipe RJournal article we used piping into ggplot2 layers/geoms/items as an example. Being able to use the same pipe operator for data processing steps and for ggplot2...continue reading.

Saghir Bashir of ilustat recently shared a nice getting started with R and tidyverse guide. In addition they were generous enough to link to Dirk Eddelbuette’s later adaption of the...continue reading.

According to a KDD poll fewer respondents used only R in 2017 than in 2018. At the same time more respondents used only Python in 2017 than in 2016. And...continue reading.

Introduction Let’s take a quick look at a very important and common experimental problem: checking if the difference in success rates of two Binomial experiments is statistically significant. This can...continue reading.

vtreat is a powerful R package for preparing messy real-world data for machine learning. We have further extended the package with a number of features including rquery/rqdatatable integration (allowing vtreat...continue reading.

Our interference from then environment issue was a bit subtle. But there are variations that can be a bit more insidious. Please consider the following. library(“dplyr”) # unrelated value that...continue reading.

It is no great secret: I like value oriented interfaces that preserve referential transparency. It is the side of the public debate I take in R programming. “One of the...continue reading.

I’ve ended up (almost accidentally) collecting a number of different solutions to the “use a column to choose values from other columns in R” problem. Please read on for a...continue reading.

We recently saw a great recurring R question: “how do you use one column to choose a different value for each row?” That is: how do you use a column...continue reading.

We are thrilled to announce our (my and Nina Zumel’s) paper on the dot-pipe has been accepted by the R-Journal! A huge “thank you” to the reviewers and editors for...continue reading.