In this series of blog posts introducing tidy eval, we’ve been looking at why tidy eval is important, and terms like “quotation” and “quasiquotation”. The next step is to look at how we can write our own dplyr-style functions in R. This post will look at the following terms and functions: quosures quo() enquo() What is a quosure? Quosures are a topic which come up frequently when talking about tidy eval.
In a previous entry, I introduced the concept of tidy eval. If you’re completely new to tidy eval and haven’t read that post yet, I’d suggest you go back to it before continuing, as this post will build upon the concepts I discussed there. To recap, tidy eval refers to the ‘special’ type of evaluation used by dplyr functions. Whereas in base R, you have to refer the data frame in question if you want to returns particular rows, this is not the case with dplyr functions.
I’m going to begin this post somewhat backwards, and start with the conclusion: tidy eval is important to anyone who writes R functions and uses dplyr and/or tidyr. I’m going to load a couple of packages, and then show you exactly why. library(dplyr) library(rlang) Data wrangling with base R Here’s an example function I have written in base R. Its purpose is to take a data set, and extract values from a single column that match a specific value, with both input and output both being in data frame format.
I previously blogged about using tidy eval with dplyr::mutate, and found that post handy to refer back to. I still haven’t got round to having an in-depth look at the principles of tidy eval, so instead I’m continuing to explore problems as and when they come up. In this post, I’ll be taking a look at using tidy eval with dplyr::filter. Once again, I’ll be using the iris dataset to create examples that should be simple to follow.
I recently attended rstudio::conf, with my favourite talks being those which taught me new things that I am going to use in my day-to-day work. I attended and enjoyed Hadley Wickham’s talk, ‘Tidy eval: programming with dplyr, tidyr, and ggplot2’, although got sidetracked trying to keep up typing whilst listening. When I’m delivering training courses, this is the one thing I advise all attendees not to do - it’s so easy to miss important points whilst running code.
When I started my career in data science, I was in the common position of having familiarity with technologies like R, Python, and SQL, but much less with big data technologies. I remember feeling intimidated by big data; there were lots of different technologies named after animals or making some sort of pun I wasn’t clued up enough to understand. Flash forward 18 months and with experience, some parts of the big data landscape felt a bit more familiar.