One of the key steps in making “improvements” on your R journey is to write code that is more clear, succinct, concise and short; for me, this usually occurs when I need to create multiple plots of variables when exploring a dataset.
I find myself usually making simple bar plots using ggplot and geom_col() to count things. Instead of copying-and-pasting the same ggplot code and altering the the column names in the code like a newbie hack, I recently learned about double curly braces “{{ }}” which allow you to dynamically pass unquoted variable names within functions.
First, we’ll use the palmerpenguins package to make some bar plots (Yeah, I am fully aware that I should’ve worked with some Spotify/Apple Music Bay Area music datasets to make this blog post even more hyphy). I examine the penguins dataset to see which columns are categorical/discrete and numerical/continuous:
An easy first look at exploring a dataset is to simply count the number of items in a variable. I want to determine the counts across the categorical columns and make a bar plot for each. That happens to be the species, island and sex columns. Here’s the long way how to do it:
In the code demonstrated above, I realize that all I am just changing is the name of the columns (species, island, sex) to create the three different plots. The rule of thumb is to succint programming is to avoid duplication of code – twice is fine but thrice is too much! Here, we can attempt to create a function to shorten the number of lines written. We create a function (I named it geomcol_discrete) with the nifty use of the double curly brackets { column } to maintain a tidyverse work flow:
I also created a function to plot histograms for the continuous data geomhist_continuous:
Code
# function - continuous plotsgeomhist_continuous<-function(tbl, column){tbl%>%drop_na()%>%ggplot(aes(x ={{column}}, fill =species))+geom_histogram(alpha =0.8)}# now we quickly plotpenguins%>%geomhist_continuous(bill_length_mm)penguins%>%geomhist_continuous(bill_depth_mm)penguins%>%geomhist_continuous(flipper_length_mm)penguins%>%geomhist_continuous(body_mass_g)
The next step further would be to create a vector of column names so that I can loop the functions. Stay tuned for a future update. As the Bay Area hip-hop lingo would dictate, We out here tryin’ to function!