Reeeeeeeeeeeegex!!!
What is regex
? The official defintion from wikipedia is a sequence of characters that define a search pattern. In English, it’s basically a way of describing the structure of a character string and it’s the most difficult thing to remember. So this a short and ever evolving post to store all the important regex
or regular expressions I come across and expect to use often.
I use regex
most often when I need to replace some string during some data cleaning. My current favourite way to do this is to use stringr::str_replace_all
.
Use \\
to be able to reference (escape) special R
characters like *
, \
and .
.
library(stringr)
fake_word <- "Hell0, m! n@m* is T0ju"
str_replace(fake_word, "\\*", "X")
## [1] "Hell0, m! n@mX is T0ju"
Or to replace everything after a certain pattern
str_replace(fake_word, "m.*", "")
## [1] "Hell0, "
My favourite resources
As always here are some great resources I keep coming back to:
I also heard of an RStudio add-in that helps you with this and I thought I bookmarked the tweet but can’t find it right now.
And lastly, here’s Missy: