Skip to Main Menu

Reeeeeeeeeeeegex!!!

What is regex? The official defintion from wikipedia is a sequence of characters that define a search pattern. In English, it’s basically a way of describing the structure of a character string and it’s the most difficult thing to remember. So this a short and ever evolving post to store all the important regex or regular expressions I come across and expect to use often.

I use regex most often when I need to replace some string during some data cleaning. My current favourite way to do this is to use stringr::str_replace_all.

Use \\ to be able to reference (escape) special R characters like *, \ and ..

library(stringr)

fake_word <- "Hell0, m! n@m* is T0ju"
str_replace(fake_word, "\\*", "X")
## [1] "Hell0, m! n@mX is T0ju"

Or to replace everything after a certain pattern

str_replace(fake_word, "m.*", "")
## [1] "Hell0, "

My favourite resources

As always here are some great resources I keep coming back to:

  1. RStudio cheatsheet

I also heard of an RStudio add-in that helps you with this and I thought I bookmarked the tweet but can’t find it right now.

And lastly, here’s Missy:

comments powered by Disqus