R: substr - Getting a vector of positions
I recently found myself writing an R script to extract parts of a string based on a beginning and end index which is reasonably easy using the https://stat.ethz.ch/R-manual/R-devel/library/base/html/substr.html function:
> substr("mark loves graphs", 0, 4)
[1] "mark"
But what if we have a vector of start and end positions?
> substr("mark loves graphs", c(0, 6), c(4, 10))
[1] "mark"
Hmmm that didn’t work as I expected! It turns out we actually need to use the https://stat.ethz.ch/R-manual/R-devel/library/base/html/substr.html function instead which wasn’t initially obvious to me on reading the documentation:
> substring("mark loves graphs", c(0, 6, 12), c(4, 10, 17))
[1] "mark" "loves" "graphs"
Easy when you know how!
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.