R: Mapping over a list of lists
As part of the coursera Data Analysis course I had the following code to download and then read in a file:
> file <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
> download.file(file, destfile="americancommunity.csv", method="curl")
> acomm <- read.csv("americancommunity.csv")
We then had to filter the data based on the values in a couple of columns and work out how many rows were returned in each case:
> one <- acomm[acomm$RMS == 4 & !is.na(acomm$RMS)
& acomm$BDS == 3 & !is.na(acomm$BDS), c("RMS")]
> one
[1] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
...
[137] 4 4 4 4 4 4 4 4 4 4 4 4
> two <- acomm[acomm$RMS == 5 & !is.na(acomm$RMS)
& acomm$BDS == 2 & !is.na(acomm$BDS), c("RMS")]
> two
[1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
...
[375] 5 5 5 5 5 5 5 5 5 5 5 5
> three <- acomm[acomm$RMS == 7 & !is.na(acomm$RMS)
& acomm$BDS == 2 & !is.na(acomm$BDS), c("RMS")]
> three
[1] 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
[36] 7 7 7 7 7 7 7 7 7 7 7 7 7 7
So I needed to know how many values were in the variables one, two and three.
I thought I could probably put those lists into another list and then use http://www.math.montana.edu/Rweb/Rhelp/apply.html or one of its variants to get the length of each one.
I usually use the http://www.math.montana.edu/Rweb/Rhelp/c.html function to help me create lists but it’s not helpful in this case as it creates one massive vector with all the values concatenated together:
Calling apply doesn’t have the intended outcome: ~r > lapply(c(one, two, three), length) ... [[582]] [1] 1 [[583]] [1] 1 ~
Instead what we need is the http://www.math.montana.edu/Rweb/Rhelp/list.html function: ~r > lapply(list(one, two, three), length) [[1]] [1] 148 [[2]] [1] 386 [[3]] [1] 49 ~
Et voila!
The code is on github as usual.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.