R: dplyr - mutate with strptime (incompatible size/wrong result size)
Having worked out how to translate a string into a date or NA if it wasn't the appropriate format the next thing I wanted to do was store the result of the transformation in my data frame.
I started off with this:
data = data.frame(x = c("2014-01-01", "2014-02-01", "foo"))
> data
x
1 2014-01-01
2 2014-02-01
3 foo
And when I tried to do the date translation ran into the following error:
> data %>% mutate(y = strptime(x, "%Y-%m-%d"))
Error: wrong result size (11), expected 3 or 1
As I understand it this error is telling us that we are trying to put a value into the data frame which represents 11 rows rather than 3 rows or 1 row.
It turns out that storing POSIXlts in a data frame isn't such a good idea! In this case we can use the as.character function to create a character vector which can be stored in the data frame:
> data %>% mutate(y = strptime(x, "%Y-%m-%d") %>% as.character())
x y
1 2014-01-01 2014-01-01
2 2014-02-01 2014-02-01
3 foo <NA>
We can then get rid of the NA row by using the is.na function:
> data %>% mutate(y = strptime(x, "%Y-%m-%d") %>% as.character()) %>% filter(!is.na(y))
x y
1 2014-01-01 2014-01-01
2 2014-02-01 2014-02-01
And a final tweak so that we have 100% pipelining goodness:
> data %>%
mutate(y = x %>% strptime("%Y-%m-%d") %>% as.character()) %>%
filter(!is.na(y))
x y
1 2014-01-01 2014-01-01
2 2014-02-01 2014-02-01
About the author
Mark Needham is a Developer Relations Engineer for Neo4j, the world's leading graph database.