R: dplyr - "Variables not shown"
I recently ran into a problem where the result of applying some operations to a data frame wasn’t being output the way I wanted.
I started with this data frame:
words = function(numberOfWords, lengthOfWord) {
w = c(1:numberOfWords)
for(i in 1:numberOfWords) {
w[i] = paste(sample(letters, lengthOfWord, replace=TRUE), collapse = "")
}
w
}
numberOfRows = 100
df = data.frame(a = sample (1:numberOfRows, 10, replace = TRUE),
b = sample (1:numberOfRows, 10, replace = TRUE),
name = words(numberOfRows, 10))
I wanted to group the data frame by a and b and output a comma separated list of the associated names. I started with this:
> df %>%
group_by(a,b) %>%
summarise(n = n(), words = paste(name, collapse = ",")) %>%
arrange(desc(n)) %>%
head(5)
Source: local data frame [5 x 4]
Groups: a
a b n
1 19 90 10
2 24 36 10
3 29 20 10
4 29 80 10
5 62 54 10
Variables not shown: words (chr)
Unfortunately the words column has been excluded and I came across this Stack Overflow post which suggested that the print.tbl_df function was the one responsible for filtering columns.
Browsing the docs I found a couple of ways to overwrite this behaviour:
> df %>%
group_by(a,b) %>%
summarise(n = n(), words = paste(name, collapse = ",")) %>%
arrange(desc(n)) %>%
head(5) %>%
print(width = Inf)
or
> options(dplyr.width = Inf)
> df %>%
group_by(a,b) %>%
summarise(n = n(), words = paste(name, collapse = ",")) %>%
arrange(desc(n)) %>%
head(5)
And now we see this output instead:
Source: local data frame [5 x 4]
Groups: a
a b n words
1 19 90 10 dfhtcgymxt,zpemxbpnri,rfmkksuavp,jxaarxzdzd,peydpxjizc,trdzchaxiy,arthnxbaeg,kjbpdvvghm,kpvsddlsua,xmysfcynxw
2 24 36 10 wtokzdfecx,eprsvpsdcp,kzgxtwnqli,jbyuicevrn,klriuenjzu,qzgtmkljoy,bonbhmqfaz,uauoybprrl,rzummfbkbx,icyeorwzxl
3 29 20 10 ebubytlosp,vtligdgvqw,ejlqonhuit,jwidjvtark,kmdzcalblg,qzrlewxcsr,eckfgjnkys,vfdaeqbfqi,rumblliqmn,fvezcdfiaz
4 29 80 10 wputpwgayx,lpawiyhzuh,ufykwguynu,nyqnwjallh,abaxicpixl,uirudflazn,wyynsikwcl,usescualww,bkvsowfaab,gfhyifzepx
5 62 54 10 beuegfzssp,gfmegjtrys,wkubhvnkkk,rkhgprxttb,cwsrzulnpo,hzkvjbiywc,gbmiupnlbw,gffovxwtok,uxadfrjvdn,aojjfhxygs
Much better!
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.