Python: Equivalent to flatMap for flattening an array of arrays
I found myself wanting to flatten an array of arrays while writing some Python code earlier this afternoon and being lazy my first attempt involved building the flattened array manually:
episodes = [
{"id": 1, "topics": [1,2,3]},
{"id": 2, "topics": [4,5,6]}
]
flattened_episodes = []
for episode in episodes:
for topic in episode["topics"]:
flattened_episodes.append({"id": episode["id"], "topic": topic})
for episode in flattened_episodes:
print episode
If we run that we’ll see this output:
$ python flatten.py
{'topic': 1, 'id': 1}
{'topic': 2, 'id': 1}
{'topic': 3, 'id': 1}
{'topic': 4, 'id': 2}
{'topic': 5, 'id': 2}
{'topic': 6, 'id': 2}
What I was really looking for was the Python equivalent to the flatmap function which I learnt can be achieved in Python with a list comprehension like so:
flattened_episodes = [{"id": episode["id"], "topic": topic}
for episode in episodes
for topic in episode["topics"]]
for episode in flattened_episodes:
print episode
We could also choose to use itertools in which case we’d have the following code:
from itertools import chain, imap
flattened_episodes = chain.from_iterable(
imap(lambda episode: [{"id": episode["id"], "topic": topic}
for topic in episode["topics"]],
episodes))
for episode in flattened_episodes:
print episode
We can then simplify this approach a little by wrapping it up in a 'flatmap' function:
def flatmap(f, items):
return chain.from_iterable(imap(f, items))
flattened_episodes = flatmap(
lambda episode: [{"id": episode["id"], "topic": topic} for topic in episode["topics"]], episodes)
for episode in flattened_episodes:
print episode
I think the list comprehensions approach still works but I need to look into itertools more - it looks like it could work well for other list operations.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.