Python: Combinations of values on and off
In my continued exploration of Kaggle’s Spooky Authors competition, I wanted to run a GridSearch turning on and off different classifiers to work out the best combination.
I therefore needed to generate combinations of 1s and 0s enabling different classifiers.
e.g. if we had 3 classifiers we’d generate these combinations
0 0 1
0 1 0
1 0 0
1 1 0
1 0 1
0 1 1
1 1 1
where...
-
'0 0 1' means: classifier1 is disabled, classifier3 is disabled, classifier3 is enabled
-
'0 1 0' means: classifier1 is disabled, classifier3 is enabled, classifier3 is disabled
-
'1 1 0' means: classifier1 is enabled, classifier3 is enabled, classifier3 is disabled
-
'1 1 1' means: classifier1 is enabled, classifier3 is enabled, classifier3 is enabled
...and so on. In other words, we need to generate the binary representation for all the values from 1 to 2^number of classifiers^-1.
We can write the following code fragments to calculate a 3 bit representation of different numbers:
>>> "{0:0b}".format(1).zfill(3)
'001'
>>> "{0:0b}".format(5).zfill(3)
'101'
>>> "{0:0b}".format(6).zfill(3)
'110'
We need an array of 0s and 1s rather than a string, so let’s use the list function to create our array and then cast each value to an integer:
>>> [int(x) for x in list("{0:0b}".format(1).zfill(3))]
[0, 0, 1]
Finally we can wrap that code inside a list comprehension:
def combinations_on_off(num_classifiers):
return [[int(x) for x in list("{0:0b}".format(i).zfill(num_classifiers))]
for i in range(1, 2 ** num_classifiers)]
And let’s check it works:
>>> for combination in combinations_on_off(3):
print(combination)
[0, 0, 1]
[0, 1, 0]
[0, 1, 1]
[1, 0, 0]
[1, 0, 1]
[1, 1, 0]
[1, 1, 1]
what about if we have 4 classifiers?
>>> for combination in combinations_on_off(4):
print(combination)
[0, 0, 0, 1]
[0, 0, 1, 0]
[0, 0, 1, 1]
[0, 1, 0, 0]
[0, 1, 0, 1]
[0, 1, 1, 0]
[0, 1, 1, 1]
[1, 0, 0, 0]
[1, 0, 0, 1]
[1, 0, 1, 0]
[1, 0, 1, 1]
[1, 1, 0, 0]
[1, 1, 0, 1]
[1, 1, 1, 0]
[1, 1, 1, 1]
Perfect! We can now use this function to help work out which combinations of classifiers are needed.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.