· python

Python: Scoping variables to use with timeit

I’ve been playing around with Python’s timeit library to help benchmark some Neo4j cypher queries but I ran into some problems when trying to give it accessible to variables in my program.

I had the following python script which I would call from the terminal using python top-away-scorers.py:

import query_profiler as qp

attempts = [
{"query": '''MATCH (player:Player)-[:played]->stats-[:in]->game, stats-[:for]->team
             WHERE game<-[:away_team]-team
             RETURN player.name, SUM(stats.goals) AS goals
             ORDER BY goals DESC
             LIMIT 10'''}
]

qp.profile(attempts, iterations=5, runs=3)

query_profiler initially read like this:

from py2neo import neo4j
import timeit

graph_db = neo4j.GraphDatabaseService()

def run_query(query, params):
	query = neo4j.CypherQuery(graph_db, query)
	return query.execute(**params).data

def profile(attempts, iterations=10, runs=3):
	print ""

	for attempt in attempts:
		query = attempt["query"]
		potential_params = attempt.get("params")

		params = {} if potential_params == None else potential_params

		timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query", number=iterations, repeat=runs)

		print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query))
		print timings

but when I ran top-away-scorers.py I got the following exception:

$ python top-away-scorers.py

Traceback (most recent call last):
  File "top-away-scorers.py", line 11, in <module>
    qp.profile(attempts, iterations=5, runs=3)
  File "/Users/markhneedham/code/cypher-query-tuning/query_profiler.py", line 19, in profile
    timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query", number=iterations, repeat=runs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 233, in repeat
    return Timer(stmt, setup, timer).repeat(repeat, number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 221, in repeat
    t = self.timeit(number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 194, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
NameError: global name 'query' is not defined

As far as I understand, timeit couldn’t understand what 'query' is because we didn’t explicitly import it in the setup and it doesn’t automatically pick up values from the scope its running in.

I tried adding query and params to the list of imports from query_profiler like so:

def profile(attempts, iterations=10, runs=3):
	print ""

	for attempt in attempts:
		query = attempt["query"]
		potential_params = attempt.get("params")

		params = {} if potential_params == None else potential_params

		timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs)

		print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query))
		print timings

Unfortunately that didn’t work:

$ python top-away-scorers.py

Traceback (most recent call last):
  File "top-away-scorers.py", line 11, in <module>
    qp.profile(attempts, iterations=5, runs=3)
  File "/Users/markhneedham/code/cypher-query-tuning/query_profiler.py", line 21, in profile
    timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 233, in repeat
    return Timer(stmt, setup, timer).repeat(repeat, number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 221, in repeat
    t = self.timeit(number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 194, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 3, in inner
ImportError: cannot import name query

I eventually came across the global keyword which allows me to scope query and params in a way that we can import them from the query_profiler module:

def profile(attempts, iterations=10, runs=3):
	print ""

	for attempt in attempts:
		global query
		query = attempt["query"]
		potential_params = attempt.get("params")

		global params
		params = {} if potential_params == None else potential_params

		timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs)

		print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query))
		print timings

I’m generally wary of using anything global but in this case it seems necessary…​or I’ve completely misunderstood how you’re meant to use timeit.

Any Pythonistas able to shed some light?

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket