Tuesday, September 20, 2011

Python: range() vs. xrange()

I recently discovered a really cool Python module called timeit. It let's you efficiently time blocks of Python code. So I decided to see the comparisons between range() and xrange(). I decided to look at a few aspects between them - the time to create them, compute the length (len()), and to iterate over them. As a note, each data point was achieved by running the block of code 1,000 times.

As many other websites state, there are cases when to use both. This is not a tutorial of what case is best for both, but rather a peek inside to help better understand them.

 We can clearly see that using range() increases linearly, algorithmically this is O(n). Using xrange() runs in constant time, O(1). Understanding the inner workings of the two reveals why. When using range, it goes ahead and computes all the numbers in the list and stores that array. This will cause it to use more memory since it has to store all the values. xrange is different in that it will compute the next number in the list when it is requested, not when it is created.

Computing the length using the len() function can be approximated that it runs in linear time. Since the time to compute the length is fractions of a second, the consistency between testings varies (I ran this several times and nothing seemed to be very consistent). As a note, the creation of the range and xrange where not timed in the computing of the length.

Most of the time these are used in a loop, but this tested just the basic iteration. Again the creation of the range and xrange were not timed. Since the iteration runs in linear time, we can assume that single element access to them run in a constant time, which is what is expected for both. Since normal iteration in code usually involves creating the range or xrange in the for statement, it is best to use xrange, especially for large values.

The script to create this data can be downloaded here. (Python 2.7 was used, but may work under different versions.)

As a side note, it is faster to run "a,b = b,a" than to run "t = a; a = b; b = t" to swap variables!