Trying to understand python csv .next()

csv.reader is an iterator. It reads a line from the csv every time that .next is called. Here's the documentation: http://docs.python.org/2/library/csv.html. An iterator object can actually return values from a source that is too big to read all at once. using a for loop with an iterator effectively calls .next on each time through the loop.


The csv.reader object is an iterator. An iterator is an object with a next() method that will return the next value available or raise StopIteration if no value is available. The csv.reader will returns value line by line.

The iterators objects are how python implements for loop. At the beginning of the loop, the __iter__ object of the looped over object will be called. It must return an iterator. Then, the next method of that object will be called and the value stored in the loop variable until the next method raises StopIteration exception.

In your example, by adding a call to next before using the variable in the for loop construction, you are removing the first value from the stream of values returned by the iterator.

You can see the same effect with simpler iterators:

iterator = [0, 1, 2, 3, 4, 5].__iter__()
value = iterator.next()
for v in iterator:
    print v,
1 2 3 4 5
print value
0

The header row is "skipped" as a result of calling next(). That's how iterators work.

When you loop over an iterator, its next() method is called each time. Each call advances the iterator. When the for loop starts, the iterator is already at the second row, and it goes from there on.

Here's the documentation on the next() method (here's another piece).

What's important is that csv.reader objects are iterators, just like file object returned by open(). You can iterate over them, but they don't contain all of the lines (or any of the lines) at any given moment.