What does grep do when it's not running the CPU?

It is quite often related to the page cache.

The first time, the data has to be read (physically) from the disk.

The second time (for not too big files) it is likely to be sitting in the page cache.

So you could issue first a command like cat(1) to bring the (not too big) file into the page cache (i.e. in RAM), then a second grep(1) (or any program reading the file) would generally run faster.

(however, the data still needs to be read from the disk at some time)

See also (sometimes useful in your application programs, but practically rarely) readahead(2) & posix_fadvise(2) and perhaps madvise(2) & sync(2) & fsync(2) etc....

Read also LinuxAteMyRAM.

BTW, this is why it is recommended, when benchmarking a program, to run it several times. Also, this is why it could be useful to buy more RAM (even if you don't run programs using all of it for their data).

If you want to understand more, read some book like e.g. Operating Systems : Three Easy Pieces