gnuplot slow when plotting large data set as animation

I suspect gnuplot is reading the whole file every time you plot, as opposite to read up to the line in question, then next line, then next, etc. One possible strategy is to separate your particles trajectory into different files, but specially it could help to remove the plot for by simply a plot plus a block selection with every, where instead of selecting the column for the particle you have your particles positions for the same time step in the same block.

Now your data looks something like this:

x1 y1 x2 y2 x3 y3 # Time step 1
x1 y1 x2 y2 x3 y3 # Time step 2

And gnuplot needs to read the file once for every time step and particle. If you structure the file as follows (note one blank line between blocks):

# Time step 1
x1 y1
x2 y2
x3 y3

# Time step 2
x1 y1
x2 y2
x3 y3

Then you don't need the plot for, instead just select the corresponding block with all the particles by inserting one extra semicolon in every:

set terminal wxt size 1000,600
k=999999
#N = 999 you don't need this anymore!
do for [i=0:k] {
plot "pos.txt" every :::i::i
}

The code above reads the file for every time step, rather than every time step and particle, and plots all the particles at once.


If performance is very critical, you may consider using a completely different data format. Although changing the format of the ASCII file gives a huge improvement, it scales badly, because gnuplot must always scan from the beginning of the data file in order to determine the position where to start at. I did some testing, and to plot the first 1000 frames it took me 60s, whereas the points 9000 to 10000 took 600s to plot.

You would need a data format which allows you to seek at any data set in constant time. In my thesis I saved all my experimental data (huge data sets) with hdf5, and then you can use the external utility h5totxt to extract the desired data set. Here, the position of the requested data set can be calculated without scanning the whole file, and the access time is independent of the frame number.

For testing I used the following python script to generate a test data file points.h5:

from numpy import random
import h5py
P = random.normal(size=(10000,1000,2))
f = h5py.File('points.h5', 'w')
f.create_dataset('points', data=P)

The gnuplot script for plotting is

set terminal wxt size 1000,600
k=9999
do for [i=0:9999]{
  plot sprintf("< h5totxt -s ' ' -x %d points.h5", i) using 1:2 ls 1 pt 7 ps 2 title sprintf("%d", i)
}

Now, plotting of 1000 frames takes 40s, no matter which frames you take (0-1000 or 9000-10000).