How can I optimize pgrouting for speed?

When faced with tasks like this your primary objective is to be rational. Don't change params based on 'gut feeling'. While the gut seems to works for Hollywood it does not for us who live in the real world. Well, at least not my gut ;-).

You should:

  1. establish a usable and repeatable metric (like the time required by a pgrouting query)

  2. save metric results in a spreadsheet and average them (discard best and worst). This will tell you if the changes you are making are going in the right direction

  3. monitor your server using top and vmstat (assuming you're on *nix) while queries are running and look for significant patterns: lots of io, high cpu, swapping, etc. If the cpu is waiting for i/o then try to improve disk performance (this should be easy, see below). If the CPU is instead at 100% without any significant disk acticity you have to find a way to improve the query (this is probably going to be harder).

For the sake of simplicity I assume network is not playing any significant role here.

Improving database performance

Upgrade to the latest Postgres version. Version 9 is so much better that previous versions. It is free so you have no reason not not.

Read the book I recommended already here.

You really should read it. I believe the relevant chapters for this case are 5,6,10,11

Improving disk performance

  1. Get an SSD drive and put the whole database on it. Read performance will most-likely quadruple and write performance should also radically improve

  2. assign more memory to postgres. Ideally you should be able to assign enough memory so that the whole (or the hottest part) can be cached into memory, but not too much so that swapping occurs. Swapping is very bad. This is covered in the book cited in the previous paragraph

  3. disable atime on all the disks (add the noatime options to fstab)

Improving query perfomance

Use the tools described in the book cited above to trace your query/ies and find stops that are worth optimizing.

Update

After the comments I have looked at the source code for the stored procedure

https://github.com/pgRouting/pgrouting/blob/master/core/src/astar.c

and it seems that once the query has been tuned there is not much more room for improvement as the algorithm runs completely in memory (and, unfortunately on only one cpu). I'm afraid your only solution is to find a better/faster algorithm or one that can run multithreaded and then integrate it with postgres either by creating a library like pgrouting or using some middleware to retrieve the data (and cache it, maybe) and feed it to the algorithm.

HTH


I have just the same problem and was about to ask on mailing lists, so thanks to everybody!

I am using Shooting Star with a million and a half rows on the routing table. It takes almost ten seconds to calculate it. With 20k rows it takes almost three seconds. I need Shooting Star because I need the turn restrictions.

Here are some ideas I'm trying to implement:

  • On the SQL where pgRouting get the ways, use a st_buffer so it don't get all ways, but just the "nearby" ways:

    select * from shortest_path_shooting_star( 'SELECT rout.* FROM routing rout, (select st_buffer(st_envelope(st_collect(geometry)), 4) as geometry from routing where id = ' || source_ || ' or id = ' || target || ') e WHERE rout.geometry && e.geometry', source, target, true, true);

It improved the performance, but if the way needs to go outside the buffer, it can return a "no path found" error, so... big buffer? several calls increasing the buffer until it finds a way?

  • Fast routes cached

Like dassouki suggested, I will cache some "useful" routes so if the distance is too long, it can go through these fast routes and just have to find the way in and out of them.

  • Partition table by gis index

But I suppose that, if it goes to memory, it doesn't really matter... Should test it, anyway.

Please, keep posting if you find another idea.

Also, do you know if there is some compiled pgRouting for Postgres9?


We have just created a branch in git for a turn restricted shortest path @ https://github.com/pgRouting/pgrouting/tree/trsp

Sorry no documentation yet, but but if you ask questions on the pgRouting list I hang out there and will respond. This code runs much faster than shooting star and is based on Dijkstra algorithm.

-Steve