Maintaining Order and Repetition of Cell Names using PostGIS?

For get all the visited grid cells in order, even if the grid cell is revisited, I wrote the SQL query below :

SELECT tr.tr_id,
ce.grid_id
FROM cells ce, traj tr
ORDER BY ST_LineLocatePoint(
    tr.traj_path,
    ST_CENTROID(
        (ST_DUMP(
            ST_Intersection(ce.coordinates, tr.traj_path)
        )).geom
    )
);

Explanations :

The big part is in the ORDER clause, the rest is for the selection.

  • ST_INTERSECTION (doc) : for return traj line layer geometry clipped by cells polygon layer geometries.

  • ST_DUMP (doc) with .geom : for return the geometry part of the ST_DUMP set, for have only single part geometries. A trajectory that revisit many times a grid cell, clipped by the grid cells, gives only one multi-part geometry.

  • ST_CENTROID (doc) : for return the centroid of the clipped line part, the latter being in the grid cell (pay attention to the trajectories that follow the cell boundaries)

  • ST_LineLocatePoint (doc) : for return the proportion of the distance of the location of the closest point on the trajectory of the clipped part centroid.

EDIT

Following the good remark from @ThingumaBob, for avoiding repetitions of grid cells which contains self intersections of a trajectory (ST_INTERSECTION split the trajectory if it self intersects), I rewrote the SQL code :

WITH t1 AS (
  SELECT tr.tr_id,
  ce.grid_id,
  ST_LineLocatePoint(
      tr.traj_path,
      ST_CENTROID(
          (ST_DUMP(
              ST_Intersection(ce.coordinates, tr.traj_path)
          )).geom
      )
  ) AS distance
  FROM cells ce, traj tr
),

t2 AS (
  SELECT t1.tr_id,
  t1.grid_id,
  COALESCE(LEAD(t1.grid_id) OVER(ORDER BY t1.tr_id, t1.distance), -1) AS next_grid_id
  FROM t1
)

SELECT t2.tr_id,
t2.grid_id
FROM t2
WHERE t2.grid_id <> t2.next_grid_id
;

Explanations 2 :

  • Subquery t1 : this is the query explained above, the ORDER part is now in the SELECT for use in other query parts.

  • Subquery t2 (from t1) (doc) : with the window function LEAD, this subquery returns the next grid cell id in the distance order. And the last trajectory part has no grid cell next id, hence the use of COALESCE for replace the NULL value with -1 (which is, normally, an impossible id, but not NULL).

  • Final select from t2 : returns all records that have a grid cell id different from the next grid cell id (an even grid cell id means a same trajectory part in the same grid cell, means a trajectory self-intersection).


You should be able to order by the fraction of line-length at which each cells centroid projects onto the line:

SELECT cell_names
FROM   (
  SELECT ce.cell_names,
         ST_LineLocatePoint(tr.traj_path, ST_Centroid(ce.coordinates)) AS frac
  FROM   cells AS ce
  JOIN   traj AS tr
    ON   ST_Intersects(ce.coordinates, tr.traj_path)
) q
ORDER BY
       frac
;

Those cells need to be a regular grid. There may be rare edge cases, though, where this fails.