Finding pseudo nodes in free GIS software?

Here a generic soluion, that you can impĺement with PostGIS or any other OGC-compliant software.

NOTE: as I say before, a key concept in FOSS and GIS is standardization: the best solutions adopt standards, like OGC ones.


Your problem is to "find pseudo nodes"... But I think that it is a little more, "find non-pseudo nodes and join lines of pseudo nodes". My solution can be used for both.

OGC standards offer:

  • ST_Boundary(geom): to detect the nodes of the lines

  • ST_Dump(geom): to put each single node in a SQL table record.

  • ST_DWithin, ST_Equals, ST_SnapToGrid, ST_Snap can be used for change tolerance. I am using ST_DWithin.

We can suppose that your main problem can be specified with these objects and properties,

  • there are only line segments (of a table linesegment), represented by a LINESTRING geometry... I not tested with MULTILNE, if you have geometrytype=MULTIPOINT, you can split and cast MULTILINEs with ST_Dump and ST_LineMerge;

  • each line segment have a (geometry ID) gid and a (color ID) idline.

So, the first step is to obtain the nodes that comes from joining lines,

CREATE TABLE cache_bounds AS
  SELECT gid as gid_seg, (ST_Dump(ST_Boundary(the_geom))).geom AS the_geom,
         gid as color 
         -- if you not have something for "color label" of lines, use gid.
  FROM linesegment;
ALTER TABLE cache_bounds ADD column gid serial PRIMARY KEY;

CREATE TABLE cache_joinnodes AS
  -- Use your TOLERANCE instead "1" at ST_DWithin and ST_Buffer.
  SELECT *, array_length(colors,1) as ncolors FROM (
   SELECT gid, array_distinct(array_cat(a_colors,b_colors)) as colors, the_geom FROM (
    SELECT 
      a.gid, array_agg(a.color) as a_colors, array_agg(b.color) as b_colors
      , st_buffer(a.the_geom,1) as the_geom -- any one to represent the join point.
    FROM cache_bounds a, cache_bounds b 
    WHERE a.gid>b.gid AND ST_DWithin(a.the_geom,b.the_geom,1)
    -- use ST_equals(a.the_geom,b.the_geom) if no tolerance.
    GROUP BY a.gid, a.the_geom
   ) as t
  ) as t2;

NOTE: using caches because they are faster than views. Use "EXPLAIN SELECT ..." to check CPU time, it can take a long time.

Here cycles and continuous (same color) lines are detected as ncolors=1 points, and the pseudo nodes by ncolors=2 points, so, you have a layer with that points.

Your table of "good nodes" is with the original "bounding points" and without "pseudo nodes".

CREATE VIEW vw_joinnodes_full AS
  SELECT b.*, j.ncolors
  FROM cache_joinnodes j INNER JOIN cache_bounds b 
       ON j.gid=b.gid;

CREATE TABLE cache_good_nodes AS
  SELECT *  
  FROM vw_joinnodes_full 
  WHERE ncolors=1 OR ncolors>2;

-- IF NEED ... CREATE VIEW vw_correct_linesegment AS ... 

Refractions Research has made a Line Cleaner tool that seems to do what you want.

Line Cleaner cleanses networks by simplifying complex, cyclical, very short and zero-length geometries, and removing pseudo-nodes and insignificant vertexes. Most significantly, in the cleansing phase, it is able to ensure that feature matches can be considered automatically

enter image description here

The source code can be found at GitHub.


Non-Free solution: FME + MRF + SmartCleaner transformer

Free solution GRASS v.clean (Latest QGIS 1.8.0 with GRASS tools is easiest way to use it) and other topology cleaning tools