How to most efficiently find out if a record has child records?

Don't forget that Postgres has a boolean datatype. The following is the most succinct way to express the query:

select
  parent_id,
  name,
  exists (select from child where parent_id = p.parent_id) as has_children
from parent p;

https://dbfiddle.uk/?rdbms=postgres_10&fiddle=86748ba18ba8c0f31f1b77a74230f67b


Methods

Aggregate Method

The popular way we'll call it the aggregate method. Note bool_or(child_id IS NOT NULL) also works but was not any faster.

SELECT parent_id, count(*)>1 AS has_children
FROM parent
LEFT OUTER JOIN children
  USING (parent_id)
GROUP BY parent_id;

LEFT JOIN LATERAL with limit

But you may also try this, with LEFT JOIN LATERAL() like this..

SELECT parent_id, has_children
FROM parent AS p
LEFT JOIN LATERAL (
  SELECT true
  FROM children AS c
  WHERE c.parent_id = p.parent_id
  FETCH FIRST ROW ONLY
) AS t(has_children)
  ON (true);

EXISTS

Just FYI, you can use CROSS JOIN LATERAL with EXISTS too (which is I believe how it's planned). We'll call it the EXISTS method.

SELECT parent_id, has_children
FROM parent AS p
CROSS JOIN LATERAL (
  SELECT EXISTS(
    SELECT 
    FROM children AS c
    WHERE c.parent_id = p.parent_id
  )
) AS t(has_children);

Which is the same as,

SELECT parent_id, EXISTS(
    SELECT 
    FROM children AS c
    WHERE c.parent_id = p.parent_id
) AS has_children
FROM parent AS p;

Benchmarks

Sample dataset

1000000 children, 2500 parents. Our sims get it done.

CREATE TABLE parent (
  parent_id int PRIMARY KEY
);
INSERT INTO parent
  SELECT x
  FROM generate_series(1,1e4,4) AS gs(x);
CREATE TABLE children (
  child_id int PRIMARY KEY,
  parent_id int REFERENCES parent
);
INSERT INTO children
  SELECT x, 1 + (x::int%1e4)::int/4*4
  FROM generate_series(1,1e6) AS gs(x);

VACUUM FULL ANALYZE children;
VACUUM FULL ANALYZE parent;

Results (pt1)

  • Aggregate method: 450ms,
  • LEFT JOIN LATERAL ( FETCH FIRST ROW ONLY ): 850ms
  • EXISTS method: 850ms

Results (adding an index and running again)

Now let's add an index

CREATE INDEX ON children (parent_id);
ANALYZE children;

Now the timing profile is totally different,

  • Aggregate method: 450ms,
  • LEFT JOIN LATERAL ( FETCH FIRST ROW ONLY ): 30ms
  • EXISTS method: 30ms

This is how I would do in SQL server (I don't have postgresql- I'm guessing it would be similiar)

SELECT p.parent_id,
CASE WHEN EXISTS (SELECT 1 FROM Child c WHERE c.ParentId=p.ParentId)
                THEN 'Yes'
                ELSE 'No'
                END as has_child,
FROM Parent p
--WHERE EXISTS (SELECT 1 FROM Child c WHERE c.ParentId=p.ParentId)