SPARQL query and distinct count

If you're using Java and Jena's ARQ, you can use ARQ's extensions for aggregates. Your query would look something like:

SELECT ?tag (count(distinct ?tag) as ?count)
WHERE {
    ?r ns9:taggedWithTag ?tagresource.
    ?tagresource ns9:name ?tag
}
LIMIT 5000

The original SPARQL specification from 2008 didn't include aggregates, but the current version, 1.1, from 2013 does.


Using COUNT(), MIN(), MAX(), SUM(), AVG() with GROUP BY can produce summary values for groups of triples. Note, these patterns might be specific to SPARQL 1.1.

For example, this one can sum the ?value for each ?category,

SELECT ?category (SUM(?value) as ?valueSum)
WHERE
{
  ?s ?category ?value .
}
GROUP BY ?category

This one can count the number of uses for predicate ?p,

SELECT ?p (COUNT(?p) as ?pCount)
WHERE
{
  ?s ?p ?o .
}
GROUP BY ?p

These examples are inspired by material from Bob DuCharme (2011), "Learning SPARQL". O’Reilly Media, Sebastopol, CA, USA; see http://www.learningsparql.com/

To avoid the error "Bad aggregate" upon using GROUP BY:

  1. The grouping variables should match ; (?category in the first example)
  2. The rest of the variables in the SELECT should each result into one value ; (SUM(?value) as ?valueSum) in the first example.

Tags:

Count

Sparql