Difference between PromQL "by" and "without" unclear

All of these examples are aggregating incorrectly, as you're averaging an average. You want:

  sum without (path,host) (
    rate(request_duration_sum{status_code=~"2.*"}[5m])
  )
/
  sum without (path,host) (
    rate(request_duration_count{status_code=~"2.*"}[5m])
  )

Which will return the average latency per status_code plus any other remaining labels.


  • The by modifier groups aggregate function results by labels enumerated inside by(...).
  • The without modifier groups aggregate function results by all the labels except those enumerated inside without(...).

For example, suppose process_resident_memory_bytes metric exists with job, instance and datacenter labels:

process_resident_memory_bytes{job="job1",instance="host1",datacenter="dc1"} N1
process_resident_memory_bytes{job="job1",instance="host2",datacenter="dc1"} N2
process_resident_memory_bytes{job="job1",instance="host1",datacenter="dc2"} N3
process_resident_memory_bytes{job="job2",instance="host1",datacenter="dc1"} N4

Then sum(process_resident_memory_bytes) by (datacenter) would return summary per-datacenter memory usage, while sum(process_resident_memory_bytes) without (instance) would return summary per-job per-datacenter memory usage.