How to build efficient Kafka broker healthcheck?

I would strongly recommend you to use Yahoo Kafka Manager, which provides all the information related to Kafka setup. (e.g. bytes sent/consumed over a time interval). This tool can also be used for managing your Kafka Cluster.

It also exposes Restful API and you can consume these API in your own application, if needed. Follow the following link to access it.

https://github.com/yahoo/kafka-manager


If you want to build your own health check, this is a current (January 2020) list of KIPs covering health checks:

  • KIP-143: Controller Health Metrics
  • KIP-188: Add new metrics to support health checks
  • KIP-237: More Controller Health Metrics

Regarding Harvinder Singh's currently accepted answer:

Kafka Manager is great but it's evolving slowly. There's of course Confluent Control Center - a part of Confluent Platform, but you'll need a license for it. Confluent is a company founded by the team that built Apache Kafka. I've heard about akHQ (ex KafkaHQ) (HackerNews story). Here's a list of management consoles maintained on Apache Kafka Confluence page (check URLs there):

  • Kafka Manager - A tool for managing Apache Kafka.
  • kafkat - Simplified command-line administration for Kafka brokers.
  • Kafka Web Console - Displays information about your Kafka cluster including which nodes are up and what topics they host data for.
  • Kafka Offset Monitor - Displays the state of all consumers and how far behind the head of the stream they are.
  • Capillary - Displays the state and deltas of Kafka-based Apache Storm topologies. Supports Kafka >= 0.8. It also provides an API for fetching this information for monitoring purposes.
  • Doctor Kafka - Service for cluster auto healing and workload balancing.
  • Cruise Control - Fully automate the dynamic workload rebalance and self-healing of a Kafka cluster.
  • Burrow - Monitoring companion that provides consumer lag checking as a service without the need for specifying thresholds.
  • Chaperone - An audit system that monitors the completeness and latency of data stream.

If you don't need GUI, there are also:

  • https://github.com/andreas-schroeder/kafka-health-check
  • and its fork https://github.com/ustream/kafka-health-check

You can also use Zookeeper API to get the broker list as follows:

ZooKeeper zk = new ZooKeeper(KafkaContextLookupUtil.getZookeeperConnect().getZkConnect(), 10000, null);
    List<String> ids = zk.getChildren("/brokers/ids", false);
    List<Map> brokerList = new ArrayList<>();
    ObjectMapper objectMapper = new ObjectMapper();

    for (String id : ids) {
        Map map = objectMapper.readValue(zk.getData("/brokers/ids/" + id, false, null), Map.class);
        brokerList.add(map);
    }
    return brokerList;