How Logstash is different than Kafka

Logstash is a tool that can be used to collect, process and forward events and log messages. Collection is accomplished through a number of input plugins. You can use Kafka as an input plugin, where it will read events from a Kafka topic. Once an input plugin has collected data it can be processed by any number of filters which modify and annotate the event data. Finally events are routed to output plugins which can forward the events to a variety of external programs including Elasticsearch.

Where as Kafka is a messaging software that persists messages, has TTL, and the notion of consumers that pull data out of Kafka. Some of it's usages could be:

  • Stream Processing
  • Website Activity Tracking
  • Metrics Collection and Monitoring
  • Log Aggregation

So simply both of them have their own advantages and disadvantages. But then it depends on your requirements solely.


Kafka is much more powerful than Logstash. For syncing data from such as PostgreSQL to ElasticSearch, Kafka connectors could do the similar work with Logstash.

One key difference is: Kafka is a cluster, while Logstash is basically single instance. You could run multiple Logstash instances. But these Logstash instances are not aware of each other. For example, if one instance goes down, others will not take over its work. Kafka handles the node down automatically. And if you set up Kafka connectors to work in the distributed mode, other connectors could take over the work of the down connector.

Kafka and Logstash could also work together. For example, run a Logstash instance on every node to collect logs, and send the logs to Kafka. Then you could write the Kafka consumer code to do any handling you want.


In addition, I want to add somethings through scenarios:

Scenario 1: Event Spikes

The app you deployed has a bad bug where information is logged excessively, flooding your logging infrastructure. This spike or a burst of data is fairly common in other multi-tenant use cases as well, for example, in the gaming and e-commerce industries. A message broker like Kafka is used in this scenario to protect Logstash and Elasticsearch from this surge.

enter image description here

Scenario 2: Elasticsearch not reachable

When eleasticsearch is not reachable, If you have a number of data sources streaming into Elasticsearch, and you can't afford to stop the original data sources, a message broker like Kafka could be of help here! If you use the Logstash shipper and indexer architecture with Kafka, you can continue to stream your data from edge nodes and hold them temporarily in Kafka. As and when Elasticsearch comes back up, Logstash will continue where it left off, and help you catch up to the backlog of data.

The whole blog is here about use cases of the Logtash and Kafka.