Kafka Msg VS REST Calls

There are several posts which make easy to understand Kafka's role in Microsercices.

microservices-apache-kafka-domain-driven-design

journey-to-event-driven

building-a-microservices-ecosystem-with-kafka

build-services-backbone-events

If you need any further help let me know. Happy to help.


Microservices architecture advocates indepdent and autonomous services that can operate on their own. Lets understand why we need message queues?

HTTP protocol is sync

There is very wide misconception that HTTP is async. Http is synchronous protocol but your client could deal it async. E.g. when you call any service using http your http client would schedule is on the backend thread (async). However The http call will be waiting until either it's timeout or response is back , during all this time the http call chain is awaiting synchronously. Now if you have hundreds of requests at a time you can image how many http calls are scheduled synchronously and you may run of sockets.

AMQP

In Microservices architecture we prefer AMQP (Advance message queue protocol) . Which means the service drops the message in queue and forgets about it. This is true async transport protocol since your service is done once it drops the message in the queue and interested services will pick those.

This type of protocol is preferred since you can scale without worry even when other services are down as they will eventually get message/event/data.

So it really depends on your particular case. HTTP are easy to implement but you can't scale them well. Message services come with own challenges like order of messages and workers but that make the architecture scaleable and is preferred way. For write operation always prefer queue, for read operation you can use HTTP but make sure you are not doing a long chain where one service is calling another and that calls another.

Hope that helps !


Gist (for those who want just the gist)

    • Kafka - Publish & Subscribe (just process the pipeline, will notify once the job is done)

    • REST - Request & Await response (on-demand)


    • Kafka - Publish once - Subscribe n times (by n components).

    • REST - Request once, get the response once. Deal over.


    • Kafka - Data is stored in topic. Seek back & forth (offsets) whenever you want till the topic is retained.

    • REST - Once the response is over, it is over. Manually employ a database to store the processed data.


    • Kafka - Split the processing, have intermediate data stored in intermediate topics (for speed and fault-tolerance)

    • REST - Take the data, process it all at once OR if you wish to break it down, don't forget to take care of your OWN intermediate data stores.


    • Kafka - The one who makes the request typically is not interested in a response (except the response that if the message is sent)

    • REST - I am making the request means I typically expect a response (not just a response that you have received the request, but something that is meaningful to me, some computed result for example!)

Q&A style

Is your data streaming?
If the data keeps on coming and you have a pipeline to execute, Kafka is best.

Do you need a request-response model?
If the user requests for something and they wait for a response, then REST is best.

Kafka (or any other streaming platform) is typically used for pipelines i.e where we have forward flow of data.

Data comes to Kafka and from there it goes through component1, component2 and so on and finally (typically) lands in a database.

To get the information on-demand we need a data store (a database) where we can query and get it. In such a case we provide a REST interface which the user can invoke and get the data they want.


Regarding your example,

Everyday vendor service calls the vendor API to get new items and these need to be moved into inventory service

Questions & Answers

Is your vendor API using REST?

Then you need to pull the data and push to Kafka. From there your inventory service (or any other service thereafter) will subscribe to that topic and execute their processing logic.

The advantage here is that you can add any other service which requires vendor data as a consumer to the vendor topic.

Moreover, the vendor data is always there for you even after your inventory service processed it.

If you use REST for this, you need to call the Vendor API for every component that requires vendor data which becomes trivial when used with Kafka

Do you want the inventory to be queried?

Store it in a database after processing through Kafka and provide a REST on top of this. This is needed because Kafka is typically a log, to make the data query-able you would need some database.