Message Queue vs Task Queue difference

I asked a similar question on some Developer Community Groups on Facebook. It was not about GoogleAppEngine specifically - i asked in more of a general sense to determine use case between RabbitMQ and Celery. Here are the responses I got which I think is relevant to the topic and fairly clarifies the difference between a message queue and a task queue.

I asked:

Will it be appropriate to say that "Celery is a QueueWrapper/QueueFramework which takes away the complexity of having to manage the internal queueManagement/queueAdministration activities etc"?

I understand the book language which says "Celery is a task queue" and "RabbitMQ is a message broker". However, it seems a little confusing as a first-time celery user because we have always known RabbitMQ to be the 'queue'.

Please help in explaining how/what celery does in constrast with rabbitMQ

A response I got from Abu Ashraf Masnun

Task Queue and Message Queue. RabbitMQ is a "MQ". It receives messages and delivers messages.

Celery is a Task Queue. It receives tasks with their related data, runs them and delivers the results.

Let's forget Celery for a moment. Let's talk about RabbitMQ. What would we usually do? Our Django/Flask app would send a message to a queue. We will have some workers running which will be waiting for new messages in certain queues. When a new message arrives, it starts working and processes the tasks.

Celery manages this entire process beautifully. We no longer need to learn or worry about the details of AMQP or RabbitMQ. We can use Redis or even a database (MySQL for example) as a message broker. Celery allows us to define "Tasks" with our worker codes. When we need to do something in the background (or even foreground), we can just call this task (for instant execution) or schedule this task for delayed processing. Celery would handle the message passing and running the tasks. It would launch workers which would know how to run your defined tasks and store the results. So you can later query the task result or even task progress when needed.

You can use Celery as an alternative for cron job too (though I don't really like it)!

Another response I got from Juan Francisco Calderon Zumba

My understanding is that celery is just a very high level of abstraction to implement the producer / consumer of events. It takes out several painful things you need to do to work for example with rabbitmq. Celery itself is not the queue. The events queues are stored in the system of your choice, celery helps you to work with such events without having to write the producer / consumer from scratch.

Eventually, here is what I took home as my final learning:

Celery is a queue Wrapper/Framework which takes away the complexity of having to manage the underlying AMQP mechanisms/architecture that come with operating RabbitMQ directly


GAE's Task Queues are a means for allowing an application to do background processing, and they are not going to serve the same purpose as a Message Queue. They are very different things that serve different functions.

A Message Queue is a mechanism for sharing information, between processes, threads, systems.

An AppEngine task Queue is a way for an AppEngine application to say to itself, I need to do this, but I am going to do it later, outside of the context of a client request.


Might differ depending on the context, but below is my understanding:

Message queue

Message queue is the message broker part - a queue data structure implementation, where you can:

  1. Enqueue/produce/push/send (different terms depending on the platform, but refers to the same thing) message to.
  2. Dequeue/consume/pull/receive message from.
  3. Provides FIFO ordering.

Task queue

Task queue, on the other hand, is to process tasks:

  1. At a desired pace - how many tasks can your system handle at the same time? Perhaps determined by the number of CPU cores on your machine, or if you're on Kubernetes, number of nodes and their size. It's about concurrency control, or the less-cool term, "buffering".
  2. In an async way - non-blocking task processing. Processes tasks in the background, so your main process can go do other stuff after kicking off a task. Server API over HTTP is a popular use case, where you want to respond quickly to the client because HTTP request usually has a short timeout (<= 30s), especially when your API is triggered by end user (humans are impatient). If your task takes longer than seconds, you want to consider bring it off to the background, and give a API response like "OK I received your request, I'll process it when I have time".

Their difference

As you can see, message queue and task queue focus on different aspects, they can overlap, but not necessarily.

An example for task queue but not message queue - if your tasks don't care about ordering - each task does not depend on one another - then you don't need a "queue", FIFO data structure. You can, but you don't have to. You just need a place to store the buffered tasks like a pool, a simple SQL/NoSQL database or even S3 might suffice.

An opposite example is push notification. You use message queue but not necessarily task queue. Server generates events/notifications and wants to deliver them to the client. The server will push notifications in the queue. The client consumes/pulls down notifications from the queue when they are ready to do so. Products like GCP PubSub, AWS SNS can be used for this.

Takeaway

Task queue is usually more complicate than a message queue because of the concurrency control, not to mention if you want horizontal scaling like distributing workers across nodes to optimize concurrency.

Tools like Celery are task queue + message queue baked into one. There aren't many tools like Celery as I know that do both, guess that's why it's so popular (alternatives are Bull or Bee in NodeJS, or if you know more please let me know!).

My company recently had to implement a task queue. While googling for the proper tool these two terms confused me a lot, because I kind of know what I want, but don't know how people call it and what keyword I should search by.

I personally haven't used AppEngine much so cannot answer that, but you can always check for the points above to see if it satisfies the requirements.