Message Queues: Basics of highly scalable asynchronous communication for apps
Grady Booch, the much respected software architect asserts that “at some level of abstraction all complex systems are message passing systems". He means all systems, from cell biology to plate tectonics, not just software systems.
Software in many ways are inherently state management systems where functions act upon data. State in turn has to be passed between functions and larger modules. We built the concept of Object Oriented Design and Programming over this simple idea.
If you carefully observe, all complex systems both organic (eg: protein synthesis) and inorganic (eg: payment systems), are essentially abstractions of message passing systems. This is why Object Oriented Programming sticks, except perhaps in bare metal code, regardless of dogmatism from functional programmers and OO haters.
Message Queues
When an application or a smaller microservice wants to pass state, response or request to another application or service, (assuming most applications built today are built for the web) it has two ways to do so.
- Synchronous: Make an HTTP request through standard HTTP operations like post, put, get etc.
- Asynchronous: Place your request on a message queue.
HTTP requests are blocking requests because requestor waits for response. That’s why they are called synchronous. MQs on the other hand lets you drop the message on a queue and the requestor can get back to doing other stuff letting it operate asynchronously.
Messages are usually XML or JSON to maintain data interchange standard, with JSON being more popular today. But you can have any type of payload in a message.
When to use MQs
MQs help with decoupling of services and apps that need to exchange data. If message producer and consumer doesn’t need to bind by an agreement and also be able to make changes independent of each other MQs are the ideal solution to communicate.
An example of a situation where this won’t work is the interaction between web app and its Operational Data Store — coupling them with interdependencies is the only way to go. Introducing an MQ in between to decouple will delay critical response times between the systems. (Well not exactly always. Microservices sometimes solve this problem differently through a concept called Domain Driven Design).
MQ concepts
MQs can be parallel MQs or FIFO (First In First Out) MQs. FIFO is harder to implement as it needs to make sure consumer reads it in a specific order. This limits the amount of messages allowed per second.
AWS Simple Queue Service(SQS) for instance is a parallel MQ and it can handle billions of messages. SQS is probably the simplest messaging system out there. Recent versions of SQS allow FIFO but it places certain limits as expected.
Broadcasting of messages by a producer to many Queues is termed Fan Out. It is fairly simple to implement compared to Fan In where Queues consume messages from multiple Queues. One problem is identifying messages belonging to the same category coming from different Queues or producers. This is where Topics are helpful.
Producers put messages into a topic (imagine a header on a notice board). Most MQ products available today lets you create a topic. Consumers subscribe to that topic regardless of the origin.
The ideal design pattern for MQs go like this:
1. Producers publish messages to a topic.
2. Have a separate Queue for each receiver
3. Link each Queue to the topic that its receiver is interested in.
SQS is capable of doing #2 above but to publish to a topic, and to distribute messages from a topic to the SQS Queues that are subscribed to that topic, we use the Simple Notification Service (SNS) on AWS.
SNS can broadcast multiple messages to multiple endpoints (subscribers). For instance let’s say you have SQS notification configured for new file-landings on an S3 location. SQS messages will get picked by a consumer, let’s say a cloud database. But if you want the same S3 file arrival notification to go to another consumer, SNS has to be configured.
#2 alleviates the need to poll multiple Queues which leads to simpler subscriber architecture. Note that subscribers are difficult to implement as it needs exception handling mechanisms to handle ill-formatted messages, malicious payload inside messages, Dead Letter Queues etc. DLQ is an exception handling Queue where messages that failed to process can be moved and be hooked to an alarm so we can investigate. Also from a producer’s perspective subscribers are clients, so the moral high road (aka avoiding technical debt) is to make it easy on subscribers.
A popular software development pattern for implementing message passing is called publish/subscribe or pub/sub. It is implemented as the subscriber polling the Queue usually through an HTTP request or through webhooks.
Webhook is simply an http (POST) request from publisher to subscriber to notify of a message. It is a user-defined callback (in this case it’s defined on the publisher). When an event happens let’s say pushing a message, the pre-configured callback (aka webhook) notifies the subscriber through an HTTP post request to the pre-configured URI.
Other MQ tools
Apart from SNS-SQS combo, AWS offers Eventbridge which introduces the concept of an event bus with no separation of topics and Q’s. All messages are of a structured format and all of them pass through Eventbridge. A topic is embedded inside the message to channel them to subscribers. This unified pattern helps define more rules on the event bus and allows advanced features on the bus like archival, monitoring, alerting, replay etc.
Perhaps the more popular tools are Kafka and RabbitMQ. Kafka messages are called events. Events can be published into a topic. A topic is partitioned using a key. Each partition is ordered in FIFO. Events can be streamed through sockets or polled by consumers and is widely used for event streaming applications like IoT and web analytics. Number of partitions (servers) determine how much Kafka can distribute load. Kafka is known to be difficult to install and maintain, and its managed services are expensive than SQS/SNS.
RabbitMQ is an open source tool like Kafka. It has direct support for Queues and messages. Queues are FIFO by default. This is a much easier tool to install and maintain. It isn’t suitable for streaming applications that require very good horizonal scalability.
Final note on messages themselves
An often overlooked theme in building enterprise messaging is data serialization standards to unify enterprise’s data interchange format — because we often start small but only until it’s no longer small.
JSON being very popular today solves this for most enterprises. But intensive messaging systems may need to look for other performant serialization techniques. A great example is the open sourced protocol buffers developed by the venerable Jeff Dean for Google’s internal message passing.