Comparing MQTT Brokers for the Industrial IoT

Unified Namespace, also known as event-driven architecture, is a system in which all data is published, regardless of whether there is a consumer for it. MQTT is a popular protocol for transmitting data between machines and is often used in Unified Namespace.

When choosing an MQTT broker, it's common for companies to ask about comparisons between different options, such as "HiveMQ vs mosquitto" or "What is the best MQTT broker?"

However, it's important to take a step back and consider the individual requirements of each company before making a decision. By understanding what is needed, the process of choosing a system architecture becomes much simpler and questions like "HiveMQ vs OSIsoft PI" no longer clutter the discussion.

You can skip the following explanations, if you just want to hear more about MQTT brokers. However, we strongly recommend to read it as it helps you in getting a better picture of what the decision is actually about.

TLDR: We often see larger enterprises choosing HiveMQ as their MQTT broker, but this may not be the best choice for every company. It's important to consider individual requirements before making a decision.

Amendment (2023-01-18): This article presents a qualitative analysis of MQTT brokers, in which they are classified into the general IT system landscape, resulting in qualitative requirements, and then evaluated based on these requirements. It does not include quantitative benchmarks or detailed feature comparison lists that compare the brokers to each other. In our opinion, the brokers are similar in terms of features. In terms of performance, there are some analyses in the scientific literature [1] [2]. Here too, the tendency is that all brokers move in roughly the same order of magnitude, although we would like to verify this ourselves in a future article. In our experience, however, decision-makers in manufacturing companies are more influenced by the qualitative arguments for why a company chooses broker A or B.

Three main requirements for your IT / OT architecture

To make the best decision, it's important to combine IT and OT best practices.

Traditional OT tools are not well-suited for handling large amounts of data, so other approaches are needed. By looking at companies like Netflix, Google, and Facebook, we can see how they handle data-intensive applications.

Data-intensive applications are not limited by CPU power, but rather by the amount of data, the complexity of the data, and the rate at which the data changes. In manufacturing, load can refer to the number of sensors sending data, their frequency, and the size of their messages.

“Designing data-intensive applications” is an important book that explains how these companies handle large-scale IT applications. The OT equivalent is the ISA95 model, also known as the "automation pyramid".

Designing Data-Intensive Applications by Martin Kleppmann

In that book, there are the following three main requirements for a data-intensive application:

Reliability

Reliability refers to "making the systems work correctly, even when faults occur". This can include:

  • Hardware faults: Equipment failures, such as broken sensors, cables, or edge devices, or electrical spikes caused by nearby electrical engines. These can be mitigated by choosing hardware with a high MTTF (mean time to failure) and introducing hardware redundancy, such as hot-swappable disks, multiple cables, and multiple sensors.
  • Software faults: Software failures, such as important programs not receiving enough CPU and RAM resources, or software bugs that cause programs to hang. These can be mitigated by continuously monitoring each program and restarting it if necessary, as well as introducing software redundancy, failovers, and self-healing.
  • Human faults: Failures caused by improper use of the system, such as a machine operator accidentally unplugging cables or interacting with the system in the wrong way. These can be mitigated by good management practices and the same technologies used for hardware and software faults.

Scalability

Scalability refers to a system's ability to cope with increased load. In manufacturing, this could mean the ability to add more data sources from which a machine learning model can determine the state of equipment. Data from sensors can vary greatly, with vibration data producing many small messages and camera data producing fewer, larger messages. In some cases, it's also important that data is consistently analyzed in real-time.

Maintainability

Maintainability consists of three sub-goals:

  • Operability: The ease of keeping the system running, such as being able to troubleshoot issues easily and receiving notifications when there are problems with the system.
  • Simplicity: The ease of understanding the system, such as being able to quickly learn and work with the system as a new process engineer.
  • Evolvability: The ease of making changes to the system, such as being able to add new sensors or data sources without disrupting the system.
Note
Every time when we talk about concrete features of brokers, we will now highlight using italic text whether this feature belongs to Reliability, Scalability or Maintainability.

We will write another article about requirements and go deep-dive into them. But for now, we can use it to derive our system architecture (which will have very likely an message broker in the middle of it, as you would otherwise not have clicked on the article).

Message brokers and MQTT

Once you have considered the requirements for your overall IT / OT architecture, you can proceed to the next step: designing your system architecture.

System architecture

In the book "Designing Data-Intensive Applications", there are four potential building blocks, of which a data-intensive application is built of:

  1. Long-lived databases to store data
  2. Short-lived caches to speed up expensive operations
  3. Stream processing blocks to continuously process and share data
  4. Batch processing blocks to periodically process batches of data

To connect these building blocks, there are three common architecture approaches:

  1. Dataflow through databases
  2. Dataflow through service calls
  3. Dataflow through asynchronous message passing
Option 1: Dataflow through databases. No real-time stream processing possible. The database could be something like a Historian.
Option 2: Dataflow through service calls. Can cause spaghetti diagrams when scaled up and not properly documented.

The third approach, "asynchronous message passing", introduces a fifth building block: the message broker. This is also sometimes called as “Pub/Sub” or “Unified Namespace”.

Option 3: Dataflow through asynchronous message passing. Introduces the message broker like Apache Kafka or HiveMQ or RabbitMQ

In manufacturing applications, which often run for 10-20 years, it is important to have the ability to easily plug in new components or remove existing ones. This helps prevent spaghetti diagrams and allows for real-time data processing. For these reasons, using a message broker or “Unified Namespace” is often the best choice.

However, when looking back at our requirements, it is a tradeoff between Maintainability (simplicity) and Maintainability (evolvability). You are adding a new component to the overall stack, which increases the likelihood of a failure, but gain the flexibility to exchange single building blocks.

Anyway, in enterprise IT, there are many different message brokers to choose from, such as Apache Kafka, RabbitMQ, NATS, and more. They are designed to be the backbone for large companies and they typically run in the cloud or in server farms.
MQTT, that you as a reader very likely already know, is rarely mentioned when talking about message broker, but why?

How does MQTT fit into the overall picture?

Message Queue Telemetry Transport (MQTT)

“MQTT (originally an initialism of MQ Telemetry Transport) is a lightweight, publish-subscribe, machine to machine network protocol for Message queue/Message queuing service. It is designed for connections with remote locations that have devices with resource constraints or limited network bandwidth.”
- Wikipedia

Compared to traditional IT message broker, MQTT is designed for unreliable connections. One of the first use-cases was in Oil & Gas to monitor pipelines and get telemetry data via unstable satellite connections. Nowadays, it is also used for example for connected cars and manufacturing. When looking at this background, one can understand easily the advantages and disadvantages of MQTT:

  • Pro: Works via unreliable connections
  • Pro: Can handle millions of connected devices and topics at the same time
  • Pro: very simple protocol to enable embedded use-cases
  • Con: is not suited for stream processing (also called “data contextualization”, “data enrichment”, etc.). It guarantees only the delivery of messages, not the processing. This means it is not well suited to transport important information like “Start a new order”, as it is not guaranteed that the PLC actually started production. This can be mitigated partially with using Kafka. See also our blog article
To summarize, we need an MQTT broker because we want to - have a flexible system, where building blocks can be added and removed easily (“Unified Namespace”)
- retrieve data from a lot of devices via unstable connections

Now we can take a look at the reason why you clicked on the article: comparing different MQTT brokers!

MQTT Broker comparison

When looking at pure MQTT brokers, there are only a few to choose from:

  1. Mosquitto
  2. VerneMQ
  3. HiveMQ
  4. EMQx
  5. Others (traditional enterprise message brokers like Apache Kafka or Solace; MQTT (somewhat) compatible brokers like Azure IoT Hub or AWS IoT Core)
Side-fact
What is interesting is, that most of them are originating from Europe! VerneMQ, HiveMQ and cedalo (mosquitto) have all their company in Europe.

Let’s go through them, broker by broker!

Mosquitto

Eclipse Mosquitto is an open source (EPL/EDL licensed) message broker that implements the MQTT protocol versions 5.0, 3.1.1, and 3.1. It is lightweight and suitable for use on all devices, from low power single board computers to full servers.

It is the default choice for many developers starting with MQTT and is often used in Internet of Things (IoT) and Industrial IoT (IIoT) projects.

However, it does not come with enterprise features such as high availability (Reliability) or clustering (Scalability) by default, which may make it less suitable for use in larger or more complex environments.

The new company Cedalo now offers enterprise support for the open-source version of Mosquitto as well as a more advanced version called "Mosquitto Pro" that includes high availability and a configuration panel including market standard enterprise features (RBAC, monitoring, etc. --> Maintainability).

The Mosquitto Pro version includes an active/passive failover feature, which allows for a secondary instance to take over in the event that the primary instance fails. While this helps to ensure Reliability, it does not allow for horizontal scaling (Scalability). Cedalo is reportedly planning to introduce clustering in the future, which would allow messages to be sent to any broker and passed around, enabling more Scalability.

When we first learned about Cedalo's plans to offer a "pro" version of the open-source Mosquitto broker, we were sceptical. Developing reliable distributed systems is highly complex and we were concerned that Cedalo had simply scrapped the Mosquitto codebase and started from scratch (Reliability).

However, upon further investigation, we were relieved to learn that Cedalo is still leveraging the open-source version of Mosquitto and has simply added some features, such as active/passive failover, to make it more suitable for enterprise use. We were also encouraged to see that Cedalo is offering support for the open-source version (Maintainability), given that the developer of Mosquitto, Roger Light, is now working at their company.

While we are still looking closely at the future of this enterprise offering and whether it will be a viable option for companies, we are encouraged by the technical promise of the Mosquitto Pro version and the fact that Cedalo is providing support for the open-source version. It's also worth noting that Cedalo is VC funded, which may be a factor to consider when evaluating the long-term stability of the company.

According to Stefan Loelkes, co-founder of Cedalo, the open-source version of Mosquitto is responsible for approximately 80% of the worldwide MQTT traffic and is therefore a highly reliable and battle-tested option. Cedalo is currently keeping the open-source version as it is and using approaches such as sharding/partitioning, active/passive failover, and a configuration panel to allow enterprises to scale out with Mosquitto.

VerneMQ

VerneMQ is a high-performance, distributed MQTT broker. It scales horizontally and vertically on commodity hardware to support a high number of concurrent publishers and consumers while maintaining low latency and fault tolerance.

VerneMQ is the go-to broker (and currently the only option) for those seeking High Availability (Reliability) and Clustering (Scalability) in an open-source MQTT broker. We have extensively tested VerneMQ as part of the United Manufacturing Hub for the past three years and have found it to be a reliable and performant option.

One notable aspect of VerneMQ is its licensing model (Maintainability). It is open-source, which means that the source code is freely available for anyone to use, modify, and distribute. However, the company behind VerneMQ, Octavo Labs, has implemented a monetization model based on pre-built packages, such as Docker containers. These packages are only available for development environments and it is not permitted to use them in production environments without a paid subscription from Octavo Labs.

VerneMQ also uses LevelDB as its internal database, which has a reputation for being prone to corruption. This may be a concern for those planning to use VerneMQ on edge devices or in large-scale deployments (Reliability).

VerneMQ has been on the market since 2012, but the company behind it appears to be relatively small. This may indicate that VerneMQ has lower commercial adoption compared to other MQTT brokers (Maintainability).

Despite this, VerneMQ is the only viable open-source MQTT broker with clustering currently available.

HiveMQ

HiveMQ's MQTT broker makes it easy to move data to and from connected devices in an efficient, fast and reliable manner. We make it possible to build connected products that enable new digital businesses.

HiveMQ is a popular MQTT broker that has gained widespread adoption in enterprise environments, thanks to its extensive feature set and strong backing by investors (almost 50 million USD now!).

It offers High Availability Clustering, which allows for failover and seamless recovery in the event of a server outage. It also features automatic scalability, which enables the broker to adjust to changing workloads without requiring manual intervention (Reliability & Scalability).

Additionally, HiveMQ offers a range of security features, including TLS, authentication, and authorization, as well as integration with third party security systems (Maintainability).

HiveMQ is widely used in a variety of applications, including connected cars and manufacturing. It has a comprehensive documentation that suggests it has been designed to account for a wide range of edge cases. The broker is written in Java and configured using XML files, which can give it a feeling of legacy software. However, this may also be seen as a positive attribute for a critical part of an IT/OT infrastructure, as it can be seen as a battle-proven solution (Reliability).

Note that the open-source community edition of HiveMQ does not offer clustering or High Availability, which require an enterprise license.

"When companies need more than what an open source broker can provide - enterprise-grade features like the plug-and-play integration with enterprise and cloud platforms, reliable scalability to millions of connections, or observability into their MQTT messages - that’s when HiveMQ becomes the clear choice. Over 130 customers, including many Fortune 500, trust the HiveMQ MQTT Platform to move data for business critical use cases in connected cars, logistics, connected products and Industry 4.0.”

- Kudzai Manditereza, Developer Advocate, HiveMQ

EMQ

Connect, move, process, and analyze your IoT data in real-time from edge to cloud to multi-cloud.

EMQx is also an open-source broker and used by for example AWS.

This is a difficult topic (but at the same time really funny!) and we try to keep it short here:

We did not compare the technical features behind it as we struggled to trust the company behind it. On the first appearance, it looks like a typical US startup. Then you realize it is actually coming from China and their “offices around the world” stated on their website are actually mailbox companies in office complexes or in Swedish single-family homes (2022-11-25).

You see a picture of “EMQ's global R&D center” (quote from their website), Mazarinvägen 36 Sköndal, Stockholm, Sweden.

This mismatch between what they are saying and the reality, eroded our trust. But check it out and decide for yourself! If they decide to take down these statements after seeing this article, we are happy to provide screenshots :)

Other brokers

Other brokers, such as NATS, RabbitMQ and Apache Kafka, are all good options (and often necessary) for environments where data needs to be heavily processed, but is already "safe" (typically in a server farm).

In our experience, Apache Kafka has the largest enterprise adoption, but others could also be feasible.

For more information on scalable data processing in Industrial IoT, check out our blog article Tools and Techniques for Scalable Data Processing in Industrial IoT

Summary

In this article, we have compared four popular MQTT brokers: Eclipse Mosquitto, VerneMQ, EMQx and HiveMQ.

When considering which broker to use, it's important to consider factors such as scalability, reliability, and maintainability:

  • Mosquitto is a lightweight, open-source broker that is popular for use in IoT and IIoT projects. However, the open-source version does not provide High Availability or Clustering by default, which is only available in the "pro" version offered by cedalo.
  • VerneMQ is a high-performance, distributed broker that offers High Availability in the open-source version, but may not be suitable for use on edge devices due to its use of LevelDB as an internal database.
  • HiveMQ is a broker with a high level of commercial adoption and offers a range of features for enterprise use, including High Availability Clustering and automatic scalability. However, clustering and High Availability are not available in the open-source community edition.
  • EMQx is out, because in our opinion their statements at the time we created this article, were quite shady.

We hope this article has been helpful in your search for the right MQTT broker for your needs.

If you have any questions or want to join the conversation, don't hesitate to join our Discord channel.