Building Production-Grade MQTT Infrastructure for Industrial IoT
MQTT has become the standard messaging protocol for industrial IoT, yet running it reliably at factory scale remains surprisingly difficult. This post covers the architecture patterns we have battle-tested across multiple sites with hundreds of PLCs, sensors, and edge gateways.
Why Standard MQTT Setups Fail in Industry
Consumer-grade broker configurations assume stable networks and well-behaved clients. Industrial environments deliver the opposite: high latency, frequent disconnections, and devices that publish at irregular intervals during shift changes or maintenance windows.
We learned early that message loss is rarely the broker’s fault — it is almost always caused by clients that reconnect aggressively or queues that grow unbounded when the WAN link flaps.
Production Architecture We Deploy
- Redundant EMQX or Mosquitto cluster with shared persistent storage on NVMe
- Mutual TLS everywhere with short-lived client certificates rotated automatically
- Local edge brokers (one per production cell) that buffer during outages
- Strict client ID naming convention that encodes site, line, and device type
- Real-time monitoring of inflight messages and session queue depth
This combination has allowed us to maintain above 99.9% delivery rates even on sites with notoriously unreliable network infrastructure.