Running Lightweight ML Models at the Industrial Edge with...

Cloud-based analytics are valuable, but many industrial use cases require decisions in under 50 milliseconds with no dependency on internet connectivity. Running models at the edge solves both problems.

Our Preferred Stack

We standardize on ONNX Runtime because it offers excellent performance on both ARM and x86 edge hardware while keeping model size reasonable. A well-optimized vision model for defect detection can easily run under 80 MB and deliver 30+ FPS on modest Jetson or industrial PC hardware.

Quantization to INT8 where accuracy permits
Strict input validation and fallback rules when confidence is low
Local result buffering and eventual forwarding to the plant historian
Model versioning and A/B testing without touching the production line

The results have been dramatic: one automotive supplier reduced false positives on their vision inspection system by 87% after moving the model from the cloud to the edge.

Running Lightweight ML Models at the Industrial Edge with ONNX

Our Preferred Stack