edgeX
Edge-first infrastructure & platform — https://edgex.it.com

Latency optimizations for edge inference

How to reduce end-to-end latency for on-device ML: batching, model quantization, and near-source routing.

How to reduce end-to-end latency for on-device ML: batching, model quantization, and near-source routing. Full article and reproducible artifacts available in the docs and repo.