EDGE AI

On-Device Neural Inference

Compressed neural models running on embedded compute hardware — detecting, classifying, and tracking at mission-relevant latency without any uplink requirement. Optimized for the SWaP constraints of tactical unmanned platforms.

< 50 ms

End-to-end inference latency

< 15 W

Total compute envelope

INT8

Quantization target

3

Supported hardware targets

INFERENCE PIPELINE

Model compression to hardware deployment.

The inference stack processes neural models through a deterministic optimization pipeline before deployment — ensuring each model's latency and power budget is characterized before it reaches a platform.

Base Model

FP32 PyTorch / ONNX model from training pipeline

Compression

Structured pruning at 40% sparsity, INT8 post-training quantization

Characterization

Latency, power draw, and accuracy measured on target hardware

Runtime Deployment

TensorRT / HailoRT / custom FPGA bitstream — hardware-specific runtime

HARDWARE TARGETS

Three compute substrates. Different SWaP points.

Jetson Orin NX

NVIDIA embedded GPU / NPU

AI Performance 70–275 TOPS
TDP 10–25 W (configurable)
Inference framework TensorRT 10
Best for Multi-modal fusion

Hailo-8

Purpose-built AI inference chip

AI Performance 26 TOPS
TDP 2.5 W (full load)
Inference framework HailoRT SDK
Best for Micro-UAS / tight SWaP

Custom FPGA

AMD Zynq UltraScale+ / programmable logic

Latency Deterministic
TDP 3–8 W (design-dependent)
Inference framework FINN / Vitis AI / custom HLS
Best for Hard real-time / SIL-3 latency guarantee

LATENCY COMPARISON

Object detection inference time by hardware target.

YOLOv8n INT8 — 640×640 input — single-frame inference

10 ms 20 ms 30 ms FPGA ~9 ms Hailo-8 ~14 ms Jetson Orin ~22 ms 50ms budget

ENGAGE

Characterizing edge compute for your platform's SWaP budget?

Latency, power draw, and accuracy are interlinked — and they depend on your specific sensor resolution, object classes, and frame rate requirements. Hardware target selection is a program-specific decision. We walk through the tradeoff space in a structured technical briefing. Kestrelsense does not build weapons systems; we build the inference substrate that informs them.

Request Technical Briefing