Raspberry Pi AI Kit (Hailo-8L)
The Raspberry Pi AI Kit bundles a Hailo-8L 13 TOPS M.2 AI accelerator with an M.2 HAT+ board for the Raspberry Pi 5. It transforms a Pi 5 into an AI inference machine running YOLO v8 and MobileNet at 30+ FPS. Integrates natively with rpicam-apps for plug-and-play camera inference pipelines.
Best way to add serious AI acceleration to a Raspberry Pi 5, skip if you need training capability or more than 13 TOPS.
Where to Buy
Pros
- 13 TOPS inference performance — 3x the Coral USB Accelerator's 4 TOPS
- Native Pi 5 integration via PCIe M.2 — lower latency than USB accelerators
- Supports YOLO v8, MobileNet, EfficientNet, and custom ONNX models
- rpicam-apps integration enables plug-and-play camera inference pipelines
- Backed by Raspberry Pi's official support and documentation
Cons
- Requires a Raspberry Pi 5 — does not work with Pi 4 or earlier models
- Occupies the Pi 5's single PCIe slot — cannot use NVMe SSD simultaneously without a multiplexer
- Inference only — no on-device model training capability
- Hailo model conversion toolchain is less mature than NVIDIA TensorRT
- 13 TOPS is well below the Jetson Orin Nano's 40 TOPS for demanding models
13 TOPS Performance in Practice
The Hailo-8L's 13 TOPS of INT8 inference performance translates to real-world model throughput that significantly exceeds USB-based accelerators. Running YOLO v8 nano for object detection, the AI Kit achieves 30-40 FPS at 640x640 input resolution — fast enough for real-time video processing.
More complex models scale predictably. EfficientDet-Lite achieves 15-25 FPS depending on input size. Multi-model pipelines (detection plus classification) run at 10-15 FPS. The PCIe 2.0 x1 connection to the Pi 5 provides consistent low-latency data transfer without the overhead and jitter of USB.
Compared to the Coral USB Accelerator's 4 TOPS, the Hailo-8L runs the same models roughly 3x faster. Compared to the Jetson Orin Nano's 40 TOPS, the AI Kit handles simpler models competitively but falls behind on larger architectures like YOLO v8 medium or large.
Power efficiency is a key advantage. The Hailo-8L draws 1-2.5W under load, delivering roughly 5-8 TOPS per watt. The Coral USB achieves similar efficiency at 2-3 TOPS per watt, while the Jetson Orin Nano delivers 3-5 TOPS per watt at 7-15W total draw. For always-on applications like home security NVRs where the system runs 24/7, the AI Kit's low power draw translates to meaningful electricity savings over a year compared to the Jetson.
Pi 5 Integration and the PCIe Trade-off
The AI Kit uses the Pi 5's single M.2 M-Key PCIe slot via the included HAT+ adapter board. Installation is straightforward — attach the HAT+ to the Pi 5's GPIO header and PCIe FPC connector, insert the Hailo-8L M.2 module, and install the Hailo runtime from Raspberry Pi's apt repository.
The trade-off is that the Pi 5 has exactly one PCIe lane. Using the AI Kit means you cannot simultaneously use an NVMe SSD in the same slot. For projects needing both fast storage and AI inference, you would need a PCIe switch/multiplexer board, which adds cost and complexity. Alternatively, a fast USB 3.0 SSD provides reasonable storage speeds while the PCIe slot serves the Hailo-8L.
The rpicam-apps framework from Raspberry Pi provides built-in Hailo inference stages. A single command can start a camera preview with real-time YOLO v8 detection overlays, including bounding boxes and confidence scores. This dramatically reduces the code needed for common vision tasks.
Hailo-8L vs Coral vs Jetson
Three AI accelerators dominate the edge inference market, and the choice comes down to TOPS, power, framework flexibility, and what hardware you already own.
The Hailo-8L in the Pi AI Kit delivers 13 TOPS of INT8 inference at 1-2.5W. It connects via PCIe to the Pi 5, giving it lower latency and higher bandwidth than USB-attached alternatives. The Hailo Dataflow Compiler accepts ONNX, TensorFlow, and TFLite models, converting them to optimized Hailo binaries. The supported model zoo covers YOLO v5/v7/v8, EfficientNet, MobileNet, ResNet, and SSD architectures. Running YOLOv8-nano, the Hailo-8L achieves 30-40 FPS at 640x640 — fast enough for real-time single-camera video processing.
The Google Coral USB Accelerator provides 4 TOPS via its Edge TPU at under 2W, connected over USB 3.0. Its advantage is universal compatibility — it plugs into any Linux computer, Raspberry Pi 3/4/5, or even macOS. The limitation is strict TFLite-only model support: models must be quantized to INT8 and compiled with Google's Edge TPU Compiler. Custom operations fall back to CPU. For simple classification and detection with pre-built TFLite models, the Coral is the cheapest path to hardware-accelerated inference. For anything beyond the TFLite model zoo, the 4 TOPS ceiling and framework lock-in become constraints.
The NVIDIA Jetson Orin Nano sits at the top with 40 TOPS, 1024 CUDA cores, and 8GB LPDDR5 unified memory at 7-15W. It runs TensorFlow, PyTorch, ONNX, TensorRT, and custom CUDA kernels without framework restrictions. It handles multi-camera pipelines, on-device transfer learning, and models too large for Hailo or Coral. The trade-off is cost (3-4x the Pi AI Kit), power consumption (5-10x), and complexity (full Linux computer with NVMe SSD required).
The Pi AI Kit's sweet spot is users who already own a Raspberry Pi 5 and want to add meaningful AI acceleration without buying a separate computer. The 13 TOPS performance handles most single-camera vision tasks. The native rpicam-apps integration means you can go from unboxing to real-time object detection in under an hour. If your model fits within 13 TOPS and one camera stream, the AI Kit delivers the best value per TOPS in the lineup.
Model Conversion and Developer Workflow
Getting a custom model running on the Hailo-8L requires converting it through the Hailo Dataflow Compiler (DFC), which runs on an x86 Linux workstation — not on the Pi 5 itself. The workflow is: train your model in TensorFlow or PyTorch on a workstation, export to ONNX or TFLite, run it through the DFC to produce a Hailo Executable Format (.hef) file, then copy the .hef to the Pi 5 for inference.
The DFC handles quantization, layer fusion, and memory scheduling automatically. It supports INT8 and INT16 quantization with a calibration dataset for accuracy optimization. The compiler's model zoo includes pre-optimized .hef files for popular architectures: YOLOv5s, YOLOv7-tiny, YOLOv8n/s/m, EfficientDet, MobileNet v1/v2, ResNet-50, and SSD MobileNet. For most users, downloading a pre-compiled .hef from the model zoo and plugging it into rpicam-apps is the fastest path to working inference.
Compared to NVIDIA's TensorRT, the Hailo DFC is less mature but simpler for common models. TensorRT handles a wider range of model architectures and custom layers, supports FP16 and FP32 in addition to INT8, and runs on the target device itself. The Hailo DFC requires a separate compilation step on an x86 machine, which adds friction to the development loop but produces well-optimized binaries for the supported architectures. For standard vision models, the rpicam-apps integration on the Pi 5 provides a developer experience that is arguably smoother than the Jetson's — one command to start a camera preview with real-time detection overlays, versus configuring DeepStream pipelines.
Full Specifications
Processor
| Specification | Value |
|---|---|
| ai_accelerator | Hailo-8L (13 TOPS) [1] |
| ai_performance | 13 TOPS [1] |
| host_requirement | Raspberry Pi 5 (sold separately) [1] |
I/O & Interfaces
| Specification | Value |
|---|---|
| interface | M.2 HAT+ (PCIe 2.0 x1) [2] |
| frameworks | TensorFlow Lite, ONNX, Hailo Model Zoo [2] |
| camera_support | Uses Pi 5's MIPI CSI-2 cameras [2] |
Power
| Specification | Value |
|---|---|
| Input Voltage | Powered by Pi 5 [1] |
| power_draw | ~3 W [1] |
Physical
| Specification | Value |
|---|---|
| Dimensions | Pi HAT+ form factor [2] |
| Form Factor | M.2 module + HAT+ adapter (stacks on Pi 5) [2] |
Who Should Buy This
13 TOPS runs YOLO v8 nano at 30+ FPS through rpicam-apps. The Pi Camera Module 3 connects via CSI for low-latency video. Pi 5 handles recording, notifications, and streaming while the Hailo-8L handles detection. Official Raspberry Pi support ensures long-term compatibility.
The Pi 5 has a single PCIe lane feeding the Hailo-8L at 500MB/s. Multi-camera streams at high resolution can saturate this bandwidth. The Jetson Orin Nano has 40 TOPS, 8GB unified memory, and handles 4+ camera streams natively.
Better alternative: NVIDIA Jetson Orin Nano Super Developer Kit (8GB)
The M.2 HAT+ stacks cleanly on a Pi 5. The Hailo-8L handles inference while the Pi 5 CPU remains free for application logic. rpicam-apps integration means detection overlays work with a few command-line arguments.
The AI Kit is Pi 5-specific. The Coral USB Accelerator works with any computer via USB 3.0 — Linux, macOS, Windows, even a Jetson. At 4 TOPS it is less powerful, but universally compatible.
Better alternative: Google Coral USB Accelerator
The Hailo-8L is inference-only with no training capability. Model training requires a GPU. The Jetson Orin Nano supports on-device training with CUDA and 8GB unified memory for transfer learning workflows.
Better alternative: NVIDIA Jetson Orin Nano Super Developer Kit (8GB)
Ecosystem & Community
The Pi AI Kit leverages the massive Raspberry Pi ecosystem plus Hailo's 13 TOPS accelerator. Integrates natively with rpicam-apps for camera inference pipelines and Frigate NVR for home security AI. The Pi 5's PCIe bus enables full 13 TOPS throughput without USB bottlenecks.
Compatible Software
What to Build First
Install Frigate on a Pi 5 with the AI Kit, connect IP cameras, and run real-time person/vehicle/animal detection on multiple camera feeds simultaneously. The Hailo-8L handles all inference while the Pi 5 manages recording, UI, and Home Assistant integration.
View tutorial →Must-Have Accessories
Tutorials & Resources
- Frigate NVR DocumentationComplete setup guide for AI-powered home security with Hailo accelerationdocs
- Raspberry Pi AI Kit DocumentationOfficial installation guide and rpicam-apps integrationdocs
- Jeff Geerling: Raspberry Pi AI Kit ReviewIn-depth benchmarks of Hailo-8L performance and integration with Pi 5review
- Hailo RPi5 ExamplesOfficial example code for object detection, segmentation, and pose estimationgithub
Frequently Asked Questions
Does the Raspberry Pi AI Kit work with Raspberry Pi 4?
No. The AI Kit requires the Raspberry Pi 5's PCIe interface, which the Pi 4 does not have. The Coral USB Accelerator is the best option for adding AI to a Pi 4, as it connects via USB 3.0.
Raspberry Pi AI Kit vs Coral USB Accelerator?
The AI Kit delivers 13 TOPS via PCIe with lower latency. The Coral USB provides 4 TOPS via USB but works with any computer. The AI Kit is 3x faster but Pi 5-only. The Coral is universal but slower.
Can I use an NVMe SSD and the AI Kit at the same time?
Not in the standard configuration — both use the Pi 5's single M.2 PCIe slot. You would need a third-party PCIe multiplexer board, or use a USB 3.0 SSD for storage while the PCIe slot serves the Hailo-8L module.
What ML frameworks does the Hailo-8L support?
The Hailo-8L runs models converted through the Hailo Dataflow Compiler, which accepts ONNX, TensorFlow, and TFLite formats. The rpicam-apps integration provides pre-built pipelines for common models like YOLO v8 and MobileNet SSD.
Raspberry Pi AI Kit vs NVIDIA Jetson Orin Nano?
The AI Kit adds 13 TOPS to a Pi 5 at a fraction of the Jetson's cost. The Jetson provides 40 TOPS, 8GB unified RAM, CUDA support, and handles training. Choose the AI Kit for simple detection tasks; choose the Jetson for complex multi-model AI workloads.
How much power does the AI Kit add to Pi 5 consumption?
The Hailo-8L draws approximately 1-2.5W under inference load. Combined with the Pi 5's 4-7W, total system power is 5-10W. This is significantly less than the Jetson Orin Nano's 7-15W and comparable to the Coral Dev Board's 2-4W.
Can the AI Kit run large language models?
No. The Hailo-8L is designed for vision inference models (detection, classification, segmentation). LLMs require far more memory and compute than 13 TOPS and the Pi 5's RAM can provide. The Jetson Orin Nano can run small LLMs with its 8GB unified memory.