Hailo Support
Deploying to Hailo chips requires the compilation of ONNX models to Hailo-ONNX format. And due to the nature and involvement in the compilation process, it's prohibitively complicated to automate this process on the cloud. Hence, we provide a workaround where the user compiles the model locally, then uploads the compiled Hailo-ONNX file to the Nx AI Cloud.
Compiling an ONNX model
Requirements
Python 3.8
Python environment
Hailo Dataflow compiler and HailoRT Python API installed in that environment
A set of calibration images (similar to images used to train the model)
Example
In this example, we'll go over the compilation steps of a Yolov4-tiny model trained on the COCO dataset. Albeit, most of the instructions mentioned here apply to all kinds of ONNX models, with some requiring changes based on the model.
To compile the model, you can run the Python script below after changing the ONNX path.
The code performs the following tasks:
it transpiles the ONNX model to another format optimized for Hailo,
it quantizes and optimizes the model using the supplied set of calibration images,
it compiles the model to a HEF and embedds it inside an ONNX file as an operator, while the keeping the pre-processing and post-processing intact. Please note that this step returns an ONNX file that can have different input & output names and shapes.
Finally, a new metadata field, named
chip
is injected in the ONNX to save which Hailo chip the model was optimized for. This needed by the Nx AI Cloud to determine the target chip of the model.
After all the aforementioned steps are executed, a new ONNX file is generated. The latter needs to have his IOs metadata (names & shapes) adjusted. The Python script below is used for that purpose, it creates a new ONNX model with the adjusted inputs and outputs. The idea of the script is to make sure the generated ONNX is conforming to Nx's ONNX requirements.
Deploying on a machine with Hailo-8 or Hailo-8L chips
The first step is to verify that you have a compatible HailoRT driver installed. Please check out this table to determine if your driver version is supported. For general Hailo driver install see here (you will need to register for the Hailo Dev Zone). For the Raspberry Pi AI HAT+ see here for install instructions. For the Raspberry pi AI Kit see here.
Next, install the Nx AI plugin by following these instructions.
If all is well, you should be able to select the Hailo runtime when enabling the Nx plugin as shown below:
After the installation is finished, the plugin interface will look something like this:
To manually verify that the Hailo runtime is downloaded and set up, feel free to check out the content of the
bin
folder of the AI Manager and make sure it contains these files: - libhailort.so.4.xx.0 (xx is the minor version of the library) - libonnxruntime_providers_hailo.so -libonnxruntime_providers_shared.so - libRuntimeLibrary.so
ubuntu@ThinkStation-P360-Tower:~$ ls /opt/networkoptix-metavms/mediaserver/bin/plugins/nxai_plugin/nxai_manager/bin/
installed_runtime.txt libhailort.so.4.17.0 libonnxruntime_providers_hailo.so libonnxruntime_providers_shared.so libRuntimeLibrary.so sclbld sclblmod
Finally, to deploy a model that can be accelerated on the Hailo chip, make sure that it has a
application/x-onnx; device=hailo
orapplication/x-onnx; device=hailo-8l
based on the Hailo chip (Hailo-8 or Hailo-8L) model file in the Nx AI Cloud:
If that is not the case, you'll need to manually compile the ONNX model and upload the generated model to the cloud as illustrated in the example above.
Limitations
Number of parallel models
The Nx AI Manager offers the ability to operate multiple AI models at the same time. This flexibility allows you to efficiently manage resources and optimize the performance of your AI applications. However, it's important to consider the limitations of the hardware you're using. Specifically, each Hailo chip, whether it's the Hailo-8 or the Hailo-8L, is capable of running only one model at a given time. Therefore, the number of models you can run concurrently on a single machine directly corresponds to the number of Hailo chips installed in that machine. Failing to account for this limitation may lead to the AI Manager's failure.
Monitoring
How to Enable Hailo Monitoring with hailortcli monitor
To monitor Hailo usage with the hailortcli monitor
command, you need to set a specific environment variable. Follow these steps:
Edit the Media Server Service Configuration:
Add the following line to the
/etc/systemd/system/networkoptix-metavms-mediaserver.service
file to set the necessary environment variable:Environment="HAILO_MONITOR=1"
The updated configuration file should look like this:
[Unit] Description=Network Optix Media Server After=network.target local-fs.target remote-fs.target Requires=networkoptix-metavms-root-tool.service [Service] PermissionsStartOnly=true ExecStartPre=/opt/networkoptix-metavms/mediaserver/lib/scripts/systemd_mediaserver_pre_start.sh ExecStart=/opt/networkoptix-metavms/mediaserver/lib/scripts/systemd_mediaserver_start.sh User=networkoptix-metavms Group=networkoptix-metavms Restart=always TimeoutStopSec=120 KillMode=process TasksMax=8191 LimitCORE=infinity Environment="HAILO_MONITOR=1" [Install] WantedBy=multi-user.target
Restart the NX Media Server:
After updating the configuration file, restart the Network Optix Media Server for the changes to take effect. You can do this by running one of the following commands:
sudo systemctl restart networkoptix-metavms-mediaserver.service
or
sudo service networkoptix-metavms-mediaserver restart
Run hailortcli
HAILO_MONITOR=1 hailortcli monitor

PCIe descriptor page size error
If you encounter the following error (actual page size may vary), it indicates that your host does not support the specified PCIe descriptor page size:
[HailoRT] [error] CHECK_AS_EXPECTED failed - max_desc_page_size given 16384 is bigger than hw max desc page size 4096
This issue is common on ARM64 devices like the Raspberry Pi AI Kit. To resolve it, add the following configuration to /etc/modprobe.d/hailo_pci.conf
. If the file does not exist, create it and set the max_desc_page_size
to the value mentioned in the error (e.g., 4096):
options hailo_pci force_desc_page_size=4096
You can add this configuration by running the following command:
echo 'options hailo_pci force_desc_page_size=4096' | sudo tee -a /etc/modprobe.d/hailo_pci.conf
Reboot the machine for the changes to take effect. Alternatively, you can reload the driver without rebooting by executing these commands:
sudo modprobe -r hailo_pci
sudo modprobe hailo_pci
Experimental .ini
setting
.ini
settingExplicitly setting multiple Nx AI runtime engines is controlled by an .ini
file. This .ini
file does not exist by default and must be created by the user.
Create an empty file by running:
sudo mkdir -p /home/networkoptix-metavms/.config/nx_ini
sudo touch /home/networkoptix-metavms/.config/nx_ini/nxai_plugin.ini
sudo chmod 666 /home/networkoptix-metavms/.config/nx_ini/nxai_plugin.ini
Then restart the mediaserver:
sudo service networkoptix-metavms-mediaserver restart
Once the mediaserver is restarted, the .ini file should be filled with defaults. Each setting should have a description in the .ini file.
Now, you can set multiple runtimes through:
#enableOutput=false
# If enabled, the NxAI Plugin, NXAI Manager, and Inference Engine logs will be logged to the console. Default: false
#logToConsole=false
# If enabled, the NxAI Plugin will send frames to the AI Manager even when inactive. Default: false
#sendFramesOverride=false
# The path of the AI Manager's socket file is listening for messages. Default: "/tmp/nxai_manager.sock"
#AIManagerSocketPath="/tmp/nxai_manager.sock"
# The path of the socket file the plugin will listen for messages. Default: "/tmp/nxai_plugin.sock"
#PluginSocketPath="/tmp/nxai_plugin.sock"
# The URL of the NXAI cloud. Default: "https://api.sclbl.nxvms.com"
#CloudAPI="https://api.sclbl.nxvms.com"
# The AI Manager will spawn this many runtimes per model. Default: 1
runtimesPerModel=2
Last updated