Introduction to OAAX

The Open AI Accelerator Standard (OAAX) represents an initiative aimed at standardizing the integration process of Application Programming Interfaces (APIs) and Application Binary Interfaces (ABIs) for hardware accelerators. Primarily utilized in the deployment of AI model within applications, OAAX seeks to establish uniformity and efficiency in this integration procedure.

Within the framework of OAAX, the deployment process of an AI model is conceptualized as a two-stage pipeline. The initial stage encompasses a series of steps and procedures necessary for transforming a trained AI model into an optimized format or binary. These steps may include, but are not limited to, validation, parsing, optimizations, and conversion. The resultant optimized model is then forwarded to the subsequent stage. The latter phase entails the utilization of the produced model file, and executing it on a designated hardware XPU. This initial stage is performed by an "Operation Conversion Toolchain," while the subsequent stage is conducted by a "XPU Operation Runtimes."

It's important to note that a conversion toolchain and a runtime are interdependent components within the OAAX framework. The artifacts generated by a conversion toolchain are exclusively intended for use by their corresponding runtime. This symbiotic relationship ensures that the optimized model files produced by the conversion toolchain are seamlessly utilized by the target runtime for inference and execution tasks, without the need for running a heavy-load processing that requires a huge amount of resources and time on the device.

Within the Nx toolkit context, the conversion toolchain is expected to be a Docker container that will be communicating with an API that manages that container, and the runtime will be a C shared library, intended to be dynamically loaded by other software applications as needed.

Last updated