ONNX (Open Neural Network Exchange) is an open format for representing machine learning models as portable computation graphs, enabling interoperability between frameworks like PyTorch and TensorFlow. ONNX Runtime is Microsoft's high-performance inference engine that executes those graphs across CPUs, GPUs, mobile, and browser runtimes. The core value proposition is a clean separation between training and deployment: export once, run anywhere β on CUDA, TensorRT, DirectML, CoreML, QNN, or plain WebAssembly. The key mental model is that performance is layered: the model graph itself can be optimized (graph fusion, quantization, FP16 conversion), the execution provider determines the hardware backend, and session/run options fine-tune threading and memory β these three levers operate independently and can be combined.
What This Cheat Sheet Covers
This topic spans 19 focused tables and 139 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: ONNX Model Format and Core Concepts
The ONNX format stores models as Protocol Buffer (.proto) files. Understanding how ModelProto, opsets, and domains fit together is essential before exporting or loading any model.
| Concept | Example | Description |
|---|---|---|
model = onnx.load("model.onnx")print(model.ir_version) | Top-level container holding the computation graph, opset imports, model version, and metadata. | |
graph = model.graphprint(len(graph.node)) | Directed Acyclic Graph of NodeProtos; contains node, input, output, and initializer fields. | |
node = graph.node[0]print(node.op_type, node.input, node.output) | Represents a single operator invocation; references input/output tensors by string name. | |
init = graph.initializer[0]arr = numpy_helper.to_array(init) | Stores constant tensor values (weights/biases) embedded directly in the model. | |
vi = helper.make_tensor_value_info('X', TensorProto.FLOAT, [None, 3]) | Describes shape and dtype of a graph edge (input, output, or intermediate value). |