Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

πŸ€– Artificial Intelligence
☁️ Cloud and Infrastructure
πŸ’Ύ Data and Databases
πŸ’Ό Professional Skills
🎯 Programming and Development
πŸ”’ Security and Networking
πŸ“š Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
Β© 2026 CheatGridβ„’. All rights reserved.
Privacy PolicyTerms of UseAboutContact

ONNX and ONNX Runtime Cheat Sheet

ONNX and ONNX Runtime Cheat Sheet

Back to AI and Machine Learning
Updated 2026-05-21
Next Topic: Optuna Hyperparameter Optimization Cheat Sheet

ONNX (Open Neural Network Exchange) is an open format for representing machine learning models as portable computation graphs, enabling interoperability between frameworks like PyTorch and TensorFlow. ONNX Runtime is Microsoft's high-performance inference engine that executes those graphs across CPUs, GPUs, mobile, and browser runtimes. The core value proposition is a clean separation between training and deployment: export once, run anywhere β€” on CUDA, TensorRT, DirectML, CoreML, QNN, or plain WebAssembly. The key mental model is that performance is layered: the model graph itself can be optimized (graph fusion, quantization, FP16 conversion), the execution provider determines the hardware backend, and session/run options fine-tune threading and memory β€” these three levers operate independently and can be combined.

What This Cheat Sheet Covers

This topic spans 19 focused tables and 139 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: ONNX Model Format and Core ConceptsTable 2: ONNX Opsets and Operator VersioningTable 3: Exporting from PyTorchTable 4: Exporting from TensorFlow / Keras (tf2onnx)Table 5: Converting scikit-learn Models (sklearn-onnx / skl2onnx)Table 6: ONNX Model Verification and Shape InferenceTable 7: Building ONNX Models ProgrammaticallyTable 8: ONNX Runtime InferenceSessionTable 9: SessionOptions ConfigurationTable 10: Graph Optimization LevelsTable 11: Execution ProvidersTable 12: Execution Provider Configuration β€” CUDA and TensorRTTable 13: IOBinding for Zero-Copy InferenceTable 14: QuantizationTable 15: FP16 and Mixed Precision ConversionTable 16: Transformer Model OptimizerTable 17: ONNX Runtime Web (Browser)Table 18: ONNX Runtime Mobile and ORT FormatTable 19: Profiling and Performance Diagnosis

Table 1: ONNX Model Format and Core Concepts

The ONNX format stores models as Protocol Buffer (.proto) files. Understanding how ModelProto, opsets, and domains fit together is essential before exporting or loading any model.

ConceptExampleDescription
ModelProto
model = onnx.load("model.onnx")
print(model.ir_version)
Top-level container holding the computation graph, opset imports, model version, and metadata.
GraphProto
graph = model.graph
print(len(graph.node))
Directed Acyclic Graph of NodeProtos; contains node, input, output, and initializer fields.
NodeProto
node = graph.node[0]
print(node.op_type, node.input, node.output)
Represents a single operator invocation; references input/output tensors by string name.
TensorProto
init = graph.initializer[0]
arr = numpy_helper.to_array(init)
Stores constant tensor values (weights/biases) embedded directly in the model.
ValueInfoProto
vi = helper.make_tensor_value_info('X', TensorProto.FLOAT, [None, 3])
Describes shape and dtype of a graph edge (input, output, or intermediate value).

More in AI and Machine Learning

  • Online Learning and Concept Drift Adaptation Cheat Sheet
  • Optuna Hyperparameter Optimization Cheat Sheet
  • AI Bias & Fairness Cheat Sheet
  • Edge AI and TinyML Cheat Sheet
  • Mixture of Experts (MoE) Architecture Cheat Sheet
  • PyTorch Cheat Sheet
View all 83 topics in AI and Machine Learning