Edge AI and TinyML (Tiny Machine Learning) bring machine learning inference directly to resource-constrained devices like microcontrollers, embedded systems, and IoT endpoints. Edge AI runs on moderately powerful edge devices (~100mW to several watts), while TinyML pushes ML capabilities onto ultra-low-power microcontrollers operating at milliwatt-level consumption (often <1mW idle). The key innovation is deploying optimized neural networks directly on-device rather than relying on cloud servers, enabling real-time inference with enhanced privacy, reduced latency, and minimal connectivity dependence. Successful Edge AI deployment hinges on aggressive model optimization (quantization, pruning, knowledge distillation), understanding hardware accelerator capabilities (NPU, DSP, GPU delegation), and navigating the tradeoff triangle of accuracy, latency, and power consumption.