AI at the Edge: ONNX Inference in WASM on Featherweight k0s

This session explores deploying ONNX machine learning models via WebAssembly on lightweight k0s Kubernetes clusters—enabling fast, secure AI inference in resource-constrained environments. The talk demonstrates running pre-trained models from frameworks like PyTorch and TensorFlow through ONNX, executed inside WebAssembly’s sandboxed runtime on minimal Kubernetes.

Topics include:
– ONNX Runtime with WebAssembly on k0s
– Using WASI to access and run models
– Optimizing performance for Wasm-based inference
– Model loading, caching, and scaling strategies
– CI/CD integration for Wasm ML pipelines

Attendees will see real-world benchmarks comparing Wasm-based vs. containerized inference, focusing on latency, memory usage, cold start, and throughput. This approach reduces infrastructure cost, improves isolation, and unlocks edge AI use cases where traditional containers fall short.

Prashant Ramhit

Mirantis Inc. Platform Engineer | Snr DevOps Advocate | OpenSource Dev

Dubai, United Arab Emirates

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

AI at the Edge: ONNX Inference in WASM on Featherweight k0s

Prashant Ramhit

Links

Actions