Millisecond AI: LLM Inference at the 5G Edge

With 5G’s rollout, delivering AI services in milliseconds has never been more critical.

This session shows how to deploy large language models (LLMs) on edge infrastructure leveraging Open5GS for a virtualized 5G core and Ollama for local inference to slash latency and cloud expenses.

We’ll walk through a production-grade architecture and demo a simulated 5G device calling an edge-hosted AI endpoint. You’ll learn how to optimize workload placement, enable CPU-only inference, and balance reliability with resource constraints when integrating AI into telecom networks.

By the end, you’ll have a practical blueprint for bringing real-time intelligence to users and unlocking new edge-driven innovation in the AI & Data Innovations track.

Prakash Rao

AIG Technologies, Principal Cloud Engineer

Tokyo, Japan

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Millisecond AI: LLM Inference at the 5G Edge

Prakash Rao

Links

Actions