Session
Build Your Pocket Brain: Custom On-Device LLMs via MediaPipe on Android
The era of cloud-dependent Generative AI presents challenges in latency, cost, and user privacy. The next frontier is on-device AI, and MediaPipe's new LLM Inference API is leading the charge for Android developers. beyond simply running pre-trained models like Gemma on a device, we will dive deep into the power of customisation using Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning (PEFT) technique.
You'll learn the end-to-end workflow for creating a specialised, on-device LLM tailored to your app's unique domain. We will cover how to take a base model (like Gemma or Phi-2), fine-tune it with your own dataset using the PEFT library in Python, convert both the base model and the LoRA weights into the MediaPipe-compatible FlatBuffer format, and finally, integrate this custom-tuned model into an Android application.
we will demonstrate how to configure LlmInferenceOptions in kotlin to load both the base model and the .tflite LoRA file, unlocking hyper-personalised AI experiences that are fast, offline-capable, and completely private.
Key Takeaways
* Understanding when to use MediaPipe LLM Inference / gemini nano on-device generative AI solutions.
* Setup and configure MediaPipe LLM Inference for on-device generative AI.
* Expertise in LoRA fine-tuning to adapt LLMs like Gemma-2B or Phi-2 for specific use cases cost-effectively.
* Dive into Configuration options and multimodal prompting.
* Knowledge of deployment workflows, GPU-accelerated LoRA inference, and ethical AI practices that we should follow.
Session to be presented on Droidcon Abu Dhabi 2025 on 13th December.
Dinoy Raj
Product Engineer – Android @ Strollby | Droidcon Uganda ’25 & Droidcon Abu Dhabi;25 Speaker
Thiruvananthapuram, India
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top