Using the Gemini API to generate images & audio

There are many APIs and products in the industry you can use to generate audio & images, but most of them aren't free. This includes Google AI products like Veo and Imagen... they aren't free either. What if you just want to experiment for free and don't need high quality output? This is the perfect use case for Gemini, a robust and fully-featured LLM (large language model) that has such capabilities. In this session, learn how to use the Gemini API to generate images as well as spoken-voice audio files using the Flash 2.0 Experimental model. Also get a peek at the capabilities of the Flash 2.5 Image (Nano Banana) model!

intended for software developers building genAI apps or guiding the development of vibecoded apps or agents

Wesley Chun

AI TPgM | Technical Consultant | Google Developer Expert (GCP, GWS) | ex-Google engineer

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Using the Gemini API to generate images & audio

Wesley Chun

Links

Actions