Session
Create a Multi-Speaker Podcast with Gemini 2.0 & Text-to-Speech
This session demonstrates how to use the Gemini API in Vertex AI to generate an engaging multi-speaker podcast using studio voices in the Text-to-Speech API.
This can be useful for creating interviews, interactive storytelling, video games, e-learning platforms, and accessibility solutions.
The steps performed include:
Load a PDF file from a Google Cloud Storage bucket or public URL
Summarize the content using Gemini 2.0 Flash
Return a pre-defined JSON schema using Controlled Generation
Create a multi speaker conversation from the JSON script using Text-to-Speech.
Generate the audio as MP3 file.

Arthur Kaza
Head of Tech Support-Mentorship & Data Analytics @Akieni (Yao Corp)
Kinshasa, Democratic Republic of the Congo
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top