Create a Multi-Speaker Podcast with Gemini 2.0 & Text-to-Speech

This session demonstrates how to use the Gemini API in Vertex AI to generate an engaging multi-speaker podcast using studio voices in the Text-to-Speech API.

This can be useful for creating interviews, interactive storytelling, video games, e-learning platforms, and accessibility solutions.

The steps performed include:

Load a PDF file from a Google Cloud Storage bucket or public URL
Summarize the content using Gemini 2.0 Flash
Return a pre-defined JSON schema using Controlled Generation
Create a multi speaker conversation from the JSON script using Text-to-Speech.
Generate the audio as MP3 file.

Arthur Kaza

Sr. Manager, Data Analytics & Automation @Equity Bank

Kinshasa, Democratic Republic of the Congo

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Create a Multi-Speaker Podcast with Gemini 2.0 & Text-to-Speech

Arthur Kaza

Links

Actions