Video Session Summarization with ASR-based & Visual Highlight

Video summarization pipeline that converts long-form videos into readable articles. The system integrates Automatic Speech Recognition (ASR) to transcribe audio into text with timestamp that speaker talk (or multi-speaker) from open-source model, applies large language model for summarization and topic segmentation, and uses key-phrase or speaker-based importance detection to extract visual highlights (frames) aligned with the narrative.

Applications
- Automated video blogging or podcast summarization
- Lecture and meeting note generation with visual context

Witthawin Sripheanpol

AI Researcher

Bangkok, Thailand

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Video Session Summarization with ASR-based & Visual Highlight

Witthawin Sripheanpol

Links

Actions