From Ingestion to Insights: Building Robust Data Pipelines in AWS

In today’s data-driven world, organizations must handle vast and varied data sources while delivering timely insights. This session explores how to design and implement scalable, resilient data pipelines on AWS—from raw data ingestion to delivering business-ready insights. We’ll walk through real-world architecture patterns using AWS-native services such as Kinesis, Glue, EMR, Athena, and Redshift, with a focus on modular design, data governance, cost optimization, and performance.

Attendees will learn how to handle both batch and streaming ingestion, orchestrate complex workflows, and manage data across different lifecycle stages (raw, refined, curated) using scalable storage solutions like Amazon S3. We'll also dive into transformation strategies using PySpark and SQL, techniques for metadata management and schema evolution, and tools for observability and access control.

Whether you're building your first pipeline or optimizing existing systems, this talk will equip you with practical strategies, architecture blueprints, and lessons learned from real-world implementations. By the end of the session, you’ll understand how to create data pipelines that are robust, cost-effective, and ready to scale—empowering your teams to move faster from data to decisions in the AWS ecosystem.

Santosh Durgam

Morningstar Investments Inc, Manager of Software Engineering, Data Engineering & Analytics Leader

Chicago, Illinois, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

From Ingestion to Insights: Building Robust Data Pipelines in AWS

Santosh Durgam

Links

Actions