Speaker

Mitesh Mangaonkar

Mitesh Mangaonkar

Tech Lead Data Engineer at Airbnb

Seattle, Washington, United States

Actions

Mitesh is an accomplished Data Engineer and Architect with a proven track record in the dynamic realm of information technology and services. With a robust skill set encompassing Databases, Big Data technologies, Cloud Computing, Software Development Life Cycle (SDLC), and Hadoop, he has consistently delivered innovative solutions that drive businesses forward.

Mitesh's expertise is anchored by a Master's in Management Information Systems, Business Intelligence, and Data Analytics from the prestigious Texas Tech University - Rawls College of Business. His academic prowess and hands-on experience position him as a driving force in the ever-evolving field of data engineering.

As a forward-thinker and problem-solver, Mitesh is dedicated to harnessing the power of data to transform organizations. His passion for technology and ability to translate complex data into actionable insights make him a sought-after professional in the industry.

Area of Expertise

  • Business & Management
  • Information & Communications Technology
  • Transports & Logistics
  • Consumer Goods & Services

Topics

  • Data Engineering
  • Data Engineering Pipelines
  • Data Engineering with Python
  • Data Privacy
  • Data Governance
  • Big Data Machine Learning AI and Analytics
  • Data Architecture
  • Cloud Computig
  • Cloud strategy
  • Database and Cloud
  • Cloud Technology
  • Cloud & DevOps
  • Database
  • Big Data
  • Artificial Inteligence
  • Machine Leaning
  • Amazon Web Services
  • Security & Compliance
  • AWS Data
  • Data Operations
  • Big Data Analytics
  • Data Observability
  • Data Organizations
  • Working with time series/IoT data on a data lake
  • Data Orchestration
  • All things data
  • Machine Learning & AI
  • Machine Learning and Artificial Intelligence
  • Data Quality
  • Modern Data Warehouse

Harnessing Large Language Models in Enterprise Data Engineering: An On-Call Revolution

Data engineering teams encounter challenges like data quality issues and pipeline failures, especially in enterprise environments. Addressing this, our approach combines the linguistic prowess of models like GPT-4 with data engineering tasks. We autonomously identify and rectify data quality issues, transform anomaly detection paradigms, and automate recovery tasks. Our methodology achieves reduced resolution times, fine-tuned anomaly detectors, and minimized downtime.

Key Takeaways:
Innovative Use of GPT-4: Leveraging large language models like GPT-4 can revolutionize traditional data engineering tasks and offer autonomous solutions.
Efficient Issue Resolution: The approach significantly reduces resolution times, allowing engineers to focus on intricate challenges.
Empirical Validation: Our case studies validate improved metrics and overall system stability, suggesting tangible benefits of integrating AI in on-call data engineering.

Unified Data Layer

Data fragmentation remains a significant challenge, affecting the efficiency and reliability of data analytics in large organizations. Despite advancements in central data governance, most organizations continue to suffer from the proliferation of redundant, poorly organized, and seldom-used data tables. This presentation introduces a new Unified Data Layer (UDL) approach to address these challenges. UDL aims to serve as a flexible, consolidated layer above existing data warehouses, designed to eliminate data silos and encourage reusability. By employing a structural model inspired by Object-Oriented Programming (OOP), the UDL creates a collection of source and processed views, effectively serving as encapsulated units of relevant data and related operations.

Key Takeaways:

The current data landscape is littered with fragmented, poorly maintained tables that hinder efficient data discovery and analysis.
UDL offers a modular approach to organizing and unifying data inspired by principles derived from the evolution of programming paradigms, notably OOP.
Implementing UDL can significantly reduce the number of ad-hoc pipelines and redundant tables, thereby improving data discoverability and reusability.
With UDL, multiple teams can contribute data related to a single entity, which can be joined from various sources to present a comprehensive view, leading to more effective decision-making.
By implementing UDL, organizations can overcome some of the most pressing challenges in data management, paving the way for a more organized, efficient, and effective utilization of data assets.

Harnessing Large Language Models in Enterprise Data Engineering: An On-Call Revolution

Data engineering teams encounter challenges like data quality issues and pipeline failures, especially in enterprise environments. Addressing this, our approach combines the linguistic prowess of models like GPT-4 with data engineering tasks. We autonomously identify and rectify data quality issues, transform anomaly detection paradigms, and automate recovery tasks. Our methodology achieves reduced resolution times, fine-tuned anomaly detectors, and minimized downtime. Empirical evidence showcases enhanced metrics such as reduced MTTR and fewer false positives, advocating a future where AI plays a pivotal role in on-call data engineering.

Harnessing Large Language Models in Enterprise Data Engineering: An On-Call Revolution

Data engineering teams encounter challenges like data quality issues and pipeline failures, especially in enterprise environments. Addressing this, our approach combines the linguistic prowess of models like GPT-4 with data engineering tasks. We autonomously identify and rectify data quality issues, transform anomaly detection paradigms, and automate recovery tasks. Our methodology achieves reduced resolution times, fine-tuned anomaly detectors, and minimized downtime. Empirical evidence showcases enhanced metrics such as reduced MTTR and fewer false positives, advocating a future where AI plays a pivotal role in on-call data engineering.

Key Takeaways:
1. Innovative Use of GPT-4: Leveraging large language models like GPT-4 can revolutionize traditional data engineering tasks, offering autonomous solutions.
2. Improved Anomaly Detection: By analyzing historical data, our system provides optimized thresholds for anomaly detectors, balancing alert sensitivity and accuracy.
3. Efficient Issue Resolution: The approach significantly reduces resolution times, allowing engineers to focus on intricate challenges.
4. Empirical Validation: Our case studies validate improved metrics and overall system stability, suggesting tangible benefits of integrating AI in on-call data engineering.

DeveloperWeek 2024 Sessionize Event

February 2024 Oakland, California, United States

Global AI Conference 2023 Sessionize Event

December 2023

Mitesh Mangaonkar

Tech Lead Data Engineer at Airbnb

Seattle, Washington, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top