
Pankaj Koti
Astronomer, Software Engineer
Pune, India
Actions
Pankaj Koti is an experienced Data Engineer with over five years of expertise in using Airflow. Acknowledged for his understanding of Airflow's capabilities and contributions to the project, he has been designated as a committer to the project. As a member of the Airflow OSS Engineering team at Astronomer, Pankaj contributes to innovation in data orchestration space. He likes to aid users and contributors in navigating Airflow's complexities and propelling data engineering practices forward.
Links
Area of Expertise
Topics
Optimizing Airflow Performance: Strategies, Techniques, and Best Practices
Airflow, an open-source platform for orchestrating complex data workflows, is widely adopted for its flexibility and scalability. However, as workflows grow in complexity and scale, optimizing Airflow performance becomes crucial for efficient execution and resource utilization. This session delves into the importance of optimizing Airflow performance and provides strategies, techniques, and best practices to enhance workflow execution speed, reduce resource consumption, and improve system efficiency. Attendees will gain insights into identifying performance bottlenecks, fine-tuning workflow configurations, leveraging advanced features, and implementing optimization strategies to maximize pipeline throughput. Whether you're a seasoned Airflow user or just getting started, this session equips you with the knowledge and tools needed to optimize your Airflow deployments for optimal performance and scalability. We'll also explore topics such as DAG writing best practices, monitoring and updating Airflow configurations, and database performance optimization, covering unused indexes, missing indexes, and minimizing table and index bloat.
Overcoming performance hurdles in Integrating dbt with Airflow
The integration between dbt and Airflow is a popular topic in the community, both in previous editions of Airflow Summit, in Coalesce and the #airflow-dbt Slack channel.
Astronomer Cosmos (https://github.com/astronomer/astronomer-cosmos/) stands out as one of the libraries that strives to enhance this integration, having over 300k downloads per month.
During its development, we've encountered various performance challenges in terms of scheduling and task execution. While we've managed to address some, others remain to be resolved.
This talk describes how Cosmos works, the improvements made over the last 1.5 years, and the roadmap. It also aims to collect feedback from the community on how we can further improve the experience of running dbt in Airflow.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top