Session

Boosting Data Processing: Performance Tune Pandas

This talk will be a in-depth exploration of techniques to enhance the performance of Pandas, the powerful data analysis library in Python. It will cover strategies, tips, and best practices for optimizing data processing workflows, leading to faster and more efficient analysis.

Recognizing the challenges posed by big data, we're well aware that Pandas can struggle with large datasets. Given that optimization is integral to tech, this talk delves into effective strategies for accelerating Pandas operations be it simple transformations on data or data export/imports to databases. I will also cover alternate supporting libraries to use and simple modifications to existing code to speed up execution.

We will have a live demo with code snippets demonstrating the usage and performance comparison as opposed to traditional methods which are used widespread.

This session aims to address 4 key points:

Why pandas is slow when it comes to handling big data?
Slight code modifications to existing pandas code syntax
Using different libraries to speed up execution - like SQLAlchemy, NVIDIA’s RAPIDS cuDF library among others
Performance comparison between proposed and existing methods
Drawing from personal experience, I'll share tried-and-tested methods to optimize Python scripts using Pandas and reduce pipeline execution time, ultimately enhancing resource efficiency.

Asha Holla

Analytics Engineer @ Bloom Value | AI, Automation, BI | Data Nerd

Bengaluru, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top