
Abdul Rehman Zafar
Senior Solutions Architect at Ververica
Berlin, Germany
Actions
Abdul is a Senior Solutions Architect in Ververica with expertise in real-time Streaming Analytics. He is a strategic technical advisor of Ververica, helping customers solve complex data engineering challenges. Before working with Ververica, specialising in Cloud computing and Steaming Analytics, he worked in Amazon Web Services as a Solutions Architect. In AWS, he helped startups and enterprises in their journey toward the cloud and big data by building petabyte-scale data pipelines. He has over 15 years of diverse experience in various roles, from startups to enterprises, solving data and distributed system-related challenges.
Area of Expertise
Topics
Real-time Clickstream Analytics on E-commerce Website Data using Ververica Cloud
This presentation will discuss how Ververica Cloud solves different use cases for real-time Clickstream Analytics on E-commerce websites. We will explore how VERA (Ververica Runtime Assembly) sets a course toward streamlined operations, resource efficiency, and enhanced productivity for Apache Flink applications running on Ververica Cloud. Ultimately, we will show how easy it is to get started and do real-time Clickstream Analytics on E-commerce website data using FlinkSQL on Ververica Cloud.
Scaling Kafka and Flink with Private Networking: VPC Peering, Private Link, and Bring Your Own Cloud
Deploying Apache Flink and Kafka at scale requires careful networking design to ensure high performance, security, and compliance. Public cloud networking can introduce unnecessary latency, security risks, and egress costs, making private networking solutions a critical consideration for enterprises operating in regulated industries or multi-cloud environments.
In this talk, we will explore how to architect a high-performance private networking setup using:
• VPC Peering and Private Link: When to use each approach for low-latency, secure data exchange between Flink and Kafka clusters.
• Bring Your Own Cloud (BYOC) Deployments: Best practices for running Redpanda and Ververica’s Flink platform in an isolated environment while maintaining full control over networking.
• Security and Compliance Advantages: How private networking ensures data sovereignty, regulatory compliance (GDPR, HIPAA), and protection against unauthorized access.
• Performance Gains and Cost Optimizations: Reducing cross-region traffic costs, eliminating NAT bottlenecks, and ensuring predictable throughput in production.
• Implementation Challenges and Lessons Learned: Common pitfalls when setting up private networking in AWS, GCP, and Azure, and how to troubleshoot networking issues in Flink-Kafka pipelines.
By the end of this session, attendees will gain a clear roadmap for setting up private networking solutions for large-scale, real-time streaming architectures. Whether you’re an infrastructure engineer, data platform architect, or security-conscious enterprise, this talk will provide actionable insights on running Kafka and Flink securely, efficiently, and at scale in your own cloud environment.
Increasing Flink Performance to 4x: Optimising Flink SQL for High-Performance Streaming Workloads
Apache Flink has become the go-to stream processing engine for large-scale real-time analytics, but operating it at petabyte scale introduces unique challenges. How can you ensure that your Flink SQL jobs are fast, resource-efficient, and cost-effective?
In this talk, we will dive into the core principles of how we optimised Flink SQL queries for a European Bank to 4x performance using simple techniques. Drawing insights from real-world workloads and best practices, we will explore:
• Understanding Flink SQL Execution: How Flink translates SQL queries into execution plans and why performance bottlenecks occur.
• Optimising Joins and Aggregations: Strategies for handling stateful operations efficiently, minimising data shuffling, and leveraging Flink’s internal optimisation techniques.
• Memory and State Tuning: How to fine-tune Flink’s state backend, memory management, and parallelism to prevent excessive checkpointing overhead.
• Practical Techniques for Performance Tuning: Real-world debugging workflows using Flink’s Web UI, metrics, and profiling tools.
• Lessons from Scaling to Petabytes: How leading companies optimize Flink SQL in production, ensuring reliability and cost efficiency.
This session is aimed at data engineers, Flink practitioners, and architects looking to push the boundaries of Flink SQL performance. By the end of the talk, attendees will walk away with actionable strategies to reduce job latency, optimize resource usage, and scale Flink SQL jobs efficiently in production.
Whether you’re operating Flink for real-time analytics, ETL pipelines, or streaming machine learning, this talk will help you write faster, more efficient Flink SQL queries at scale.
Building a Modern Streaming Data Pipeline with Apache Flink, Iceberg and Paimon
Streaming data architectures have evolved beyond traditional batch ETL pipelines. With the rise of streaming data lakes, enterprises can now build real-time, scalable, and cost-efficient data processing systems that seamlessly join event streams from Kafka with transactional data (MySQL), aggregate results, and store them in modern table formats like Apache Iceberg and Apache Paimon.
This talk will walk through how to architect a robust, end-to-end streaming data pipeline, covering:
• Consuming Real-Time Data from Kafka: Best practices for handling high-throughput streaming data.
• Joining with MySQL: Using Flink SQL to enrich streaming events with transactional data.
• Aggregating and Transforming Data: Efficient stateful processing techniques to handle large-scale real-time analytics.
• Apache Iceberg vs. Apache Paimon: Key features, trade-offs, and when to use each for a scalable, queryable streaming data lake.
• Real-World Use Cases: How companies are adopting streaming data lake architectures to improve reporting, machine learning, and real-time operational analytics.
• Comparing with Traditional Architectures: Why moving away from batch ETL + traditional data warehouses to a streaming-first approach improves latency, cost efficiency, and data freshness.
By the end of this session, attendees will understand how to build a scalable, real-time streaming pipeline that integrates Kafka, MySQL, Flink, and Apache Iceberg/Paimon to power low-latency, high-throughput analytics. This talk is perfect for data engineers, architects, and platform teams looking to modernize their data stack with real-time data lake architectures.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top