Muhammed Mizaj's Speaker Profile @ Sessionize

The SQLite Moment for Analytics: Building Lightweight Data Pipelines with DuckDB

Point of the talk: To show developers how to eliminate costly, complex cloud data warehouses for medium-scale datasets by embedding DuckDB directly into Python applications for serverless, zero-copy data processing.

Duration: 45 Minutes

Detailed Breakdown: Many data pipelines unnecessarily spin up heavy, distributed clusters (like Spark) to process 10–50 GB datasets, incurring high financial costs and latency. This talk introduces DuckDB as an in-process, columnar analytical engine that lives inside your Python runtime. We will explore how to write ultra-fast SQL directly over local or remote Parquet, CSV, and JSON files. Attendees will learn the architecture of zero-copy memory sharing between DuckDB and PyArrow/Polars, ensuring seamless data handoffs without serialization overhead. Finally, we will build a local-first dashboard pipeline that runs entirely on a single machine while matching the speed of a cloud warehouse.

Semantic Data Enrichment: Harnessing LLMs and Vector Search in Python Pipelines

Point of the talk: To show how to upgrade traditional string-matching enrichment pipelines by injecting unstructured contextual data using LLM structured outputs and vector databases.

Duration: 45 Minutes

Detailed Breakdown: Traditional enrichment relies on exact key matching (e.g., matching IDs or exact words). Modern pipelines require semantic enrichment—categorizing customer feedback, sentiment-tagging tickets, or merging disparate data based on intent. This talk outlines a production-ready architecture using Pydantic (for strict schema validation of LLM outputs) and vector databases for fast similarity lookups. We will discuss how to minimize LLM token costs through semantic caching and how to handle schema data drift safely.

High-Throughput Data Enrichment

Point of the talk: To teach developers how to design highly parallelized data enrichment workers that fetch external API data or microservice context without hitting rate limits or blocking downstream consumers.

Duration: 45 Minutes

Detailed Breakdown: Data enrichment often requires stitching incoming telemetry with external user profiles, geocoding APIs, or CRM data. Doing this synchronously destroys pipeline throughput. This talk covers how to build non-blocking enrichment workers using FastStream and Asyncio. We will explore advanced throttling strategies, token-bucket algorithms for API compliance, and how to use Redis as a high-speed caching layer to prevent redundant external network requests. Attendees will walk away with a resilient blueprint for handling thousands of requests per second.

Building High-Throughput Event Pipelines with Python and NATS

Building High-Throughput Event Pipelines with Python and NATS

Duration: 25 minutes

Prerequisites: Basic Python knowledge, familiarity with APIs and distributed systems concepts. No prior experience with NATS required.

Talk Overview

Modern applications are no longer built as isolated services. They communicate through streams of events, real-time updates, and asynchronous workflows that demand reliability, scalability, and low latency. While traditional message brokers often introduce operational complexity, NATS offers a lightweight, high-performance approach to event-driven architecture.

In this talk, we'll explore how to build high-throughput event pipelines using Python and NATS. Starting with the challenges of synchronous communication, we'll examine why event-driven systems have become the backbone of modern distributed architectures. Through practical examples, attendees will learn how producers, consumers, subjects, queues, and streams work together to process large volumes of events efficiently.

We'll build a complete event pipeline in Python, explore JetStream for durability and replayability, implement consumer scaling patterns, and discuss strategies for handling failures, backpressure, and observability in production systems.

By the end of the session, attendees will understand how to design, build, and operate event-driven applications capable of processing thousands of messages per second with minimal complexity.

Detailed Outline

1. Why Event-Driven Architecture? (3 minutes)

* Challenges of synchronous service-to-service communication
* Tight coupling and scalability limitations
* Introduction to event-driven systems
* Common use cases:

* Real-time notifications
* Analytics pipelines
* Microservices communication
* Background processing

Key takeaway: Events enable scalable, loosely coupled architectures.

2. Understanding NATS Fundamentals (4 minutes)

* What is NATS?
* Core architecture and design philosophy
* Subjects and publish-subscribe messaging
* Request-reply patterns
* Queue groups for workload distribution

Live demonstration:

* Publishing and consuming messages using Python

Key takeaway: NATS provides simple primitives that enable powerful messaging patterns.

3. Building an Event Pipeline in Python (6 minutes)

* Setting up NATS clients using Python
* Creating producers and consumers
* Designing event contracts
* Event serialization strategies
* Handling concurrent consumers

Live coding:

* Building a simple order-processing pipeline
* Publishing events from one service and consuming them in another

Key takeaway: Event-driven workflows can be implemented with minimal code and infrastructure.

4. Scaling with JetStream (5 minutes)

* Why durability matters
* Streams and consumers
* Message persistence
* Acknowledgements
* Replay and recovery mechanisms

Demonstration:

* Recovering from consumer failures
* Replaying historical events

Key takeaway: JetStream extends NATS from lightweight messaging to resilient event streaming.

5. Production Considerations (4 minutes)

* Backpressure management
* Consumer scaling patterns
* Idempotency and duplicate handling
* Monitoring and observability
* Performance tuning techniques

Real-world lessons learned from operating event-driven systems.

Key takeaway: Reliability comes from architecture and operational discipline, not just technology.

6. Closing Thoughts & Q&A (3 minutes)

* When to choose NATS
* Comparing NATS with Kafka and RabbitMQ
* Common architectural patterns
* Future of event-driven systems

Learning Outcomes

By the end of this talk, attendees will:

* Understand the principles of event-driven architecture.
* Learn the core concepts of NATS and JetStream.
* Build event producers and consumers using Python.
* Design scalable and resilient event pipelines.
* Handle failures, retries, and message durability effectively.
* Understand how high-throughput messaging systems operate in production environments.
* Gain practical knowledge that can be applied immediately in microservices and distributed systems.

Streaming protocols for conversational AI

Modern conversational AI is no longer limited to simple request-response interactions. Today's AI assistants stream tokens in real time, process voice conversations with minimal latency, invoke external tools, and coordinate with other services to deliver intelligent experiences.

In this talk, we'll dive into the communication protocols and architectural patterns that make real-time conversational AI possible. Starting with Server-Sent Events (SSE) for token streaming, we'll explore when to use WebSockets for bidirectional communication, how WebRTC enables voice-based AI interactions, and how emerging standards such as Model Context Protocol (MCP) and Agent-to-Agent (A2A) communication are shaping the next generation of AI systems.

Using Python and modern frameworks such as FastAPI and asyncio, we'll examine practical implementation patterns, discuss trade-offs between different protocols, and explore how event-driven architectures can be used to build scalable AI applications.

Attendees will learn:
• How token streaming works in modern LLM applications
• When to choose SSE, WebSockets, WebRTC, or gRPC
• How MCP enables AI agents to interact with tools and external systems
• How agent-to-agent communication enables collaborative AI workflows
• Best practices for building low-latency conversational AI systems in Python
• Real-world architecture patterns for production-scale AI applications

From Generators to Event Loops: Understanding Python Async Internals

From Generators to Event Loops: Understanding Python Async Internals

Duration: 25 minutes

Audience Level: Intermediate Python developers

Prerequisites:Familiarity with Python functions, iterators, and basic networking concepts.

Talk Overview:

Python's `async` and `await` keywords have become fundamental tools for building scalable applications, yet many developers use them without understanding the machinery operating underneath. This talk aims to remove the mystery around Python's asynchronous programming model by rebuilding the core ideas behind `asyncio` from scratch using only the Python standard library.

Rather than treating asyncio as a black box, attendees will progressively construct a miniature version of Python's async runtime, gaining a practical understanding of event loops, coroutine scheduling, non-blocking I/O, tasks, and futures.

The session combines live coding, visual explanations, and architectural walkthroughs to connect high-level async code with the low-level mechanisms that make it work.

Detailed Outline:

1. The Problem with Blocking I/O (3 minutes)

* Demonstrate a simple blocking socket operation.
* Show how a single blocking call prevents other work from progressing.
* Briefly discuss traditional concurrency approaches such as threads and processes.
* Introduce cooperative multitasking and event-driven programming as an alternative.

Key takeaway: Concurrency problems often stem from waiting on I/O rather than CPU limitations.

2. Generators: The Hidden Foundation of Async (4 minutes)

* Review Python generators and the `yield` statement.
* Demonstrate pausing and resuming execution.
* Build a minimal scheduler capable of interleaving multiple generator-based tasks.
* Explain how coroutines evolved from generator mechanics.

Key takeaway: The foundations of async programming already exist within Python's generator model.

3. Building an Event Loop from Scratch (6 minutes)

* Introduce the `selectors` module from the standard library.
* Explain readiness-based I/O.
* Build a minimal event loop that:

* Tracks pending operations
* Waits for I/O readiness
* Resumes suspended coroutines
* Visualize how scheduling occurs.

Key takeaway: The event loop is the central coordinator of asynchronous execution.

4. Tasks, Futures, and Scheduling (5 minutes)

* Create lightweight task wrappers around coroutines.
* Implement simple future-like objects.
* Build scheduling logic similar to:

* `asyncio.run()`
* `asyncio.create_task()`
* Demonstrate task lifecycle management.

Key takeaway: Asyncio's higher-level abstractions are built on relatively simple primitives.

5. Async/Await Syntax Demystified (3 minutes)

* Examine how Python compiles `async def`.
* Use the `inspect` and `dis` modules to explore generated bytecode.
* Show the relationship between:

* Generators
* Coroutines
* `await`
* Connect language syntax back to the runtime built earlier.

Key takeaway: `await` is elegant syntax built on understandable underlying concepts.

6. Building a Real Async HTTP Proxy (3 minutes)

* Demonstrate a working asynchronous HTTP proxy powered by the custom event loop.
* Handle multiple concurrent client connections.
* Compare behavior against the earlier blocking implementation.

Key takeaway: The concepts discussed throughout the talk can power real-world applications.

7. Closing Thoughts and Modern Context (1 minute)

* Connect the implementation to modern frameworks and libraries.
* Discuss where concepts such as structured concurrency and production event loops extend the ideas presented.
* Highlight practical benefits of understanding async internals.

Learning Outcomes:

By the end of the session, attendees will:

* Understand how Python implements asynchronous programming.
* Explain the relationship between generators, coroutines, and async/await.
* Understand the role of the event loop in coordinating concurrency.
* Develop an accurate mental model of tasks and scheduling.
* Be better equipped to debug, optimize, and reason about asynchronous applications.

Database Migrations at Scale: DAGs, Automation, and AI in Python

Schema changes are inevitable, but performing them safely in production is one of the most challenging aspects of backend engineering. As systems grow, migrations must handle dependency ordering, rollback strategies, long-running operations, and coordination across services without causing downtime or data corruption.

In this session, I'll present the architecture and implementation of a migration platform built in Python for large-scale MongoDB applications. The system uses MongoEngine for data modeling, directed acyclic graphs (DAGs) to manage migration dependencies, and AI-assisted tooling to generate and validate migration steps.

We'll explore:
- Designing migration workflows for safety and reliability
- Representing migrations as DAGs to enforce execution order
- Handling failures, retries, and rollback scenarios
- Using AI to reduce manual migration authoring effort
-Operational considerations: observability, auditing, and deployment strategies
-Lessons learned from running migrations in production environments

Building Analytics Pipelines in Python with DuckDB and Parquet

Many Python applications start with operational databases and CSV exports for reporting and analytics. While these approaches work initially, they quickly become slow, inefficient, and difficult to scale as data volumes grow.

In this talk, we'll explore a modern analytics workflow built around Python, Parquet, and DuckDB. We'll examine why traditional CSV-based pipelines become bottlenecks, how columnar storage formats dramatically improve performance, and how DuckDB enables fast analytical queries directly on files without requiring a dedicated data warehouse.

Through practical examples, we'll walk through the process of extracting data, transforming it into Parquet datasets, and running analytical workloads using familiar SQL from within Python applications. We'll also compare performance characteristics across different approaches and discuss when these tools are the right choice for production systems.

Topics covered include:
• Why CSV-based analytics pipelines struggle at scale
• Understanding Parquet and columnar storage fundamentals
• Querying Parquet files directly with DuckDB
• Building efficient analytics workflows in Python
• Leveraging predicate pushdown and column pruning
• Performance comparisons and benchmarking
• Choosing the right architecture for analytical workloads

Attendees will leave with a practical understanding of modern analytics tooling and learn how to build lightweight, high-performance analytics pipelines using Python without introducing complex infrastructure.

Key Takeaways
• Understand the limitations of traditional CSV workflows
• Learn how Parquet improves storage and query performance
• Use DuckDB for fast analytical processing directly from Python
• Build scalable analytics pipelines with minimal operational overhead
• Apply modern data engineering techniques to existing Python applications

Building an Event-Driven Knowledge Graph Platform in Python

Knowledge within engineering organizations is often scattered across repositories, pull requests, issue trackers, CI/CD systems, and documentation. While each system contains valuable information, understanding the relationships between people, projects, services, and decisions remains a significant challenge.

In this talk, I'll demonstrate how to build a production-ready knowledge graph platform using Python and event-driven architecture principles. We'll explore how streams of events—such as Git commits, pull requests, code reviews, deployments, and tickets—can be transformed into entities and relationships that continuously evolve a graph representation of organizational knowledge.

The session will walk through the complete architecture, including event ingestion, stream processing, graph modeling, incremental updates, and query patterns. We'll discuss practical design decisions, scalability considerations, and lessons learned from building systems that process large volumes of events while maintaining an accurate and up-to-date graph.

Topics covered include:
• Designing event-driven pipelines with Python
• Modeling entities and relationships for knowledge graphs
• Processing real-time events using message brokers
• Incremental graph updates and consistency challenges
• Querying relationships across people, projects, and systems
• Using knowledge graphs to enhance search, recommendations, and AI applications
• Production lessons learned from building graph-powered systems

Attendees will leave with a practical understanding of how Python can be used to build scalable knowledge graph platforms and how graph-based approaches can unlock insights that are difficult to discover using traditional databases alone.

Speaker

Muhammed Mizaj

Actions

Links

Sessions