Dunith Danushka

Senior Developer Advocate at Redpanda

Manchester, United Kingdom

Actions

Dunith avidly enjoys designing, building, and operating large-scale real-time, event-driven architectures. He's got 10+ years of doing so and loves to share his learnings through blogging, videos, and public speaking.
Dunith works at Redpanda as a Senior Developer Advocate, where he spends much time educating developers about building event-driven applications with Redpanda.

Badges

Area of Expertise

Information & Communications Technology

Topics

stream processing
Apache Kafka
Machine Learning & AI
Big Data Analytics
Realtime Analytics
Data Engineering
Data Analytics
Apache Flink

Trash Talk: Using Langchain4J to Build Intelligent Waste Sorting Chatbots

Last week, I had a close call with my municipal council—nearly facing a fine for tossing the remains of ground coffee into the wrong bin. What seemed like a simple mistake quickly spiraled into an hour-long quest through the council’s website, desperately trying to decipher their waste sorting guidelines. After much frustration, I finally discovered that ground coffee belongs in the food waste bin! It hit me then—what if there was an easier way? What if I could just ask a question like, “Where does this go?” and get an instant answer? That’s when the idea for this chatbot was born—a smart assistant to take the guesswork out of waste sorting.

In this talk, I’ll guide you through the architecture of an AI-powered chatbot that does exactly that. Using Langchain4J, we’ll transform unstructured text from municipal websites into a fast, intuitive AI tool that helps users sort waste correctly. We’ll try to understand how AI is used to process natural language queries, turning them into embeddings, which are then matched against a vector database in Pinecone. Using Langchain4J’s powerful abstractions like LLMChain, EmbeddingChain, and VectorStore, we’ll show how this solution provides near-instant results, offering a seamless user experience.

For Java developers, this session will be a deep dive into integrating AI into your projects using Langchain4J. You’ll learn how to leverage language models, vector-based searches, and scalable architectures to bring smart, real-time functionality into your applications. Whether you’re interested in chatbots, recommendation engines, or other AI-powered tools, you’ll walk away with the knowledge to drive real innovation in your own development workflow.

Takeaways:

- Discover the power of Langchain4J in building AI-driven applications for Java developers.

- Learn how to create conversational interfaces that harness embeddings and vector databases for real-world solutions.

- Understand how this AI-powered chatbot can make waste sorting (and other use cases) easier and more accurate with natural language processing and fast retrieval.

By the end, you’ll see how AI can transform even the most mundane tasks—and how Java developers are perfectly positioned to lead this wave of intelligent applications.

Game, Set, Match: Transforming Live Sports with AI-Driven Commentary

We both enjoy BBC's live text commentary for sports. Often, it includes novel insights but also summarizes recent events. Imagine enhancing this with an AI Co-Pilot for more efficiency.

In our session, we'll introduce this AI Co-Pilot, leveraging Redpanda, ClickHouse, Flink, and a Large Language Model. It involves feeding a stream of events into RedPanda, capturing them in Flink by game or time, and storing them in ClickHouse with historical data.

The LLM receives recent events and historical data, suggesting text commentary. Commentators can use, edit, or ignore these AI-generated suggestions, enhancing the live commentary experience.

This mechanism applies to similar use cases like live auctions, traffic updates, weather predictions, and use case that involves real-time data and human commentary.

Many Faces of Real-time Analytics

Real-time analytics systems derive meaningful insights from continuous streams of data, enabling organizations to make swift decisions and react fast. However, not all real-time analytics systems are made equal. While they share the same goal in the end, there are differences in how they achieve it.

This talk aims to classify real-time analytics systems into four main groups based on five characteristics, discuss popular use cases for them, and identify the best technology choice for implementing them in production.

The first half of the talk introduces the five characteristics that you can use to assess any real-time analytics system: data freshness, query latency, concurrency, query complexity, and access to historical data. Then, we classify real-time analytics systems into four groups based on those characteristics and discuss their use cases while taking a fictitious train company as a reference.

The second half of the talk explores the best technology choices to implement for each group, including stream processors, streaming databases, and real-time OLAP databases. Finally, we draw a real-time analytics landscape for the above train company to achieve different analytical needs.

This talk would be a guide for beginner practitioners in the data analytics domain to identify, assess, and find the right technology stack for their real-time analytics use cases.

Unbundling the Modern Streaming Stack

Event-first thinking and streaming help organizations transition from followers to leaders in the market. A reliable, scalable, and economical streaming architecture helps them get there.

This talk first explores the "classic streaming stack," based on the Lambda architecture, its origin, and why it didn't pick up amongst data-driven organizations. The modern streaming stack (MSS) is a lean, cloud-native, and economical alternative to classic streaming architectures, where it aims to make event-driven real-time applications viable for organizations.

The second half of the talk explores the MSS in detail, including its core components, their purposes, and how Kappa architecture has influenced it. Moreover, the talk lays out a few considerations before planning a new streaming application within an organization. The talk concludes by discussing the challenges in the streaming world and how vendors are trying to overcome them in the future.

The Real-Time Analytics Stack

Although Real-Time Analytics may seem like a recent trend, there have been attempts at it stretching back several decades.

This talk first explores the "classic streaming stack," based on the Lambda architecture, its origin, and why it didn't pick up amongst data-driven organizations. The Real-Time Analytics (RTA) Stack is a lean, cloud-native, and economical alternative to classic streaming architectures, which aims to make event-driven real-time applications viable for organizations.

The second half of the talk explores the RTA Stack in detail, including its core components, its purposes, and how the Kappa architecture influenced them. We will also outline a few considerations before planning a new streaming application within an organization.

The talk concludes by discussing the challenges in the streaming world and how vendors are trying to overcome them in the future.

Streaming vs. Eventing: Differences and Co-existence

Streaming and eventing are two architectural styles based on event-driven architecture. Although both contribute to building asynchronous, scalable, and decoupled applications, a few differences exist in their approaches.

The first half of this talk compares and contrasts eventing and streaming across a few dimensions, including event delivery semantics, retention, selective event subscription, and processing patterns. Additionally, several real-world use cases will be discussed for each.

While streaming and eventing have their differences, they can co-exist to fill each other's gaps, enabling you to build even better solutions by combining them. The second half of the talk discusses a practical use case related to event-driven Microservices, which combines both the eventing and streaming features to build a scalable and reliable operational system.

Building a User-facing Analytics Dashboard with Airbyte, Apache Pinot, and Streamlit

When you hear "decision-maker," it's natural to think of "C-suite" or "executive."

But these days, we're all decision-makers. Restaurant owners, bloggers, big-box shoppers, and diners have important decisions and need instant actionable insights. Businesses need access to fast, fresh analytics to provide these insights to end-users like us.

In this session, we will learn how to build our own real-time analytics application on top of a streaming data source using Apache Kafka, Apache Pinot, and Streamlit. Kafka is a distributed, open-source pub-sub messaging and streaming platform for real-time workloads; Pinot is an OLAP database designed for ultra-low latency analytics. Streamlit is a Python-based tool that makes it easy to build data-driven apps.

After introducing these tools, we'll use Airbyte to build an ETL pipeline that captures MySQL database changes and moves them to Apache Pinot for further analysis. Once we've done that, we'll bring everything together with an auto-refreshing Streamlit dashboard that queries Apache Pinot to generate a beautiful dashboard.

Don’t drop it; handle it: Reliable Message Reprocessing Patterns for Kafka

Failures are inevitable in distributed systems. We often come across unreliable networks, botched-up downstream systems, and rogue message payloads, forcing our applications to detect and handle failures as gracefully as possible.

After accepting a message, Kafka durably stores it in its infrastructure, allowing consumers to process it at their will. After that, the consumer must be responsible for processing the message reliably and efficiently handling failures.

This talk discusses several error-handling patterns you can implement in Kafka consumer applications. We will explore different approaches to handling transient and non-transient errors and highlight the use of dead letter topics in Kafka for message reprocessing. Finally, we will walk through a Spring Kafka application code to showcase blocking and non-blocking message retry scenarios.

Transforming Your APIs Into Business Gold – Architecting a Real-Time API Usage Analytics Platform

In today's hyper-connected digital landscape, real-time API usage analysis and billing have become paramount. With the proliferation of APIs at the heart of modern applications and services, gaining real-time insights into their use is often mission-critical as it depends on critical factors such as latency, throughput, freshness, and the correctness of generated insights.

This talk discusses the architecture of a real-time API usage analytics system composed of Redpanda, Apache Flink, and Apache Pinot. Redpanda, as a scalable streaming data platform, enables high-volume, low-latency API volume data ingestion from API gateways in real time. The ingested data is streamed through Flink for streaming ETL, enabling operations like joins, aggregations, and transformations, feeding the final output of the pipeline to Apache Pinot for serving analytics at scale. In addition to that, we will use Flink for rate-limiting incoming API requests.

Having such a system enables businesses to make informed, agile decisions, addressing performance bottlenecks, security threats, and resource allocation issues in the API infrastructure promptly. Also, it ensures seamless user experiences by identifying usage patterns, issues, and opportunities as they occur, proactively enhancing the quality of your product.

Whether you're a data engineer, an application developer, or an architect, this talk promises to equip you with the knowledge and tools to build a real-time analytics solution that empowers your organization with the insights it needs to stay ahead in a data-driven world.

Real-Time Analytics Summit 2024 Sessionize Event

May 2024 San Jose, California, United States

Open Source Analytics Conference 2023 Sessionize Event

December 2023

Kafka Summit London 2023 Sessionize Event

May 2023 London, United Kingdom

Current 2022: The Next Generation of Kafka Summit Sessionize Event

October 2022 Austin, Texas, United States

Dunith Danushka

Senior Developer Advocate at Redpanda

Manchester, United Kingdom

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Most Active Speaker

Dunith Danushka

Actions

Links

Badges

Area of Expertise

Topics

Sessions

Trash Talk: Using Langchain4J to Build Intelligent Waste Sorting Chatbots

Game, Set, Match: Transforming Live Sports with AI-Driven Commentary

Many Faces of Real-time Analytics

Unbundling the Modern Streaming Stack

The Real-Time Analytics Stack

Streaming vs. Eventing: Differences and Co-existence

Building a User-facing Analytics Dashboard with Airbyte, Apache Pinot, and Streamlit

Don’t drop it; handle it: Reliable Message Reprocessing Patterns for Kafka

Transforming Your APIs Into Business Gold – Architecting a Real-Time API Usage Analytics Platform

Events

Real-Time Analytics Summit 2024 Sessionize Event

Open Source Analytics Conference 2023 Sessionize Event

Kafka Summit London 2023 Sessionize Event

Current 2022: The Next Generation of Kafka Summit Sessionize Event

Dunith Danushka

Links

Actions