Streaming Machine Learning with Flink, Pulsar, and Iceberg
Discord is the place to talk online, whether that’s one-on-one, in small groups, or in larger communities organized around shared interests. In this talk, we'll show how Discord uses Apache Flink to power real-time machine learning applications for fighting abuse at scale & keeping over 150M active users safe. We'll share the how and why of our migration to Pulsar from Google Pub/Sub, and how we pair Pulsar with Apache Iceberg to create a data layer capable of seamless historical and realtime serving. Together, the three technologies unlock faster feature engineering, backfilling, point-in-time accuracy, and minimize offline-online skew, making this architecture compelling for practical real-time ML in production.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top