© Mapbox, © OpenStreetMap

Speaker

Antonio Murgia

Antonio Murgia

Data Architect @ Agile Lab

Bologna, Italy

Actions

The spirit of a SW Engineer in the body of a Data Engineer: on a never-ending quest for painless data operationalisation

Area of Expertise

  • Information & Communications Technology

Topics

  • Data Platform
  • Data Management
  • data mesh
  • Big Data
  • Analytics and Big Data
  • data engineering
  • All things data
  • Big Data Machine Learning AI and Analytics
  • data masking

Navigating Icebergs: Don't Be the Next Titanic!

The advantages that Apache Iceberg brings to the Data Lake ecosystem are undeniable. However, it's important to remember that behind the scenes, immutable Parquet files reside in your lake. Consequently, the usual culprits are ever-present, ready to cause trouble; data skew and data fragmentation are just a couple of examples.

With that in mind, wouldn't it be great to have a plug-and-play solution for monitoring the health of your tables, seamlessly integrating with your open telemetry compliant observability platform?

In this presentation, we'll introduce a straightforward yet invaluable (and free) solution to keep your iceberg (tables) under your radar!

Syncing the Iceberg: Real-Time sailing at Terabyte Latitudes

Iceberg, along with other table formats, promises ACID properties atop read-optimized and open file formats like Apache Parquet. But is achieving this promise feasible when synchronising tables in near real-time? Will optimistic concurrency remain the optimal choice? What trade-offs will we encounter? Let's embark on a journey across glacial seas and find out!

DataOps in action with Nessie, Iceberg and Great Expectations

In this talk I will present how we used Nessie, Iceberg and Great Expectations to build a Data Ops pipeline that ensures Data Quality and avoids “Datastrophes”

Comet shines, Photon flies!

In the fast-evolving world of data engineering, speed is king, and query accelerators for Apache Spark are at the forefront of this race. This talk dives deep into two major contenders: Apache Comet, an open-source accelerator, and Databricks Photon, a proprietary solution exclusive to the Databricks platform.

We’ll explore their architectures, performance benchmarks, and cost implications to answer the big question: is Photon’s proprietary edge worth the investment, or can Comet’s open-source approach deliver comparable acceleration and total cost savings? Whether you’re a Spark enthusiast, a cost-conscious engineer, or a decision-maker evaluating data platforms, this talk will illuminate the trade-offs and help you choose the right path for your organization’s data acceleration needs.

Join us to discover who truly leads the race—and where the value lies for your data strategy.

to be filled

Subsurface LIVE 2024 Sessionize Event

May 2024

Subsurface LIVE 2023 Sessionize Event

March 2023

Antonio Murgia

Data Architect @ Agile Lab

Bologna, Italy

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top