Cijo Thomas
Principal Software Engineer at Microsoft
Seattle, Washington, United States
Actions
Cijo is a Software Engineer at Microsoft specializing in Observability. He has been deeply involved with the OpenTelemetry project since its inception and is a core maintainer for the OpenTelemetry .NET and OpenTelemetry Rust implementations. He is also an Approver for the OpenTelemetry Specification and OpenTelemetry Arrow project. His expertise extends beyond OpenTelemetry, as he also maintains various other telemetry solutions within Microsoft.
Area of Expertise
Topics
High-Volume Logging Without High Cost: Flight Recorder for OpenTelemetry Logs
Organizations face escalating costs from high-volume application logs—most never used. In response, teams aggressively filter logs, risking incident blind spots.
This talk introduces the Flight Recorder pattern for OTel Logs: instead of continuously exporting logs via OTLP, applications export them to ring buffers—ideally backed by OS-native mechanisms like Windows ETW or Linux Tracepoints, but also implementable in user space. Logs remain local, continuously overwritten until triggers—critical errors or operator requests—cause buffered logs to be exported via OTLP to OTel Collector, maintaining existing workflows.
Once triggered, the system also enters a temporary streaming mode, exporting all new logs continuously for a brief period. This approach eliminates unnecessary logging costs while ensuring logs are readily available when it matters. We’ll present a design and working prototype that shows how this model delivers substantial cost savings without sacrificing observability.
Beyond OTLP: Unlocking the potential of OS-native tracing
OTLP has become the de-facto standard for exporting telemetry in OpenTelemetry, but its reliance on networking (TCP) and batching introduces challenges-CPU contention, memory overhead, potential data loss during crashes, and complex retry and back-pressure handling.
This session explores a powerful alternative: leveraging OS-native tracing mechanisms—ETW (Event Tracing for Windows) and Linux user_events—to export telemetry synchronously via kernel-backed buffers. These mechanisms eliminate batching, network stack overhead, and retry logic, offering exceptional performance and reliability. Moreover, they are naturally suited for dynamic, on-demand telemetry, allowing telemetry to be enabled or disabled without restarts or redeployments—ideal for large-scale systems needing minimal overhead when telemetry is off.
Attendees will learn how ETW and user_events work, how they integrate with OpenTelemetry (without needing re-instrumentation), and how event listeners (e.g., the OTel Collector) can forward data via OTLP—preserving existing pipelines while gaining advanced observability features like capturing call stacks or triggering actions (e.g., process dumps) on specific events.
Behind the Code: Design Choices in OpenTelemetry .NET Metrics API and SDK
In the field of observability, metrics play a key role in monitoring system behavior and identifying potential issues. A reliable, high-performing, and always-running Metrics SDK is essential for monitoring the health of software systems.
This session will explore the OpenTelemetry .NET Metrics SDK, highlighting its design principles aimed at performance, predictable and bounded memory usage, and memory efficiency. We will delve into the engineering challenges faced by the team and the various design trade-offs considered. We'll briefly cover performance testing techniques.
A notable feature of the SDK is its ability to record measurements in as low as 10 nanoseconds, and up to just a few hundred nanoseconds, all with zero heap allocation. Moreover, the SDK includes memory limit safeguards to ensure bounded memory usage, regardless of the input, thereby ensuring that the telemetry system itself does not become an attack vector for security threats.
Join us for an overview of how OpenTelemetry .NET Metrics strives to meet its performance and efficiency objectives. We'll conclude the session with actionable insights for end-users to effectively use OpenTelemetry .NET Metrics!
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top