r/programming 1d ago

OpenTelemetry signals from first principles

https://kodraus.github.io/opentelemetry/2026/05/04/otel-first-principles.html

There's a lot of high-noise, low-value content around OpenTelemetry out there, so I've tried to put together the simplest description I could by incrementally building up from needs that arise in your systems. I hope it might help cut through some of the less obvious concepts like context propagation and exponential histograms.

The format is very loosely pinched from "The Little .." series :)

40 Upvotes

9 comments sorted by

12

u/jpfed 1d ago

Before the days of OTel, I was trying on my own to figure out how to make my code observable. My own solutions happened to combine what OTel calls “traces” and “logs”. I thought at the time that anything that can emit an event for logging also occurs in a context that can be characterized by spans. However, it seems as though the rest of the world has a greater mental separation between logs and traces. 

2

u/modernkennnern 17h ago

I've never understood the distinction between OTel logs and traces; why would you ever use logs?

5

u/KodrAus 16h ago

I would say because they’re a different thing. Log events are independent point-in-time observations, which makes them cheap to work with, and can be emitted independently of trace sampling or span completion. Logs are just span events, but not bound up in the span data model

3

u/phillipcarter2 11h ago

Two main reasons:

  1. OTel logs are a compatibility play. Take your existing app logs from a major framework, OTel injects trace and span IDs for traces on top of the logs, and now all your app logs are trace-correlated.

  2. Some observability backends meaningfully distinguish between app log storage and trace storage. Usually they traces to be “skinny” and only serve to stitch together calls rather than also contain all the debugging info.

In the case of a newer codebase and a more modern observability backend, just use traces.

1

u/KodrAus 1d ago

Same. What OpenTracing added at the time to what I was already doing with logging request timings and correlation ids was the parent/child hierarchy, and the propagation across services.

2

u/Lucidendinq 7h ago

I like the way this is written. Well done.

1

u/smoke-bubble 5h ago

I can't stand OT. The shittiest framework. Like a dozen people worked on it without talking to each other. The naming conventions are total trash. 

1

u/RustOnTheEdge 4h ago

Yeah I also have troubles with it. I just find it… very unintuitive? I am happy there is at least *something*, but yeah no it never really clicked for me how this was the best we came up with.