π Describe the bug
When OpenTelemetry tracing is enabled (--otlp-traces-endpoint), the Dynamo bytecode transform span emitted from torch.compile instrumentation reports a wildly incorrect duration β tens of thousands of days instead of the actual compile time (seconds).
Example span observed in a trace backend:
Dynamo bytecode transform
Service: llm
Duration: 20622d 20h
Start Time: 0ΞΌs (01:51:02.838)
Kind: internal
Status: unset
Library Name: vllm.tracing.otel
Span attributes
dynamo.time_seconds 17.16094771400094
The dynamo.time_seconds attribute (17.16s) is correct, but the span's own start/end timestamps used for the rendered Duration are off by ~56 years (~20622 days), which matches the gap between the Unix epoch and "now".
Root cause
In vllm/compilation/backends.py (around line 1145-1156):
from .monitor import torch_compile_start_time
dynamo_time = time.perf_counter() - torch_compile_start_time
...
start_time = int(torch_compile_start_time * 1e9)
attributes = {"dynamo.time_seconds": dynamo_time}
instrument_manual("Dynamo bytecode transform", start_time, None, attributes)
torch_compile_start_time is set in vllm/compilation/monitor.py:
torch_compile_start_time: float = 0.0
...
def ...():
global torch_compile_start_time
torch_compile_start_time = time.perf_counter()
time.perf_counter() returns a value from a monotonic clock with an arbitrary, implementation-defined reference point (e.g. system boot time on Linux). It is only meaningful for measuring elapsed intervals via subtraction β it is not wall-clock/Unix-epoch time.
However, instrument_manual forwards start_time directly into OpenTelemetry's tracer.start_span(start_time=...) (see vllm/tracing/otel.py::manual_instrument_otel), which expects nanoseconds since the Unix epoch:
span_kwargs: dict[str, Any] = {
"name": span_name,
"context": ctx,
"start_time": start_time,
}
...
span = tracer.start_span(**span_kwargs)
...
span.end() # end_time is None here, so OTel uses the real wall-clock "now"
Because end_time is None, span.end() stamps the end time using the real wall-clock time.time_ns(). So the span ends up with:
start_time β perf_counter() value (small, interpreted as if it were close to the 1970 epoch)
end_time β actual wall-clock now
The resulting duration is therefore roughly "now minus 1970", i.e. tens of thousands of days, regardless of how long Dynamo actually took to compile.
Suggested fix
Anchor the span timestamps to wall-clock time instead of the monotonic clock, e.g. capture both a monotonic start (for the elapsed delta) and a wall-clock reference in vllm/compilation/monitor.py, and compute the OTel start_time from wall-clock time at the call site in vllm/compilation/backends.py, for example:
# monitor.py
torch_compile_start_time: float = 0.0 # time.perf_counter()
torch_compile_start_time_wall: float = 0.0 # time.time()
# backends.py
dynamo_time = time.perf_counter() - torch_compile_start_time
start_time_wall = torch_compile_start_time_wall # seconds, epoch-relative
start_time = int(start_time_wall * 1e9)
end_time = int((start_time_wall + dynamo_time) * 1e9)
instrument_manual("Dynamo bytecode transform", start_time, end_time, attributes)
This would also let other callers of instrument_manual avoid the same pitfall if they pass a time.perf_counter()-derived value by mistake.
Impact
Any OTel trace viewer (Jaeger, Tempo, etc.) shows the Dynamo bytecode transform span with a nonsensical multi-decade duration, which is confusing/misleading for anyone using tracing to debug compile-time overhead, and can visually break trace waterfalls/timelines in the UI (span extends far outside the rest of the trace).
Environment
Found via source inspection on vllm at commit 3b03a2cf4772838da622d81315941bb41bcc03ff (main, 2026-06-11). The bug is not environment-specific β it reproduces whenever --otlp-traces-endpoint is set and a torch.compile-triggered Dynamo bytecode transform occurs.
Before submitting a new issue...
Root cause and analysis above were investigated with AI assistance (Claude); the underlying source locations and reasoning were reviewed by me before filing.
π Describe the bug
When OpenTelemetry tracing is enabled (
--otlp-traces-endpoint), theDynamo bytecode transformspan emitted fromtorch.compileinstrumentation reports a wildly incorrect duration β tens of thousands of days instead of the actual compile time (seconds).Example span observed in a trace backend:
The
dynamo.time_secondsattribute (17.16s) is correct, but the span's own start/end timestamps used for the renderedDurationare off by ~56 years (~20622 days), which matches the gap between the Unix epoch and "now".Root cause
In
vllm/compilation/backends.py(around line 1145-1156):torch_compile_start_timeis set invllm/compilation/monitor.py:time.perf_counter()returns a value from a monotonic clock with an arbitrary, implementation-defined reference point (e.g. system boot time on Linux). It is only meaningful for measuring elapsed intervals via subtraction β it is not wall-clock/Unix-epoch time.However,
instrument_manualforwardsstart_timedirectly into OpenTelemetry'stracer.start_span(start_time=...)(seevllm/tracing/otel.py::manual_instrument_otel), which expects nanoseconds since the Unix epoch:Because
end_timeisNone,span.end()stamps the end time using the real wall-clocktime.time_ns(). So the span ends up with:start_timeβperf_counter()value (small, interpreted as if it were close to the 1970 epoch)end_timeβ actual wall-clock nowThe resulting duration is therefore roughly "now minus 1970", i.e. tens of thousands of days, regardless of how long Dynamo actually took to compile.
Suggested fix
Anchor the span timestamps to wall-clock time instead of the monotonic clock, e.g. capture both a monotonic start (for the elapsed delta) and a wall-clock reference in
vllm/compilation/monitor.py, and compute the OTelstart_timefrom wall-clock time at the call site invllm/compilation/backends.py, for example:This would also let other callers of
instrument_manualavoid the same pitfall if they pass atime.perf_counter()-derived value by mistake.Impact
Any OTel trace viewer (Jaeger, Tempo, etc.) shows the
Dynamo bytecode transformspan with a nonsensical multi-decade duration, which is confusing/misleading for anyone using tracing to debug compile-time overhead, and can visually break trace waterfalls/timelines in the UI (span extends far outside the rest of the trace).Environment
Found via source inspection on
vllmat commit3b03a2cf4772838da622d81315941bb41bcc03ff(main, 2026-06-11). The bug is not environment-specific β it reproduces whenever--otlp-traces-endpointis set and atorch.compile-triggered Dynamo bytecode transform occurs.Before submitting a new issue...
Root cause and analysis above were investigated with AI assistance (Claude); the underlying source locations and reasoning were reviewed by me before filing.