Under the Hood: How vLLM Parses Reasoning Traces and Tool Calls Token by Token

Reasoning models and tool-calling agents are now central to production LLM systems, but the infrastructure inside vLLM that enables them is rarely documented outside the codebase. This talk opens that black box.

XingYan Jiang

DaoCloud, Software Engineer, Cloud Native Enthusiast

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Under the Hood: How vLLM Parses Reasoning Traces and Tool Calls Token by Token

XingYan Jiang

Links

Actions