Session
Under the Hood: How vLLM Parses Reasoning Traces and Tool Calls Token by Token
Reasoning models and tool-calling agents are now central to production LLM systems, but the infrastructure inside vLLM that enables them is rarely documented outside the codebase. This talk opens that black box.
XingYan Jiang
DaoCloud, Software Engineer, Cloud Native Enthusiast
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top