Jingyu Liu (刘镜宇)
Jingyu Liu (刘镜宇)
Home
Experience
Publications
Miscellaneous
Light
Dark
Automatic
arxiv
Not All Prefills Are Equal: PPD Disaggregation for Multi-turn LLM Serving
Prefill-Decode (PD) disaggregation has become the standard architecture for modern LLM inference engines, which alleviates the …
Zongze Li
,
Jingyu Liu
,
Zach Xu
,
Yineng Zhang
,
Tahseen Rabbani
,
Ce Zhang
PDF
Cite
Project
Scaling Beyond Masked Diffusion Language Models
Diffusion language models are a promising alternative to autoregressive models due to their potential for faster generation. Among …
Subham Sekhar Sahoo
,
Jean-Marie Lemercier
,
Zhihan Yang
,
Justin Deschenaux
,
Jingyu Liu
,
John Thickstun
,
Ante Jukic
PDF
Cite
Project
HAMburger: Accelerating LLM Inference via Token Smashing
The growing demand for efficient Large Language Model (LLM) inference requires a holistic optimization on algorithms, systems, and …
Jingyu Liu
,
Ce Zhang
PDF
Cite
Project
Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Improving time-to-first-token (TTFT) is an essentially important objective in modern large language model (LLM) inference engines. …
Jingyu Liu
,
Beidi Chen
,
Ce Zhang
PDF
Cite
Project
Cite
×