Jingyu Liu (刘镜宇)
Jingyu Liu (刘镜宇)
Home
Experience
Publications
Miscellaneous
Light
Dark
Automatic
arxiv
HAMburger: Accelerating LLM Inference via Token Smashing
The growing demand for efficient Large Language Model (LLM) inference requires a holistic optimization on algorithms, systems, and …
Jingyu Liu
,
Ce Zhang
PDF
Cite
Project
Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Improving time-to-first-token (TTFT) is an essentially important objective in modern large language model (LLM) inference engines. …
Jingyu Liu
,
Beidi Chen
,
Ce Zhang
PDF
Cite
Project
Cite
×