paper-conference

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
Diffusion language models (dLMs) have emerged as a promising paradigm that enables parallel, non-autoregressive generation, but their …
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
Effective Long-Context Scaling of Foundation Models
We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built …
Effective Long-Context Scaling of Foundation Models