Compute-Skipping Policies for Diffusion LLMs (dLLM-v2) (2026)
Two compute-skipping policies for diffusion LLMs (Fast-dLLM v2) that reuse stable hidden states across denoising steps to cut FLOPs: a layer-level cache-reuse policy and a stability-aware token-level policy that recomputes only the least-similar tokens.
Report (PDF)





