<span class="embed-block embed-speakerdeck"><iframe src="https://speakerdeck.com/player/70468114b3ab42c3a357ba6ee04c24c3" scrolling="no" allowfullscreen allow="encrypted-media" loading="lazy"></iframe></span>
<p data-line="1" class="code-line">Are autoregressive models the only way to build powerful LLMs? A fascinating paper titled "Large Language Diffusion Models" explores an alternative, and I've put together a presentation to walk through their findings. The authors introduce LLaDA, a non-autoregressive model that uses a diffusion process to generate text. My slides cover how this approach works and highlight some of its impressive results, such as strong performance on various benchmarks and a unique advantage in reversal reasoning tasks. Check them out for a summary of this exciting research.</p>


[Paper Introduction] Large Language Diffusion Models

Discussion