Behold LLADA (aka Large Language Diffusion with mAsking) a diffusion model for Text with an unprecedented 8B scale, trained entirely from scratch, rivaling ARMs (Auto-regressive models) like LLaMA3 8B
Share this post
Large Language Diffusion Models (LDMs vs…
Share this post
Behold LLADA (aka Large Language Diffusion with mAsking) a diffusion model for Text with an unprecedented 8B scale, trained entirely from scratch, rivaling ARMs (Auto-regressive models) like LLaMA3 8B