Blogs - Basavasagar Patil

Home

Training a byte-level discrete diffusion language model to generate palindromes, comparing it to autoregressive approaches.

Understanding per-sample gradients and their applications in data attribution and influence functions.