Gradient Descent Langevin dynamics is an algorithm that expands the conventional gradient descent algorithm by adding Gaussian noise into each iteration with proper scaling. It is also the key optimization algorithm in many diffusion models. Additionally, it can be useful in general ML optimization problems due to the capability of reaching the global extrema for the loss functions.
Anchored Langevin Dynamics
githubIn practice, loss functions with non-differentiability at finitely many points are popular, such as Lasso regression, losses with SCAD, MCP penalties and especially deep neural networks with ReLU activation in multiple layers. These functions lack access to their gradients at the non-differentiable points. Hence, in coordination with my advisor and some other co-authors in the field, I am currently working on our newly developed algorithm, the Anchored Langevin Dynamics, with the key idea of replacing the loss function of a model with a better reference potential to solve the non-differentiable problem.