A Whole Recipe For Stochastic Gradient MCMC

Many recent Markov chain Monte Carlo (MCMC) samplers leverage continuous dynamics to define a transition kernel that efficiently explores a target distribution. In tandem, a focus has been on devising scalable variants that subsample the data and use stochastic gradients instead of full-data gradients in the dynamic simulations. However, such stochastic gradient MCMC samplers have lagged behind their full-data counterparts in terms of the complexity of dynamics considered since proving convergence in the presence of the stochastic gradient noise is non-trivial. Within this paper, we provide a general recipe for constructing MCMC samplers–including stochastic gradient versions–based on continuous Markov processes. We rigorously prove that this framework is complete. That is, any continuous Markov process that provides samples from the target distribution can be cast within our framework. We show how various continuous-dynamic samplers can be trivially reinvented within our framework, avoiding the complicated sampler-specific proofs. We likewise use our recipe to straightforwardly propose a new state-adaptive sampler: stochastic gradient Riemann Hamiltonian Monte Carlo (SGRHMC). Our experiments on simulated data as well as a streaming Wikipedia analysis demonstrate that the proposed SGRHMC sampler inherits the advantages of Riemann HMC, while using scalability of stochastic gradient methods.

