Deriving the expectation-maximization algorithm, and the beginnings of its application to LDA. Once finished, its intimate connection to variational inference is apparent.
Stochastic maximum likelihood, contrastive divergence, negative contrastive estimation and negative sampling for improving or avoiding the computation of the gradient of the log-partition function. (Oof, that's a mouthful.)
A pedantic walk through Boltzmann machines, with focus on the computational thorn-in-side of the partition function.
Introducing the RBF kernel, and motivating its ubiquitous use in Gaussian processes.
A thorough, straightforward, un-intimidating introduction to Gaussian processes in NumPy.
The higher education paradigm is changing. Motivation, logistics and strategic insight re: designing the "Open-Source Masters" for yourself.
Convolutional variational autoencoders for emoji generation and Siamese text-question-emoji-answer models. Keras, bidirectional LSTMs and snarky tweets @united within.
Coupling nimble probabilistic models with neural architectures in Edward and Keras: "what worked and what didn't," a conceptual overview of random effects, and directions for further exploration.
Exploring generative vs. discriminative models, and sampling and variational methods for approximate inference through the lens of Bayes' theorem.