Statistical underpinnings of the machine learning models we know and love. A walk through random variables, entropy, exponential family distributions, generalized linear models, maximum likelihood estimation, cross entropy, KL divergence, maximum a posteriori estimation and going "fully Bayesian."

Autoencoding airports via variational autoencoders to improve flight delay prediction. Additionally, a principled look at variational inference itself and its connections to machine learning.

Deriving the softmax from first conditional probabilistic principles, and how this framework extends naturally to define the softmax regression, conditional random fields, naive Bayes and hidden Markov models.

In this post, we look to beat the performance of Implicit Matrix Factorization on a recommendation task using 5 different neural network architectures.