Parallelization

Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature

A recent publication tried to parallelize K-FAC for multiple processes to speed up convergence. In this blog post I want to summarize the main contribution and give a little more insights.