Fitting larger networks into memory
When training deep feed-forward neural networks consisting of n layers, we can reduce the memory consumption to O(sqrt(n)) in this way, at the cost of performing one additional forward pass (see e.g. Training Deep Nets with Sublinear Memory Cost, by Chen et al. (2016)). This repository provides an implementation of this functionality in Tensorflow, using the Tensorflow graph editor to automatically rewrite the computation graph of the backward pass.