The distributed version of Tensorflow has been released, but the documentation is not great. As with other open-source libraries like this, the most effective way is to look into the sample code. In case of tensorflow, the only sample available is the model trained on CIFAR. Technically it isn’t really a sample of the distributed version of tensorflow, it is just training on multiple GPUs. Moreoever, the CIFAR sample is quite obscured, and how to write a program to utilize a tensorflow cluster wasn’t clearly demonstrated.
So this is my attempt to demonstrate that. Check it out and leave me a comment if there is any issue.