From DistBelief to TensorFlow
Google has been using deep learning internally for years. The predecessor is called DistBelief, a system designed to train large-scale neural networks by distributing computation across thousands of machines. DistBelief works, but its architecture is rigid: adding new layer types or experimenting with different architectures requires deep changes to the system’s code. The Google Brain team decides to redesign the framework from scratch, with the goal of creating a system that is flexible, performant and — an unprecedented decision for Google — public.
In November 2015 TensorFlow is released under the Apache 2.0 licence. The framework Google uses internally for Search, Gmail, Google Translate and speech recognition becomes available to everyone.
Static computational graphs
TensorFlow’s architecture is built on the concept of a static computational graph. The programmer defines a graph of operations — additions, matrix multiplications, convolutions, activation functions — that describes the entire computation flow of the neural network. Graph nodes represent operations, edges represent the tensors (multi-dimensional arrays) that flow between operations.
The graph is defined in a first phase (define) and then executed in a second phase (run) within a session. This separation allows the system to analyse the entire graph before execution, apply optimisations — operation fusion, efficient memory allocation, parallelisation — and distribute computation across different devices.
CPU, GPU and clusters
TensorFlow abstracts the execution device: the same graph can be run on a CPU, on a GPU via NVIDIA’s CUDA, or distributed across a cluster of machines. The developer annotates which operations should run on which device, and the runtime handles communication and synchronisation.
Distributed training allows scaling to dataset sizes that a single machine could not handle. Parameter servers coordinate the updating of network weights across different workers, while gradient computation is parallelised.
TensorBoard and tooling
TensorBoard is the integrated visualisation tool: it allows inspecting the structure of the computational graph, monitoring the progress of metrics during training — loss, accuracy, weight distributions — and visualising embeddings in high-dimensional spaces. For teams training complex models over days or weeks, the ability to observe training evolution in real time is essential for diagnosing problems and optimising hyperparameters.
TensorFlow makes available to researchers, startups and companies the same level of deep learning infrastructure that until then had been accessible only within large research laboratories.
Link: tensorflow.org
