The deep learning barrier
By 2015 deep learning is producing notable results in image recognition, machine translation and natural language processing. The available frameworks for building and training neural networks — Theano, Caffe, early versions of TensorFlow — are powerful but impose a steep learning curve. Defining a model requires explicitly managing tensors, computational graphs, symbolic derivatives and low-level optimisations. For a researcher or engineer who wants to experiment rapidly with different architectures, the distance between the idea and working code is too great.
Francois Chollet, engineer and researcher, publishes Keras with a clear objective: make deep learning accessible without sacrificing flexibility.
An API designed for humans
Keras presents itself as a high-level Python API for building neural networks. The guiding principle is the reduction of cognitive load: every common operation should require the fewest possible steps, and user errors should produce clear, actionable messages.
A Sequential model is built by stacking layers one on top of another: Dense for fully connected layers, Conv2D for convolutions, LSTM for temporal sequences, Dropout for regularisation. Each layer explicitly declares its output dimension and activation function. The functional API allows building more complex architectures — models with multiple inputs, multiple outputs, residual connections — by explicitly connecting layers as nodes of a graph.
Compile, fit, evaluate
The workflow in Keras follows three distinct steps. Compile configures the model for training: one specifies the loss function (cross-entropy, mean squared error), the optimiser (SGD, Adam, RMSprop) and the metrics to monitor (accuracy, precision). Fit starts training on the data: Keras automatically handles splitting into batches, epochs, data shuffling and validation. Evaluate measures the model’s performance on a separate test dataset.
This linear flow allows moving from architecture definition to experimental results in a few lines of code, without having to manually manage training loops.
Swappable backend
A distinctive architectural choice of Keras is the swappable backend. Keras does not perform numerical computations directly: it delegates tensor operations to an underlying computation engine. At the time of release, the primary backend is Theano, the symbolic computation framework developed by the MILA in Montreal. TensorFlow is supported as an alternative backend shortly after its own release. This separation allows changing the computation engine without modifying model code, protecting the investment in architecture definitions.
Link: keras.io
