Hugging Face Transformers 4.0: the library and hub of open AI models

Hugging Face Transformers 4.0 (November 2020) by Thomas Wolf et al.: unified Python library for transformer models, PyTorch/TensorFlow/JAX support, Hub with thousands of pre-trained models, democratisation of modern AI.

Open SourceR&DAI Hugging FaceTransformersThomas WolfAIMLPyTorchOpen Source

From chatbot to AI foundation

Hugging Face — founded in New York in 2016 by Clément Delangue, Julien Chaumond, Thomas Wolf — was initially a consumer chatbot startup. Pivot in 2018 toward Open Source AI tooling, with the pytorch-transformers library (later just transformers).

The Transformers library is a unified port of transformer models (BERT, GPT-2, T5, BART, etc.) in PyTorch, with a consistent API for fine-tuning and inference. Apache 2.0 licence.

Version 4.0 released on 19 November 2020 consolidates the modern offering: PyTorch + TensorFlow + JAX support, high-level pipeline() for common tasks (classification, QA, summarisation, translation), Hub model distribution.

The Hub

Hugging Face Hub is the GitHub of AI models: repository with thousands of versioned pre-trained models, datasets, Spaces (Gradio/Streamlit demo apps). By 2021 it already contains thousands of models.

from transformers import pipeline
classifier = pipeline("sentiment-analysis")
classifier("I love open source")
# [{'label': 'POSITIVE', 'score': 0.9998}]

Three lines for a SST-2 fine-tuned BERT-based classifier.

Ecosystem models

Transformers supports out-of-the-box:

  • BERT, RoBERTa, ALBERT, DistilBERT
  • GPT-2, GPT-Neo/X
  • T5, BART, Pegasus
  • ViT, CLIP, Wav2Vec2

Parallel ecosystem: datasets (loading/streaming), tokenizers (fast Rust-based tokenisation), accelerate (distributed training).

Impact

Hugging Face is making modern AI accessible to every Python developer: the library and the Hub enable rapid adoption of pre-trained models in production.

In the Italian context

Ubiquity in every Italian AI/ML team. Many Italian startups and research groups base products and experiments on Hub models (Italian BERT variants, fine-tuning on domain-specific tasks).


References: Hugging Face Transformers 4.0 (19 November 2020). Thomas Wolf, Julien Chaumond, Clément Delangue. Apache 2.0 licence. Hugging Face Hub. Ecosystem: datasets, tokenizers, accelerate.

Need support? Under attack? Service Status
Need support? Under attack? Service Status