Research & Development

Quantum Computing, Blockchain, IoT & Industry 4.0, Robotics, Energy & Sustainability. Prototypes and proof-of-concepts to validate emerging technologies.

Discover research areas →

pandas limits

pandas (Wes McKinney, 2008) is the dominant Python DataFrame library, but shows limits on modern scales:

Single-threaded by default — does not leverage multi-core CPUs
Eager evaluation — every operation runs immediately
Memory — eager-loaded DataFrame in RAM, problems with data > memory

Polars, created by Ritchie Vink (Dutch engineer) from 2020, responds with a Rust rewrite:

Rust core with underlying Arrow format
Lazy evaluation — operation chain optimised before execution
Multi-threaded by default
Streaming — datasets larger than RAM
Query optimiser — predicate pushdown, projection pushdown

MIT licence. Version 0.14-0.15 (autumn 2022) consolidates production maturity.

API

Polars has two modes:

Eager (familiar from pandas):

import polars as pl
df = pl.read_csv("data.csv")
result = df.filter(pl.col("age") > 30).group_by("country").agg(pl.col("salary").mean())

Lazy (optimised):

result = (
    pl.scan_csv("data.csv")
      .filter(pl.col("age") > 30)
      .group_by("country")
      .agg(pl.col("salary").mean())
      .collect()
)

In lazy mode, Polars builds an execution plan, optimises it, then executes with minimum overhead.

Performance

Public benchmarks (TPC-H, DB-benchmark): Polars 10-100x faster than pandas on medium-large datasets. Competitive with Spark on single-node; Dask/Ray for distributed.

Interoperability

Polars integrates with:

pandas — .to_pandas() and .from_pandas()
NumPy
Arrow (shared data format with Spark, DuckDB, others)
Parquet, CSV, JSON, Avro
PyArrow, pyarrow-flight

In the Italian context

Polars starts to attract interest in Italian data teams for scenarios where pandas is too slow but Spark is overkill.

References: Polars. Ritchie Vink. Rust + Python bindings. MIT licence. Arrow format. Lazy evaluation + query optimiser. Modern alternative to pandas.

Company

Actions

Links

Products

Solutions

Industries

Polars: DataFrames in Rust, pandas successor

Research & Development

pandas limits

API

Performance

Interoperability

In the Italian context