Tutorial: OpenHands in Docker for reproducible research experiments

OpenHands (formerly OpenDevin) running in a Docker container with read-only mount on input data. Configurable model endpoint, workspace persistence, network constraints for isolation.

Open SourceAITutorial Open SourceAIAgenticTutorialOpenHandsOpenDevinDockerResearch

Preliminary notes

Content provided “as-is”. Before using OpenHands on real data:

  • Work inside a container with targeted volume mounts: the agent has full access to the container’s file system.
  • Back up dataset and code before every session.
  • Never place secrets in environment variables passed to the container unless strictly required. Prefer mounted config files in read-only and rotated after use.
  • OpenHands is an autonomous agent that runs code: the Docker perimeter is the last line of defence. --privileged or --network host defeat the isolation; avoid them.
  • For regulated data (personal, health), use a local model and disable container network access once images are pulled.

What OpenHands is

OpenHands (github.com/All-Hands-AI/OpenHands), originally released as OpenDevin on 12 March 2024 and renamed OpenHands on 26 August 2024 (announced by Graham Neubig, v0.9.0), is an MIT-licensed open-source project providing a platform for agents able to write code, navigate file systems, use browsers and terminals. The architecture runs the agent inside a sandbox, typically a Docker container, with a web frontend (or CLI) where the operator observes and steers.

The operational difference from an IDE assistant is the degree of autonomy: OpenHands can plan a sequence of steps and execute them without per-step approval, within the configured sandbox. For a research context the paradigm is attractive if paired with a strict perimeter.

Use case: first cleaning pass on a scientific dataset in a sandbox

A researcher receives a 5 GB CSV dataset with heterogeneous schema (numeric columns with inconsistent separators, dates in multiple formats). They want a repeatable, documented first cleaning pass, with no risk of altering the originals.

1. Project layout

~/exp-openhands/
├── input/              # source data, mounted read-only
│   └── measurements.csv
├── workspace/          # agent working dir, writable
└── logs/               # session transcripts and reports
mkdir -p ~/exp-openhands/{input,workspace,logs}
cp /data/raw/measurements.csv ~/exp-openhands/input/
chmod -R a-w ~/exp-openhands/input      # also read-only inside the container

2. Launch OpenHands in Docker

Indicative syntax — the current OpenHands release may change flag names; refer to the docs for the installed version.

docker run --rm -it \
  --name openhands \
  --network openhands-net \
  -p 3000:3000 \
  -v ~/exp-openhands/input:/workspace/input:ro \
  -v ~/exp-openhands/workspace:/workspace/work \
  -v ~/exp-openhands/logs:/workspace/logs \
  -e LLM_BASE_URL=http://ollama:11434/v1 \
  -e LLM_API_KEY=ollama \
  -e LLM_MODEL=openai/deepseek-coder-v2:16b \
  docker.all-hands.dev/all-hands-ai/openhands:latest

Configuration highlights:

  • input mounted :ro — the agent cannot overwrite source data.
  • workspace is the only writable directory: scripts, derived datasets, plots end up here.
  • logs separate for session artifacts — easy to wipe if it contains PII.
  • Model via local endpoint (Ollama on the dedicated openhands-net Docker network): no outbound traffic to external providers.
  • No --privileged and no --network host: the agent lives in a dedicated Docker network.

3. Session prompt

In the web UI at http://localhost:3000:

/workspace/input/measurements.csv (read-only) contains a tabular dataset. Create a script /workspace/work/clean.py that: (a) reads the file auto-detecting the separator; (b) normalises dates to ISO 8601; (c) converts numeric columns stripping thousands separators; (d) writes the result as /workspace/work/measurements_cleaned.parquet; (e) writes to /workspace/logs/cleaning_report.md a summary with input rows, rejected rows and reasons. Do not change anything in /workspace/input.”

OpenHands plans the execution. While it acts, monitor the side panel for commands issued and files being created.

4. Result verification

After completion:

ls -la ~/exp-openhands/workspace/
cat ~/exp-openhands/logs/cleaning_report.md
python3 -c "import pandas as pd; print(pd.read_parquet('$HOME/exp-openhands/workspace/measurements_cleaned.parquet').info())"

The resulting clean.py is the reproducible artifact. Commit it to the project repo: cleaning is no longer “session magic”, it is a documented step.

5. Closing the session

docker stop openhands
# Mounted volumes remain: results are on the host filesystem.
# If logs contain PII, wipe them:
#   shred -u ~/exp-openhands/logs/*

Limits and caveats

  • The sandbox is not complete confinement: Docker is a pragmatic sandbox, not an isolated VM. An agent with access to the Docker socket or a network shared with other services expands the surface. For higher risk, run OpenHands in a dedicated VM or in an ephemeral CI runner.
  • The model can produce “plausible but wrong” code: the cleaning_report.md summary is not ground truth. Compare counts against an independent query on the original dataset before accepting the cleaning.
  • Local 7B–14B models are modest at articulated data cleaning: complex datasets may require multiple iterations with manual corrections to the script. It is not a fire-and-forget automation tool.
  • Fast-moving versions: OpenHands flag names, environment variables and container structure changed more than once during 2024. Pin an explicit image tag (avoid :latest for sessions that should be reproducible months later).

Link: github.com/All-Hands-AI/OpenHandsall-hands.dev


Stefano Noferi — Founder e CEO/CTO di noze
Tech Entrepreneur — AI Governance & Security Architect

Need support? Under attack? Service Status
Need support? Under attack? Service Status