Tutorial: Goose with a read-only MCP server on anonymised FHIR metadata

Block's Goose configured with a custom MCP server exposing read-only metadata of an anonymised FHIR bundle. Local LLM via Ollama, directory isolation, use of test datasets.

Open SourceAITutorial Open SourceAIAgenticTutorialGooseMCPHealthcareFHIR

Preliminary notes

Content provided “as-is”, not suitable for real clinical data without a DPIA. Before any trial:

  • Work only on test datasets or data anonymised under a documented and validated procedure.
  • Back up the working folder before every session.
  • No cloud endpoints on PHI (Protected Health Information): the model must be local (Ollama, vLLM) even for trials. A misconfiguration on real data is a breach.
  • Never place secrets into prompts or versioned config files.
  • MCP and Goose are fast-moving technologies: check the official docs of the version you have, especially for server syntax.
  • This integration is not medical software. Do not use it for diagnostic, triage or clinical decision support purposes.

What Goose is and what MCP is

Goose (block.github.io/goose) is an open-source desktop and CLI agent released by Block in January 2025, Apache 2.0. The key trait here is its adherence to the Model Context Protocol (MCP), the interoperability standard between LLM agents and tools/data sources published by Anthropic in late 2024. An MCP server exposes capabilities (resources, tools, prompts); Goose consumes them over JSON-RPC.

For a healthcare organisation or a vendor building medical-software products, the combination opens a concrete possibility: experiment with an agent on top of their own data without the data leaving the infrastructure, and without reimplementing ad-hoc integrations.

Use case: exploring an anonymised sample FHIR bundle

A development team is designing a module that queries a folder of anonymised test FHIR bundles. We want to see whether an agent can help a developer ask exploratory questions (counts by resource type, observation distribution, conformance to a minimal schema) without first writing a custom tool.

1. Preparing test data

Use synthetic or publicly-released development data only. The Synthea project (Apache 2.0, synthetichealth.github.io/synthea) generates FHIR bundles of fictional patients. Alternatively, adapt your own datasets using an internally certified anonymisation procedure.

mkdir -p ~/fhir-lab/input ~/fhir-lab/reports
# Place only test or synthetic JSON here:
cp /path/to/synthetic-bundle.json ~/fhir-lab/input/
# Read-only for safety:
chmod 444 ~/fhir-lab/input/*.json

2. Minimal read-only MCP server in Python

An MCP server exposing only two tools: list_bundles (lists JSON files in the input directory) and bundle_metadata (returns resource-type counts for a bundle, no clinical content). The server never reads demographic fields.

# ~/fhir-lab/fhir_mcp_server.py
import json, os
from pathlib import Path
from mcp.server.fastmcp import FastMCP  # depends on SDK version

BASE = Path(os.environ["FHIR_LAB_DIR"])  # required, no risky default
assert BASE.is_dir(), "FHIR_LAB_DIR invalid"

app = FastMCP("fhir-lab-readonly")

@app.tool()
def list_bundles() -> list[str]:
    """List .json files in the input directory."""
    return sorted(p.name for p in (BASE / "input").glob("*.json"))

@app.tool()
def bundle_metadata(filename: str) -> dict:
    """Aggregated resourceType counts in a bundle. No clinical content."""
    p = (BASE / "input" / filename).resolve()
    if BASE / "input" not in p.parents:
        raise ValueError("path traversal")
    bundle = json.loads(p.read_text())
    counts = {}
    for entry in bundle.get("entry", []):
        rt = entry.get("resource", {}).get("resourceType", "Unknown")
        counts[rt] = counts.get(rt, 0) + 1
    return {"filename": filename, "totalEntries": sum(counts.values()), "byType": counts}

if __name__ == "__main__":
    app.run()

Three elementary security properties:

  • BASE required via env: no hardcoded paths, no fallback.
  • Path-traversal check: filename cannot escape the declared directory.
  • Read-only functions: no mutating operation exposed to the model.

Install dependencies in an isolated virtualenv:

python3 -m venv ~/fhir-lab/.venv
source ~/fhir-lab/.venv/bin/activate
pip install "mcp[cli]"   # package name depends on the release

3. Configure Goose to use the server and a local model

# Local provider via Ollama:
ollama pull qwen2.5:14b   # or another appropriate model

# Goose configuration (the exact command may vary between versions):
goose configure
# Pick: provider = Ollama; model = qwen2.5:14b

Attach the MCP server to the Goose profile (indicative syntax — adapt to the config.yaml format of the installed version):

extensions:
  fhir-lab:
    type: stdio
    command: /home/user/fhir-lab/.venv/bin/python
    args: ["/home/user/fhir-lab/fhir_mcp_server.py"]
    env:
      FHIR_LAB_DIR: "/home/user/fhir-lab"

4. Session with Goose

cd ~/fhir-lab
goose session

Useful prompts in the REPL:

  • “List available bundles using the list_bundles tool.”
  • “For each bundle, show resourceType counts using bundle_metadata. Summarise in a table.”
  • “Which bundles do not contain Observation? Return the list.”

The agent calls the MCP tools to answer: no clinical data enters the prompt — only the metadata the server deliberately exposes.

Limits and caveats

  • Anonymisation ≠ security: an “anonymised” dataset can still carry PHI via indirect re-identification. Use documented procedures (pseudonymisation, generalisation, k-anonymity or equivalents) before considering it usable in an AI trial.
  • The MCP server is part of your TCB: a bug bypassing the path-traversal check, or a mutating tool added for convenience, opens an attack surface on sensitive data. Treat the server code as you treat code that writes to clinical DBs.
  • A local model is not “risk-free”: traces of Goose sessions (user home, shell history, swap) can still contain queried metadata. Protect the workstation as you would a clinical client.
  • Not medical software: even seemingly harmless examples (type counts) are not clinically validated. Any diagnostic or care-pathway use requires a regulatory route (MDR, ISO 13485, IEC 62304) outside the scope of this tutorial.

Link: block.github.io/goosemodelcontextprotocol.iosynthetichealth.github.io/synthea


Stefano Noferi — Founder e CEO/CTO di noze
Tech Entrepreneur — AI Governance & Security Architect

Need support? Under attack? Service Status
Need support? Under attack? Service Status