Preliminary notes
Content provided “as-is”, not suitable for real clinical data without a DPIA. Before any trial:
- Work only on test datasets or data anonymised under a documented and validated procedure.
- Back up the working folder before every session.
- No cloud endpoints on PHI (Protected Health Information): the model must be local (Ollama, vLLM) even for trials. A misconfiguration on real data is a breach.
- Never place secrets into prompts or versioned config files.
- MCP and Goose are fast-moving technologies: check the official docs of the version you have, especially for server syntax.
- This integration is not medical software. Do not use it for diagnostic, triage or clinical decision support purposes.
What Goose is and what MCP is
Goose (block.github.io/goose) is an open-source desktop and CLI agent released by Block in January 2025, Apache 2.0. The key trait here is its adherence to the Model Context Protocol (MCP), the interoperability standard between LLM agents and tools/data sources published by Anthropic in late 2024. An MCP server exposes capabilities (resources, tools, prompts); Goose consumes them over JSON-RPC.
For a healthcare organisation or a vendor building medical-software products, the combination opens a concrete possibility: experiment with an agent on top of their own data without the data leaving the infrastructure, and without reimplementing ad-hoc integrations.
Use case: exploring an anonymised sample FHIR bundle
A development team is designing a module that queries a folder of anonymised test FHIR bundles. We want to see whether an agent can help a developer ask exploratory questions (counts by resource type, observation distribution, conformance to a minimal schema) without first writing a custom tool.
1. Preparing test data
Use synthetic or publicly-released development data only. The Synthea project (Apache 2.0, synthetichealth.github.io/synthea) generates FHIR bundles of fictional patients. Alternatively, adapt your own datasets using an internally certified anonymisation procedure.
mkdir -p ~/fhir-lab/input ~/fhir-lab/reports
# Place only test or synthetic JSON here:
cp /path/to/synthetic-bundle.json ~/fhir-lab/input/
# Read-only for safety:
chmod 444 ~/fhir-lab/input/*.json
2. Minimal read-only MCP server in Python
An MCP server exposing only two tools: list_bundles (lists JSON files in the input directory) and bundle_metadata (returns resource-type counts for a bundle, no clinical content). The server never reads demographic fields.
# ~/fhir-lab/fhir_mcp_server.py
import json, os
from pathlib import Path
from mcp.server.fastmcp import FastMCP # depends on SDK version
BASE = Path(os.environ["FHIR_LAB_DIR"]) # required, no risky default
assert BASE.is_dir(), "FHIR_LAB_DIR invalid"
app = FastMCP("fhir-lab-readonly")
@app.tool()
def list_bundles() -> list[str]:
"""List .json files in the input directory."""
return sorted(p.name for p in (BASE / "input").glob("*.json"))
@app.tool()
def bundle_metadata(filename: str) -> dict:
"""Aggregated resourceType counts in a bundle. No clinical content."""
p = (BASE / "input" / filename).resolve()
if BASE / "input" not in p.parents:
raise ValueError("path traversal")
bundle = json.loads(p.read_text())
counts = {}
for entry in bundle.get("entry", []):
rt = entry.get("resource", {}).get("resourceType", "Unknown")
counts[rt] = counts.get(rt, 0) + 1
return {"filename": filename, "totalEntries": sum(counts.values()), "byType": counts}
if __name__ == "__main__":
app.run()
Three elementary security properties:
BASErequired via env: no hardcoded paths, no fallback.- Path-traversal check:
filenamecannot escape the declared directory. - Read-only functions: no mutating operation exposed to the model.
Install dependencies in an isolated virtualenv:
python3 -m venv ~/fhir-lab/.venv
source ~/fhir-lab/.venv/bin/activate
pip install "mcp[cli]" # package name depends on the release
3. Configure Goose to use the server and a local model
# Local provider via Ollama:
ollama pull qwen2.5:14b # or another appropriate model
# Goose configuration (the exact command may vary between versions):
goose configure
# Pick: provider = Ollama; model = qwen2.5:14b
Attach the MCP server to the Goose profile (indicative syntax — adapt to the config.yaml format of the installed version):
extensions:
fhir-lab:
type: stdio
command: /home/user/fhir-lab/.venv/bin/python
args: ["/home/user/fhir-lab/fhir_mcp_server.py"]
env:
FHIR_LAB_DIR: "/home/user/fhir-lab"
4. Session with Goose
cd ~/fhir-lab
goose session
Useful prompts in the REPL:
- “List available bundles using the
list_bundlestool.” - “For each bundle, show resourceType counts using
bundle_metadata. Summarise in a table.” - “Which bundles do not contain
Observation? Return the list.”
The agent calls the MCP tools to answer: no clinical data enters the prompt — only the metadata the server deliberately exposes.
Limits and caveats
- Anonymisation ≠ security: an “anonymised” dataset can still carry PHI via indirect re-identification. Use documented procedures (pseudonymisation, generalisation, k-anonymity or equivalents) before considering it usable in an AI trial.
- The MCP server is part of your TCB: a bug bypassing the path-traversal check, or a mutating tool added for convenience, opens an attack surface on sensitive data. Treat the server code as you treat code that writes to clinical DBs.
- A local model is not “risk-free”: traces of Goose sessions (user home, shell history, swap) can still contain queried metadata. Protect the workstation as you would a clinical client.
- Not medical software: even seemingly harmless examples (type counts) are not clinically validated. Any diagnostic or care-pathway use requires a regulatory route (MDR, ISO 13485, IEC 62304) outside the scope of this tutorial.
Link: block.github.io/goose — modelcontextprotocol.io — synthetichealth.github.io/synthea
