AI Agents in
Statistical Genetics Research

A practical guide for the lab

Yuxuan Wang  ·  Broad Institute  ·  May 2026
slides: yuxuanwang.org/musings.html

Does this actually save time?

Typical day in our lab

  • GWAS QC scripts, variant filters, HPC submissions
  • ggplot2 tweaks, file format conversions, debugging
  • Writing the same SLURM header for the 50th time
These are all mechanical. AI handles the mechanical parts → you focus on the science: choosing models, interpreting results, designing the next experiment.

LLM ≠ Agent

The distinction that matters

LLM
text in → text out
stateless, no tools
+
🔧 Tools (read files, run code)
🧠 Memory (project context)
🔁 Action loop (plan → act → observe)
📡 Orchestration (multi-agent)
=
Agent
GitHub Copilot
in VS Code
Copilot is the agent wrapper. The underlying model — Claude Sonnet, GPT-4o, Gemini — is selected automatically or by you. Cost = tokens (a model property). The agent wrapper is what can act on your code.

GitHub Copilot: one tool, many models

Before AI: RStudio. Now: VS Code + Copilot — the only setup compliant at both Broad and MGB.

Copilot capability What it gives you Models available
Inline completion Tab-complete as you type GPT-4o, Claude Haiku
Copilot Chat Multi-turn conversation, file-aware Claude Sonnet, GPT-4o
Autopilot / Plan agent Multi-step agentic tasks end-to-end Claude Opus, GPT-5.5
Local models Private data, zero API cost Qwen, custom endpoints
MCP integrations PubMed, gnomAD, external tools inline Any model
Only tool approved at both Broad and MGB — covered by Microsoft enterprise agreement.

GitHub Copilot in VS Code

More capable than you might think

  • Autopilot mode: runs multi-step tasks without constant approval — configure "approve all" for uninterrupted sessions
  • Auto model: self-selects Claude, Codex, or Gemini based on task complexity; claims ~10% cost reduction
  • Local models: connect Qwen or a custom endpoint — OSS models are competitive for routine tasks
⚠️ June 1, 2026: AI Credits replace premium requests — monthly plans auto-migrate; annual enterprise plans stay on PRU billing until renewal. Check with your institution's IT team.

The single most impactful thing

Write a .github/copilot-instructions.md for your project

Without it

  • Agent uses docker → your cluster uses podman
  • Writes SLURM headers → you use UGER
  • Uses gene symbols → you need ENSG IDs
  • Re-explains setup every session

With it

  • Agent knows your scheduler, runtime, data format
  • Writes correct code from prompt 1
  • Persistent across sessions
  • 20 min to write → saves hours
Same idea for other agents: CLAUDE.md (Claude Code)  ·  AGENTS.md (generic)

A real example

From an active project — works in .github/copilot-instructions.md

## Scientific Focus
- Prioritize statistical + biological correctness over style

## HPC Environment (UGER / Grid Engine)
- Scheduler: qsub / qrsh
- GPU jobs require: -l gpu=1  -l os=RedHat8  -hard
- Container runtime: podman (not docker)

## Data Conventions
- Always use ENSG IDs (not gene symbols)

## Code Standards
- R primary; use data.table and apply over loops
- Commit messages ≤ 5 words

Now every generated job script targets UGER, uses podman, and references ENSG IDs — without you saying it each time.

The workflow that works

Using GitHub Copilot in VS Code

1 Explore Find what already exists. Don't reinvent the wheel.
2 Plan Design before coding. Review the plan. Catch bad assumptions early.
3 Implement Specific prompts. Agent handles boilerplate.
4 Commit Review the diff. Clean commit message. No leaked paths.
Use Copilot's Plan agent to enforce this — it describes every step and waits for your sign-off before touching anything.

Approval settings

Define where the agent acts freely — and where it stops and asks

chat.permissions.default

  • default — pauses before terminal commands & file edits ← start here
  • autopilot — auto-approves all, runs to completion
  • autoApprove — bypasses everything, avoid in research

Always require approval

  • Any deletionrm, overwrite
  • Git push / force-push
  • Writes to protected data dirs
  • HPC job submission — misfired array = expensive
  • Any external network call
Use Copilot's Plan agent — it describes every step before touching anything.

Git worktrees

Multiple branches checked out simultaneously

The problem

  • Cluster job running on main
  • You want to develop a new feature in parallel
  • git stash / branch switching loses your context

The solution

# check out a second branch
# without touching your current tree
git worktree add ../project-dev dev-branch

# now you have two directories:
# /project       → main (cluster runs here)
# /project-dev   → dev  (you code here)
Copilot respects worktree boundaries — each directory gets its own agent context. Submit jobs from main, develop in dev-branch, no interference.

Specificity is everything

Prompts that get good statistical code

❌  "Run a burden test"
✓  "Run STAAR-O with MAF < 0.01, adjusting for age, sex, and the first 10 PCs. Loop over all annotated protein-coding genes on chr22. Use ENSG IDs. Read variants from /data/topmed/wgs_chr22.pgen. Output to results/chr22_staar.rds."
  • Include: method, covariates, thresholds, input paths, output format
  • The agent can implement any method you name — you choose the method

Right model for the task

Copilot Auto mode handles this — or override manually

Task Model tier Why
Exploration, planning, statistical reasoning High — Claude Opus 4.7 Best reasoning, worth the cost
Implementation, refactoring Medium — Claude Sonnet 4.6 Fast, sufficient for coding
Boilerplate, format conversions Low — GPT-4o / Haiku Instant, nearly free
Copilot Auto mode picks the right tier automatically for each request. To override: click the model selector in the Copilot Chat panel.

Sensitive genomic data

The cardinal rule: never let the agent read your data files

❌  "Here's my phenotype file,
write a REGENIE script"
(data leaves your environment)
✓  You run head -3 pheno.txt
✓  You paste: column names, format, paths
✓  Agent writes the code
✓  You execute on the cluster
Genetic / health data → DUA restrictions → IRB requirements. The metadata (column names, file structure) is usually fine. The data itself is not.

Copilot cost: heavier models cost more

Billing transitions to AI Credits on June 1, 2026 — the principle stays the same

Tier Cost weight Examples Use for
Included / Lightweight Free / low GPT-4o, GPT-5 mini, Claude Haiku 4.5 Everyday coding, QC scripts, plots
Premium Medium Claude Sonnet 4.6, Gemini 2.5 Pro Planning, architecture, tricky bugs
High-tier / Agentic High Claude Opus 4.7, GPT-5.5 Complex multi-step pipelines only
⚠️ June 1, 2026: premium requests → GitHub AI Credits (token-based). Pro: 1,000 credits/mo · Pro+: 3,900 credits/mo · Enterprise: $19/user in credits. Annual plan users stay on PRU billing until renewal.
  • Default to Included / Auto model — sufficient for most research coding
  • Keep context lean — every turn draws from your allowance
  • Check: GitHub Settings → Billing → Copilot

Honest assessment

✓ Good at
  • Syntactically correct R / Python / bash
  • Implementing a method you specify
  • Boilerplate (UGER scripts, argument parsers)
  • Refactoring messy scripts
  • Explaining error messages
✗ Not good at
  • Choosing the right statistical method
  • Knowing your cluster without being told
  • Validating scientific results
  • Catching batch artifacts
  • Replacing domain expertise
Always read every line of generated statistical code. A wrong sign or dropped covariate produces wrong results silently.

Three things to do this week

  1. 1 Enable Copilot in VS Code — check with your institution's IT team; many research institutes cover it under an enterprise license
  2. 2 Write a CLAUDE.md for your project — scheduler, container runtime, data format, package preferences
  3. 3 Try the workflow on one task you've been putting off — Explore → Plan → Implement → Commit

OpenClaw

A different paradigm: local-first, always-on

  • Runs on your own device (not cloud-hosted)
  • Answers via WhatsApp, Slack, Discord, voice
  • Multi-model: Claude, GPT-4o, local Qwen
  • Skills registry (like Claude Code)
When this beats Claude Code:
Monitoring a cluster job while away from your desk.
Quick questions via Slack.
Hands-free via voice on mobile.

github.com/openclaw/openclaw

MCP: agents that fetch for you

Model Context Protocol — open standard for agent ↔ tool connections

Without MCP

  • Open browser → search PubMed → copy abstract → paste into chat
  • Go to gnomAD → look up allele frequency → switch back
  • Context switch every few minutes

With MCP

  • Agent searches PubMed inline
  • Agent queries gnomAD directly
  • You stay in the coding session
Relevant servers for statistical genetics:
PubMed  ·  bioRxiv  ·  gnomAD  ·  ClinicalTrials.gov  ·  ChEMBL
VS Code Copilot: MCP Servers panel  ·  Registry: registry.mcphub.io

Questions?

Blog post & slides:
yuxuanwang.org/musings.html

github.com/openclaw/openclaw
docs.github.com/copilot · AI Credits transition (June 1, 2026)