# BRC Annual Meeting Talk — Outline (BRC-Analytics-tuned, 30+15 format)

## Context

**Event:** BRC Annual Meeting, Day 2
**Date:** Wednesday, May 6, 2026
**Location:** 308 Life Sciences Building, Penn State University (hybrid)
**Speaker:** Steven Weaver (Temple University)
**Title:** _From Browser to Autonomous Agents: WebAssembly-Powered Evolutionary Analysis for BRC.Analytics_ (tentative)
**Length:** **45-min slot** = ~30 min spoken + ~15 min Q&A. The long Q&A is a feature, not a buffer — this audience will engage. The talk has to _open the conversation_, not exhaust it.
**Audience mix:** NIAID program officers (Wiriya Rutvisuttinunt, Liliana Brown), BRC Analytics team, BV-BRC and PDN representatives, DMID leadership. Mix of program/policy and technical.

This is the BRC Analytics flagship technical/AI message of the meeting. The talk has to land three things: (1) the WASM-in-browser engineering shift, (2) the agentic science direction (NIAID + UX working group priority), and (3) Datamonkey 3 as a concrete instance of the BRC.Analytics thesis — community-owned tools, infrastructure-flexible execution, agent-accessible APIs, no gatekeeping.

A 15-min Q&A means you should leave 2–3 deliberate "hooks" in the talk — concrete claims or roadmap items that invite questions — rather than tying every loose end on stage.

---

## Narrative arc (memorize this, not the slides)

> Two-thirds of the world's pathogen researchers don't have a SLURM cluster. The other third has one but waits in queue behind everyone else. Datamonkey 3 dissolves that asymmetry: HyPhy compiled to WebAssembly runs the same analysis on a Chromebook in Lagos and a workstation at Penn State, with a single toggle to burst to TACC when the dataset gets large. The same engine is also exposed via MCP, so an LLM agent — or a Galaxy workflow, or a BV-BRC handoff — drives the analysis without a human in the loop. This is what BRC.Analytics looks like at the method level: community-owned tools, infrastructure-flexible execution, agent-accessible APIs, no gatekeeping. Today I'll show both halves running live, walk through a worked spike-protein case study, and outline where this plugs into BV-BRC, PDN, and Galaxy.

**Three numbers to anchor:** 13 methods · ~2.3× single-thread native overhead · 0 bytes leave the browser by default.

**One framing line for the program officers:** _"Every analytical capability we put in a browser is one less reason a researcher has to wait, pay, or share data they shouldn't have to share."_

**Hooks to leave for Q&A** (don't fully resolve these on stage):

- How does this scale to _very_ large alignments (10k+ sequences)?
- What's the security/sandbox story for running an agent that calls scientific tools?
- How does PDN/BV-BRC ingestion actually work end-to-end?
- What about reproducibility and provenance when an agent makes the analysis choices?

---

## Time budget (target — 30 min spoken)

| Act                 | Slides         | Time     | Purpose                                                                       |
| ------------------- | -------------- | -------- | ----------------------------------------------------------------------------- |
| 1. Setup            | 1–4            | ~6 min   | Frame analytical inequality, position Datamonkey 3 as BRC.Analytics shape     |
| 2. Browser engine   | 5–9 + Demo 1   | ~10 min  | Architecture, methods, benchmarks, live in-browser run                        |
| 3. Agentic pivot    | 10–12 + Demo 2 | ~8.5 min | MCP as the GA4GH-for-agents analog, agent walkthrough, the no-promise insight |
| 4. Case study + BRC | 13–15          | ~4 min   | Real result, BRC handshake diagram, NIAID alignment                           |
| 5. Roadmap + close  | 16–18          | ~2 min   | Concrete FY2027 deliverables, try it, thanks                                  |
| Q&A                 | —              | ~15 min  | In-slot — see Q&A prep below                                                  |

**Hard rule:** the extra 5 spoken minutes vs. the 25-min version go into Acts 2 and 4 (more time on the architecture explanation and a fuller case study) — _not_ into more slides. Aim for fewer, deeper slides over more, shallower ones.

---

## Slide-by-slide outline (~17–18 slides + 2 demo segments)

### ACT 1 — Setup (~6 min)

#### Slide 1 — Title (~30s)

- Title, name, Temple (Pond Lab), BRC Analytics
- Co-credits in small text: Sergei Pond (PI), Anton Nekrutenko, BRC Analytics team
- Subtitle previews the arc: _"Putting HyPhy in every browser — and every agent."_

#### Slide 2 — Why this matters: analytical inequality (~2 min)

- Pathogen evolution analysis is decision-relevant infrastructure: drug resistance, immune escape, host adaptation, recombination
- But the _people_ who most need it — public-health labs in low-resource settings, clinical labs with embargoed data, undergraduate classrooms — are the least likely to have an HPC allocation
- The BRC.Analytics proposal calls this "analytical inequality." Datamonkey 3 is a concrete attempt to dissolve one corner of it
- Outbreak speed is the consequence, not the goal — every hour between sequence and interpretation is a decision a clinician or epidemiologist can't make yet (callback on Slide 15)
- _With the extra time:_ one concrete vignette — pick a real example. SARS-CoV-2 sequencing from Durban running MEME in 2020, or HIV alignments from a low-resource setting using FUBAR. The narration script has good candidates. Make the inequality concrete with one face, one place.
- _Speaker note:_ This is the slide that gets program officers to lean in. Don't rush past "analytical inequality" — name it, then ground it in a real example.

#### Slide 3 — HyPhy + Datamonkey: 15 years of selection analysis (~90s)

- Brief retrospective on HyPhy as the standard for codon-level selection inference
- Datamonkey 1.0/2.0 served the community for 15+ years — over a million jobs run (per the narration script)
- But the architecture was server-bound: upload → queue → wait → email → download
- Cite the Datamonkey 2.0 paper context (`Datamonkey_2_0/Datamonkey-2.0.tex`) — this isn't a teardown, it's a generational handoff
- _Speaker note:_ Acknowledge the legacy work warmly; you're standing on it.

#### Slide 4 — Datamonkey 3: the reset (~2 min)

- One image: a laptop browser window running a HyPhy analysis with no network panel activity
- Live at **v3.datamonkey.org** (currently 0.1.0-beta.27)
- Three principles, big text:
  1. **Browser-native** — compute runs where the user is
  2. **Privacy by default** — sequences never leave the device
  3. **Agent-ready** — same engine exposed as MCP tools
- One line under the three principles: _"Same shape as BRC.Analytics: community-owned tool, infrastructure-flexible execution, GUI + API as equal citizens."_
- _With the extra time:_ spend 30 sec on what changed structurally between v2 and v3 — three-tab workflow (Data, Analyze, Results), persistent IndexedDB results, no login. This is the slide where the reset becomes concrete.
- This is the slide that the rest of the talk justifies.

---

### ACT 2 — The browser engine (~10 min)

#### Slide 5 — Architecture at a glance (~2 min)

- Diagram: **three** parallel paths into a single HyPhy 2.5.94 engine
  - `Browser UI → WASM (Aioli) → HyPhy`
  - `MCP Clients → Socket.IO backend → HyPhy`
  - `Galaxy tool wrapper → Conda/BioContainers → HyPhy`
- Same code, three front doors. Critical for what comes later in Acts 3 and 4.
- This is the BRC.Analytics distribution model from §3.2.2 of the proposal (BioConda → BioContainers → ToolShed → Galaxy) at the method level.
- Frontend: SvelteKit. WASM loader: `@biowasm/aioli`. Backend: Node + Socket.IO.
- _With the extra time:_ explicitly walk through one method's path through all three doors — e.g., FEL — so the room sees that "same engine" isn't marketing.

#### Slide 6 — How HyPhy ends up in your browser (~2.5 min)

- The compilation pipeline:
  1. HyPhy 2.5.94 C++ codebase → **Emscripten** → WebAssembly + JS shim
  2. **Aioli** (BioWASM) loads the WASM, mounts a virtual POSIX filesystem at `/shared/data/`
  3. SvelteKit app writes user inputs (alignment, tree) into the virtual FS
  4. HyPhy runs on the main thread; results are read back as JSON
- Why this works: Emscripten's POSIX shim is good enough that HyPhy needed minimal source changes
- Trade-offs we accepted: no SIMD, no OpenMP threads (yet), no native IO speed
- _With the extra time:_ one slide-side or speaker note on what didn't work the first time — a real engineering anecdote about porting HyPhy to WASM lands well with the BioWASM-curious half of the room and signals technical depth without showing off.

#### Slide 7 — What you can run today: 13 methods (~90s)

- A clean grid of all 13: FEL, MEME, SLAC, FUBAR, B-STILL, aBSREL, BUSTED, RELAX, Contrast-FEL, GARD, BGM, PRIME, MULTI-HIT
- Group them visually:
  - **Site-level selection:** FEL, MEME, SLAC, FUBAR, B-STILL
  - **Branch-level / lineage:** aBSREL, BUSTED, RELAX, Contrast-FEL
  - **Beyond pointwise selection:** GARD (recombination), BGM (covariation), PRIME (property-dependent), MULTI-HIT
- Note the interactive branch-tagging UI for RELAX / Contrast-FEL (a real UX win over CLI HyPhy)

#### Slide 8 — Performance: honest numbers (~2 min)

- Benchmarks table, with cluster row added to make the dual-execution story land:

  | Dataset  | Seqs | Sites | Native CLI | WASM     | Cluster (SLURM) | Take              |
  | -------- | ---- | ----- | ---------- | -------- | --------------- | ----------------- |
  | bglobin  | 17   | 432   | 33.7 ms    | 88.3 ms  | —               | Browser           |
  | lysozyme | 19   | 390   | 8.3 ms     | 17.0 ms  | —               | Browser           |
  | adh      | 23   | 762   | 62.0 ms    | 148.8 ms | —               | Browser           |
  | camelid  | 212  | 288   | 269.6 ms   | 539.7 ms | —               | Browser           |
  | spike    | ~5k  | 3,822 | —          | slow     | minutes         | Toggle to cluster |

- ~2.3× slower than single-threaded native HyPhy on small/medium data; ~9× vs OpenMP+SIMD native (because WASM has neither, yet)
- Headline: _"Browser handles the long tail of analyses. Cluster handles the head. The user picks with one toggle."_
- This is the dual-execution thesis from the narration — same interface, only the location of the computation changes
- WASM SIMD + threads will close most of the remaining gap on the browser side — already being prototyped
- _With the extra time:_ one extra beat on _why_ WASM is fast enough for the use cases that matter. A 2.3× overhead on a 100ms analysis is invisible to the user; the same overhead on a 10-hour analysis isn't, which is why the toggle exists. **This is a Q&A hook** — leave the cluster details light; let someone ask.

#### Slide 9 — Privacy & accessibility (~75s)

- Sequences never leave the browser by default — and here's what that unlocks:
  - **Pre-publication / embargoed data** — analyzed without crossing an institutional boundary
  - **Clinical / IRB-restricted alignments** — never leave the device
  - **Teaching contexts** — works on a Chromebook in a classroom with no cluster account
  - **Field deployments** — works offline after first load (this is real for outbreak responders)
- IndexedDB persistence, results portable as JSON / shareable ZIP
- No accounts, no quotas, no per-user infrastructure
- _Speaker note:_ This is the slide for Wiriya and Liliana. Each bullet is one line, but each bullet maps onto a concrete thing they care about.

#### **DEMO 1 — In-browser analysis (~3 min)**

- **Goal:** show the full Data → Analyze → Results loop on something a BRC user would actually do
- **Suggested script:**
  1. Open `v3.datamonkey.org` (already loaded in another tab as backup)
  2. Drop in a small SARS-CoV-2 spike codon alignment + tree (have it pre-staged)
  3. Pick FEL or aBSREL; show the parameter UI briefly
  4. Run the analysis — narrate the progress bar; this is a believability moment
  5. Open hyphy-eye visualization in a new tab; point at sites under positive selection
- **Backup:** pre-recorded screen capture (the `video-frames/` assets already exist from manuscript work — reuse them)
- **Wi-Fi insurance:** because compute is local, only the _initial app load_ needs network. Once it's loaded, you're safe even if the room Wi-Fi dies mid-talk. Make this point on stage — it's a great moment.

---

### ACT 3 — From browser to agent (~8.5 min)

#### Slide 10 — Pivot: are you a website or a tool? (~75s)

- "Modern science is scientist + AI assistant. The question for a tool builder is: are you a website, or are you a callable tool?"
- Datamonkey 3 is **both** — same engine, three front doors (callback to Slide 5)
- The website is the on-ramp. The MCP and Galaxy surfaces are what make it scale across the BRC ecosystem.

#### Slide 11 — MCP architecture (~2.5 min)

- One-screen diagram of the MCP surface:
  - MCP client (Claude / Claude Code / Cursor / custom agent)
  - → MCP transport
  - → Socket.IO bridge
  - → HyPhy engine (13+ tool-callable methods)
- Each method registers as an MCP tool with: name, description, parameter schema (alignment, tree, genetic code, branch sets, method-specific options), result schema (JSON)
- Reference: `PRIME-backend-implementation.md` — the per-method socket protocol that MCP wraps (`prime:spawn`, `prime:check`, `prime:resubscribe`, etc.)
- Same protocol whether the agent is talking to the WASM build or the backend — the agent doesn't need to care
- **Frame for this room (1):** _"GA4GH gives us standard APIs for systems to talk to BRC services. MCP gives us a standard API for agents to talk to BRC tools. Same principle, different consumer."_
- **Frame for this room (2) — the unexpected freedom:** A normal API is a _promise_ to developers. They write code against it, deploy it, walk away. Break the API, you break their code, and someone gets woken up at 2am. So we evolve APIs slowly, deprecate carefully, and accumulate years of backwards-compatibility debt. **MCP is different.** The consumer is an LLM that re-reads the tool spec on every call. There's no committed code on the other end — there's an agent that figures it out fresh each time. So we haven't made any promises, and we don't have to keep any. **In practice this means our MCP server already exposes capabilities our REST API doesn't have yet** — because there's no compatibility cost to shipping them. We can change the MCP surface tomorrow and the agent will adapt. That's a different relationship between tool and consumer, and it changes how fast we can move.
- _Speaker note:_ This is the line that gets the engineers in the room nodding and the program officers leaning in for different reasons. Engineers recognize the deprecation tax they pay every day; program officers hear "we can iterate faster on the agentic surface than on the traditional one." Land both beats, then move on — don't over-explain.
- _Speaker note 2:_ The GA4GH framing aligns Datamonkey 3 with the BRC.Analytics interoperability roadmap. The promise/no-promise framing explains _why the agentic surface is moving faster than the API surface_ — which sets up Slide 16's roadmap honestly.

#### Slide 12 — What an agent can actually do (~75s)

- Capabilities, framed as verbs an agent can take autonomously:
  - **Pick** the right method for a hypothesis (FEL for site-level dN/dS, GARD for recombination, RELAX for shifts)
  - **Configure** parameters (genetic code, branch tagging from a tree, p-value thresholds)
  - **Spawn** the analysis and **monitor** progress
  - **Interpret** the JSON result (sites, branches, statistical significance)
  - **Chain** follow-on analyses based on what it found
- Cite the MCP-aware client list explicitly: Claude, Claude Code, Cursor, "or your own agent"
- **Q&A hook:** don't preempt the obvious questions about reproducibility, provenance, or "what if the agent picks the wrong method?" — let those come up.

#### **DEMO 2 — Agent-driven analysis (~3–4 min, recorded with live tail)**

- **Goal:** show an LLM autonomously picking and running an analysis — this is the slide that earns the talk title
- **Recommended format for this audience:** pre-recorded clean run, then a brief live tail to prove it isn't faked. The BRC Analytics room contains people who have personally watched LLM demos go sideways at conferences, and "agent picks the wrong method on stage" lands worse here than in a general audience.
- **Suggested script (same for recording and live):**
  1. Open Claude (desktop app or Claude Code) with the Datamonkey MCP server already configured
  2. Prompt: _"I have a SARS-CoV-2 spike alignment from variants of concern. Identify sites under positive selection, then check whether recombination is plausible in this dataset."_
  3. Watch the agent: list available tools → pick FEL → call it → read the JSON → call GARD → summarize findings in plain language
  4. Highlight: the agent chained two analyses _without_ you specifying the second one
- **Live tail (~30s):** ask the agent one follow-up question on stage — "now run RELAX comparing Omicron vs. pre-Omicron branches" — to demonstrate it's responding in real time
- **Risk callout:** if the live tail stalls or picks a wrong method, narrate it — that's an honest moment that the program officers will appreciate. Don't pretend it's deterministic.

---

### ACT 4 — Case study + BRC integration (~4 min)

#### Slide 13 — Worked example: SARS-CoV-2 spike (~2 min)

- Pick one real result you've already produced (likely from manuscript figures — `scripts/compose-figure5.sh` output is a candidate)
- Show: alignment summary, FEL or MEME positive-selection sites overlaid on the spike structure or RBD region, brief biological interpretation (e.g., RBD residues under positive selection consistent with known immune-escape sites)
- One sentence on time-to-result: _"From upload to interpretation: under [X] minutes, no server, no login, agent-driven."_
- _With the extra time:_ this slide now has room to be a real case study, not a blink-and-miss-it figure. Walk through the biological question, what method you picked and why, what the result actually says, and how an agent could have made the same chain of decisions. This is the slide where the room sees the science, not just the engineering. Land it.
- _Speaker note:_ If this was the cut-first slide in the 25-min version, in the 30-min version it's load-bearing. Don't cut it.

#### Slide 14 — BRC interoperability: the handshake diagram (~75s)

- One diagram showing four flows, each marked **prototype** or **roadmap**:
  1. **BV-BRC → Datamonkey 3** _(roadmap):_ user pulls a curated alignment from BV-BRC; Datamonkey 3 reads it directly via API; analysis runs in browser
  2. **PDN → Datamonkey 3** _(roadmap):_ PDN-curated phylogeny becomes a first-class input; agent can request "the latest PDN tree for this lineage" and run RELAX or Contrast-FEL on it
  3. **Galaxy → Datamonkey 3** _(prototype):_ HyPhy methods registered as Galaxy tool wrappers; workflow chaining via Galaxy; cluster execution via Pulsar to TACC
  4. **Datamonkey 3 → anything** _(prototype):_ results as JSON, agent-readable, re-ingestible by Galaxy, BV-BRC dashboards, ObservableHQ notebooks
- The shared interface is MCP for agents and standard APIs for systems — a stable contract any BRC component can target without bespoke integration
- _Speaker note:_ Honesty about what's prototype vs. roadmap matters here. Anton and Sergei will be in the room. Don't oversell. **Q&A hook:** the BV-BRC and PDN reps may want to drill into ingestion details — leave room for it.

#### Slide 15 — What BRC.Analytics gets from this (~75s)

- **Reduces analytical inequality:** browser-native methods work without a cluster account — directly answers §1.1.3 of the proposal
- **Closes the AI-agent gap:** HyPhy methods are now agent-callable, which Galaxy and BV-BRC can leverage too
- **Strengthens emergency posture:** during a surge, queue-free in-browser analysis + agent automation is a parallel pipeline that doesn't compete with the cluster — addresses Key Element 4 of the RFA
- **Aligns with the FY2027 BRC.Analytics roadmap** — callback to Wednesday morning's roadmap segment
- _Speaker note:_ This slide is for the program officers, but it's also the slide that the rest of the BRC team will quote later. Make it quotable.

---

### ACT 5 — Close (~2 min)

#### Slide 16 — Roadmap: concrete FY2027 deliverables (~60s)

- **BV-BRC alignment ingestion via MCP** — agent can request "give me the spike alignment from BV-BRC for variant X" and run selection analysis without a manual download. _MCP first, REST API to follow once the shape stabilizes — the no-promise principle from Slide 11 in action._
- **Galaxy tool wrappers parity** — all 13 WASM-backed methods also available as Galaxy tools by Q2 FY2027
- **PDN tree input as first-class** — RELAX / Contrast-FEL workflows that take PDN-curated tree IDs directly
- **Methods:** FADE, NRM landing next; broader HyPhy parity over the next year
- **WASM:** SIMD + threads when the runtime ecosystem stabilizes — projected ~2× speedup
- _Speaker note:_ Concrete commitments are what the BSC and program officers remember. Vague ones aren't. The "MCP first, REST to follow" framing is honest — it's how we're actually working — and it ties this slide back to the no-promise insight from Slide 11.

#### Slide 17 — Try it / get involved (~45s)

- **v3.datamonkey.org** — open in your browser right now (the live URL is the call to action)
- GitHub: github.com/veg/datamonkey3 (verify exact org before slide is final)
- MCP server config: documented in the repo
- Contact: weaverst@gmail.com

#### Slide 18 — Acknowledgments + Q&A (~30s)

- Sergei Pond (Temple, PI), Anton Nekrutenko, the BRC Analytics team (Penn State, JHU, Temple, UCSC, TACC)
- BioWASM / Aioli authors
- HyPhy contributors
- NIAID OGAT for ongoing support
- "Happy to take questions — and I'll keep the live site open if anyone wants to try it during Q&A."
- _Speaker note:_ You have 15 minutes for Q&A. That's a lot. Don't be afraid of pauses — let questions develop. If the room is quiet for 10 seconds at the start, seed it with: _"One thing I'm curious to hear from the BV-BRC team is how you'd want alignment ingestion to work in practice."_

---

## Q&A prep (15 min is real — prepare for it)

A 15-min Q&A in front of this audience is its own deliverable. Prepare 10–11 likely questions with crisp answers; expect to use 5–6 of them.

### Likely technical questions

1. **"What's the upper limit on alignment size you can run in the browser?"**
   Practical answer: 200–500 sequences for selection methods, depending on sites and method. Above that, the cluster toggle exists. The honest version: _"The point isn't to run everything in the browser; it's that the user shouldn't have to think about where it runs."_

2. **"How does WASM HyPhy stay in sync with native HyPhy as new methods land?"**
   The compilation pipeline is automated against HyPhy main; new methods get a WASM build as part of the HyPhy release process. (Verify this is actually true before the talk.)

3. **"What about reproducibility when an agent picks the method?"**
   Every agent-driven analysis logs the tool call, parameters, and JSON result. Reproducibility is at the _call_ level, not the _prompt_ level. This is honest and defensible.

4. **"How does this differ from a REST API?"**
   You'll have answered the structural part on Slide 11 (LLMs re-read the spec on every call; no committed code on the consumer side; no compatibility tax). The follow-up question is usually about _why this matters in practice_. Answer: _"It means we ship capabilities to MCP first and back-port to REST later, once the shape has stabilized. The MCP surface gets to be a moving target. The REST API still has to be a promise."_ That's an honest description of how Datamonkey 3 is actually being developed.

5. **"Doesn't 'we don't keep promises' worry you from a reproducibility standpoint?"** _(This is the natural follow-up to the no-promise framing — be ready for it.)_
   Good question, and the answer separates two things. **Method semantics** are governed by HyPhy itself — FEL is FEL, MEME is MEME, the statistical methods are versioned in HyPhy releases and don't change because the MCP wrapper changed. **The MCP surface** — tool names, parameter shapes, how the agent invokes things — that's what we treat as fluid. So a published analysis is reproducible by version-pinning the HyPhy engine and the JSON result schema, not by version-pinning the MCP tool descriptor. We're moving fast on the _interface to the tools_, not on the _tools themselves_. That distinction is what makes the no-promise approach defensible.

6. **"Security — what stops an agent from running arbitrary HyPhy?"**
   The MCP server only exposes the registered method tools. There's no shell-out, no arbitrary file access. The sandbox is the WASM runtime in the browser case, and the explicit tool surface in the backend case.

### Likely BRC-strategy questions

7. **"How does this fit with VEuPathDB's Galaxy instance?"**
   The Galaxy tool wrappers work in any Galaxy instance, including VEuPathDB's. The MCP surface is independent. Both paths are open.

8. **"What's the sustainability story past this funding cycle?"**
   Same as HyPhy: open source, community-driven, no infrastructure dependencies the community can't replicate. The browser version literally runs on the user's machine — there's no server to keep alive.

9. **"Who maintains the MCP server and the WASM build?"**
   Currently the Pond Lab. Roadmap is to fold this into the broader HyPhy release process so it's not a single-person dependency.

### Likely program-officer questions

10. **"How do you measure impact?"**
    Job counts (Datamonkey has historical baseline of ~2,200/week), citation tracking, and — new for v3 — agent-driven session counts. Open question on whether browser-only sessions are countable; flag this honestly.

11. **"What's the risk if MCP as a standard doesn't take off?"**
    The Socket.IO surface underneath is independent of MCP. If MCP fades, the same engine is still callable via standard web APIs. The architecture isn't betting on MCP specifically — it's betting on the agent-callable-tool pattern.

### Questions to redirect rather than answer

- **"Should BV-BRC adopt this?"** — that's a conversation for the BSC, not the talk. Redirect to Anton or the BRC Analytics roadmap session.
- **"Is this replacing Galaxy?"** — no, it's a tool _in_ Galaxy and a parallel surface alongside it. Don't take the bait.

---

## Files / sources behind each slide

- Methods + execution flow: `src/lib/MethodSelector.svelte`, `src/lib/services/WasmAnalysisRunner.js`, `src/routes/+page.svelte`
- Benchmarks: `benchmark-results/comparison-summary.md`, `benchmark-results/aggregated-results.json`
- MCP / agent surface: `PRIME-backend-implementation.md`, `e2e/video-frames.spec.js` (architecture frames already drawn)
- Visualization: `src/lib/utils/hyphyEyeIntegration.js`
- Pre-existing manuscript figures + recordings: `scripts/compose-figure5.sh`, `scripts/screenshot-absrel-results.js`, `video-frames/`
- Manuscript context for Slide 3: `Datamonkey_2_0/Datamonkey-2.0.tex`

The video frames and figure-5 composition assets were originally built for the Datamonkey 3 manuscript. Reusing them keeps the visual language consistent with the paper and saves prep time.

---

## Demo prep checklist

**Demo 1 (in-browser, live):**

- [ ] Stage SARS-CoV-2 spike codon alignment + tree (small, ~30 sequences) on the laptop
- [ ] Pre-load `v3.datamonkey.org` in two browser tabs (one armed for demo, one as backup)
- [ ] Pre-record the same flow as a 90s screen capture; have it on the same machine
- [ ] Verify hyphy-eye opens and renders cleanly with the chosen dataset
- [ ] Test on conference Wi-Fi at LSB 308 the day before — confirm the _initial_ asset load works (after that, offline-safe)

**Demo 2 (agent, recorded + live tail):**

- [ ] Configure Datamonkey MCP server in Claude desktop / Claude Code on the demo laptop
- [ ] Record the clean 2-min flow in advance — this is the primary asset
- [ ] Test the live-tail prompt twice end-to-end the day before — note the typical wall-clock duration
- [ ] Have a fallback plain prompt ready in case the agent gets confused during the live tail
- [ ] Decide in advance: if the live tail wanders, do you narrate live or end on the recording? Pick now, don't decide on stage.

**Both demos:**

- [ ] Increase font sizes in browser + terminal beforehand
- [ ] Disable notifications (Slack, Mail, calendar)
- [ ] Close anything visually distracting — single window, clean dock

**Q&A prep:**

- [ ] Print the 11-question list above; have it on the lectern
- [ ] Have the live `v3.datamonkey.org` window ready to use as a visual aid during answers
- [ ] If a question is hostile or off-topic, redirect to "happy to follow up offline" — you don't need to win every exchange

---

## Cut order if you run long during the talk

1. Trim Slide 6 (compilation detail) from 2.5 min to 90 sec
2. Compress Slide 8 (benchmarks) — keep the table, drop the SIMD/threads aside
3. Tighten Slide 13 (case study) from 2 min to 75 sec — but don't drop it; in this version it's load-bearing

Do **not** cut: Slide 2 reframe, Slide 14 BRC handshake, either demo.

---

## Verification (how to know the talk works)

- **Time it standing up, twice.** 30-min spoken slots run long. Aim for 28:00 in rehearsal with 2-min buffer. With a 15-min Q&A you have margin if you go over by a minute or two — but don't plan for it.
- **Dry run with one BRC Analytics colleague** (Sergei or Anton) to sanity-check the analytical-inequality framing and the Slide 14 handshake diagram
- **Dry run with one non-specialist** (Donna, or anyone in the program-officer profile) to check that the WASM compilation slide and the MCP architecture slide actually land for a non-engineer
- **Q&A rehearsal:** have a colleague throw 5 of the 10 questions at you cold. If you stumble, refine the answer in your notes.
- **Wi-Fi check at LSB 308** at least 24h before — confirm `v3.datamonkey.org` loads and the MCP demo path works on the conference network
- **Pre-record both demos** regardless of whether you plan to run them live — the recordings double as deck assets you can hand off to anyone who asks for the slides afterward

---

## Open questions before lock

1. **Slide 13 case study — confirm the spike-protein result you'll use.** This slide is now load-bearing, not optional. If you don't have a real result locked, swap to a different real result (HIV env, influenza HA — anything with a known biological story) rather than holding a placeholder.
2. **Speaker notes:** want full near-verbatim phrasing for each slide as a follow-on?
3. **Q&A list:** want me to expand the 11-question list further, or draft fuller answers for the top 5?