BRC-Analytics | Explore, Analyze & Share Genomic Data
Ebola Evolutionary Surveillance
| Explore, Analyze & Share Genomic Data

Real-Time Evolutionary Surveillance:
Ebola Virus Disease

Tools for Tomorrow NIAID BRCs Infectious Disease Webinar Series

Sergei Pond, Anton Nektrutenko, and the BRC-analytics team

June 12, 2026
02 // SURVEILLANCE REPORTS

Surveillance Reports: Virological.org

Real-time genomic updates from the international research community:

Genome Sequencing
Uganda CPHL sequenced sample CL023114. 99% genome coverage (Pathoplexus: PP_006XCJJ).
tMRCA & Circulation
tMRCA estimated to March 1–15, 2026. Suggests ~2 months of undetected community transmission.
Evolutionary Origin
Long branch indicates a new zoonotic spillover (14-year reservoir evolution) rather than survivor latency.
BDBV Phylogeny and tMRCA Timeline

BDBV phylogeny and MCMC molecular clock emergence timeline (Cuomo-Dannenburg & Ghafari, Virological.org, June 2026).

03 // SURVEILLANCE DYNAMICS

Surveillance Dynamics: Tracking Outbreak Spikes

Datamonkey.org submissions track EBOV outbreaks in real time:

Ebola Virus (EBOV) Datamonkey Submission Spikes
Sporadic Outbreak Signal (n=110 total hits)
Late 2025 Endemic Baseline Scattered submissions
Early 2026 Routine Surveillance Low baseline activity
Mid May '26 Pre-Alert Gap 0 submissions
Late May '26 27 hits (Spike) Bundibugyo Outbreak Alert
Outbreak Tracking Signal
Surges in Datamonkey server submissions serve as early indicators of outbreak activity. Low baseline sequence counts during quiet periods limit standard molecular clock and selection analyses.
04 // DATA FOUNDATION

Pathogen Data Integration

1. Evolutionary Question
Did 14 years of unobserved reservoir evolution alter selective pressures in 2026?
Study Design
2. Data Sources
68 curated genomes (≥18kb) from NCBI RefSeq and Pathoplexus/PDN.
Pathogen Registry
3. Workflows & Tools
CAPHEINE pipeline (cawlign, IQ-TREE, HyPhy selection suite) on Galaxy/BRC-analytics.
Galaxy Toolset
4. Interpretation
Quantifying selection intensity shifts (RELAX) and lineage-specific codon shifts (Contrast-FEL).
Molecular Reports
05 // DATA CURATION

Programmatic Data Curation

The agent programmatically queries the Pathoplexus API to extract, clean, and format genomic surveillance sequences:

1. LAPIS API Query
Queries Pathoplexus for BDBV genomes matching quality thresholds (length ≥ 18kb, coverage ≥ 90%).
2. Quality Curation
Removes lab-passaged clones, duplicate sequences, and short gene fragments to preserve wild-type signal.
3. Dataset Partitioning
Groups and caches 68 genomes into 34 foreground (May 2026) and 34 background (historical) cohorts.
Programmatic access curated directly from the Pathogen Data Network (PDN) / Pathoplexus.
06 // AUTOMATED ANALYSES

Automated Selection Analyses

CAPHEINE is a Galaxy workflow that automates codon-aware alignment, tree reconstruction, and the full suite of HyPhy selection tests on usegalaxy.org. Status: Usable directly on Galaxy via the Intergalactic Workflow Commission (IWC); integration on BRC-analytics (featuring a dedicated configuration stepper) is coming soon.

1. BioBlend Preparation
Curated genomes are pushed to Galaxy via BioBlend APIs, automatically initializing dataset collections.
2. CAPHEINE Execution
Launches the 113-job Galaxy-native workflow, executing the full suite of HyPhy selection models in parallel.
3. Agentic Monitoring
The agent programmatically polls execution status, dynamically resolves payload errors, and caches JSON results.
Automated evolutionary pipeline running on public Galaxy infrastructure (usegalaxy.org).
Galaxy History
07 // PHYLOGENETIC TREES

Phylogenetic trees reconstructed for each of the 7 EBOV protein-coding regions (only unique sequences are retained per gene to eliminate redundant genomes and focus on distinct evolutionary variants). Outbreak isolates are highlighted, identifying their relationship to historical genomes.

Note: Trees include only unique sequence variants to remove identical genomes and highlight evolutionary paths.
Legend
2026 Outbreak Isolates
MRCA Clade (2026 Isolates)
Historical Background
MRCA Branch Analysis
Loading...
08 // SELECTION MODULES

Non-linear evolutionary inquiries mapping selection and recombination:

EBOV Genome Selection Mapping (BDBV 2026 Outbreak)
Differential Selection (Contrast-FEL, p≤0.10) Pervasive Purifying Selection (FEL, p≤0.05) AA Different from Reference (consensus) Polymorphic Site in Outbreak
NP 406 537 707 VP35 VP40 271 GP 196 213 344 509 VP30 VP24 L (RNA Polymerase) 440 642 1341/2 1345 1417 1681 1714 1738 2042
Pervasive Purifying Selection
Most amino acid sites are under strong purifying constraints (FEL, p≤0.05). Essential structural proteins (VP40, NP) show high sequence stability, maintaining virus functional integrity.
Differential Selection Pressures
Contrast-FEL (p≤0.10) identifies 7 selective differences between 2026 and background lineages: 1 in NP (site 537), 2 in GP (sites 213, 344), and 4 in L (sites 1341, 1342, 1681, 1738).
Outbreak Variable Sites Summary
Out of 47 total variable amino acid sites in the 2026 outbreak (41 different from reference, 6 polymorphic), Contrast-FEL finds 7 (15%) under significantly different selection pressure (p≤0.10).

Note: The extremely low genetic diversity among the 2026 outbreak isolates (mean genetic distance 0.56%) severely limits the statistical power of standard selection analyses. Consequently, a threshold of p≤0.10 is used to identify potential selective differences.

09 // BDBV SELECTION DYNAMICS

BDBV Joint RELAX Analysis

Testing for selection shifts on 6 protein-coding regions (NP, VP35, VP40, GP, VP24, L) relative to historical human outbreaks (2007, 2012):

Run A: Spillover Event (Ancestral MRCA)
Tests for changes in evolutionary pressure when the virus jumped to humans:
  • Result: No statistically significant shift in selection intensity.
  • Meaning: The virus did not face unusual adaptive pressures during spillover.
Run B: Human-to-Human Spread (Outbreak Clade)
Tests for changes in evolutionary pressure during active transmission within the 2026 outbreak:
  • Result: No statistically significant shift in selection intensity.
  • Meaning: Natural constraints remain fully active; no relaxation of evolutionary pressure.

Biological Implications

Contextualizing selection intensity shifts preceding emergence:

Conserved Purifying Selection
Both the spillover event and the subsequent outbreak branches exhibit strong purifying selection. The virus remained under constraint to preserve functional protein structures, indicating a lack of evolutionary release.
Evolutionary Stability
The 2026 outbreak is evolving under identical selection regimes as the 2007 and 2012 outbreaks. This stability suggests emergence from a similar, naturally circulating reservoir and matches the patterns observed in other typical zoonotic spillovers.
10 // BRC-ANALYTICS WORKFLOWS
Consensus Assembly
Reference-guided assembly of raw FASTQ sequencing reads to reconstruct high-quality viral genomes.
Variant Calling
Low-frequency variant analysis to detect sub-clonal iSNVs and minor variants in clinical hosts.
Evolutionary Selection
Comparative genomics using cawlign, IQ-TREE, and HyPhy to detect selective differences between lineages.
Annotation & RNA-Seq
In-frame CDS annotation via the BRC-analytics VIGOR4 workflow and host transcriptomics to monitor immune evasion profiles.
11 // AGENTIC INFRASTRUCTURE

Agentic Analysis on Public Infrastructure

BRC-analytics tools are designed for programmatic orchestration, enabling AI agents to autonomously manage complex genomic runs on public Galaxy servers:

1. Programmatic API Control
Agents use BioBlend and Galaxy APIs to push sequences, construct dataset collections, and instantiate complex pipelines without human GUI interaction.
2. Self-Correction Loops
Agentic execution engines programmatically poll history states, analyze execution errors, and adjust configuration payloads to resolve run-time faults.
3. Public Cloud Orchestration
Standardized definitions on hubs like Galaxy's IWC, DockStore, and WorkflowHub allow agents to dynamically deploy and replicate identical workflows across global public servers.
Robust, automated, and scalable genomic intelligence utilizing free, public scientific compute resources.
veg/bdbv-selection-pipeline