BRC Analytics Logo BRC-Analytics | Positive Selection Server
| |
Platform Spotlight

Real-Time Evolutionary Surveillance:

Tracking Viral Outbreak Upticks via Positive Selection Server Usage

Sergei Kosakovsky Pond & Anton Nekrutenko

May 22, 2026

Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

BRC-Analytics: Analysis Workflow

A high-level overview of the sequence analysis pipeline, from input data processing to selection detection and downstream ecosystem integration.

Workflow Pipeline
The schematic illustrates the process: inputting alignments, running analyses via Datamonkey servers or client browsers, and exporting outputs (JSON/CSV) to downstream platforms like Galaxy for visualization.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Datamonkey: Platforms for Molecular Evolution

Providing statistically rigorous inference frameworks for natural selection and recombination across global computing environments. The platform supports sequence analysis for diverse viral, bacterial, and eukaryotic pathogens, processing over 50,000 runs annually, where each analysis is a comparative task evaluating multi-sequence alignments and phylogenetic trees.

Datamonkey 2.0 (Server-Side)

  • Centralized Computing: Runs complex, resource-intensive evolutionary models on shared server clusters.
  • Queue-Based Workflow: Processes large-scale datasets reliably, though jobs may experience wait times during peak usage.

Web portal: datamonkey.org

Datamonkey 3.0 (Client-Side)

  • Browser-Based Computing: Performs analysis directly on the user's computer, eliminating server queue wait times.
  • Immediate Processing: Minimizes wait times and supports data privacy, but is bounded by local browser memory limits.

Web portal: v3.datamonkey.org

Ecosystem Integration
Both versions operate as standalone portals for targeted analysis and integrate with automated platforms like Galaxy to support reproducible genomic pipelines.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Programmatic Integration: Automated Workflows

Exposing evolutionary analysis pipelines programmatically using the Model Context Protocol (via the Datamonkey MCP Server) to integrate selection inference directly into digital research systems.

Workflow Automation

  • Unified Tool Access: Connects evolutionary tools directly to existing analysis pipelines, removing manual data transfer steps.
  • Automated Research: Integrates with digital research assistants to perform selection analyses as part of larger computational runs.

MCP service: mcp.datamonkey.org

Operational Management

  • Pre-Flight Verification: Automatically validates input datasets and configurations prior to execution to reduce runtime failures.
  • Process Coordination: Oversees the analysis lifecycle, handling data submission, queue monitoring, and result retrieval.
Operational Impact
Programmatic access supports automated evolutionary tracking, enabling selective pressure monitoring to be integrated into viral surveillance workflows.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Outbreak Analysis: Modular Selection Suite

Analyzing selection and recombination through complementary, non-linear evolutionary inquiries.

Control Module
Recombination Screening (GARD)
Identifies breakpoints to control for phylogenetic conflict.
  • Pre-flight Check: Reduces false positive selection signals.
  • Pathogen Context: Commonly used for high-recombination pathogens (HIV, SARS-CoV-2); less relevant for non-recombining pathogens (Hantavirus).
When? (Episodic)
Temporal Selection Spikes
Detects transient evolutionary pressure restricted to subset branches/sites.
  • BUSTED: Gene-wide test for episodic selection spikes.
  • MEME: Pinpoints individual sites under episodic pressure.
Where? (Sites & Branches)
Pervasive & Localized Pressure
Maps selective constraint and adaptation across specific loci or branches.
  • FEL & FUBAR: Codon-level pervasive purifying or diversifying selection.
  • MEME: Individual sites under episodic selection spikes.
  • aBSREL: Identifies specific branches undergoing diversifying selection.
What Differences? (Comparative)
Selective Shifts & Tuning
Compares evolutionary pressures between host species or groups.
  • Contrast-FEL: Maps host-specific selection shifts (e.g. Human vs. Rodent).
  • RELAX: Tests for selection relaxation or intensification.
Integration Strategy
Rather than running a single linear pipeline, researchers select the module suited to their biological question. Recombination screening (GARD) is recommended for recombining pathogens, while direct selection analysis is suitable for non-recombining viruses like Hantavirus.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Outbreak Analysis: Why Evolutionary Profiling Matters

Connecting phylogenetic patterns to biological mechanisms, structural characterization, and outbreak surveillance.

🎯
Outbreak & Escape Tracking
Tracing Receptor Binding & Antigenic Drift
  • Evasion Signals: Separates neutral genetic drift from adaptive pressure, tracing structural changes associated with potential immune escape.
  • Host Adaptation: Contrast-FEL maps host-specific selection shifts, identifying differences in selection pressures between host groups.
Conservation Analysis
Locating Conserved Residues
  • Mutational Tolerance: B-STILL identifies residues under purifying selection where mutations are less frequently observed.
  • Functional Domains: Identifying conserved residues (e.g., fusion loops, transmembrane domains) helps characterize potential structural constraints.
Surveillance Objective
Integrating selection workflows supports surveillance by going beyond cataloging mutations: it provides information on whether new lineages are adapting to hosts or experiencing potential structural constraints.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Surveillance Dynamics: Tracking Outbreak Spikes

Analyzing Datamonkey server usage signals for Hantavirus and Ebola virus under sequence availability constraints.

Hantavirus Submission Patterns
Active Monitoring (59.6% of weeks, n=994)
Oct 2025 174 hits Peak Submissions
(Southern Cone Spring Onset)
Jan 2026 76 hits Uptick
(Post-PAHO Alert)
Mar 2026 148 hits Increased Submissions
(Late-Summer Peak)
May 2026 98 hits Active Submissions
(MV Hondius Outbreak)
Ebola Virus (EBOV) Signal
Sporadic Signal (30.8% of weeks, n=110)
Late 2025 Sporadic Low Baseline
(Endemic Monitoring)
Early 2026 Scattered Low Baseline
(Routine Surveillance)
Mid May '26 0 hits Complete Gap
(Pre-Alert Pause)
Late May '26 27 hits Increased Submissions
(Bundibugyo Outbreak)
Surveillance Insight
Submission surges can serve as indicators of outbreak activity. Low sequence counts can limit standard selection analysis power, suggesting the use of small-sample models.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Case Study: Andes Hantavirus Glycoprotein (M Segment)

Characterizing selective constraints, host adaptation, and co-evolution across 104 isolates.

Why Analyze Glycoprotein M?
Understanding Andes Hantavirus evolutionary constraints across the host-vector barrier.
  • Entry & Transmission: Glycoprotein M mediates cell entry and membrane fusion, making it a target of interest for neutralizing antibodies.
  • Reservoir-to-Host Spillover: Analyzing constraints between Rodent reservoirs and Human hosts can help identify residues associated with adaptation.
  • Outbreak Tracking: Useful for tracking transmission dynamics in outbreaks, including the recent MV Hondius cruise ship outbreak.
Collaborator-Driven Study Driven by collaborators L. D. González Vázquez (Univ. of Vigo), C. Mavian (Stellenbosch Univ.), and D. Martin. BRC-Analytics provided computational resources, software tools, and workflow guidance.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Global Dynamics & Host Transitions

Gene-wide analysis of selective pressure changes during the transition from rodent reservoir to human host.

Global Selection Pressures
The glycoprotein is under purifying selection, suggesting functional constraint (dN/dS << 1).
Human Branches dN/dS0.0224
Rodent Branches dN/dS0.0227
Background Branches dN/dS0.0127
Selection Intensity Analysis
Tests if transmission to humans relaxes or intensifies selective constraints on the Glycoprotein relative to rodents.
Selection Intensity (K)1.07
LRT Statistic1.0289
P-value (Threshold 0.05)0.3104 (NS)
Conclusion: There is no significant change in selection intensity (K ≈ 1). The host transition does not suggest generalized evolutionary relaxation or intensification.
💡 Interpretation
This constraint aligns with the structural and functional conservation typical of the Hantavirus Glycoprotein (M Segment). The stability of global selection intensity suggests selection pressures remain similar overall in both hosts, with potential shifts localized to specific sites rather than general genomic relaxation.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Host-Specific Adaptation: Site-Level Shifts

Contrast-FEL identifies specific residues experiencing different evolutionary constraints between host groups.

71
217
598
649
899
1051
Gn Head (1–512)
Gn Stalk
Gc Head (652–1000)
Gc Stem
TMD
1
512
648
1000
1110
1137
Host Constraint Shift (Human vs. Rodent)
Contrast-FEL identifies 6 host-specific sites. In these results, all 6 sites exhibit variation in the rodent reservoir (higher dN) but are conserved in human infections (dN = 0), which is consistent with potential host transmission bottlenecks:
Site & Region Human Rodent
71 (Gn Head)
T
T
L
217 (Gn Head)
T
T
V
598 (Gn Stalk)
G
G
N
649 (Cleavage)
A
A
V
899 (Gc Head)
G
G
H
1051 (Gc Stem)
T
T
S
M
Cleavage Motif Host Tuning (Site 649)
Site 649 resides in the conserved proteolytic cleavage motif W-A-A-S-A (glycoprotein processing):
W - A - A - S - A
Site 649 (Middle Alanine)
  • Human branches: Purifying selection (dN ≈ 0) — conserved for Alanine (A).
  • Rodent branches: Relaxed selection (dN = 0.64) — tolerates Valine (V) (50% Alanine [A], 50% Valine [V] in non-gapped sequences).
Biological Significance: Suggests host-specific differences in cellular protease interactions or processing machinery are associated with tighter constraint in human infections compared to the reservoir.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Diversifying Selection: Episodic Selection Pressures

Tracking localized selection pressures in the Glycoprotein.

307
499
823
994
996
1055
Gn Head (1–512)
Gn Stalk
Gc Head (652–1000)
Gc Stem
TMD
1
512
648
1000
1110
1137
Host Variation at Adaptive Sites
Site & Region Human Rodent MEME p
307 (Gn Head)
S
S
0.051
499 (Gn Head)
I
V
I
V
0.074
823 (Gc Head)
A
A
0.082
994 (Gc Head)
T
T
0.017
996 (Gc Head)
T
T
V
0.094
1055 (Gc Stem)
S
T
S
0.080
Selection Summary & Interpretation
  • Gene-Wide Selection (BUSTED-E): Whole-tree selection is borderline significant (p = 0.054). No global enrichment on human lineages (p = 0.21).
  • Site-Specific Selection (MEME): Identified 6 sites under episodic selection (p <= 0.10) across key domains: Gn head (307, 499), Gc head (823, 994, 996), and Gc stem (1055).
  • Hondius Outbreak Variants: All 5 isolates from the May 2026 outbreak carry minority variants at two selected sites: site 499 (100% Valine) and site 1055 (100% Threonine).
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Co-Evolutionary Networks (BGM)

Identifying interacting codon networks and structural linkages.

Bayesian Graphical Models (BGM)
BGM maps epistatic interactions by detecting co-varying codon positions:
  • Network Density: 25 co-evolving pairs (posterior probability > 0.50) across the M segment.
  • Compensatory Selection: Links (prob ≥ 0.90) in Gn outer shell and Gc fusion loop show structural constraint.
  • Synonymous Association: Co-variation at synonymous sites suggests RNA-level constraint.
Co-evolving Codon Pairs (Prob ≥ 0.90)
Interaction links and structural context:
Codon Pair Posterior Prob Type Structural Context
126 & 168 0.97 Nonsyn Gn Head outer shell co-variation (V/I vs S/N)
117 & 311 0.95 Nonsyn Compensatory hydrophobic shifts (P/S/A vs T/A/S)
669 & 670 0.95 Nonsyn Adjacent residues in Gc DI fusion loop (E vs I/V)
743 & 785 0.90 Syn Gc Head co-variation (CAG/CAA vs GTT/GTG/GTA/GTC)
🔗 RNA-Level vs. Protein-Level Co-evolution
Codon-level BGM detects both nonsynonymous compensatory links (e.g., adjacent sites 669/670 in Gc) and synonymous co-evolving pairs (e.g., 743 & 785). Synonymous association suggests potential selection at the RNA level, which may conserve secondary structures or translation speed.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026
BRC Analytics Logo BRC-Analytics | Positive Selection Server

Highly Conserved Sites & Summary

B-STILL identifies highly conserved residues suggesting potential functional constraint.

22: ACC
685: TCA
735: GCA
978: ACA
992: ACA
1116: GTG
Gn Head (1–512)
Gn Stalk
Gc Head (652–1000)
Gc Stem
TMD
1
512
648
1000
1110
1137
B-STILL Strongly Conserved Loci (SCLs)
B-STILL identifies 6 strongly conserved loci (sites 22, 685, 735, 978, 992, and 1116) under purifying constraint (Bayes Factor EBF > 10). These sites represent positions with low mutational tolerance across the glycoprotein sequence.

These sites are unusual in the context of a gene that is already quite conserved: only 7.3% of the codons (83/1137 sites with >50% non-gap data) are perfectly conserved across all 104 isolates.

Summary Takeaways

  • Conserved Regions: Highly conserved sites such as Site 735 (fusion loop) and Site 1116 (TMD) exhibit very low mutational tolerance, which may guide future immunogen design.
  • Host-Vector Monitoring: Monitoring host-specific selection shifts (such as the proteolytic processing motif at site 649) provides information relevant to vector-to-human spillover analysis.
📊 Evolutionary Summary
By integrating B-STILL, MEME, and Contrast-FEL, we map both conserved loci and host-vector adaptations in the viral envelope, helping to characterize evolutionary constraints.
Evolutionary Tracking of Viral Outbreaks • May 22, 2026

Hantavirus Timeline Track

Nodes

Ebola Timeline Track

Nodes

Gene Bar

Gene Axis

Gene Bar

Gene Axis

Gene Bar

Gene Axis