AxoMeme Selection Analyzer

Alignment Input (FASTA/NEXUS)

Provide a multiple sequence alignment in FASTA or NEXUS format. The nucleotide sequences must be codon-aligned (multiples of 3). Gzipped files (.gz) are supported for faster uploads.

Drag & drop alignment file here

Supports FASTA (.fasta, .fa) or NEXUS (.nex, .nexus) & Gzip (.gz)

No file selected

Reference Sequence

Select the specific sequence in your alignment that will act as the coordinate system reference. All predicted selection sites (e.g. Site 42) will be reported in terms of this sequence's coordinates, ignoring gaps.

Phylogenetic Tree Input

Provide an optional phylogenetic tree in Newick or NEXUS format. If the tree does not contain estimated branch lengths, HyPhy WebAssembly will automatically estimate them client-side using the HKY85 model. If omitted entirely, a flat star-like topology will be assumed.

Drag & drop tree file here (optional)

Newick/NEXUS format. If omitted, uses embedded tree or flat topology.

No file selected

Max Species Cap (max_species)

Controls the maximum number of sequences included in the analysis.

If your alignment contains more species than this cap, the pipeline preserves the selected reference sequence and prunes the tree/alignment down to this threshold based on the taxa order.

Key Impacts:

Selection Mechanism: The user-selected reference sequence is moved to the absolute front of the list to guarantee its inclusion. To maximize the phylogenetic diversity and preserve the overall tree span/length, the remaining sequences are selected using an exact greedy Phylogenetic Diversity (Faith's PD) maximization algorithm. This deterministically selects species that add the maximum evolutionary branch distance to the existing subtree, capturing the deepest splits and broadest taxonomic span. If no tree is provided, it falls back to systematic even-spacing sampling along the alignment.
Computational Complexity: Reduces the time needed for client-side calculations, such as the O(N^3) Classical MDS coordinate extraction and HyPhy branch estimation. A hard upper limit of 512 is enforced to prevent the web browser from running Out of Memory (OOM) or crashing.
Phylogenetic Resolution: Dropping species can prune away key intermediate nodes or sister lineages, potentially reducing the model's statistical sensitivity to detect episodic diversifying selection at certain sites.
Inference Fidelity: The surrogate model's learned phylogenetic bias is optimized for balanced input taxonomic density. A default cap of 256 offers a robust trade-off between client-side latency and evolutionary signal depth.

Upon submission, if the tree does not contain branch lengths, the engine will run HyPhy WebAssembly client-side to fit the HKY85 model and estimate branch lengths automatically!

Inference Processing Pipeline

Decompressing & Parsing Alignment...

Parsing & Pruning Phylogenetic Tree...

Running HyPhy WASM (Branch Optimization)...

Calculating Patristic Distances & MDS...

Running Neural Network Inference...

Post-processing LRT & calling selection Tiers...

Inference Results

Codon Sites -

Species Analyzed -

Tier 1 (High) Sites -

Tier 2 (Medium) Sites -

Codon Sites Table

Codon Site	Reference State	AA Composition	Variable	Log(LRT)	Predicted LRT	Local Z-Score	Local Percentile	Call

Showing 0 to 0 of 0 entries

Goal of the Model

Phylogenetic Axial Transformer

A. Joint Codon-AA Embedding

B. Learnable Phylogenetic Attention Bias

Regression Training Pipeline

A. Continuous Selection Target Formulation

B. Weighted Huber Loss

C. Data Splits & Metrics

D. Selection Call Classification (Tiers)