SARS-CoV-2 Multisite analysis

This analysis was constructed to look across "interesting" sites which show evidence to be acting under natural selection. {Link to observable}. Daily pulldowns from the GISAID portal were used. These SARS-CoV-2 sequences were codon-aware aligned and the following genes have been critiqued (S, M, N, ORF1a, ORF3a, ORF7a, ORF8). From the "compressed" alignments, the site number and codon at that position were recorded for all available sequences from that day. Time-series analysis was performed in a manner where each day represents the cumulative dataset of information pulled down from GISAID. Using this site/codon "Code", a haplotype or mutation sequence / sequence fingerprint was created. The graphs below describe the observed frequency of that genes "haplotype" on that day.

Gene   SARS-CoV-2 "Interesting" Sites
M 30,69,109,132,141,143
N 13,20,24,55,67,108,119,139,144,151,173,207,208,210,218,253,291,305,318,360,377,393,397,417
ORF1a 16,84,86,92,166,210,265,274,286,333,398,428,454,447,486,519,549,580,609,650,652,717,748,872,892,894,924,1100,
ORF3a 42,43,57,68,99,125,131,240
ORF7a 70,92,121
ORF8 8,84,121
S 29,61,138,231,294,598,614,620,623,675,723,731,824,934,936,1044,1100

A Note on the charts below. If you double-click on the "haplotype" it will display/highlight data for that sequence only. Double click anywhere on the graph to escape.

A Note for gene ORF1a on the charts below. On Dates where the sum of 'haplotypes' is below 100% this is because the ORF1a unique haplotypes were filtered to remove values below a 1% frequency.