Multiple sequence alignment (MSA) lies at the heart of comparative genomics, structural biology and evolutionary inference. By arranging three or more nucleotide or amino acid sequences in a matrix, ...
Protein language models have demonstrated remarkable performance in predicting the effects of missense variants but DNA language models have not yet shown a competitive edge for complex genomes such ...