A new perspective from Information Theory on genetic sequences

With Omri Tal, Max Planck Institute for Mathematics in the Sciences

A new perspective from Information Theory on genetic sequences

I demonstrate conceptual links between features of population genetic samples and a core information-theoretic property. In essence, long stretches of genetic variants may be captured as ‘typical sequences’ of a nonstationary source modelled on the source population. I will introduce the concepts of typical genotypes, population entropy rate and mutual typicality, and their relation to the asymptotic equipartition property. The interplay of typical genotypes from differing populations and their geometric properties in high dimensional space will provide motivation for constructing simple typicality-based population assignment schemes, where the communication channel can be likened to an inferential channel. We then highlight a surprising resilience to allele frequency estimation noise of such schemes. Finally, I will discuss the prospects of further applications of ideas from information theory for interpreting population genetic data.

Add to your calendar or Include in your list