Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of most eukaryotic genomes. Over 58% of the human genome is derived from transposition and duplication of these prototypically selfish sequences. While only 1 in 20 humans have a new germline TE insertion, the somatic TE activity in tumors, during neural development in the brain, or during creation of pluripotent stem cells has a dramatic impact on the genome and potentially on human health.
We have developed two of the leading software packages used in the study of TEs: RepeatMasker (http://www.repeatmasker.org) for the high-quality annotation of TE copies in a genomic sequence, and RepeatModeler for the de novo discovery of TE families in newly sequenced genomes. In addition, we are developing the Dfam resource (https://www.dfam.org): a comprehensive database of TE families, sequence models, and genome annotations for eukaryotic genomes.
Current Project Leads:
Image attributed to John Kauffman