Arun Seetharam bio photo

Arun Seetharam

Bioinformatician

Email Twitter Github

Current Projects

Overview

I am working on many different projects, the primary projects that I’m currently working are listed below. See the publications/resources pages to see my contributions to other projects.

PanAnd - Harnessing convergence and constraint to predict adaptations to abiotic stress for maize and sorghum

The Andropogoneae tribe of grasses contains a thousand species that collectively represent over a billion years of evolutionary history. It has used NADP-C4 photosynthesis and a wide range of adaptations to become a dominant clade on earth. This project will use the diversity and evolution across this tribe to understand the rules of adaptive convergence and constraint in plant genomes. The project team will sample and analyze the worldwide spectrum of genetic diversity in Andropogoneae to develop detailed models testing whether (1) quantitative estimates of evolutionary constraint improve predictions of fitness-related traits, and (2) convergent environmental adaptations shared across the Andropogoneae explain a substantial proportion of total adaptive variance. These hypotheses will be tested by assembling the gene and regulatory content of 57 species as well as whole genome sequencing of another 700 species. For eight species, diversity across their natural range of adaptation will be surveyed at the sequence level. Evolutionary and machine learning models will be used to quantify the disruptive impact of a mutation in every ancestral genomic element. The inter and intra-specific surveys will also permit an estimation of the prevalence of convergent evolution. This project addresses two key elements of the genotype to phenotype problem - how to quantify the disruptive impact of mutations and how to determine whether adaptive solutions to environmental stresses are convergently shared across species.

PI’s:

NSF Award Page can be found here.

Previous projects

Previous projects that I have worked on are listed here. The associated publications are in the publications tab.

Whole genome assembly of the maize NAM founders

Maize is an important crop and model organism for plant genetics. However, currently nearly all forms of sequence analysis are referenced to the single B73 inbred. Beyond B73, the most extensively researched maize lines are the core set of 25 inbreds known as the NAM founder lines, which represent a broad cross section of modern maize diversity. Prior data show that gene content can differ by more than 5% across lines and that as much as half of the functional genetic information lies outside of genes in highly variable intergenic spaces. To capture and utilize this variation, the NAM founder inbreds and a twenty-sixth line containing abnormal chromosome 10 will be sequenced and assembled using a mate-pair strategy. Scaffolds will be validated by BioNano optical mapping, and ordered and oriented using linkage data. RNA-seq data from multiple tissues will be used to annotate each genome, and assemblies and annotations will be released with genome browser support through MaizeGDB, NCBI, and Cyverse. Comparative genomic tools will be used to identify and to catalog the maize pangenome, and to assess the role of structural variation such as presence-absence variation and copy number variation in the determination of agronomic traits. Results will be disseminated through a project web site and a CyVerse/Gramene/MaizeCODE Workshop at the annual Maize Genetics Conference.

PI’s:

NSF Award Page can be found here.

Project page is here

Orphan Genes: An Untapped Genetic Reservoir of Novel Traits Driving Evolutionary Adaptation and Crop Improvement

The premise that new genes can arise from non-genic DNA sequences is borne out from massive DNA and RNA sequencing data.This concept sharply contrasts with the long-accepted view that novel gene functions primarily arise from a slow process of accumulated mutations and rearrangements of already-established genes. A hypothesis is that a major role of orphan genes is to regulate the defense and metabolic responses that enable evolutionary adaptation to new environments. This research will identify orphan genes of major agronomic species, focusing first on maize and Brassica. These results will inform a systematic analysis of orphan genes at the level of subspecies, thus categorizing orphan genes in the context of the adaptation and selection that has occurred as the result of human intervention for improved agronomic traits. Based on the resultant data, specific orphan genes will be selected for experimental functional analysis. Data will be integrated into community databases, and code will be available to the public. New computer game modules will be targeted to high school and early undergraduate students. The goal is to develop data and computational tools that facilitate predictive understanding of the function of orphan genes in driving evolutionary adaptation, to harness these resources for improving crops, and to disseminate the information to researchers and students. These capabilities will empower researchers to explore the significance of recently-emerged orphan genes, and transform fundamental knowledge into innovative solutions that improve crop traits.

PI’s:

NSF Award Page can be found here.