October 1, 2017 at 4:30 pm

MCB 7410 Seminar | Evolutionary Analysis of Essential Genes in Arabidopsis, Oct. 31

The MCB 7410 Seminar presents Yuan Zhang discussing “From Genotype to Phenotype: Evolutionary Analysis of Essential Genes in Arabidopsis” on Tuesday, Oct. 31, at 4:35 p.m. in Porter 104.

Zhang is a graduate student in Environmental & Plant Biology at Ohio University.

Refreshments are provided!

Abstract: Large-scale functional genomics datasets and available mutant libraries provide useful resources to understand gene functions in model organisms. However, large-scale genotype-to-phenotype studies in higher model organisms are still lacking. This is mainly due to the time and resources required for cultivating and phenotyping genetic mutants. Moreover, high gene duplication rates further complicate the problem. In 2010, Lloyd and Meinke built a database including all phenotypic genes in Arabidopsis by gathering published phenotype information online using TAIR database and key word search on NCBI. They discovered only around 15% Arabidopsis genes are connected to well-curated phenotypes. By examining the Arabidopsis phenotype database, two questions were asked: 1) what are the gene features of plant essential genes; 2) are these gene features can be used for essential gene predictions in Arabidopsis and other organisms? To answer these questions, Lloyd and colleagues collected 3443 essential genes of Arabidopsis (12.7% protein coding genes) from literature and database searches. A gene is considered essential if its mutant shows phenotypic changes. By comparing genes in the phenotype dataset with genes not in the dataset (undocumented genes), they found essential genes are often a single copy, evolved during older whole genome duplication (WGD) events, highly expressed, more conserved, and more likely to be co-expressed with other genes. Based on these features, Lloyd’s research group developed a machine learning framework and employed it for essential gene prediction of 23,763 undocumented Arabidopsis genes. They found 1970 of the undocumented genes in Arabidopsis are likely essential. Finally, they further optimized and trained their gene prediction model for cross-species gene predictions. Around 80% and 75% prediction rate were achieved for Arabidopsis-rice and Arabidopsis-yeast gene prediction, respectively. Taken together, findings in this study further our understanding on the relationship between DNA sequence and gene functions. The gene prediction model developed in this study provides a useful tool for functional studies and gene annotation of newly sequenced genome.


  1. Lloyd, J. et al. Plant Cell. 2015; 27: 2133-2147
  2. Meinke, et al. Trends Plant Sci. 2008; 13: 483-491
  3. Lloyd, et al. Plant Physiol. 2012; 158: 1115-29

Leave a Reply

Your email address will not be published. Required fields are marked *