Major databases to study genes expression in plants (A case study: Rice)

Document Type : Review Paper

Authors

1 Graduated Ph. D., Dept. of Agricultural Biotechnology, Faculty of Agricultural Science, University of Guilan, Rasht, Iran

2 Prof., Dept. of Agronomy and Plant Breeding, Faculty of Agricultural Sciences, University of Guilan, Rasht, Iran

Abstract

Rice, as one of the most important crops, has the smallest genome among cereals and is considered as a model plant for genetic studies. The small size of this plant’s genome has led to comprehensive studies on it, resulting in a large amounts of data to be obtained. Properly collecting and storing these data and managing data from various experiments in a database for accessing researchers to them to avoid re-work and to compare their results with the results of other researchers is very important. To achieve these, bioinformatics can be great help. Therefore, the creation and development of specialized databases and the use of bioinformatics tools for data processing, efficient organization, analysis and visualization are necessary. In this paper, the major databases for studying gene expression at three levels, RNA, protein and metabolome in rice are reviewed and the characteristics of these databases for each level are discussed.

Keywords


Adams, M. D., Kelley, J. M., Gocayne, J. D., Dubnick, M., Polymeropulos, H. M. and Xiao, H. 1991. Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252: 1651-1656.##Barrett, T., Wilhite, S. E., Ledoux, P., Evangelista, C., Kim, I.F., Tomashevsky, M. and Yefanov, A. 2012. NCBI GEO: Archive for functional genomics data sets-update. Nucleic Acids Research 41 (D1): 991-995.##Barrett, L. W., Fletcher, S. and Wilton, S. D. 2012. Regulation of eukaryotic gene expression by the untranslated gene regions and other non-coding elements. Cellular and Molecular Life Sciences 69 (21): 3613-3634.##Belacel, N., Wang, Q. and Cuperlovic-Culf, M. 2006. Clustering methods for microarray gene expression data. Omics: A Journal of Integrative Biology 10 (4): 507-531.##Bennet, M. D. and Smith, J. B. 1991. Nuclear DNA amounts in angiosperms. Philosophical Transactions of the Royal Society B334: 309-345.##Brazma, A., Parkinson, H., Sarkans, U., Shojatalab, M., Vilo, J., Abeygunawardena, N. and Oezcimen, A. 2003. ArrayExpress-A public repository for microarray gene expression data at the EBI. Nucleic Acids Research 31 (1): 68-71.##Callard, D., Lescure, B. and Mazzolini, L. 1994. A method for the elimination of false positives generated by the mRNA differential display technique. BioTechniques 16: 1096-1103.##Cao, P., Jung, K. H., Choi, D., Hwang, D., Zhu, J. and Ronald, P. C. 2012. The rice oligonucleotide array database: An atlas of rice gene expression. Rice 5 (1): 17.##Chen, C., Huang, H. and Wu, C. H. 2017. Protein bioinformatics databases and resources. In: Protein bioinformatics. Humana Press, New York, NY. pp: 3-39.##Chien, C. H., Chow, C. N., Wu, N. Y., Chiang-Hsieh, Y. F., Hou, P. F. and Chang, W. C. 2015. EXPath: A database of comparative expression analysis inferring metabolic pathways for plants. BMC Genomics 16: S6.##Craigon, D. J., James, N., Okyere, J., Higgins, J., Jotham, J. and May, S. 2004. NASCArrays: A repository for microarray data generated by NASC’s transcriptomics service. Nucleic Acids Research 32 (suppl_1): D575-D577.##Clark R. M., Wagler T. N., Quijada, P. and Doebley, J. 2006. A distant upstream enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and inflorescent architecture. Nature Genetics 38: 594-597.##Cochrane, G., Karsch-Mizrachi, I., Takagi, T. and Sequence Database Collaboration, I.N. 2015. The international nucleotide sequence database collaboration. Nucleic Acids Research 44 (D1): D48-D50.##Deans, C. and Maggert, K. A. 2015. What do you mean,“epigenetic”?. Genetics 199 (4): 887-896.##Dharmawardhana, P., Ren, L., Amarasinghe, V., Monaco, M., Thomason, J., Ravenscroft, D. and Jaiswal, P. 2013. A genome scale metabolic network for rice and accompanying analysis of tryptophan, auxin and serotonin biosynthesis regulation under biotic stress. Rice 6 (1): 15.#3Dodd, A. N., Salathia, N., Hall, A., Kévei, E., Tóth, R., Nagy, F. and Webb, A. A. 2005. Plant circadian clocks increase photosynthesis, growth, survival, and competitive advantage. Science 309 (5734): 630-633.##Garcia-Hernandez, M., Berardini, T., Chen, G., Crist, D., Doyle, A., Huala, E. and Mundodi, S. 2002. TAIR: A resource for integrated Arabidopsis data. Functional and Integrative Genomics 2 (6): 239-253.##Ghildiyal, M. and Zamore, P. D. 2009. Small silencing RNAs: An expanding universe. Nature Reviews Genetics 10 (2): 94.##Geertz, M. and Maerkl, S. J. 2010. Experimental strategies for studying transcription factor–DNA binding specificities. Briefings in Functional Genomics 9 (5-6): 362-373.##Gour, P., Garg, P., Jain, R., Joseph, S. V., Tyagi, A. K. and Raghuvanshi, S. 2013. Manually curated database of rice proteins. Nucleic Acids Research 42 (D1): D1214-D1221.##Grigoriev, A. 2001. A relationship between gene expression and protein interactions on the proteome scale: Analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiaeNucleic Acids Research 29 (17): 3513-3519.##Gruber, A. R., Martin, G., Keller, W. and Zavolan, M. 2014. Means to an end: Mechanisms of alternative polyadenylation of messenger RNA precursors. Wiley Interdisciplinary Reviews: RNA 5 (2): 183-196.##Gu, H., Zhu, P., Jiao, Y., Meng, Y. and Chen, M. 2011. PRIN: A predicted rice interactome network. BMC Bioinformatics 12 (1): 161.##Hamada, K., Hongo, K., Suwabe, K., Shimizu, A., Nagayama, T., Abe, R. and Tsuchida, H. 2011. OryzaExpress: An integrated database of gene expression networks and omics annotations in rice. Plant and Cell Physiology 52 (2): 220-229.##Helmy, M., Tomita, M. and Ishihama, Y. 2011. OryzaPG-DB: Rice proteome database based on shotgun proteogenomics. BMC Plant Biology 11 (1): 63.##Hruz, T., Laule, O., Szabo, G., Wessendorp, F., Bleuler, S., Oertle, L. and Zimmermann, P. 2008. GENEVESTIGATOR v3: A reference expression database for the meta-analysis of transcriptomes. Advances in Bioinformatics 2008: 1-5.##Hudson, M. E. and Quail, P. H. 2003. Identification of promoter motifs involved in the network of phytochrome A-regulated gene expression by combined analysis of genomic sequence and microarray data. Plant Physiology 133 (4): 1605-1616.##Jombart, T. 2008. Adegenet: A R-package for the multivariate analysis of genetic markers. Bioinformatics 24 (11): 1403-1405.##Jung, K. H., Cao, P., Sharma, R., Jain, R. and Ronald, P. C. 2015. Phylogenomics databases for facilitating functional genomics in rice. Rice 8 (1): 26.##Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M. and Tanabe, M. 2013. Data, information, knowledge and principle: Back to metabolism in KEGG. Nucleic Acids Research 42 (D1): D199-D205.##Karsch-Mizrachi, I., Nakamura, Y. and Cochrane, G. 2011. The international nucleotide sequence database collaboration. Nucleic Acids Research 40 (D1): D33-D37.##Katari, M. S., Nowicki, S. D., Aceituno, F. F., Nero, D., Kelfer, J., Thompson, L. P., Cabello, J. M., Davidson, R. S., Goldberg, A. P., Shasha, D. E., Coruzzi, G. M. and Gutierrez, R. A. 2010. VirtualPlant: A software platform to support systems biology research. Plant Physiology 152 (2): 500-515.##Kawahara, Y., Oono, Y., Wakimoto, H., Ogata, J., Kanamori, H., Sasaki, H. and Itoh, T. 2015. TENOR: Database for comprehensive mRNA-seq experiments in rice. Plant and Cell Physiology 57 (1): e7-e7.##Kudo, T., Akiyama, K., Kojima, M., Makita, N., Sakurai, T. and Sakakibara, H. 2013. UniVIO: A multiple omics database with hormonome and transcriptome data from rice. Plant and Cell Physiology 54 (2): e9-e9.##Lee, T., Oh, T., Yang, S., Shin, J., Hwang, S., Kim, C. Y. and Lee, I. 2015. RiceNet v2: An improved network prioritization server for rice genes. Nucleic Acids Research 43 (W1): W122-W127.##Liang, P. and Pardee, A. B. 1992. Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 257: 967-971.##Lu, T., Huang, X., Zhu, C., Huang, T., Zhao, Q., Xie, K. and Han, B. 2008. RICD: A rice indica cDNA database resource for rice functional genomics. BMC Plant Biology 8 (1): 118.##Mockler, T. C., Michael, T. P., Priest, H. D., Shen, R., Sullivan, C. M., Givan, S. A. and Chory, J. 2007. The DIURNAL project: DIURNAL and circadian expression profiling, model-based pattern matching, and promoter analysis. Cold Spring Harbor Symposia on Quantitative Biology. Vol. 72. Clocks and Rhythms. Cold Spring Harbor Laboratory Press. pp: 353-363.##Moody, D. E. 2001. Genomics techniques: An overview of methods for the study of gene expression. Journal of Animal Science 79 (E. Suppl.): E128–E135.##Narsai, R., Devenish, J., Castleden, I., Narsai, K., Xu, L., Shou, H. and Whelan, J. 2013. Rice DB: An Oryza information portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and ArabidopsisThe Plant Journal 76 (6): 1057-1073.##Orchard, S., Ammari, M., Aranda, B., Breuza, L., Briganti, L., Broackes-Carter, F. and Duesbury, M. 2013. The MIntAct project-IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Research 42 (D1): D358-D363.##Petryszak, R., Keays, M., Tang, Y. A., Fonseca, N. A., Barrera, E., Burdett, T. and Mannion, O. 2015. Expression atlas update-An integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Research 44 (D1): D746-D752.##Priya, P. and Jain, M. 2013. RiceSRTFDB: A database of rice transcription factors containing comprehensive expression, cis-regulatory element and mutant information to facilitate gene function analysis. Database 2013: 1-7.##Rajaram, S. and Oono, Y. 2010. NeatMap-non-clustering heat map alternatives in R. BMC Bioinformatics 11 (1): 45.##Rhee, S. Y., Beavis, W., Berardini, T. Z., Chen, G., Dixon, D., Doyle, A. and Miller, N. 2003. The Arabidopsis information resource (TAIR): A model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Research 31 (1): 224-228.##Riechmann, J. L. and Ratcliffe, O. J. 2000. A genomic perspective on plant transcription factors. Current Opinion in Plant Biology 3 (5): 423-434.##Sakurai, T., Kondou, Y., Akiyama, K., Kurotani, A., Higuchi, M., Ichikawa, T. and Sakakibara, H. 2010. RiceFOX: A database of Arabidopsis mutant lines overexpressing rice full-length cDNA that contains a wide range of trait information to facilitate analysis of gene function. Plant and Cell Physiology 52 (2): 265-273.##Sapkota, A., Liu, X., Zhao, X. M., Cao, Y., Liu, J., Liu, Z. P. and Chen, L. 2011. DIPOS: Database of interacting proteins in Oryza sativaMolecular BioSystems 7 (9): 2615-2621.##Sargent, T. D. and Dawid, I. B. 1983. Differential gene expression in the gastrula of Xenopus laevis. Science 222: 135-139.##Sato, Y., Namiki, N., Takehisa, H., Kamatsuki, K., Minami, H., Ikawa, H. and Nagamura, Y. 2013. RiceFREND: A platform for retrieving coexpressed gene networks in rice. Nucleic Acids Research 41 (D1): D1214-D1221.##Sato, Y., Takehisa, H., Kamatsuki, K., Minami, H., Namiki, N., Ikawa, H. and Nagamura, Y. 2013. RiceXPro version 3.0: Expanding the informatics resource for rice transcriptome. Nucleic Acids Research 41 (D1): D1206-D1213.##Schläpfer, P., Zhang, P., Wang, C., Kim, T., Banf, M., Chae, L., Dreher, K., Chavali, A. K., Nilo-Poyanco, R., Bernard, T., Kahn, D. and Rhee, S. Y. 2017. Genome-wide prediction of metabolic enzymes, pathways, and gene clusters in plants. Plant Physiology 173 (4): 2041-2059.##Schmid, M., Davison, T. S., Henz, S. R., Pape, U. J., Demar, M., Vingron, M. and Lohmann, J. U. 2005. A gene expression map of Arabidopsis thaliana development. Nature Genetics 37 (5): 501.##Second, G. 1991. Molecular markers in rice systematics and the evaluation of genetic resources. Biotechnology in Agriculture and Forestry 4: 468-490.##Shen, L., Gong, J., Caldo, R. A., Nettleton, D., Cook, D., Wise, R. P. and Dickerson, J. A. 2005. BarleyBase-An expression profiling database for plant genomics. Nucleic Acids Research 33 (suppl_1): D614-D618.##Smith, T. F. 1990. The history of the genetic sequence databases. Genomics 6 (4): 701-707.##Strasser, B. J. 2008. GenBank-Natural history in the 21st century? Science 322 (5901): 537-538.##Tabkhkar, N. and Rabiei, B. 2014. Bioinformatics and cereal genome databases: A case study in rice. Cereal Research 4 (1): 77-87. (In Persian with English Abstract).##Tardieu, F. 2013. Plant response to environmental conditions: Assessing potential production, water demand, and negative effects of water deficit. Frontiers in Physiology 4: 17.##The IC4R Project Consortium, 2016. Information commons for rice (IC4R). Nucleic Acids Research 44: D1172-D1180.##Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouze, P. and Moreau, Y. 2001. A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17 (12): 1113-1122.##Thijs, G., Marchal, K., Lescot, M., Rombauts, S., De Moor, B., Rouzé, P. and Moreau, Y. 2002. A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes. Proceedings of the 5th Annual International Conference on Computational Biology. pp: 305-312.##Toufighi, K., Brady, S. M., Austin, R., Ly, E. and Provart, N. J. 2005. The botany array resource: E‐northerns, expression angling, and promoter analyses. The Plant Journal 43 (1): 153-163.##Tollefsbol, T. O. 2011. Advances in epigenetic technology. In: Tollefsbol, T. O. (Ed.). Epigenetics protocols. Second Edition. Humana Press. pp: 1-10.##Velculescu, V. E., Zhang, L., Voglestein, B. and Kinzler, K. W. 1995. Serial analysis of gene expression. Science 270: 484-487.##Wang, D., Xia, Y., Li, X., Hou, L. and Yu, J. 2013. The rice genome knowledgebase (RGKbase): An annotation database for rice comparative genomics and evolutionary biology. Nucleic Acids Research 41 (D1): D1199-D1205.##Wittkopp, P. J. and Kalay, G. 2012. Cis-regulatory elements: Molecular mechanisms and evolutionary processes underlying divergence. Nature Reviews Genetics 13 (1): 59.##Woelfle, M. A., Ouyang, Y., Phanvijhitsiri, K. and Johnson, C. H. 2004. The adaptive value of circadian clocks: An experimental assessment in cyanobacteria. Current Biology 14 (16): 1481-1486.##Zhang, B., Whiteaker, J. R., Hoofnagle, A. N., Baird, G. S., Rodland, K. D. and Paulovich, A. G. 2018. Clinical potential of mass spectrometry-based proteogenomics. Nature Reviews Clinical Oncology 16:256-268.##Zhang, Z., Hu, S. N., He, H., Zhang, H. Y., Chen, F., Zhao, W. M., Xiao, J. F., Chen, L. L., Xue, Y. and Wang, X. F.  2016. Information commons for rice (IC4R). Nucleic Acids Research 44: D1172–D1180.##Zhao, S., Guo, Y., Sheng, Q. and Shyr, Y. 2014. Advanced heat map and clustering analysis using heatmap3. BioMed Research International 2014: 1-6.##Zimmermann, P., Hirsch-Hoffmann, M., Hennig, L. and Gruissem, W. 2004. GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiology 136 (1): 2621-2632.