- Short report
- Open Access
Extrapolative microRNA precursor based SSR mining from tea EST database in respect to agronomic traits
BMC Research Notes volume 10, Article number: 261 (2017)
Tea (Camellia sinensis, (L.) Kuntze) is considered as most popular drink across the world and it is widely consumed beverage for its several health-benefit characteristics. These positive traits primarily rely on its regulatory networks of different metabolic pathways. Development of microsatellite markers from the conserved genomic regions are being worthwhile for reviewing the genetic diversity of closely related species or self-pollinated species. Although several SSR markers have been reported, in tea, the trait-specific Simple Sequence Repeat (SSR) markers, leading to be useful in marker assisted breeding technique, are yet to be identified. Micro RNAs are short, non-coding RNA molecules, involved in post transcriptional mode of gene regulation and thus effects on related phenotype. Present study deals with identification of the microsatellite motifs within the reported and predicted miRNA precursors that are effectively followed by designing of primers from SSR flanking regions in order to PCR validation. In addition to the earlier reports, two new miRNAs are predicting here from tea expressed tag sequence database. Furthermore, 18 SSR motifs are found to be in 13 of all 33 predicted miRNAs. Trinucleotide motifs are most abundant among all followed by dinucleotides. Since, miRNA based SSR markers are evidenced to have significant role on genetic fingerprinting study, these outcomes would pave the way in developing novel markers for tagging tea specific agronomic traits as well as substantiating non-conventional breeding program.
Tea (Camellia sinensis) is most popularly consumed beverage across the world and being a cash-crop, it receives much attention to the scientific community. The vibrant research interest basically stands for its massive demand to health conscious people for its antioxidant and a broad spectrum therapeutic potentiality [1, 2]. China is the largest tea producer as well as exporter preceding by India and Sri Lanka . The fermented tea or black tea is the most common among all different types of tea consumed, although antioxidant and other health benefit properties lies maximum on non-fermented green tea . As a result, demand for quality tea has increased much to the end users and tea planters. Challenges for exploring superior cultivars with better agronomic traits are still point of interest to the researchers.
Molecular marker assisted technique in breeding programme for the selection or development of cultivars with desired trait from a large population is well established . Among different markers used in crop improvement and molecular breeding technique, microsatellite markers are profoundly used for its reliability and time saving method. Moreover, due to being co-dominant, abundant, hyper-variable and co-operative to high-throughput analysis, microsatellite markers are considered as ideal for plant genetic linkage mapping, physical mapping, population studies, genotype identification and crop improvement . Predominant of such markers like SSR, ISSR, EST-SSR have been effectively utilized in several crop improvement program [7,8,9,10,11]. MicroRNA precursor based SSR markers are very recently incorporated in this chapter and mostly utilized in marker trait association analysis in several species [12,13,14,15]. These micro RNAs are short, non-coding RNA molecules, involved in post transcriptional mode of gene regulation and  thus effects on related phenotype [17, 18]. There are a large number of molecular markers available for tea so far , however a very few are reported to be linked with some specific trait . Therefore exploration and characterization of novel and already available markers are of prime point of interest. Considering the above, present study aims to identify the microsatellite motifs within the reported and predicted miRNA precursors that are effectively followed by designing of primers for PCR validation.
Materials and methods
Retrieval of data, filtering and trimming
Already predicted tea miRNA candidates were fished out from available literature [21,22,23]. Furthermore, to screen if any pre miRNA were within the tea EST database updated so far (November 2016), standard methodology of miRNA screening  were followed with some minor customization. All available reported miRNAs of viridiplantae from online repository miRBase v21.0  and the entire EST collection of tea were retrieved from NCBI dbEST , followed by elimination of redundant sequences and trimming polyA tails using PRINSEQ v0.20.4 .
Prediction of miRNAs
The set of published miRNAs were used for a homology search against tea EST collection and the best hits with a minimum length of 18 nucleotides and a maximum miRNA length cover up to 26 nucleotides and not more than 3 mismatches were taken for further analysis. After elimination of protein coding transcripts utilizing BLASTx , the remaining candidates were subjected to the prediction of stem-loop structure using Mfold  to check possibility of their pre-miRNA existence. The potential miRNA was mined considering the criteria: (a) position of mature miRNA on arm of the hairpin, (b) minimum paired residues in miRNA = 14 and unpaired residues not more than = 5, (c) maximum number of G–U pairs in miRNA = 5, (d) maximum bulge size of 3nt, (e) the negative minimal folding free energy (MFE) is low (≤−18 kcal/mol) , and (f) minimal folding free energy index (MFEI = [(MFE/length of the RNA sequence) * 100]/(G+C)%) is high (>0.85) [30,31,32].
MicroRNA target predictions and their function
The exclusively predicted miRNAs were analyzed for their putative target genes employing the psRNATarget server  with default parameters. Subsequently, to recognize the functions of such predicted targets, they were undergone BLAST programme in NCBI. Finally a complete list was prepared taking all previously and presently reported tea miRNAs with their putative function.
Exploration of SSRs within predicted microRNAs
The simple sequence repeat motifs within all available and reported pre-miRNA sequences were investigated by the Websat online program . The parameters were set for identifying perfect di-, tri-, tetra-, penta-, and hexa-nucleotide motifs with minimum repeat numbers of 6, 4, 3, 3 and 3 respectively.
Designing primers from SSR flanking region
The primer pairs from SSR flanking regions were designed with BatchPrimer3 server . For the same, parameters were set as follows: length range = 18–23 nucleotides with 21 as optimum; PCR product size range = 100–400 bp ; optimum annealing temperature = 55 °C; and GC content 40–60%, with 50% as optimum.
Result and discussion
In the present study, previously reported miRNA sequences were utilized to find their homolog ones from tea as it is already known that plant mature miRNAs are highly conserved within the plant kingdom, and miRNA genes in one species may exist as orthologs or homologs in other species [30, 37]. With the help of this hypothesis, known miRNAs were utilized to discover novel potential miRNAs in tea. All 8442 miRNAs reported from viridiplantae so far were utilized and after elimination of 3676 exact duplicates using bioinformatics tool, a non-redundant collection was taken for further analysis. Similarly, tea EST collection of 49,670 sequences were made into a non-redundant 40,686 numbers. The BLAST search could fish out 52 number of ESTs with required level of homology i.e. minimum 18nt length similarity with not more than 3 mismatches. A total 15 of them already reported earlier as pre-miRNA [21,22,23]. Leftover candidates were analyzed through Mfold program as the miRNA precursors should be able to form stem-loop hairpin in their secondary structure for processing by Dicer enzyme  and subsequently possible false miRNA precursors were manually removed. In present study two more potential pre-miRNA (Fig. 1) were identified with accession JK478587.1 and FS955851.1 for having miR1533 and miR8002-3p respectively.
The identified pre-miRNAs belonging to two families had sequence lengths of 419 and 787 bp respectively (Table 1). Length variation was also evident in previous report [22, 39, 40]. Beside, commonly studied parameter in miRNA prediction is the minimum free energy (MFE) level which indicates the stability of RNA secondary structure  and longer pre-miRNA sequences generally have lower MFEs for maintaining its stability . Here MFE values calculated by Mfold server were −89.99 and −155.67 kcal (Table 1). The MFE index or MEFI values were also calculated to distinguish miRNA from other RNAs precisely [41, 42]. Accordingly, plant pre-miRNA should have a MFEI greater than 0.85, whereas mRNAs, tRNAs, and rRNAs have a lower MFEI. In this study both identified candidate had MEFI values more than 1 indicating there possible existence in reality. Moreover sequence alignment of the new tea based miRNAs with its homolog ones from reported data showed only the initial 1 nucleotide was missing in case of miR1533 and a few nucleotides were missing from both ends and a single mismatch in entire length of miR8002-3p that strengthens the findings of this computational prediction.
Plant microRNAs do regulate the transcripts expression for growth, development and stress responses by altering leaf morphology and polarity, organ development, lateral root formation, hormone signalling, cell death, signal transduction, cell differentiation and proliferation, transition from juvenile to adult vegetative phase, vegetative to flowering phase, flowering time, floral organ identity and reproduction [41, 43,44,45,46]. Meanwhile, miRNAs are involved in the regulation of gene expression through mRNA cleavage or translational inhibition  has been reported, therefore obtaining information about target genes of potential microRNAs was an essential part of the study. Since the full genomic information is lacking in case of tea, DFCI Gene Index of Arabidopsis thaliana, Glycine max, and Zea mays, Solanum tuberosum etc. were used as target database. Among the presently reported two microRNAs, miR1533 were not found to have any target similarity with significant threshold values of input parameter. The other one, miR8002-3p were predicted to have cleavage activity on General transcription factor 2-related zinc finger protein (Target Accession: AT1G42710.1), phosphoglycerate mutase (Target Accession: TC194811). In addition the target genes of other predicted miRNAs of tea were explored from the literature to elucidate their role in pre-miRNA SSR based polymorphism assessment. Zhu and Luo  reported that their predicted miRNA target genes encoded transcriptional factors, involved in stress response, transmembrane transport, and signal transduction and transcription regulation. Target genes encoding transcription factors and cell integrity maintenance machinery during stress response was also mentioned by Prabu and Mandal . Multiple target of a single miRNA is the system biology network, when they control expression of different transcription factors which in turn regulates specific genes for different metabolism .
Microsatellite markers are widely used tool for estimating the genetic variation and especially used for construction linkage map, understanding of marker trait association, identification of disease resistant loci [14, 49]. The distribution of simple sequence repeats or SSRs in the genome is inherently unstable and therefore highly polymorphic . People assumed that increment of the repeat unit and repeat tracts gives rise the chances of the mutation rate . Some reports have already established the fact that SSR expansions or contractions within genome sequences can affect functions of these sequences and even lead to phenotypic changes . Evidences have shown the effect of SSR unit variation within protein coding regions. However, the consequences of the same in non-coding transcripts are less studied. A very few reports demonstrated the significance of analysis of SSRs in non-coding miRNA [12,13,14,15, 49]. In current study, 13 of 33 total predicted pre-miRNAs had one or more SSR motifs (Table 1). A total of 9 sequences had SSR motif in a single region, whereas 3 sequences contained SSR motifs in two locations and 1 sequence was with 3 different SSR motifs. Trinucleotide motifs were most abundant among all followed by dinucleotides. There was one each of penta and hexa-nucleotide motifs. Forward and reverse primers from the each microsatellite motifs flanking region could be generated in all members excluding only one where SSR motif present toward the terminal end. The microsatellite motifs may be conserved among group or become signature  which might be used in studies of genetic fingerprinting work of tea. Some traits rely on specific repeats of microsatellites and their numbers. Such advantage has been efficiently employed by rice researchers when miRNA-SSR markers were employed to differentiate the salt tolerant and susceptible genotypes . They found more repeat variation of the salt responsive miRNA genes among the susceptible rice genotypes than tolerant one. Such extensive work can be applied in tea to distinguish the cultivars with varying agronomic traits on the basis of miRNA-SSR polymorphism.
Finally, newly predicted microRNAs in tea would enrich the assemblage in absence of whole genomic information of tea and make easier subsequent studies for experimental validation. Some of them might be related to certain metabolic functions thereby phenotypes as well. Excavating of microsatellite motifs from predicted microRNA precursors and designing primers from SSR flanking regions would pave the way in developing novel markers for tagging tea specific agronomic traits as well as accelerating non-conventional breeding program. This can freely be followed by genetic diversity assessment of tea cultivars with varying characters.
Hayat K, Iqbal H, Malik U, Bilal U, Mushtaq S. Tea and its consumption: benefits and risks. Crit Rev Food Sci Nutr. 2015;55(7):939–54.
Khan N, Mukhtar H. Tea and health: studies in humans. Curr Pharm Des. 2013;19(34):6141–7.
Basu Majumder A, Bera B, Rajan A. Tea statistics: global scenario. Inc J Tea Sci. 2010;8(1):121–4.
Carloni P, Tiano L, Padella L, Bacchetti T, Customu C, Kay A, et al. Antioxidant activity of white, green and black tea obtained from the same tea cultivar. Food Res Int. 2013;53(2):900–8.
Bang H, Kim S, Leskovar D, King S. Development of a codominant CAPS marker for allelic selection between canary yellow and red watermelon based on SNP in lycopene β-cyclase (LCYB) gene. Mol Breed. 2007;20(1):63–72.
Morgante M, Olivieri A. PCR-amplified microsatellites as markers in plant genetics. Plant J. 1993;3(1):175–82.
Roychowdhury R, Taoutaou A, Hakeem KR, Gawwad MRA, Tah J. Molecular marker-assisted technologies for crop improvement. In: Roychowdhury R, ed. Crop improvement in the era of climate change; 2013: p. 241–58.
Kumar S, Rajendran K, Kumar J, Hamwieh A, Baum M. Current knowledge in lentil genomics and its application for crop improvement. In: Kumar S, editor. Crop breeding: bioinformatics and preparing for climate change. USA: CRC Press; 2016. p. 309–27.
Varshney RK, Graner A, Sorrells ME. Genic microsatellite markers in plants: features and applications. Trends Biotechnol. 2005;23(1):48–55.
Singh RB, Srivastava S, Rastogi J, Gupta GN, Tiwari NN, Singh B, et al. Molecular markers exploited in crop improvement practices. Res Environ Life Sci. 2014;7(4):223–32.
Kesawat MS, Kumar BD. Molecular markers: it’s application in crop improvement. J Crop Sci Biotechnol. 2009;12(4):169–81.
Wang X, Gui S, Pan L, Hu J, Ding Y. Development and characterization of polymorphic microRNA-based microsatellite markers in Nelumbo nucifera (Nelumbonaceae). Appl Plant Sci. 2016;4(1):1500091.
Nithin C, Patwa N, Thomas A, Bahadur RP, Basak J. Computational prediction of miRNAs and their targets in Phaseolus vulgaris using simple sequence repeat signatures. BMC Plant Biol. 2015;15(1):140.
Ganie SA, Mondal TK. Genome-wide development of novel miRNA-based microsatellite markers of rice (Oryza sativa) for genotyping applications. Mol Breed. 2015;35(1):51.
Mondal TK, Ganie SA. Identification and characterization of salt responsive miRNA-SSR markers in rice (Oryza sativa). Gene. 2014;535(2):204–9.
Großhans H, Filipowicz W. Molecular biology: the expanding world of small RNAs. Nature. 2008;451(7177):414–6.
Fondon JW, Garner HR. Molecular origins of rapid and continuous morphological evolution. Proc Natl Acad Sci. 2004;101(52):18058–63.
Kashi Y, King DG. Simple sequence repeats as advantageous mutators in evolution. Trends Genet. 2006;22(5):253–9.
Mukhopadhyay M, Mondal TK, Chand PK. Biotechnological advances in tea (Camellia sinensis [L.] O. Kuntze): a review. Plant Cell Rep. 2016;35(2):255–87.
Elangbam M, Misra A. Development of CAPS markers to identify Indian tea (Camellia sinensis) clones with high catechin content. Genet Mol Res. 2016;15(2):1–13.
Prabu G, Mandal A. Computational identification of miRNAs and their target genes from expressed sequence tags of tea (Camellia sinensis). Genom Proteom Bioinform. 2010;8(2):113–21.
Das A, Mondal TK. Computational identification of conserved microRNAs and their targets in tea (Camellia sinensis). Am J Plant Sci. 2010;1(02):77.
Q-w Zhu, Y-p Luo. Identification of miRNAs and their targets in tea (Camellia sinensis). J Zhejiang Univ Sci B. 2013;14(10):916–23.
Zhang B, Pan X, Anderson TA. Identification of 188 conserved maize microRNAs and their targets. FEBS Lett. 2006;580(15):3753–62.
Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA genomics. Nucleic Acids Res. 2008;36(suppl 1):D154–8.
Boguski MS, Lowe TM, Tolstoshev CM. dbEST—database for “expressed sequence tags”. Nat Genet. 1993;4(4):332–3.
Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. 2011;27(6):863–4.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.
Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003;31(13):3406–15.
Zhang B, Pan X, Cannon CH, Cobb GP, Anderson TA. Conservation and divergence of plant microRNA genes. Plant J. 2006;46(2):243–59.
Li X, Hou Y, Zhang L, Zhang W, Quan C, Cui Y, et al. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum). Plant Signal Behav. 2014;9(9):e29462.
Zhang B, Pan X, Cox S, Cobb G, Anderson T. Evidence that miRNAs are different from other RNAs. Cell Mol Life Sci. 2006;63(2):246–54.
Dai X, Zhao PX. psRNATarget: a plant small RNA target analysis server. Nucleic Acids Res. 2011;39(suppl 2):W155–9.
Martins WS, Lucas DCS. Neves KdS, Bertioli DJ. WebSat—a web software for microsatellite marker development. Bioinformation. 2009;3(6):282–3.
You FM, Huo N, Gu YQ, M-c Luo, Ma Y, Hane D, et al. BatchPrimer3: a high throughput web application for PCR and sequencing primer design. BMC Bioinform. 2008;9(1):253.
Yu Y, Yuan D, Liang S, Li X, Wang X, Lin Z, et al. Genome structure of cotton revealed by a genome-wide SSR genetic map constructed from a BC 1 population between Gossypium hirsutum and G. barbadense. BMC Genom. 2011;12(1):15.
Weber MJ. New human and mouse microRNA genes found by homology search. FEBS J. 2005;272(1):59–73.
Kurihara Y, Watanabe Y. Arabidopsis micro-RNA biogenesis through Dicer-like 1 protein functions. Proc Natl Acad Sci USA. 2004;101(34):12753–8.
Biswas S, Hazra S, Chattopadhyay S. Identification of conserved miRNAs and their putative target genes in Podophyllum hexandrum (Himalayan Mayapple). Plant Gene. 2016;6:82–9.
Patanun O, Lertpanyasampatha M, Sojikul P, Viboonjun U, Narangajavana J. Computational identification of microRNAs and their targets in cassava (Manihot esculenta Crantz.). Mol Biotechnol. 2013;53(3):257–69.
Bonnet E, Wuyts J, Rouzé P, Van de Peer Y. Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics. 2004;20(17):2911–7.
Zhang BH, Pan XP, Wang QL, George PC, Anderson TA. Identification and characterization of new plant microRNAs using EST analysis. Cell Res. 2005;15(5):336–60.
Pandey B, Gupta OP, Pandey DM, Sharma I, Sharma P. Identification of new stress-induced microRNA and their targets in wheat using computational approach. Plant Signal Behav. 2013;8(5):e23932.
Mallory AC, Vaucheret H. Functions of microRNAs and related small RNAs in plants. Nat Genet. 2006;38:S31–6.
Wang X-J, Reyes JL, Chua N-H, Gaasterland T. Prediction and identification of Arabidopsis thaliana microRNAs and their mRNA targets. Genome Biol. 2004;5(9):R65.
Chen X. Small RNAs and their roles in plant development. Annu Rev Cell Develop. 2009;25:21–44.
Chen R, Hu Z, Zhang H. Identification of microRNAs in wild soybean (Glycine soja). J Integr Plant Biol. 2009;51(12):1071–9.
Zhang B, Pan X, Cobb GP, Anderson TA. Plant microRNA: a small regulatory molecule with big impact. Develop Biol. 2006;289(1):3–16.
Chen M, Tan Z, Zeng G, Peng J. Comprehensive analysis of simple sequence repeats in pre-miRNAs. Mol Biol Evol. 2010;27(10):2227–32.
Heesacker A, Kishore VK, Gao W, Tang S, Kolkman JM, Gingle A, et al. SSRs and INDELs mined from the sunflower EST database: abundance, polymorphisms, and cross-taxa utility. Theor Appl Genet. 2008;117(7):1021–9.
Katti MV, Ranjekar PK, Gupta VS. Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol. 2001;18(7):1161–7.
AH and SD have conceived the work, designed methodology, interpreted data and written the manuscript, NDG and CS participated in data interpretation and manuscript writing. All authors read and approved the final manuscript.
AH is thankful to National Tea Research Foundation, Tea Board, India for providing research fellowship.
The authors declare that they have no competing interests.
Availability of data and materials
Data retrieved from public database, not deposited any.
National Tea Research Foundation, Tea Board, India.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Hazra, A., Dasgupta, N., Sengupta, C. et al. Extrapolative microRNA precursor based SSR mining from tea EST database in respect to agronomic traits. BMC Res Notes 10, 261 (2017) doi:10.1186/s13104-017-2577-x
- Micro RNA
- Simple sequence repeats
- Tea quality
- Trait specific marker