Tropical Medicine and Health
Online ISSN : 1349-4147
Print ISSN : 1348-8945
ISSN-L : 1348-8945
Original articles
Evidence for Genetic Reassortment between Human Rotaviruses by Full Genome Sequencing of G3P[4] and G2P[4] Strains Co-circulating in India
T. N. Hoa TranToyoko NakagomiOsamu Nakagomi
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2013 Volume 41 Issue 1 Pages 13-20

Details
Abstract

Rotavirus A causes severe diarrhoea in infants and young children worldwide. Many unusual combinations of G and P genotypes have been observed in rotaviruses circulating in developing countries. Mixed infection of a single individual with more than one strain is a mechanism by which genetic reassortants are formed with unusual G and P combinations. However, few studies have provided direct evidence for the formation of such unusual strains as a result of co-infection of co-circulating strains. Here, we used full-genome sequencing to re-analyze a G3P[4] strain (107E1B) and a G2P[4] strain (116E3D) detected in India in 1993 and showed that 107E1B had virtually an identical nucleotide sequence with 116E3D, except the VP7 gene. Phylogenetic analysis revealed that the 107E1B VP7 gene was of typical human rotavirus origin, with a 99.3% nucleotide sequence identity with another Indian G3 VP7 gene. Thus, this study provided robust evidence for the formation of the G3P[4] strain through genetic reassortment in which a G2P[4] strain with a typical DS-1 genogroup background acquired the VP7 gene from a co-circulating G3 human rotavirus strain. This study established a basis on which to facilitate full genome sequence analysis of an increasing number of G3P[4] strains in China and elsewhere in the world.

Introduction

Rotavirus A, a member of genus Rotavirus within family Reoviridae [1], is a major etiological agent of acute gastroenteritis in infants and young children worldwide and claims an estimated 453,000 lives among children less than five years of age each year primarily in developing countries [2]. The genome of rotavirus consists of 11 segments of double-stranded RNA, and each genome segment codes for either single structural or single non-structural protein, except genome segment 11 which codes for the two out-of-phase non-structural proteins NSP5 and NSP6 [1]. VP7 and VP4 have attracted particular attention because these outer-capsid proteins were shown to be independently involved in viral neutralization [1]. Since VP7 is a glycoprotein, the genotype defined by VP7 is referred to as the G type, while the genotype defined by VP4 is referred to as the P type because VP4 is sensitive to protease [3]. This binary nomenclature has extensively been used to describe strain distribution in many molecular epidemiological studies, and it has been established that five genotype combinations, i.e., G1P[8], G2P[4], G3P[8], G4P[8], and G9P[8], account for more than 80% of genotypes carried by human rotaviruses worldwide [47].

More recently, reflecting the progress of full genome characterization of rotavirus strains, a nucleotide sequence-based, complete genome classification system was proposed by the Rotavirus Classification Working Group and is now widely used. The genome of individual rotavirus strains is given the complete descriptor of Gx-P[x]-Ix-Rx-Cx-Mx-Ax-Nx-Tx-Ex-Hx (“x” representing the genotype number) for denoting the VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5 genes [8, 9]. Under this classification system, what was previously described by RNA-RNA hybridization assays as two major and one minor genogroups of human rotaviruses, the Wa, the DS-1 and the AU-1 genogroups, are now described as three genotype constellations, G1-P[8]-I1-R1-C1-M1-A1-N1-T1-E1-H1, G2-P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2, and G3-P[9]-I3-R3-C3-M3-A3-N3-T3-E3-H3, respectively [10].

Like other viruses carrying a segmented genome, rotavirus undergoes frequent reassortment upon co-infection of a single cell with two different virus strains, and this genetic reassortment has been regarded as a major mechanism of the evolution of rotaviruses under natural conditions. Since an unusual combination of G and P genotypes is more easily detected in molecular epidemiological studies, quite a few reports have identified intergenogroup reassortants by genogrouping using the RNA-RNA hybridization or by full genome analysis [1118]. However, it was unclear in many cases whether these intergenogroup reassortants had been created in the distant past or formed between locally co-circulating strains.

Among candidate intergenogroup reassortants, G3P[4] strains deserve mention because they sometimes appeared in multiple strains in single studies. For example, one study conducted in Ghana showed that 64% (n = 32) of the strains circulating in 1998 had G3P[4] [19], and another study in Ghana showed that 11% (n = 27) of strains circulating during the 1998–2000 period possessed G3P[4] [20]. More recently in China, 91 specimens, accounting for 8.6% of total rotavirus-positive stool specimens, contained G3P[4] rotaviruses [21]. In addition, it was described that there were abundant co-circulating strains carrying G2P[4] or G3P[8] that were thought to be the parental strains of the G3P[4] reassortants [19, 2123]. However, there have been very few reports providing direct evidence that genetic reassortants between two human rotavirus genogroups were formed between locally co-circulating strains under natural conditions.

In this regard, Nakagomi et al. [24] reported molecular analysis by co-electrophoresis and RNA-RNA hybridization of G2P[4] and G3P[4] human rotavirus strains detected in New Delhi, India in 1993, carrying an apparently identical electropherotype, and suggested that the G3P[4] strain resulted from the acquisition of G3 VP7 gene by a co-circulating G2P[4] strain. The present study was undertaken to show by full genome sequencing analysis that a G3P[4] strain (107E1B) and another, locally co-circulating G2P[4] strain (116E3D) had virtually identical nucleotide sequences between all cognate genome segments except the VP7 genes and, thereby, to provide the first definitive, molecular evidence for the occurrence of reassortment between concurrently circulating human rotavirus strains.

Materials and Methods

RNA extraction, reverse transcription-PCR, and nucleotide sequencing

The genomic RNAs were extracted from the seed stocks of strains 107E1B (G3P[4]) and 116E3D (G2P[4]) [24] using a QIAamp Viral RNA Mini kit (Qiagen Sciences, Germantown, MD, USA) according to the manufacturer’s instructions. Complementary DNA was synthesized from the extracted RNA using random hexamers (SuperScript III First-Strand Synthesis System, Invitrogen Corp., Carlsbad, CA, USA).

The polymerase chain reactions (PCR) were conducted with specific primers [8, 25, 26] using Go Taq Green Master Mix (Promega Corporation, Foster, CA, USA) to amplify all gene segments except the VP7 gene, the complete nucleotide sequence of which is available in the DNA databases (accession numbers AB081594 for 107E1B and AB081593 for 116E3D).

The resulting PCR products were subjected to nucleotide sequencing reactions using the BigDye Terminator Cycle Sequencing Ready Reaction Kit, version 3.1 (Applied Biosystems, Foster, CA, USA), and the sequences were determined with an automated 3730 DNA sequencer (Applied Biosystems).

Nucleotide sequence analysis

Nucleotide sequences were assembled using the Lasergene 8 software package (DNAstar, Inc. Madison, WI, USA). Pair-wise nucleotide sequence identities were calculated with the MEGA v.5.0 package [27], and multiple sequence alignment was carried out by the CLUSTALW program. Genetic distances were calculated by the Kimura two-parameter method, and phylogenetic trees were then constructed by the neighbor-joining method. The bootstrap probability at a branching point was calculated with 1,000 pseudo-replicate datasets.

Identification for genotype constellation

Genotype constellation of strains was determined using an on-line classification tool for rotaviruses group A - RotaC v2.0 [28].

Accession numbers

The nucleotide sequence data of gene segments of strains 107E1B and 116E3D were deposited in DDBJ under the accession numbers AB763956 to AB763975.

Results

We determined the nucleotide sequences of 90–100% of the coding region of 10 genes of 107E1B and 116E3D except the VP6 gene of 116E3D, the 5’ 200 nucleotide sequences of which could not determined, ending up with a coverage of 83% of the coding region (Table 1). The remaining gene, the VP7 gene of each strain, was determined in the previous study [24] and is available in the DNA databases. Applying the whole genome-based genotyping system [8, 9, 28], we assigned the genotype constellation of the VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5 genes of 107E1B to G3-P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2 and that of 116E3D to G2-P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2. Thus, 107E1B possessed the background of the DS-1 genogroup except the VP7 gene, while 116E3D possessed the DS-1 genogroup background, indicating that 107E1B is a single VP7 intergenogroup reassortant with the DS-1 genogroup background. When we compared the identity of the nucleotide sequence for each genome segment between 107E1B and 116E3D except the VP7 gene (encoding G3 and G2, respectively), it ranged from 99.83% to 100% with the number of nucleotide mismatches equal to or less than 2 (Table 1).

Table 1 The lengths of nucleotide sequences of the 11 genome segments of reference strain DS-1, 107E1B, and 116E3D, and the percent nucleotide sequence identity between the cognate genome segments of 107E1B (G3P[4]) and 116E3D (G2P[4])
Gene segments VP7** VP4 VP6 VP1 VP2 VP3 NSP1 NSP2 NSP3 NSP4 NSP5
Length of the entire coding regions* (nucleotides) of DS-1 as reference 981 2328 1194 3267 2640 2508 1482 954 942 528 603
Length of 107E1B nucleotide sequences determined in this study 981 2227 1108 3225 2481 2487 1347 898 842 528 603
Length of 116E3D nucleotide sequences determined in this study 981 2202  993 3186 2478 2485 1392 884 905 509 603
Percent nucleotide sequence identity (between 107E1B and 116E3D) 73.0 99.91 99.9 99.94 100 100 100 100 99.88 100 99.83
Number of nucleotide mismatches (between 107E1B and 116E3D) 264 2 1 2 0 0 0 0 1 0 1

* The coding region includes the stop codon.

** The nucleotide sequences of the VP7 gene are available in the DNA databases (accession numbers AB081594 for 107E1B and AB081593 for 116E3D)

Since the G3 genotype is known to occur in rotavirus strains originating from many animal host species, we performed a comprehensive phylogenetic analysis of G3 VP7 genes available in the DNA databases to identify the host-species origin of the107E1B VP7 gene. The 107E1B VP7 sequence was used as the query for a BLAST search (blast.ncbi.nlm.nih.gov/) with the maximum number of aligned sequences to display being 1000. These sequences were subjected to RotaC v2.0, an on-line classification tool for rotavirus genotyping, and this resulted in the identification of a total of 647 sequences as G3. Of these, 486 sequences were selected to satisfy the following criteria: (i) the sequence contained the entire coding region (981 nucleotides in length), and (ii) the information on the host species and the country of origin were available. The year of detection was not added to the selection criteria but was used whenever available, and three porcine rotavirus sequences lacking the 3’ 91 nucleotides were also included. Thus, a total of 486 sequences were used, of which 405 were from human rotaviruses and 81 from various animal rotaviruses.

With the exception of a G3 simian rotavirus strain, SA11, isolated in South Africa in 1958, the G3 strains analyzed in this study circulated between 1971 and 2012. In the phylogenetic tree (Fig. 1), the Indian strain 107E1B was clustered into the major lineage of G3 human rotavirus strains that contained 309 isolates accounting for 76.3% of the total G3 human rotavirus strains. Within this large lineage, the 107E1B sequence was found to be 99.3% identical with the VP7 sequence of a human rotavirus strain, RMC437, detected in Manipur, India (deposited in 2004). In addition, the 107E1B sequence was 98.1% identical with the VP7 gene of MP126, another human rotavirus strain detected in Mysore, India in 1994–1996. When we examined how frequently this level of nucleotide sequence identity was observed between 107E1B and other human rotaviruses analyzed in this study, 290 human rotavirus strains shared the nucleotide sequence identity at a rate of 98 to 99% (Fig. 2), indicating that 107E1B possessed a typical human rotavirus G3 sequence.

Fig. 1.

An abbreviated phylogenetic tree for the G3 VP7 genes from human and animal rotavirus strains available in the DNA databases. This phylogenetic tree was constructed by the neighbor-joining method, the genetic distances were computed according to the Kimura 2-parameter model, bootstrap values were obtained after 1000 replicate trials, and the VP7 gene of the strain 116E3D (G2P[4]) was used as the outgroup. The bootstrap values lower than 70% were omitted. The phylogenetic tree was constructed from the 486 sequences of entire coding region (981 nucleotides in length including the stop codon) of VP7 gene of G3 strains with the exception of the sequences of three porcine strains (*) with 91 nucleotides shorter at the 3’ end. In total, there were 405 and 81 sequences from G3 human and animal rotavirus strains, respectively.

Fig. 2.

Distribution of pairwise nucleotide sequence identity (%) for the coding region of the VP7 gene between 107E1B and other G3 human and animal rotavirus strains. Pairwise nucleotide sequence identity was calculated as (1 – p-distacnces) × 100, and p-distance was computed with the aid of MEGA v.5.0 software [27].

Discussion

Full genome sequencing analysis of a G3P[4] strain and a G2P[4] strain with an apparently identical electropherotype convincingly showed that the G3P[4] strain was formed through genetic reassortment in which a G2P[4] strain with the typical DS-1 genogroup background acquired the VP7 gene from a locally co-circulating G3 human rotavirus strain. The robustness of the evidence is derived from the following two observations. First, the nucleotide sequence of the corresponding genome segments of the two strains, except the VP7 gene which was involved in the intergenogroup reassortment, had very high sequence identities of >99.83% with a maximum of two mismatches in any genome segment (except the VP7 gene). This level of nucleotide sequence identity was previously reported only for two G3P[6] strains detected in Belgium and reported to be of clonal origin [17]. Thus, we conclude that the parental strain of 107E1B, the G3P[4] intergenogroup reassortant, was of the same clone as 116E3D, the G2P[4] strain. Second, although the donor strain of the VP7 gene could not be determined, the human rotavirus origin of the VP7 gene of 107E1B was clearly revealed by comprehensive phylogenetic analysis of 486 G3 VP7 genes from both humans and animals available in the DNA databases. It clustered into the largest lineage that contained only G3 rotaviruses of human origin, and it showed 98–99% sequence identity with the VP7 genes of 72% of 405 human rotavirus strains available in the DNA databases, while it showed only 90.2% sequence identity with the closest feline and canine rotavirus VP7 genes of four strains and 73–80% sequence identity with the VP7 genes of 72% of the 81 animal rotavirus strains analyzed in this study. In addition, there was one G3 VP7 sequence from a human rotavirus strain (RMC437) detected in Manipur, India that showed only seven nucleotide mismatches with the 107E1B VP7 gene.

It has long been held that genetic reassortants are formed between concurrently-circulating strains. Very recently, Heylen et al. [17] described detection of multiple G3P[6] rotavirus strains with the DS-1 genogroup background in Belgium and demonstrated by full genome sequencing analysis that these G3P[6] rotavirus strains were most likely formed by genetic reassortment in which G2P[6] rotavirus strains prevalent in the United States of America two years before had acquired the G3 VP7 gene. When the nucleotide sequences of the cognate genome segments were compared between BEL/FF01322 or BEL/BE1498 (G3P[6]) and USA/06-242 (G2P[6]), the nucleotide sequence identity (except the VP7 gene) ranged from 97.9 to 99.6% with mismatches between 49 (the VP4 gene) and five (the VP6 and NSP4 genes) [17]. The sequence identity was not as small as the Indian strains we reported in this study, but the levels of divergence may be explained by the fact that the two Belgium strains were detected two years after detection of the United States strain. However, the origin of the G3 VP7 gene was not clarified.

Also worthy of mention from the perspective of genetic reassortment between co-circulating human rotavirus strains is the study by Watanabe et al. who made an extensive comparison of the electropherotypes of rotavirus strains collected in a defined area of Japan over a period of six years to determine whether any electropherotype (the migration pattern of the 11 genome segments on a polyacrylamide gel) could be explained by a combination of the electropherotypes of concurrently-circulating strains [29]. They showed that two (5.4%) out of 37 electropherotyeps were reasssortants between co-circulating strains. While their study was the first to provide direct evidence for the hypothesis of occurrence of genetic reassortment beween co-circulating strains with the aid of partial sequencing and electrophoresis under three different gel concentrations, their study still lacked a robustness of evidence at the level of nucleotide sequences.

Since the first step of genetic reassortment between co-circulating strains is the infection of the same individual with more than one strain, the frequency of mixed infection is a key factor. In developing countries including India, the frequency of mixed infection as defined by the presence of more than one G or P type in a single stool specimen was reported to be a median (IQR) of 8% (2–15) [4, 23] as compared with the frequency in Europe which was reported to be 4% [6]. Indeed, there were molecular epidemiological studies describing the simultaneous detection of G2 and G3 in combination with P[4] albeit in single specimen in most studies [19, 22, 23]. Similarly, in Europe the extensive analysis by Iturriza-Gómara et al. revealed a proportion of 0.04% for the specimens containing mixed infection of G2, G3 and P[4] [6]. In contrast, the literature is much more abundant for the detection of G3P[4] strains in diarrheal children in various parts of the world including India [19, 2124, 3038], being 0.04 % in Europe [6] to 64% in Ghana [19]. A total of nearly 200 stool specimens that contained G3P[4] rotavirus were reported from China with a median detection rate of 5.4% [21, 3438]. However, none of these reports provided information on the genetic background of the G3P[4] strains, except our previous study describing the molecular characterization of 107E1B by RNA-RNA hybridization [24]. Ghanaian G3P[4] strains that accounted for 64% of the circulating genotypes were reported to have a short RNA electropherotype (a feature generally, but not always, exhibited by the DS-1 genogroup strains) [19], while Brazilian G3P[4] strains that accounted for 31% of the circulating genotypes were reported to have long RNA patterns (a feature generally, but not always, exhibited by the Wa genogroup strains) [33]. Thus, further studies are needed to determine what genotype constellation imparted selective advantages to some G3P[4] strains and not to other G3P[4] strains.

In conclusion, this study provided, for the first time, robust molecular evidence for the occurrence of a G3P[4] reassortant in which a typical G2P[4] strain had acquired the VP7 gene from a concurrently-circulating G3 human rotavirus strain. Thus, this study established a basis on which to conduct full genome sequence analysis of an increasing number of G3P[4] strains in China and elsewhere in the world.

Conflicts of Interest

There is no conflict of interest for any author to declare regarding this study.

Acknowledgements

T. N. Hoa Tran is a PhD student supported by a scholarship from the Ministry of Education, Culture, Sports, Science and Technology, Japan.

This study was supported in part by the Global Center of Excellence Program on Integrated Global Control Strategy for Tropical and Emerging Infectious Diseases from the Ministry of Education, Culture, Sports, Science and Technology, Japan. This study was also supported in part by the grants in-aid for scientific research from the Ministry of Health, Labour, and Welfare, Japan.

References
 
© 2013 Japanese Society of Tropical Medicine
feedback
Top