Historically, the taxonomy and nomenclature of Japanese salmon have been in a state of confusion. Masu, amago and biwa salmon have been variously classified as distinct species, subspecies, or often conflicting or overlapping combinations of the two. In particular, the taxonomy of masu and amago salmon is obscured by their similarity in ecological and morphological traits. Here, DNA sequence analyses of nuclear and mitochondrial DNA were applied to clarify the genetic relationship between masu and amago salmon. No fixed differences were detected in the mitochondrial ND3 gene and control (D-loop) region, or in the nuclear growth hormone type-2 gene (GH2). However, the frequency of single nucleotide substitution alleles within GHZ intron C and size variants at a microsatellite locus nested within intron D differed markedly, providing genetic evidence to support a taxonomic distinction between the two types. The genetic data were related to previous mitochondrial DNA sequence analyses and alternative classification schemes for masu and amago salmon. The best-supported scheme arranges the two types as subspecies; masu as Oncorhynchus masou masou Brevoort and amago as Oncorhynchus masou ishikawae Jordan and McGregor.
INTRODUCTION
The genus Oncorhynchus contains eight types of Pacific salmon and the recently re-classified rainbow, cutthroat and allied trout species (Smith and Stearley, 1989). Five types of salmon, sockeye, pink, chum, chinook and coho, occur on both sides of the northern Pacific Ocean. Each exhibits marked morphological and ecological differences that have made it possible to assign unambiguous species status. This group of salmon is believed to have descended from a single common ancestor that diverged from other Pacific salmon and trout lineages at least 10 million years ago (McKay et al., 1996). Three types of salmon that occur only in Asia represent the masu lineage: masu (sakuramasu), amago (satsukimasu) and biwa (biwamasu) salmon. Two classification schemes are in current use for this group of salmon. One assigns specific status to masu (O. masou) and groups amago and biwa together as O. rhodurus (Kato, 1985, 1991), while the other groups masu (O. masou masou), amago (O. masou ishikawae) and biwa (O. masou spp.) as subspecies (Kimura, 1990).
The root of their names in the Japanese vernacular is “masu”, which means trout. Unlike the North American Pacific salmon, this group has retained more primitive, trout-like life history traits: sea-run forms, particularly satsukimasu, do not venture as far into the Ocean, and land-locked forms do not always die after spawning. The trout-like character of these fish is consistent with their basal position in inferred phylogenetic trees for Oncorhynchus (Stearley and Smith, 1993; McKay et al., 1996; Murata et al., 1996; Oohara et al., 1997).
The geographic range of masu salmon (Table 1; hereafter collectively referring to land-locked form yamame and the anadromous sakuramasu) stretches northward as far as the Kamchatka Peninsula and as far south as Taiwan and Formosa. The distribution of amago (Table 1; collectively referring to the land-locked form, amago and the anadromous form, satsukimasu) is more restricted. Amago occurs primarily on the Pacific side of Southern Japan, but also around the Seto Inland Sea of Japan, including part of Kyushu. Biwa is native only to lake Biwa and associated drainages. The range of biwa salmon is completely within that of amago, but masu does not currently occur sympatrically with either of the other types (Oshima, 1957; Kimura, 1989). Historically, marked similarity in morphological and meristic characters and vague descriptions of original type specimens (Jordan and McGregor, 1925) have led to confusion in their taxonomy and nomenclature (Table 1). Differences in scale morphology and the presence of red spots above and below the lateral line of juvenile and adult fish are diagnostic characters for distinguishing between the three types. DNA sequence analysis of the mitochondrial genome demonstrated that the lacustrine biwa salmon is probably the oldest lineage of the O. masou species complex (Oohara and Okazaki, 1996). However, molecular differences between the masu and amago types are less pronounced; much of their mitochondrial genomes are nearly identical in sequence (McKay et al., 1996; Oohara and Okazaki, 1996).
Table 1
Outline of the Oncorhynchus masou species complex
In this study, we examined additional mitochondrial DNA sequence from the ND3 gene and the control (D-loop) region, where both interspecific (Thomas and Beckenbach, 1989; Shedlock et al., 1992) and intraspecific (Beckenbach et al., 1990; Park et al., 1993) variation in Oncorhynchus have previously been observed. Very little DNA sequence variation was detected among mitochondrial sequences of masu and amago. However, analysis of intronic sequences of the nuclear growth hormone type-2 (GH2) gene revealed considerable variation within and between types, providing evidence that masu and amago are genetically divergent.
MATERIALS AND METHODS
DNA extraction, gene amplification and sequence analysis
Strains, sample origins and abbreviations are listed in Table 2. Samples of liver or fin tissue were shipped from Japan in 70% ethanol and stored at ambient temperature until use. DNA was isolated from tissue samples using Proteinase K digestion followed by extraction with organic solvents as described previously (Devlin et al., 1991). Polymerase chain reaction (PCR; Saiki et al., 1988) amplification was performed on 200-500 ng of genomic DNA template with either Ultratherm (Bio/Can Scientific) or Taq (Bethesda Research Laboratories-BRL) DNA Polymerase using the reagents and instructions provided by the manufacturer. Typically, the thermal profile of a PCR consisted of 2-4 min incubation at 94° C, followed by 30 cycles of 30 sec at 94° C, 30 sec at 55° C, 60 sec at 72° C, followed by a 4 min incubation at 72° C. PCR amplification products were prepared for sequencing by purification with Wizard PCR-Prep or DNA Clean-Up kits (Promega). Where necessary, multiple amplification products were separated by electrophoresis in low-melting-point agarose using standard methods (Sambrook et al., 1989). Amplification products were sequenced directly using either the Sequenase v2.0 or Thermosequenase sequencing kits (Amersham-United States Biochemicals). Sequencing, electrophoresis and autoradiography were performed according to the manufacturer's instructions.
Table 2
The names and geographic origins of strains used in this study
PCR and sequencing primers
A portion of the mitochondrial control region was amplified using the F+ (5′-TTCCTGTCAAACCCCTAAACCAGG-3′) and F – (5′-CCATCTTAACAGCTTCAG-3′) primer pair described in Shedlock et al. (1992). 185 nt of DNA sequence corresponding to the 3′ end of the aligned sequence reported by Shedlock et al. (1992) was obtained.
Two portions of the GH2 gene were amplified (Fig. 1). Primers GH 41 (5′-ATGGAAAACCAACGGCT-3′) and GH28 (5′-GTCTGGCTAGGGTACTCCCA-3′) were used to amplify a segment containing introns B, C and flanking regions. This primer combination produced two amplification products corresponding to GH1 and GH2. The GH2 product was identified by comparison with sequences from sockeye salmon GH1 and GH2 genes (Devlin, 1993), from which primers GH41 and GH28 were designed. The entire 451 nt intron C sequence was determined using primer GH28 and the opposing primer GH27 (5′-ATATTCCTGCTGGACTTCTG-3′).
The second portion of the gene was obtained with primers GH57 (5′-GCTCATCAAGGTAATGGTCA-3′) and GH7 (5′-CTTATGCATGTCCTTCTTGAA-3′), which specifically amplify a segment of GH2 containing intron D and exon 5 (McKay et al., 1997). The same segment plus the extreme 3′ end of exon 4 was also amplified from both the GH1 and GH2 genes using primers GH7 and GH56 (5′-AAGCTCAOKGCGACCTCAAAGT-3′). Sequence from the 3′ end was obtained using primer GH57 (GH2) or GH56 (GH1 and GH2). In some cases, the opposite strand was read using primer GH7 or GH16 (5′-TTGTTAATCTTTGTGAAAA-3′).
Direct sequencing of PCR products from heterozygous individuals
Direct sequencing of amplification products from individuals heterozygous at variable positions in the GH2 gene produced sequence ambiguities (Fig. 2A). Two bands of equal intensity occurring at the same position in the sequence were interpreted as having resulted from amplification of two alleles differing at that position. Such ambiguities never involved more than two nucleotides at one position. The possibility that the two-fold ambiguities were amplification artifacts resulting from misincorporation of nucleotides by Taq DNA Polymerase was ruled out for two reasons: 1) the site and type of virtually all observed sequence ambiguities was the same in several individuals, each of which represented independent DNA extractions, PCR amplifications, and sequencing experiments, and 2) in the case of intron D, two independent PCR amplifications with different primer pairs (GH56/7 vs. GH57/7) from six fish produced identical sequences, including the position and nature of each ambiguity.
A second type of heterozygote was observed in GH2 intron D (Fig. 2B). A four nt microsatellite repeat varied between three and five iterations (discussed below). Direct PCR sequencing from heterozygous individuals produced clean sequence upstream of the repeat region. The region immediately downstream of the heterozygous repeat produced two superimposed sequences, one being shifted out of register by either four or eight nucleotides (one or two iterations of the repeat unit). Alleles were scored by counting the number of iterations of the repeat unit, then observing the displacement of easily identified sequence motifs downstream of the repeat, such as the run of five A's shown in Fig. 2B. The reliability of this scoring method was confirmed by reproducing the results in some cases by sequencing both strands. In addition, the genotypes scored by sequence analysis were confirmed in 24 individuals by denaturing polyacrylamide gel electrophoresis of full-length32 P dATP-labelled PCR products (not shown).
RESULTS
Mitochondrial DNA sequence analysis
Overall, the ND3 gene has a relatively high substitution rate in salmonid fishes (Thomas and Beckenbach, 1989; McKay et al., 1996). However, the complete sequence of the ND3 gene (351 nt) was found to be identical between a masu sampled in Hokkaido and an amago from southern Honshu. With the exception of a silent substitution in one masu individual (Fig. 3A), complete sequence identity in the ND3 gene was also observed among an additional three masu and three amago sampled from the same locations. Two additional haplotypes (Oohara and Okazaki, 1996) that differ by single silent substitutions were not observed among the individuals sampled in this study (Fig. 3A).
Similar results were obtained with the mitochondrial control region. The 3' end of this region varies considerably among salmonid fishes (Shedlock et al., 1992), but very little variation was detected among masu and amago individuals. A 185 nt region was sequenced from 14 amago and 6 masu (Fig. 3B) and two haplotypes, differing by a single transition, were observed. The most common haplotype was present in all but one fish. The haplotypes observed in this study differ from the masu sequence reported by Shedlock et al. (1992) by a single nucleotide substitution, as well as several single-nucleotide gaps. As was observed with the ND3 genes, the most commonly observed haplotypes were found in both masu and amago salmon, providing no evidence for a genetic distinction between the two types.
Variation of intronic sequences of the GH2 gene
The complete DNA sequence of intron C from masu and amago individuals was determined. A total of 16 fish were sampled, with two from each of four geographically isolated populations (Table 2) represented in each sample group. To avoid confusion about geographic origin, only wild strains from known sampling locations were analyzed. Considerable variation was observed in intron C, both within and between the two types (Table 3). Seven nucleotide positions varied among individuals. Comparison of variation within types revealed that the amago sample group was more genetically heterogeneous, as reflected by the higher degree of heterozygosity with respect to the masu sample group.
Table 3
Variable positions within GH2 intron C of wild masu and amago salmon
Although no fixed differences were observed between masu and amago, particular nucleotides at variable positions were more common within one type than another. For example, an “A” occurred at position 269 with a frequency of 0.875 (14/16 haploid genomes) in masu, but only 0.375 in the amago sample group. In addition, polymorphism at positions 140, 182 were confined to masu and variation at position 425 was specific to amago. These observations suggest that masu and amago are genetically divergent.
The sequences of GH2 intron D from single masu and amago salmon have been reported previously (McKay et al., 1996). In this study, analysis of the 5' end of intron D from 44 amago and 52 masu salmon revealed a variable microsatellite locus nested within the intron. A direct, tandem repeat, four nucleotides in length, was found to vary between three and five iterations (Fig. 2B). Genetic heterogeneity at this locus was high, with greater than half of the individuals tested being heterozygous. Sequences of the same region of the paralogous GH1 gene were also obtained from three masu and three amago individuals. Similar variation was not detected within this gene: the (GATT) repeat sequence was present in only two iterations in each of the six individuals tested.
In addition to variation in the number of (GATT) repeat units in GH2 intron D, two (G<–>A) transitional substitutions at positions 206 and 224 of the aligned intron sequence reported by McKay et al. (1996) were found to vary within and among the masu and amago sample groups. A “G” was observed rarely at position 206 (G 206), with an overall frequency of 0.08 (14/188 haploid genomes). G 206 is likely physically linked on the same chromosome as a (GATT) 4 allele; 14/14 individuals with a G 206 allele also had at least one copy of the (GATT) 4 variant, which was either homozygous, or heterozygous with (GATT) 3 or (GATT) 5. A “G” occurred more commonly at position 224, with a frequency of 0.28 and is almost certainly linked to the (GATT) 3 variant. In 43/43 individuals with G 224, the (GATT) 3 variant was also present. In addition, homozygous G 224 /G 224 individuals were always homozygous (GATT) 3 /(GATT) 3. A causal relationship between these nucleotide substitutions and the number of repeat iterations is unlikely, as the same locus (nested within GH2) in other Oncorhynchus species varies from two to five iterations of the (GATT) repeat while having an “A” at positions 206 and 224. The association of nucleotide substitutions with particular length alleles indicates that they occurred sometime after variation in repeat number was established in this lineage.
Microsatellite allele frequencies differ between masu and amago salmon
Because salmon in Japan have a history of being transplanted, and many of the sampled individuals were of uncertain parentage, sample populations were divided into two categories. Wild fish (or their descendants), taken from known geographic locations, were analyzed separately from cultured or hatchery-reared fish of unknown geographic origin, hereafter referred to collectively as “cultured”. By treating the two categories separately, it was revealed that the allele frequencies differ markedly between wild and cultured fish (Fig. 4). As was observed with the single nucleotide substitutions in GH2 intron C, the distribution of the (GATT) n alleles of the microsatellite locus within intron D are not equal between masu and amago (Fig. 4A). Taken overall, the (GATT) 3 allele is more common in masu, while the frequency of the (GATT) 5 allele is higher in amago than in masu. The observed differences in total allele frequencies were statistically significant (χ2 = 7.44, d.f. = 2, p = 0.024). Among wild fish, the (GATT) 5 allele is clearly the most common in amago (n = 19), and the (GATT) 3 allele was observed only in a single heterozygous individual. In wild masu (n = 26), the three allele frequencies are more similar, with (GATT) 4 slightly more common than the others. The overall difference in allele frequencies between wild masu and amago was highly significant (χ2 = 15.9, d.f. = 2, p = 0.0003).
In contrast, (GATT) 4 was the least common variant among the remaining masu (n = 24) samples (Fig. 4). The (GATT) 3 allele was the most common among both cultured masu and cultured amago (n = 25). Unlike the wild fish, the three alleles were more equally represented among cultured amago. The overall differences in allele frequencies between the two types of cultured fish were not statistically significant. Masu and amago are known to hybridize readily under hatchery conditions (Oshima, 1955), and produce viable offspring. The markedly higher incidence of the (GATT) 3 allele among cultured vs. wild amago salmon suggests that introgression of this allele from masu to amago may have occurred among captive populations. Alternatively, the higher frequency of this allele among captive amago may be the result of genetic drift.
DISCUSSION
Mitochondrial DNA sequence analysis
Because it tends to evolve relatively rapidly (Brown et al., 1979), the mitochondrial genome is commonly used to study relationships among conspecific populations. While a pronounced difference between biwa salmon and both masu and amago was supported by mitochondrial sequence data (Oohara and Okazaki, 1996), the lack of apparent fixed differences between the mitochondrial genomes of the latter two types is consistent with Imanishi's (1951) assertion that these two types are simply morphs of the same species.
Variation in the GH2 gene supports a genetic distinction
Substantial variation was observed within intronic sequences from the GH2 gene among and between masu and amago salmon. None of the observed differences were fixed between types, but masu and amago clearly differed in patterns of single nucleotide substitutions in intron C (Table 3) and in allele frequencies at the (GATT) microsatellite locus nested within intron D (Fig. 4). While the overall frequency of the three observed microsatellite alleles were similar, the (GATT) 5 allele was much more common in amago salmon, while the (GATT) 3 form was extremely rare. These allele frequency differences provide evidence that the two types are genetically divergent, and that recent interbreeding has likely not occurred in nature. Since masu samples representing more southerly populations in Japan were not obtained, it was not possible to assess an alternative possibility that the differing allele frequencies are due to a north-south cline in allele frequencies. However, the two types do not generally occur sympatrically, which renders wide-spread gene-flow between populations of masu and amago unlikely.
An additional argument against recent hybridization is provided by a GH pseudogene. A GH2-derived pseudogene present on the Y chromosome of several Oncorhynchus species (Du et al., 1993) was detected in masu but not amago salmon (Nakayama et al., in preparation). The presence of this difference suggests that the pseudogene was lost in amago sometime after the two lineages split, and that it has not been reintroduced via introgression into amago.
Microsatellite allele frequencies differ between cultured and wild fish
Allele frequencies at the microsatellite locus were markedly different between cultured and wild sample groups (Fig. 4). Unlike the wild fish, the differences in allele frequencies between cultured sample groups of each type were not found to be statistically significant. The (GATT) 3 allele was very rare among wild amago but was the most common among the cultured fish. The distribution of mitochondrial haplotypes was also found to vary between wild and cultured sample groups (Oohara and Okazaki, 1996), with cultured amago and masu more similar than wild amago and masu. The differing frequencies could be the result of a founder effect. Alternatively, our observations and those of Oohara and Okazaki (1996) are consistent with recent introgressive hybridization between captive masu and amago; however, lack of information on the geographic origin and history of cultured strains precludes resolution of this question.
Recent history of the GH2 microsatellite locus
Although the three microsatellite alleles (Figs. 3 and 4) were scored only by the number of iterations of the (GATT) repeat, there are at least five alleles if one considers the (G<–>A) substitutions at positions 206 and 224 of intron D. G 206 was always associated with the (GATT) 4 variant. A (GATT) 4 allele was also observed with an “A” at that position, which indicates there are at least two (GATT) 4 alleles. Similarly, G 224 was always associated with the (GATT) 3 variant, but (GATT) 3 was also observed with an “A” at that position.
This information can be used to infer patterns of evolutionary change at the microsatellite locus. For example, G 224-(GATT) 3 is the more common (GATT) 3 allele. Since an association between a “G” at position 224 and (GATT) 4 was not observed in the sampled population (n = 94), recent expansion of G 224-(GATT) 3 to G 224-(GATT) 4 or G 224-(GATT) 5 has probably not occurred. Likewise, evidence of expansion of G 206-(GATT) 4 to G 206-(GATT) 5 was not observed. The rare A 224-(GATT) 3 allele could be the result of contraction of A 224-(GATT) 4 or A 224-(GATT) 5 but could also have resulted from interallelic recombination between the two variable positions. Overall, it was possible to infer that the microsatellite alleles have probably not undergone recent expansion from (GATT) 3 to (GATT) 4 or (GATT) 5 or from (GATT) 4 to (GATT) 5, but recent contraction of alleles by loss of one or more repeat iterations could not be ruled out.
Evaluation of alternative classification schemes
In the classification scheme reviewed by Kato (1991), masu and amago are the distinct species O. masou and O. rhodurus, respectively. This is consistent with the fact that masu and amago have consistent differences in coloration and differing scale morphology (Kimura, 1990). However, the strong similarity in mitochondrial DNA sequences between masu and amago is unlike observed differences between other closely-related species pairs in Oncorhynchus. For example, the smallest distance observed in the ND3 gene between species pairs was that of rainbow and cutthroat trout, which differ by 5.7% (McKay et al., 1996). These two species also differ by 6.2% in the portion of the mitochondrial control region analyzed in this study (Shedlock et al., 1992). Since other related pairs of species in Oncorhynchus have accumulated measurable differences in the DNA of their mitochondrial genomes, it is reasonable to expect at least some differences between the mitochondrial genomes of masu and amago if they are distinct species. The observation of no type-specific sequence divergence in the ND3 gene and mitochondrial control region argues that these two types diverged from each other much later than any of the other salmon and trout that have undisputed species status.
On first inspection, failure to detect type-specific differences in the mitochondrial genome, while considerable genetic heterogeneity was observed in a nuclear gene, appears contradictory. Similar results were also obtained when comparing Atlantic salmon (Salmo salar) populations in North Wales (O'Connell et al., 1996). Differences in mitochondrial DNA sequence suggest that the biwa salmon forms a sister-group to the masu/amago lineage (Oohara et al., 1996). If the common ancestor of the biwa and masu/amago lineages shared a restricted geographic range similar to that of modern biwa salmon, a recent expansion through Asia by a masu/amago progenitor could explain that lack of variation. A relatively recent divergence would have allowed insufficient time for a substantial number of differences to accumulate between the mitochondrial genomes. Since most known GH2 alleles are shared between masu and amago, much of the existing polymorphism in this locus likely predates such an expansion. If a genetic bottleneck preceding the expansion of masu throughout Asia were responsible for the observed lack of variation in the mitochondrial genome, it was not severe enough to eliminate diversity of nuclear genotypes.
In contrast to that of the mitochondrion, the nuclear genome is inherited in a diploid and bi-parental manner, which allows more potential for polymorphism among related individuals. Because the mitochondrial genome is hemizygous and inherited only from the maternal parent, its effective population size is only 1/4 that of alleles of nuclear genes. A less severe bottleneck could have allowed a single mitochondrial haplotype to drift to fixation in a recent masu/amago ancestor that maintained polymorphism in the GH2 locus.
Despite an apparently recent separation, enough time has elapsed for genetic drift between wild masu and amago populations to produce the dissimilar allele frequencies observed for the GH2 gene and its nested microsatellite locus. In isolation, genetic similarity does not provide a convincing argument for a taxonomic distinction between the two types. However, combined with the non-overlapping geographic ranges and subtle but fixed differences in coloration and scale morphology, the genetic data are consistent with Kimura's (1990) assertion that masu and amago salmon be recognized as distinct subspecies within an O. masou complex.
Acknowledgments
We would like to thank Nobuhisa Koide for generously providing masu samples from Hokkaido. This work was supported in part by a National Science and Engineering Research Council (NSERC) operating grant to MJS, Fisheries and Oceans Canada (RHD), an NSERC postgraduate scholarship to SJM and a British Columbia Science Council Graduate Research in Engineering and Technology (GREAT) award to SJM. We also thank an anonymous reviewer for helpful comments on the manuscript.