Hexamerins are high molecular-weight proteins found in the hemolymph of insects and have been proposed to function as storage proteins. In previous studies, two Musca domestica hexamerins, designated Hex-L and Hex-F were characterized. Hex-L is synthesized exclusively by the larval fat bodies, is secreted into the hemolymph and likely provides a source of amino acids and energy during metamorphosis. Hex-F synthesis is induced by a proteinaceous meal and occurs only in the adult insect fat bodies. Hex-F also is secreted into the hemolymph and it has been suggested that in females it may be an amino acid reservoir to be used during the final stages of egg formation. Genomic clones containing full-length copies of the genes MdHexL1 and MdHexF1, encoding subunits of the larval and the adult female hexamerin, respectively, were isolated. Complete nucleotide sequences, including the 5′-end untranscribed regions, were determined and analyzed for each of the genes. Comparisons of the conceptual translation products of the cloned genes indicated that MdHexL1 and MdHexF1 are related to the larval serum proteins (LSP) 1 and 2 of Calliphora vicina and Drosophila melanogaster. DNA fragments containing the putative promoters of the two hexamerin genes were compared and cloned into a plasmid vector so as to drive the expression of the GFP reporter gene. The constructs were assayed in vitro in transfected S2 Drosophila melanogaster cells demonstrating that the cloned M. domestica DNA fragments exhibit promoter activity.
Introduction
The insect fat body participates in multiple biochemical and physiological functions including intermediate metabolism, detoxification, communication and immune responses, and it is a major site for synthesis and storage of carbohydrate, lipid and nitrogenous components (Keeley, 1985). Although the demands for these functions remain the same throughout the insect life cycle, some fat body functions are specific to a particular developmental stage. These functions are hormonally-regulated and lead to distinct gene expression patterns that accommodate the needs of each developmental stage. Synthesis and accumulation of reserves during larval feeding stages, which are used during metamorphosis, and the prompt synthesis of large amounts of vitellogenin during reproductive periods in adult insects are examples of regulated functions fat bodies perform (Haunerland and Shirk, 1995; Raikhel et al., 1997; Miller et al., 2002). Hexamerins are the major products secreted by fat bodies into the insect hemolymph during larval development. These proteins have been identified in a wide range of insects and consist of high molecular-weight hexamers composed of homologous or heterologous subunits with average molecular weights of 80 kDa (Kanost et al., 1990; Telfer and Kunkel, 1991; Burmester et al., 1998). More than one type of hexamerin is found frequently in the hemolymph of insects, and in spite of the conserved quaternary structure, amino acid composition and sequence may differ among the types, possibly indicating that each one of them may fulfill a different developmental function.
Two groups of hexamerins have been characterized in the Diptera. The first group, represented by the prototypical Calliphorin of Calliphora vicina (Munn and Greville, 1969) and the D. melanogaster larval serum protein 1 (LSP-1) (Wolfe et al., 1977), is expressed exclusively during the larval stage and its major function appears to be to supply amino acids for the synthesis of adult tissues during metamorphosis (Levenbook and Bauer, 1984; Telfer and Kunkel, 1991). The second group, represented by the prototype D. melanogaster LSP-2 (Akam et al., 1978), is expressed in larvae, and it has been assumed that this protein performs a function similar to the LSP-1 during the larval developmental stage. Unlike LSP-1, LSP-2 is also expressed at a lower levels in adult flies, but its function during this stage is not known (Benes et al., 1996).
Two hexamerins have been identified in house fly M. domestica (de Bianchi et al., 1983; Capurro et al., 1997). However, unlike what is observed in other insects in which both hexamerins are expressed in the larval stage, the M. domestica larval hexamerin (Hex-L) is expressed exclusively in larvae while a second hexamerin, designated female hexamerin (Hex-F), is expressed exclusively during the vitellogenic stages of adult females (Capurro et al., 2000). Native Hex-L and Hex-F are hexamers composed of multiple subunit types encoded by multigenic families (Capurro et al., 2000; Moreira et al., 2003)
We describe here the cloning and sequencing of two M. domestica hexamerin genes, one encoding a Hex-L subunit and the other encoding a Hex-F subunit. Their DNA sequences as well as their conceptual translation products were analyzed and compared to other dipteran hexamerin genes and proteins. Several putative regulatory sequences present at the 5′ untranscribed portion of the genes were identified and functional assays demonstrated that these DNA sequences exhibit promoter function.
Materials and Methods
Insects
Musca domestica of the UCR strain (Mullens, 1985) was obtained from the Department of Entomology, University of California, Riverside. Larvae were fed with a mixture of oat, alfalfa, powdered non-fat milk and yeast (4:2.5:0.5:0.005, w/w/w/w). Adults were fed with 10% (w/v) sucrose, and powdered non-fat milk and sugar (1:1, w/w). Insects were maintained at 22°C, under a 12 h light-dark period.
Nucleic acid isolation and labeling protocols
Genomic DNA was isolated from individual adult flies by gently homogenizing them in 500 µl of TENT buffer (10 mM Tris-HCl, pH 7.4, containing 25 mM EDTA, 10 mM NaCl and 0.5% (v/v) Triton X-100). Samples were centrifuged at 5,000 × g for 5 min at 4°C. Supernatants were discarded and the pellets were resuspended in 500 µl of TENT and centrifuged again at 5,000 × g for 5 min at 4°C. Supernatants were discarded and the pellets were resuspended in 500 µl of TEN buffer (10 mM Tris-HCl, pH 7.4, containing 25 mM EDTA and 10 mM NaCl). Fifty µl of 10% SDS (w/v) and 25 µl of 25 mg/ml proteinase K were added and the samples were incubated at 37°C for 4 h. The DNA was extracted once with an equal volume of phenol, once with an equal volume of phenol, chloroform, isoamyl alcohol (24:24:1, v/v/v), and once with an equal volume of chloroform, isoamyl alcohol (24:1, v/v). The DNA was precipitated by the addition of sodium chloride to a final concentration of 0.3 M and 2.5 volumes of 100% ethanol, and incubated for 16 h at −20°C. The samples were centrifuged at 10,000 × g for 30 minutes at 4°C and the pellets were washed with 70% ethanol. The DNA was dried in a Speed Vac (Savant, www.savec.com/apr01/savant/sav-home.htm) and dissolved in TE buffer. Phage DNA was purified with the Lambda DNA maxi-prep kit (QIAGEN Genomics, www.qiagen.com).
For the screening of the library, 25 ng of double-stranded DNA were labeled by the random primers method (Feinberg and Volgestein, 1983) with 10 Ci of [32P]-dCTP (6,000 Ci/mmol, Amersham Pharmacia Biotech, www.apbiotech.com), using the Megaprime kit (Amersham Pharmacia Biotech). For the primer extension experiments, 10 pmols of oligonucleotides were incubated with 30 Ci [32P]-ATP (6,000 Ci/mmol, Amersham Pharmacia Biotech) and 10 U of T4 polynucleotide kinase (Promega, www.promega.com), in the appropriate buffer, for 1 h at 37°C.
Genomic library construction and screening
A genomic library of 4 × 105 recombinant clones was constructed with DNA from the UCR strain houseflies in the FixII/Xho I partial fill-in vector (Stratagene, La Jolla, USA). The library was screened with 32P-labeled cDNA that encode the larval or adult M. domestica hexamerin subunits. The radiolabeled probes for the screening were either a mixture of 7 MdHexL encoding cDNAs (Moreira et al., 2003, cDNAs 1-7) or a mixture of 4 MdHexF encoding cDNAs (Capurro et al., 1997, cDNAs F0, F1, F2 and F3). Growth of phage and transfer to nylon membranes were performed as described by Sambrook et al. (1989). The membranes were prehybridized for 5 min at 65°C in Church buffer (Church and Gilbert, 1984), and hybridized for 16 h at 60°C with 32P-labeled cDNAs. The membranes were washed twice for 15 min at 60°C in 20 mM Na2HPO4 containing 0.05% (v/v) orthophosphoric acid and 10% (w/v) SDS and exposed to a X-ray film (Hyperfilm, Amersham Pharmacia Biotech).
DNA sequencing and analysis
DNA was sequenced by the method of Sanger et al. (1977) using the Thermo Sequenase radiolabeled terminator cycle sequencing kit (Amersham Pharmacia Biotech) and both DNA strands were sequenced. Five hundred nanograms of double-stranded phage DNA, 0.225 Ci of [33P]-ddNTP (1,500 Ci/mmol, Amersham Pharmacia Biotech) and 2.5 pmoles of custom oligonucleotides were used in each reaction.
Protein and DNA databases searches were performed using BLAST (NCBI; Altschul et al., 1997). All alignments were obtained by Clustal W method using Megalign (DNAstar Inc., www.dnastar.com). The SignalP program (Nielsen et al., 1997) was used to determine the cleavage site of the gene products. Searches for glycosylation and phosphorylation sites were conducted at PROSITE (Bairoch et al., 1997).
Primer extension analysis
The primer extension reactions were carried out with the Primer extension system – AMV reverse transcriptase (Promega). Ten µg of total RNA isolated from fat body of third instar larvae at feeding stage, or from fat body of adult females at ovary stage S10 (Adams, 1974), were annealed with 200 fmols of the 32P-end labeled oligonucleotides HexL-ext (5′-TCTTTGGATTTTAGACTTC-3′) or HexF-ext (5′-GAAGCACTTCATCTTGCG-3′), respectively. DNA from the genomic clones R43a1 and R42a1 was used as template in the sequencing reactions with HexL-ext and HexF-ext oligonucleotides, as described above. The primer extension and sequencing reaction products were submitted to electrophoresis through 8% (w/v) polyacrylamide gel, containing 7 M urea, in 1X TBE. The gel was dried and exposed to X-ray film.
DNA constructs
A 878 base pairs (bp) fragment carrying the 5′-region of the gene MdHexL1, amplified with the oligonucleotides, GFP-L-Bam (5′-CGGGATCCGACTTCTACTTCTCGTATACTTGC-3′), and GFP-L-Eco (5′-GGAATTCCGTACTTATTAATTTGTTGCTG-3′), and a 698 bp fragment carrying the 5′-end of the gene MdHexF1 and amplified with the oligonucleotides, GFP-F-Bam (5′-CGGGATCCCTTGCGAAATCATACCCAACTACC-3′), and GFP-F-Eco (5′-GGAATTCGGTTGTACTTTTCGAGGGGCAGG-3′), were cloned in the pGreenPelican vector (Barolo et al., 2000). The resulting constructs were named pGreen-HexL and pGreen-HexF, respectively.
Drosophila melanogaster S2 cells culture and transfection
S2 cells from D. melanogaster (Schneider's Drosophila line 2; Schneider, 1972) were maintained at 25°C in Shield and Sang medium (Sigma, St. Louis, USA), with 10% (w/v) fetal bovine serum (Gibco-BRL, Life Technologies, www.lifetech.com). For transformation, 5 × 105 cells/60 mm plate were seeded 1 h before the experiment.
The cells were transformed according to a modified procedure described by Chen and Okayama (1987). Five g of vector were dissolved in 185 µl of 250 mM CaCl2. The same volume of 2 × HEBS (16 g NaCl, 0.7 g KCl, 0.4 g Na2PO4, 2 g d-glucose, 10 g Hepes acid free, H2O to 1 µl, pH set at 7.1 with NaOH) was added drop wise into the DNA solution. This mixture was added drop wise to the plates containing the cells in a final volume of 5 ml of culture medium, and they were incubated for 30 h at 25°C. The medium was then replaced and the incubation was continued for an additional two or three days. GFP expression was verified by the observation of cells in a Zeiss Axioskop epifluorescence microscope equipped with a 63x oil immersion objective, a Photometrics Sensys cooled CCD camera and Chroma Technology filters. Images were merged and digitally enhanced using PathVysion software (Applied Imaging, www.appliedimagingcorp.com).
Results
The screening of a M. domestica genomic library with Hex-L or Hex-F cDNA probes resulted in the identification of two clones, one of them containing a complete gene that encodes one of the subunits of the Hex-L, and another containing a complete gene encoding one of the subunits of Hex-F (Figure 1){ label needed for fig[@id='i1536-2442-4-2-1-f102'] }. The identified Hex-L gene, named MdHexL1 (GenBank AY256680) is composed of two exons of 210 and 2,133 bp, separated by one small 61 bp intron. The conserved GT-AG nucleotide pairs are present at the predicted splice junctions. The transcription start site for this gene was identified by primer extension (Figure 2A). Several oligonucleotides were used to prime the cDNA synthesis, however, every primer located within the coding region of the gene resulted in several products (results not shown). Those results were probably due to the annealing of the primer to several distinct Hex-L mRNA molecules that had conserved sequences within their coding regions. The primer HexL-ext used for the experiment shown in Figure 2 was based in the 5′UTR of the gene. The identified transcription start site is further supported by the presence of a TATA motif and other basic promoter elements in the surrounding DNA sequence. The translation initiation codon is located at position +63, and is preceded by three purines, typical of eukaryotic translation start sites (Kozak, 1984). A stop codon TAA is situated at position +2,467, followed by three polyadenylation signals, AATAAA. Conceptual translation predicts a polypeptide of 781 amino acids, with a putative 18 amino acids secretory signal peptide present at the amino terminal portion of the molecule. The theoretical molecular mass of the secreted Hex-L subunit is 94,834 Da with an isoelectric point of 4.85. The amino acid composition of the encoded polypeptide shows a high content of aromatic residues (11.1% phenylalanine and 13.4% tyrosine). Putative glycosylation and phosphorylation sites as well as the conserved motifs ADKDFLXKQK (position 27; Gordadze et al., 1999) and TMMRDPMFY (position 480; PROSITE, Bairoch et al., 1997), found in several insect hexamerins, were identified in the Hex-L amino acid sequence. The deduced M. domestica Hex-L protein sequence has 70% identity with C. vicina LSP-1 (GenBank M76480) and 62% identity with the D. melanogaster Lsp-1 subunit (GenBank U63556).
Analysis of the MdHexL1 gene 5′ untranscribed DNA sequence revealed elements characteristic of RNA polymerase II-transcribed promoters and several putative regulatory motifs. A TATA motif is found at position −32, the arthropod transcription initiation motif TCAGC (Cherbas and Cherbas, 1993) is present at the determined capsite, and two putative downstream promoter elements (DPE), AGAAGT (Kutach and Kadonaga, 2000), are present in tandem at position +36. A GATA motif is found at position −77, and two putative ecdysone responsive elements (EcRE) are located at nucleotides −52 and −148. No direct or inverted repeats were identified within the analyzed sequence, except for a palindrome ATTAAAATTTTAAT at position −420. The sequence ATAAATTGGCACCAACAA at position −135, adjacent to one of the putative EcRE, is identical to a motif found in the promoter of the D. melanogaster Lsp-1 gene and the sequence ATCACAACA at position −292 is almost identical, except for one position, to a motif found in the Sarcophaga peregrina hexamerin promoter that was shown to be the binding site of a regulatory DNA-binding protein (Kim et al., 1991).
The cloned Hex-F gene was named MdHexF1 (GenBank AY256681). The MdHexF1 mRNA is encoded by a single exon of 2,094 bp (Figure 1B). Primer extension experiments determined the transcription start site for this gene (Figure 2B). As described for the MdHexL1 gene, multiple DNA fragments were obtained when oligonucleotides located within the coding region of the gene were used as primers for the reactions (results not shown), however a single start site was defined when a oligonucleotide located at the 5′ untranslated portion of the mRNA was used. The translation initiation codon for this gene is located at position +13, preceded by three purines, and the stop codon, TAA, is at position +2,107, followed by three polyadenylation signals. The gene encodes a polypeptide of 698 amino acids and has a predicted secretory signal peptide of 18 amino acids located at the amino terminal end. The theoretical molecular mass of the secreted product is 79,408 Da and the isoelectric point is 5.09. Analysis of the amino acid composition showed that the protein contains 17% of aromatic amino acids (tyrosine plus phenylalanine) and 0.1% of methionine. The search of public data bases for amino acid identity of this hypothetical translation product with previously known proteins, retrieved LSP-2 type hexamerins, C. vicina LSP-2 (65% identity; GenBank U89789) and D. melanogaster LSP-2 (57% identity; GenBank X97770). Putative sites for glycosylation and phosphorylation were identified in the MdHexF1 product, as well as the conserved sequences, ADKFLXKQK and TSLRDPLFY, typical of insect hexamerins, located at positions 27 and 400, respectively.
The DNA sequence located at the 5′-end of the gene (Figure 1B) contains a typical TATA box at position −32, two putative downstream promoter elements (DPE) at positions +17 and +36 and a GATA motif at position −171 (Figure 1B). A single putative EcRE was identified at position −195. A matrix-generating program did not identify any direct or inverted repeated sequences. The comparison of this region with the promoters of other dipteran hexamerins showed that a 15 bp sequence, GTATGATTTCGCAAG, located immediately to the 5′-end of the translation initiation site (ATG), is identical to the sequence found in the C. vicina LSP-2 gene (GenBank U89789). The comparison of the MdHexL and MdHexF clones revealed an identical sequence of 16 nucleotides, AAGCAAAGATTATTTTT, present in their 5′ untranscribed regions, at positions −259 and −156, respectively. One partial and likely inactive mariner-like element was identified between the positions −940 and −739 (GenBank D89934).
The genomic clone containing the complete MdHexF1 gene also contained part of a second HexF gene, named MdHexF2 (GenBank AY258291). The cloned and sequenced portion of the MdHexF2 gene corresponds to its 3′-end and its coding region, 506 bp long, shares 97.7% nucleotide identity with the corresponding region in the MdHexF1 gene.
An assay in vitro was used to test the 5′ untranscribed sequences of the cloned M. domestica hexamerin genes for promoter activity. D. melanogaster S2 cells were transfected with the pGreenPelican reporter plasmid in which a DNA fragment from position +49 to position −830 (878 nt) of the MdHexL1 gene or a DNA fragment extending from position +12 to −687 (698 nt) of the MdHexF1 gene was cloned upstream of the EGFP reporter gene. The pGreen Pelican plasmid containing the D. melanogaster actin-5C promoter was used as positive control, and pGreenPelican without inserted DNA was used as a negative control for the experiments. Following transfection, expression of GFP was intense in the cells containing the pGreenPelican with actin promoter (Figure 3). Expression of GFP was also observed at a lower level in the cells containing pGreen Pelican with the HexL and HexF promoters. Only background fluorescence was observed in non-transfected and pGreenPelican transfected cells.
Discussion
The common housefly, M. domestica, is an insect of medical and veterinary importance worldwide. It is the most familiar nuisance pest and can cause human and animal myiasis. Moreover its biology and ecology makes it an ideal mechanical vector for human and animal pathogens, including viruses, bacteria, protozoan cysts and helminth eggs (Sukontason et al., 2000; Graczyk et al., 2001). Because of its public health importance, M. domestica has been the target of many control programs, which involve high financial costs (Lazarus et al., 1989). A better understanding of the mechanisms that regulate gene expression in this organism may lead to the development of alternative strategies to control this insect.
We have studied several of the M. domestica hemolymph proteins and their involvement in insect development and reproduction. Two distinct proteins belonging to the hexamerin family were identified and characterized. Hex-L is expressed exclusively during larval development, while Hex-F is expressed specifically during the adult stage and preferentially during oogenesis in females (Capurro et al., 2000). The mechanisms that control expression of developmentally-regulated genes in holometabolous insects have been the subject of research for many years, mostly because in these systems metamorphosis clearly separates sets of larval and adult specifically-expressed genes. For a better understanding of the mechanisms that regulate the synthesis of the larval specific and adult specific M. domestica hexamerins, genes encoding subunits of Hex-L and Hex-F were characterized.
The conceptual translation products of each cloned gene were compared to other protein sequences deposited in the public databases. The search for identity showed that the gene MdHexL1 is closely related to the C. vicina and D. melanogaster LSP-1 type hexamerins while the genes MdHexF1 and MdHexF2 are similar to C. vicina and D. melanogaster LSP-2 type hexamerins.
Analyses of the predicted secondary structure of the polypeptides encoded by MdHexL1 and MdHeF1 revealed that the positions of several -helixes and -sheets, and also of the three protein domains described for proteins belonging to the hemocyanin superfamily are highly conserved in the house fly hexamerins (results not shown). Hemocyanins and hexamerins are members of the same superfamily and it has been estimated that hemocyanins from primitive crustaceans diverged more than 360 million years ago giving rise to the insect hexamerins (Beintema et al., 1994; Burmester and Scheller, 1996; Burmester, 2001). Structural similarities between hemocyanins and hexamerins also were observed for C. vicina LSP-1 (Markl et al., 1992), D. melanogaster LSP-1 and LSP-2 (Massey et al., 1997; Mousseron-Grall et al., 1997) and Aedes aegypti and Anopheles gambiae LSP-1 and LSP-2 (Zakharkin et al., 1997; Gordadze et al., 1999). The motifs ADKDFLXKQK and TMMRDPMFY, conserved among hexamerins and hemocyanins, were identified in the Hex-L and Hex-F deduced amino acid sequences. In arthropod hemocyanins, these motifs were shown to play a structural role (Hazes et al., 1993).
Although hexamerins and hemocyanins are structurally similar, the organization of the genes that encode these proteins is different (reviewed by Markl et al., 1992). For example, genes that encode arachnidan hemocyanins have eight introns that correspond to 96% of the total gene nucleotides, while insect hexamerins have fewer and smaller introns. The genes that encode hexamerins of the Lepidoptera Bombyx mori and Manduca sexta contain four introns each, and these correspond to 60% of the total number of base pairs (Burmester et al., 1998). Dipteran LSP-1 type hexamerin genes, including the M. domestica MdHexL1, have only one intron that corresponds to 2.5% of the gene, while the LSP-2 type genes, including the MdHexF1, have no introns.
The nucleotide sequences at the 5′-end of the translation initiation site of MdHexL1 and MdHexF1 contain typical promoter elements and putative regulatory sequences. The core promoter of MdHexL1 gene is composed of a TATA motif, the Initiator (Inr) (Cherbas and Cherbas, 1993) and DPEs (Kutach and Kadonaga, 2000) sites, while the core promoter of MdHexF1 contains a TATA motif and DPEs, lacking a typical Inr. GATA motifs are found at the 5′ of the TATA boxes of the M. domestica hexamerin genes. Promoters of several genes expressed specifically in the fat bodies of insects, including those encoding the D. melanogaster and Aedes atropalpus hexamerins, contain binding sites for the GATA factors (Abel et al., 1993; Petersen et al., 1999; Attardo et al., 2003; Delaney et al., 1986; Benes et al., 1996; Zakharkin et al., 2001). Putative EcREs also were found in the promoters of the MdHexL1 and MdHexF1 genes. In insects, 20-hydroxyecdysone is involved in several physiological processes including molting, metamorphosis and reproduction. Ecdysone signaling is mediated by specific nuclear receptors that are able to bind the target DNA sequences. This nuclear receptor is a dimer composed of an ecdysone receptor (EcRE) (Koelle et al., 1991) and the USP protein (Thomas et al., 1993; Yao et al., 1992, 1993), which binds to ecdysone-responsive elements in the promoter of responsive genes.
The mechanisms that control the expression of the hexamerin genes are not well understood, but there is evidence that ecdysone and juvenile hormone play some role. In C. vicina, differences in the titer of ecdysone and juvenile hormone were correlated temporally with the activation and repression of LSP synthesis (Scheller et al., 1990; Fischer and Scheller, 1992). In D. melanogaster, LSP synthesis is regulated by ecdysone (Powel et al., 1984). A functional EcRE was identified in the LSP-2 encoding gene and it was shown that it is responsible, together with other promoter elements, for the expression in both larvae and adults (Benes et al., 1996). In Lepidoptera, these hormones also influence hexamerin synthesis but no EcRE has been identified in their promoters (Webb and Riddiford, 1988; Jones et al., 1990, 1993; Memmel et al., 1994). The action of juvenile hormone on the modulation of gene expression is unknown, however there are suggestions of a juvenile hormone nuclear receptor, which would bind to a specific DNA sequence (Zhang et al., 1996; Davey, 2000).
Besides the above described DNA sequences, some other sequences present in the 5′-UTR of the M. domestica hexamerin genes deserve attention. MdHexL1 has a 18 bp sequence identical to one found in the D. melanogaster LSP-1 promoter and another sequence similar to a regulatory site mapped in the S. peregrina hexamerin promoter, while in the gene MdHexF1, a 15 bp sequence identical to one present in C. vicina LSP-2 promoter was found. There is also a 16 bp sequence found in both M. domestica hexamerin genes. Although no function has been described for these sequences, their presence in both genes suggests that it may have some regulatory activity.
A partial inactive mariner element was identified in the 5′-UTR of MdHexF1, 900 bp to the 5′-end of the ATG. Being located at about 1 kb from the 5′-end of the gene's translation start site, it probably does not affect gene expression. Inactive mariner-like transposable elements are widely distributed in arthropods and thousands of copies can be found in the genome of insects. Most of these copies contain mutations and/or deletions and represent transposition events that were fixed during evolution (Atkinson and James, 2002).
The 5′-end untranscribed sequences of MdHexL1 and MdHexF1 were able to drive the expression of GFP, defining them as functional promoters. While transfection into cultured cells does not assay for tissue- or stage-specific control, it indicated that the cloned hexamerin promoters are capable of inducing constitutive expression of a reporter gene. Similar assays in vitro were successfully used for other genes, including the studies of mosquito salivary gland specifically-expressed genes (Coates et al., 1999). The mariner and piggyBac transposable elements were successfully used for stable genetic transformation of the housefly M. domestica (Yoshiyama et al., 2000; Hediger et al., 2001) and this technique will allow further functional analysis of the cloned promoters.
Acknowledgments
The authors thank Lucy Cherbas for the Actin-GFP plasmid, Jim Posakony for the Green Pelican P-element vector and Lynn Olson for help in typing the manuscript. A.G.de B. and M.de L.C. are research fellows from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). This work was financially supported by grants from FAPESP and NIH USA (AI29746).
References
Figure 1A. Nucleotide and amino acid sequences of the MdHexL1 hexamerin gene of Musca domestica. Schematic representation of the genomic clone containing the MdHexL1 gene, with the coding region indicated by a black box (Top). Complete nucleotide sequence of the MdHexL1 gene coding region and putative promoter (Bottom). The underlined sequences in the promoter region indicate the ecdysone responsive elements in green, the TATA motif in blue, and the transcription start site in red. The two conserved hexamerin amino acid motifs are underlined in purple. The intron sequence is shown in lower-case letters. Polyadenylation signals are underlined in brown. GenBank accession number: MdHexL1=AY256680.

{ label needed for fig[@id='i1536-2442-4-2-1-f102'] }
Figure 1B. Nucleotide and amino acid sequences of the MdHexF1 hexamerin gene of Musca domestica. Schematic representation of the genomic clone containing the MdHexF genes (Top). Complete nucleotide sequence of the MdHexF1 gene coding region and putative promoter (Bottom). Promoter elements, polyadenylation signals and conserved amino acid sequences are indicated as for (A). The partial mariner element is between positions −940 and −739. GenBank accession numbers: MdHexF1=AY256681 and MdHexF2= AY258291.

Figure 2.
Primer extension identification of transcription start sites of the MdHexL1 (A) and MdHexF1 (B) genes of Musca domestica. Lanes A,C,G and T correspond to the DNA sequence reactions in which the dideoxynucleotides A,C,G and T were used respectively and lanes R indicate the primer extension reactions (note that the nucleotide sequences, 5′ at the top and 3′ at the bottom, represent the complementary DNA strand, so that it may be read directly as in figures 1A and 1B). The nucleotides indicated in bold correspond to the transcription start site.

Figure 3.
In vitro assays for promoter activity in the cloned 5′-untranscribed sequences of two Musca domestica hexamerin genes. S2 cells were transfected with the reporter plasmids and GFP expression was monitored by observation of the cultures in an inverted microscope under UV light. Transfections were conducted with 1) pGreenPelican-HexL: MdHexL1 promoter cloned into pGreenPelican; 2) pGreen-HexF: MdHexF1 promoter cloned into pGreenPelican; 3) pGreenPelican vector without insert; 4) pGreen-Actin-5C: Drosophila melanogaster actin-5C promoter cloned into pGreenPelican. A and B indicate different magnifications of the transfected cells.
