Nonmethylated transposable elements and methylated genes in a chordate genome. (1/590)

The genome of the invertebrate chordate Ciona intestinalis was found to be a stable mosaic of methylated and nonmethylated domains. Multiple copies of an apparently active long terminal repeat retrotransposon and a long interspersed element are nonmethylated and a large fraction of abundant short interspersed elements are also methylation free. Genes, by contrast, are predominantly methylated. These data are incompatible with the genome defense model, which proposes that DNA methylation in animals is primarily targeted to endogenous transposable elements. Cytosine methylation in this urochordate may be preferentially directed to genes.  (+info)

Exon shuffling by L1 retrotransposition. (2/590)

Long interspersed nuclear elements (LINE-1s or L1s) are the most abundant retrotransposons in the human genome, and they serve as major sources of reverse transcriptase activity. Engineered L1s retrotranspose at high frequency in cultured human cells. Here it is shown that L1s insert into transcribed genes and retrotranspose sequences derived from their 3' flanks to new genomic locations. Thus, retrotransposition-competent L1s provide a vehicle to mobilize non-L1 sequences, such as exons or promoters, into existing genes and may represent a general mechanism for the evolution of new genes.  (+info)

The age and evolution of non-LTR retrotransposable elements. (3/590)

A comprehensive phylogenetic analysis was conducted of non-long-terminal-repeat (non-LTR) retrotransposons based on an extended sequence alignment of their reverse transcriptase (RT) domain. The 440 amino acid positions used included a region proposed to be similar to the "thumb" of the right-handed RT structure found in retroviruses. All identified non-LTR elements could be grouped into 11 distinct clades. Using the rates of sequence change derived from studies of the vertical inheritance of R1 and R2 elements in arthropods as a comparison, we found no evidence for the horizontal transmission of non-LTR elements. Assuming vertical descent, the phylogeny suggested that non-LTR elements are as old as eukaryotes, with each of the 11 clades dating back to the Precambrian era. The analysis enabled us to propose a simple chronology for the acquisition of different enzymatic domains in the evolution of the non-LTR class of retrotransposons. The first non-LTR elements were sequence specific by virtue of a restriction-enzyme-like endonuclease located downstream of the RT domain. Evolving from this original group were elements (eight clades) that acquired an apurinic-apyrimidic endonuclease-like domain upstream of the RT domain. Finally, four of these clades have inherited an RNase H domain downstream of the RT domain. The phylogenies of the AP endonuclease and RNase H domains were also determined for this report and are consistent with the monophyletic acquisition of these domains. These studies represent the most comprehensive effort to date to trace the evolution of a major class of transposable elements.  (+info)

Retropositional parasitism of SINEs on LINEs: identification of SINEs and LINEs in elasmobranchs. (4/590)

Some previously unidentified short interspersed repetitive elements (SINEs) and long interspersed repetitive element (LINEs) were isolated from various higher elasmobranchs (sharks, skates, and rays) and characterized. These SINEs, members of the HE1 SINE family, were tRNA-derived and were widespread in higher elasmobranches. The 3'-tail region of this SINE family was strongly conserved among elasmobranchs. The LINEs, members of the HER1 LINE family, encoded an amino acid sequence similar to that encoded by the chicken CR1 LINE family, and they contained a strongly conserved 3'-tail region in the 3' untranslated region. This tail region of the HER1 LINE family was almost identical to that of the HE1 SINE family. Thus, the HE1 SINE family and the HER1 LINE family provide a clear example of a pair of SINEs and LINEs that share the same tail region. Conservation of the secondary structures of the tail regions, as well as of the nucleotide sequences, between the HE1 SINE family and HER1 LINE family during evolution suggests that SINEs utilize the enzymatic machinery for retroposition of LINEs through the recognition of higher-order structures of the conserved 3'-tail region. A discussion is presented of the parasitism of SINEs on LINEs during the evolution of these retroposons.  (+info)

Significant differences in the frequency of transcriptional units, types and numbers of repetitive elements, GC content, and the number of CpG islands between a 1010-kb G-band genomic segment on chromosome 9q31.3 and a 1200-kb R-band genomic segment on chromosome 3p21.3. (5/590)

We determined the nucleotide sequence of the entire 1,010,525-bp insert contained in CEPH YAC clone 867e8. This human genomic segment was derived from chromosome 9q31.3 and corresponds to a G-band region. We compared this segment, in terms of structure, with a previously characterized 1,201,033-bp sequence in CEPH YAC936c1 that had come from a portion of human chromosome 3p21.3 corresponding to an R-band region. The two segments were significantly different with respect to the frequency of transcriptional units, the types and numbers of repetitive elements present, their GC content, and the number of CpG islands. Alu elements, GC content, and CpG islands all showed positive correlations with the abundance of exons, but the distribution of LINE1s did not. These observations might reflect an influence of the first three of these features on the functions or expression of genes in the respective regions. In addition to a novel gene (F36) lying at the centromeric end of the 9q segment, we found a cluster of placenta-specific genes within a small section (about 400 kb) on the telomeric side of YAC867e8. This cluster consisted of four apparently unrelated ESTs and two genes, pregnancy-associated plasma protein-A (PAPP-A) and a novel gene (tentatively named EST-YD1). Our characterization of the two chromosomal regions provided evidence that genes are not evenly distributed throughout the human genome, and that gene richness is correlated with the GC content and with the frequency of either Alu elements or CpG islands.  (+info)

A double-strand break in a chromosomal LINE element can be repaired by gene conversion with various endogenous LINE elements in mouse cells. (6/590)

A double-strand break (DSB) in the mammalian genome has been shown to be a very potent signal for the cell to activate repair processes. Two different types of repair have been identified in mammalian cells. Broken ends can be rejoined with or without loss or addition of DNA or, alternatively, a homologous template can be used to repair the break. For most genomic sequences the latter event would involve allelic sequences present on the sister chromatid or homologous chromosome. However, since more than 30% of our genome consists of repetitive sequences, these would have the option of using nonallelic sequences for homologous repair. This could have an impact on the evolution of these sequences and of the genome itself. We have designed an assay to look at the repair of DSBs in LINE-1 (L1) elements which number 10(5) copies distributed throughout the genome of all mammals. We introduced into the genome of mouse epithelial cells an L1 element with an I-SceI endonuclease site. We induced DSBs at the I-SceI site and determined their mechanism of repair. We found that in over 95% of cases, the DSBs were repaired by an end-joining process. However, in almost 1% of cases, we found strong evidence for repair involving gene conversion with various endogenous L1 elements, with some being used preferentially. In particular, the T(F) family and the L1Md-A2 subfamily, which are the most active in retrotransposition, appeared to be contributing the most in this process. The degree of homology did not seem to be a determining factor in the selection of the endogenous elements used for repair but may be based instead on accessibility. Considering their abundance and dispersion, gene conversion between repetitive elements may be occurring frequently enough to be playing a role in their evolution.  (+info)

Structural and functional analysis of the promoter of a mouse gene encoding an androgen-regulated protein (MSVSP99). (7/590)

MSVSP99 (mouse seminal vesicle secretory protein of 99 amino acids) is a member of the rat and mouse seminal vesicle secretory protein family. The gene encoding MSVSP99 is under androgenic control and we demonstrate here that this regulation involves a complex interplay of positive and negative regions. First, we show that the promoter region (-387/+16) sufficient to mediate a full androgen induction is a complex enhancer organized in two regulatory regions. These two regions are inactive individually and must act together to confer a 40-fold androgen induction to the MSVSP99 gene and androgen responsiveness is not only dependent on the presence of functional androgen response element (ARE) sequences but results from complex cooperations between ARE and non-ARE sequences forming an androgen response unit. Secondly, we characterized a new regulatory region (-824/-632) that decreases androgen-dependent transcriptional activity of the MSVSP99 promoter. This region, also able to repress the transcriptional activity of the heterologous thymidine kinase promoter, contains a functional promoter on the inverted strand (-826 to -387) and we identified a transcription initiation site located at position -639 with respect to the cap site of the MSVSP99 promoter. Sequence analysis of the flanking DNA also revealed that the MSVSP99 gene is surrounded by long interspersed repeated sequences called LINEs.  (+info)

MosquI, a novel family of mosquito retrotransposons distantly related to the Drosophila I factors, may consist of elements of more than one origin. (8/590)

A novel family of non-long-terminal-repeat (non-LTR) retrotransposons, named MosquI, was discovered in the yellow fever mosquito, Aedes aegypti. There were approximately 14 copies of MosquI in the A. aegypti genome. Four of the five analyzed MosquI elements were truncated at the 5' ends while one of them, MosquI-Aa2, was full-length. All five MosquI elements ended with 4-10 TAA tandem repeats, as the Drosophila I factors do. Interestingly, MosquI elements were often found near genes and other repetitive elements. The 6,623-bp MosquI-Aa2 contained two open reading frames (ORFs) flanked by a 404-bp 5' untranslated region and a 326-bp 3' untranslated region. The two ORFs code for nucleocapsids, endonuclease, reverse transcriptase, and RNase H domains. Although overall structural and sequence comparisons suggest that MosquI is highly similar to the Drosophila I factors, phylogenetic analysis based on the reverse transcriptase domains of 40 non-LTR retrotransposons indicate that MosquI and I factors are likely paralogous elements which may have been separated before the split between the ancestors of mollusca and arthropoda. Pairwise comparisons between the four truncated MosquI elements showed 96.7%-99.5% identity at the nucleotide level, while comparisons between the full-length MosquI-Aa2 and the truncated copies showed only 80.2%-81.8% identity. These comparisons and preliminary phylogenetic analyses suggest that the full-length and truncated MosquI elements may belong to two subfamilies originating from two source genes that diverged a long time ago. In contrast to the defective I factors in Drosophila melanogaster, which are likely very old components of the genome, the truncated MosquI elements seem to have been recently active. Finally, the genomic distribution and evolution of MosquI elements are analyzed in the context of other non-LTR retrotransposons in A. aegypti.  (+info)