Exon shuffling by L1 retrotransposition. (9/8470)

Long interspersed nuclear elements (LINE-1s or L1s) are the most abundant retrotransposons in the human genome, and they serve as major sources of reverse transcriptase activity. Engineered L1s retrotranspose at high frequency in cultured human cells. Here it is shown that L1s insert into transcribed genes and retrotranspose sequences derived from their 3' flanks to new genomic locations. Thus, retrotransposition-competent L1s provide a vehicle to mobilize non-L1 sequences, such as exons or promoters, into existing genes and may represent a general mechanism for the evolution of new genes.  (+info)

Genome-wide screen for systemic lupus erythematosus susceptibility genes in multiplex families. (10/8470)

Systemic lupus erythematosus (SLE) is the prototype of human autoimmune diseases. Its genetic component has been suggested by familial aggregation (lambdas = 20) and twin studies. We have screened the human genome to localize genetic intervals that may contain lupus susceptibility loci in a sample of 188 lupus patients belonging to 80 lupus families with two or more affected relatives per family using the ABI Prism linkage mapping set which includes 350 polymorphic markers with an average spacing of 12 cM. Non-parametric multipoint linkage analysis suggests evidence for predisposing loci on chromosomes 1 and 18. However, no single locus with overwhelming evidence for linkage was found, suggesting that there are no 'major' susceptibility genes segregating in families with SLE, and that the genetic etiology is more likely to result from the action of several genes of moderate effect. Furthermore, the support for a gene in the 1q44 region as well as in the 1p36 region is clearly found only in the Mexican American families with SLE but not in families of Caucasian ethnicity, suggesting that consideration of each ethnic group separately is crucial.  (+info)

Genetic mapping of a maternal locus responsible for familial hydatidiform moles. (11/8470)

Hydatidiform mole (HM) is the product of an aberrant human pregnancy in which there is an abnormal embryonic development and proliferation of placental villi. The incidence of HM varies between ethnic groups, and occurs in 1 in every 1500 pregnancies in the USA. All HM cases are sporadic, except for extremely rare familial cases. The exact mechanisms leading to molar pregnancies are unknown. We previously postulated that women with recurrent hydatidiform moles are homozygous for an autosomal recessive defective gene. To map this gene genetically, we initiated a genome-wide scan with highly polymorphic short tandem repeats in individuals from two families with recurrent HM. Here, we demonstrate that a defective maternal gene is responsible for recurrent HM. This gene resides on chromosome 19q13.3-13.4 in a 15.2 cM interval flanked by D19S924 and D19S890. The identification of a gene for HM adds new insights into the molecular genetics of early embryogenesis and may be relevant to the large number of patients with sporadic HM.  (+info)

A genome search identifies major quantitative trait loci on human chromosomes 3 and 4 that influence cholesterol concentrations in small LDL particles. (12/8470)

Small, dense LDL particles are associated with increased risk of cardiovascular disease. To identify the genes that influence LDL size variation, we performed a genome-wide screen for cholesterol concentrations in 4 LDL size fractions. Samples from 470 members of randomly ascertained families were typed for 331 microsatellite markers spaced at approximately 15 cM intervals. Plasma LDLs were resolved by using nondenaturing gradient gel electrophoresis into 4 fraction sizes (LDL-1, 26.4 to 29.0 nm; LDL-2, 25.5 to 26.4 nm; LDL-3, 24.2 to 25.5 nm; and LDL-4, 21.0 to 24.2 nm) and cholesterol concentrations were estimated by staining with Sudan Black B. Linkage analyses used variance component methods that exploited all of the genotypic and phenotypic information in the large extended pedigrees. In multipoint linkage analyses with quantitative trait loci for the 4 fraction sizes, only LDL-3, a fraction containing small LDL particles, gave peak multipoint log10 odds in favor of linkage (LOD) scores that exceeded 3.0, a nominal criterion for evidence of significant linkage. The highest LOD scores for LDL-3 were found on chromosomes 3 (LOD=4.1), 4 (LOD=4.1), and 6 (LOD=2.9). In oligogenic analyses, the 2-locus LOD score (for chromosomes 3 and 4) increased significantly (P=0.0012) to 6.1, but including the third locus on chromosome 6 did not significantly improve the LOD score (P=0.064). Thus, we have localized 2 major quantitative trait loci that influence variation in cholesterol concentrations of small LDL particles. The 2 quantitative trait loci on chromosomes 3 and 4 are located in regions that contain the genes for apoD and the large subunit of the microsomal triglyceride transfer protein, respectively.  (+info)

Analysis of sequence-tagged-connector strategies for DNA sequencing. (13/8470)

The BAC-end sequencing, or sequence-tagged-connector (STC), approach to genome sequencing involves sequencing the ends of BAC inserts to scatter sequence tags (STCs) randomly across the genome. Once any BAC or other large segment of DNA is sequenced to completion by conventional shotgun approaches, these STC tags can be used to identify a minimum tiling path of BAC clones overlapping the nucleation sequence for sequence extension. Here, we explore the properties of STC-sequencing strategies within a mathematical model of a random target with homologous repeats and imperfect sequencing technology to understand the consequences of varying various parameters on the incidence of problem clones and the cost of the sequencing project. Problem clones are defined as clones for which either (A) there is no identifiable overlapping STC to extend the sequence in a particular direction or (B) the identified STC with minimum overlap comes from a nonoverlapping clone, either owing to random false matches or repeat-family homology. Based on the minimum overlap, we estimate the number of clones to be entirely sequenced and, then, using cost estimates, identify the decision rule (the degree of sequence similarity required before a match is declared between an STC and a clone) to minimize overall sequencing cost. A method to optimize the overlap decision rule is highly desirable, because both the total cost and the number of problem clones are shown to be highly sensitive to this choice. For a target of 3 Gb containing approximately 800 Mb of repeats with 85%-90% identity, we expect <10 problem clones with 15 times coverage by 150-kb clones. We derive the optimal redundancy and insert sizes of clone libraries for sequencing genomes of various sizes, from microbial to human. We estimate that establishing the resource of STCs as a means of identifying minimally overlapping clones represents only 1%-3% of the total cost of sequencing the human genome, and, up to a point of diminishing returns, a larger STC resource is associated with a smaller total sequencing cost.  (+info)

CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs. (14/8470)

A 65-bp "core" sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3' ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome.  (+info)

Periodical distribution of transcription factor sites in promoter regions and connection with chromatin structure. (15/8470)

Nucleosomes regulate transcriptional initiation when positioned in the promoter area. This may require the transcription factor (TF) sites to be correlated with the nucleosome positions and phased on the nucleosome surface. If this is the case, one would expect a periodical distribution of TF sites in the vicinity of promoters, with the nucleosomal period of 10.1-10.5 bp. We examined the distributions of putative binding sites of 323 different TFs along 1, 057 sequences of the Eukaryotic Promoter Database (release 50) [Cavin Perier, R., Junier, T. & Bucher, P. (1998) Nucleic Acids Res. 26, 353-357] and of 218 TFs on 673 sequences of the Lead Exon Database of human promoter sequences. We obtained a statistically significant overrepresentation of TF sites distributed with the main period of 10.1-10.5 bp in the region -50 to +120 around the transcription start site and in few locations nearby. Correlation of the positioning of the TF sites with the nucleosomes is further reinforced by sequence-directed mapping of the nucleosomes, a method previously developed.  (+info)

Comparative genomics and host resistance against infectious diseases. (16/8470)

The large size and complexity of the human genome have limited the identification and functional characterization of components of the innate immune system that play a critical role in front-line defense against invading microorganisms. However, advances in genome analysis (including the development of comprehensive sets of informative genetic markers, improved physical mapping methods, and novel techniques for transcript identification) have reduced the obstacles to discovery of novel host resistance genes. Study of the genomic organization and content of widely divergent vertebrate species has shown a remarkable degree of evolutionary conservation and enables meaningful cross-species comparison and analysis of newly discovered genes. Application of comparative genomics to host resistance will rapidly expand our understanding of human immune defense by facilitating the translation of knowledge acquired through the study of model organisms. We review the rationale and resources for comparative genomic analysis and describe three examples of host resistance genes successfully identified by this approach.  (+info)