Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae. (1/3175)

Introns have typically been discovered in an ad hoc fashion: introns are found as a gene is characterized for other reasons. As complete eukaryotic genome sequences become available, better methods for predicting RNA processing signals in raw sequence will be necessary in order to discover genes and predict their expression. Here we present a catalog of 228 yeast introns, arrived at through a combination of bioinformatic and molecular analysis. Introns annotated in the Saccharomyces Genome Database (SGD) were evaluated, questionable introns were removed after failing a test for splicing in vivo, and known introns absent from the SGD annotation were added. A novel branchpoint sequence, AAUUAAC, was identified within an annotated intron that lacks a six-of-seven match to the highly conserved branchpoint consensus UACUAAC. Analysis of the database corroborates many conclusions about pre-mRNA substrate requirements for splicing derived from experimental studies, but indicates that splicing in yeast may not be as rigidly determined by splice-site conservation as had previously been thought. Using this database and a molecular technique that directly displays the lariat intron products of spliced transcripts (intron display), we suggest that the current set of 228 introns is still not complete, and that additional intron-containing genes remain to be discovered in yeast. The database can be accessed at http://www.cse.ucsc.edu/research/compbi o/yeast_introns.html.  (+info)

Economic consequences of the progression of rheumatoid arthritis in Sweden. (2/3175)

OBJECTIVE: To develop a simulation model for analysis of the cost-effectiveness of treatments that affect the progression of rheumatoid arthritis (RA). METHODS: The Markov model was developed on the basis of a Swedish cohort of 116 patients with early RA who were followed up for 5 years. The majority of patients had American College of Rheumatology (ACR) functional class II disease, and Markov states indicating disease severity were defined based on Health Assessment Questionnaire (HAQ) scores. Costs were calculated from data on resource utilization and patients' work capacity. Utilities (preference weights for health states) were assessed using the EQ-5D (EuroQol) questionnaire. Hypothetical treatment interventions were simulated to illustrate the model. RESULTS: The cohort distribution among the 6 Markov states clearly showed the progression of the disease over 5 years of followup. Costs increased with increasing severity of the Markov states, and total costs over 5 years were higher for patients who were in more severe Markov states at diagnosis. Utilities correlated well with the Markov states, and the EQ-5D was able to discriminate between patients with different HAQ scores within ACR functional class II. CONCLUSION: The Markov model was able to assess disease progression and costs in RA. The model can therefore be a useful tool in calculating the cost-effectiveness of different interventions aimed at changing the progression of the disease.  (+info)

Multipoint oligogenic analysis of age-at-onset data with applications to Alzheimer disease pedigrees. (3/3175)

It is usually difficult to localize genes that cause diseases with late ages at onset. These diseases frequently exhibit complex modes of inheritance, and only recent generations are available to be genotyped and phenotyped. In this situation, multipoint analysis using traditional exact linkage analysis methods, with many markers and full pedigree information, is a computationally intractable problem. Fortunately, Monte Carlo Markov chain sampling provides a tool to address this issue. By treating age at onset as a right-censored quantitative trait, we expand the methods used by Heath (1997) and illustrate them using an Alzheimer disease (AD) data set. This approach estimates the number, sizes, allele frequencies, and positions of quantitative trait loci (QTLs). In this simultaneous multipoint linkage and segregation analysis method, the QTLs are assumed to be diallelic and to interact additively. In the AD data set, we were able to localize correctly, quickly, and accurately two known genes, despite the existence of substantial genetic heterogeneity, thus demonstrating the great promise of these methods for the dissection of late-onset oligogenic diseases.  (+info)

Machine learning approaches for the prediction of signal peptides and other protein sorting signals. (4/3175)

Prediction of protein sorting signals from the sequence of amino acids has great importance in the field of proteomics today. Recently, the growth of protein databases, combined with machine learning approaches, such as neural networks and hidden Markov models, have made it possible to achieve a level of reliability where practical use in, for example automatic database annotation is feasible. In this review, we concentrate on the present status and future perspectives of SignalP, our neural network-based method for prediction of the most well-known sorting signal: the secretory signal peptide. We discuss the problems associated with the use of SignalP on genomic sequences, showing that signal peptide prediction will improve further if integrated with predictions of start codons and transmembrane helices. As a step towards this goal, a hidden Markov model version of SignalP has been developed, making it possible to discriminate between cleaved signal peptides and uncleaved signal anchors. Furthermore, we show how SignalP can be used to characterize putative signal peptides from an archaeon, Methanococcus jannaschii. Finally, we briefly review a few methods for predicting other protein sorting signals and discuss the future of protein sorting prediction in general.  (+info)

Genome-wide linkage analyses of systolic blood pressure using highly discordant siblings. (5/3175)

BACKGROUND: Elevated blood pressure is a risk factor for cardiovascular, cerebrovascular, and renal diseases. Complex mechanisms of blood pressure regulation pose a challenge to identifying genetic factors that influence interindividual blood pressure variation in the population at large. METHODS AND RESULTS: We performed a genome-wide linkage analysis of systolic blood pressure in humans using an efficient, highly discordant, full-sibling design. We identified 4 regions of the human genome that show statistical significant linkage to genes that influence interindividual systolic blood pressure variation (2p22.1 to 2p21, 5q33.3 to 5q34, 6q23.1 to 6q24.1, and 15q25.1 to 15q26.1). These regions contain a number of candidate genes that are involved in physiological mechanisms of blood pressure regulation. CONCLUSIONS: These results provide both novel information about genome regions in humans that influence interindividual blood pressure variation and a basis for identifying the contributing genes. Identification of the functional mutations in these genes may uncover novel mechanisms for blood pressure regulation and suggest new therapies and prevention strategies.  (+info)

FORESST: fold recognition from secondary structure predictions of proteins. (6/3175)

MOTIVATION: A method for recognizing the three-dimensional fold from the protein amino acid sequence based on a combination of hidden Markov models (HMMs) and secondary structure prediction was recently developed for proteins in the Mainly-Alpha structural class. Here, this methodology is extended to Mainly-Beta and Alpha-Beta class proteins. Compared to other fold recognition methods based on HMMs, this approach is novel in that only secondary structure information is used. Each HMM is trained from known secondary structure sequences of proteins having a similar fold. Secondary structure prediction is performed for the amino acid sequence of a query protein. The predicted fold of a query protein is the fold described by the model fitting the predicted sequence the best. RESULTS: After model cross-validation, the success rate on 44 test proteins covering the three structural classes was found to be 59%. On seven fold predictions performed prior to the publication of experimental structure, the success rate was 71%. In conclusion, this approach manages to capture important information about the fold of a protein embedded in the length and arrangement of the predicted helices, strands and coils along the polypeptide chain. When a more extensive library of HMMs representing the universe of known structural families is available (work in progress), the program will allow rapid screening of genomic databases and sequence annotation when fold similarity is not detectable from the amino acid sequence. AVAILABILITY: FORESST web server at http://absalpha.dcrt.nih.gov:8008/ for the library of HMMs of structural families used in this paper. FORESST web server at http://www.tigr.org/ for a more extensive library of HMMs (work in progress). CONTACT: [email protected]; [email protected]; [email protected]  (+info)

Age estimates of two common mutations causing factor XI deficiency: recent genetic drift is not necessary for elevated disease incidence among Ashkenazi Jews. (7/3175)

The type II and type III mutations at the FXI locus, which cause coagulation factor XI deficiency, have high frequencies in Jewish populations. The type III mutation is largely restricted to Ashkenazi Jews, but the type II mutation is observed at high frequency in both Ashkenazi and Iraqi Jews, suggesting the possibility that the mutation appeared before the separation of these communities. Here we report estimates of the ages of the type II and type III mutations, based on the observed distribution of allelic variants at a flanking microsatellite marker (D4S171). The results are consistent with a recent origin for the type III mutation but suggest that the type II mutation appeared >120 generations ago. This finding demonstrates that the high frequency of the type II mutation among Jews is independent of the demographic upheavals among Ashkenazi Jews in the 16th and 17th centuries.  (+info)

Does over-the-counter nicotine replacement therapy improve smokers' life expectancy? (8/3175)

OBJECTIVE: To determine the public health benefits of making nicotine replacement therapy available without prescription, in terms of number of quitters and life expectancy. DESIGN: A decision-analytic model was developed to compare the policy of over-the-counter (OTC) availability of nicotine replacement therapy with that of prescription ([symbol: see text]) availability for the adult smoking population in the United States. MAIN OUTCOME MEASURES: Long-term (six-month) quit rates, life expectancy, and smoking attributable mortality (SAM) rates. RESULTS: OTC availability of nicotine replacement therapy would result in 91,151 additional successful quitters over a six-month period, and a cumulative total of approximately 1.7 million additional quitters over 25 years. All-cause SAM would decrease by 348 deaths per year and 2940 deaths per year at six months and five years, respectively. Relative to [symbol: see text] nicotine replacement therapy availability, OTC availability would result in an average gain in life expectancy across the entire adult smoking population of 0.196 years per smoker. In sensitivity analyses, the benefits of OTC availability were evident across a wide range of changes in baseline parameters. CONCLUSIONS: Compared with [symbol: see text] availability of nicotine replacement therapy, OTC availability would result in more successful quitters, fewer smoking-attributable deaths, and increased life expectancy for current smokers.  (+info)