Machine learning approaches for the prediction of signal peptides and other protein sorting signals. (1/852)

Prediction of protein sorting signals from the sequence of amino acids has great importance in the field of proteomics today. Recently, the growth of protein databases, combined with machine learning approaches, such as neural networks and hidden Markov models, have made it possible to achieve a level of reliability where practical use in, for example automatic database annotation is feasible. In this review, we concentrate on the present status and future perspectives of SignalP, our neural network-based method for prediction of the most well-known sorting signal: the secretory signal peptide. We discuss the problems associated with the use of SignalP on genomic sequences, showing that signal peptide prediction will improve further if integrated with predictions of start codons and transmembrane helices. As a step towards this goal, a hidden Markov model version of SignalP has been developed, making it possible to discriminate between cleaved signal peptides and uncleaved signal anchors. Furthermore, we show how SignalP can be used to characterize putative signal peptides from an archaeon, Methanococcus jannaschii. Finally, we briefly review a few methods for predicting other protein sorting signals and discuss the future of protein sorting prediction in general.  (+info)

Exon shuffling by L1 retrotransposition. (2/852)

Long interspersed nuclear elements (LINE-1s or L1s) are the most abundant retrotransposons in the human genome, and they serve as major sources of reverse transcriptase activity. Engineered L1s retrotranspose at high frequency in cultured human cells. Here it is shown that L1s insert into transcribed genes and retrotranspose sequences derived from their 3' flanks to new genomic locations. Thus, retrotransposition-competent L1s provide a vehicle to mobilize non-L1 sequences, such as exons or promoters, into existing genes and may represent a general mechanism for the evolution of new genes.  (+info)

The chloroplast infA gene with a functional UUG initiation codon. (3/852)

All chloroplast genes reported so far possess ATG start codons and sometimes GTGs as an exception. Sequence alignments suggested that the chloroplast infA gene encoding initiation factor 1 in the green alga Chlorella vulgaris has TTG as a putative initiation codon. This gene was shown to be transcribed by RT-PCR analysis. The infA mRNA was translated accurately from the UUG codon in a tobacco chloroplast in vitro translation system. Mutation of the UUG codon to AUG increased translation efficiency approximately 300-fold. These results indicate that the UUG is functional for accurate translation initiation of Chlorella infA mRNA but it is an inefficient initiation codon.  (+info)

Involvement of the aphthovirus RNA region located between the two functional AUGs in start codon selection. (4/852)

Initiation of translation in picornavirus RNAs occurs internally, mediated by an element termed internal ribosome entry site (IRES). In the aphthovirus RNA, the IRES element directs translation initiation at two in-frame AUGs separated by 84 nucleotides. We have found that bicistronic constructs that contained the IRES element followed by the fragment including the aphthovirus start codons in front of the second gene mimicked the translation initiation pattern of viral RNA observed in infected cells. In those constructs, the frequency of initiation at the first AUG was increased by a sequence context that resembled the favorable consensus for cap-dependent translation, although initiation at the second site was always preferred. In addition, we have found that initiation at the second start codon was not diminished under conditions in which the first initiation codon was blocked by antisense oligonucleotide interference. Interestingly, mutations that positioned the second AUG out-of-frame with the first AUG did not interfere with the frequency of initiation at the second one. On the contrary, IRES-dependent translation initiation in bicistronic constructs lacking the sequences present between functional AUGs in the viral RNA was sensitive to the presence of out-of-frame initiator codons and hairpins in the spacer region. This remarkable difference in start codon recognition was due to the nucleotide composition of the RNA that separated the IRES from the initiator codon. Thus our results indicate that the region located in the aphthovirus RNA between functional AUGs is involved in start codon recognition, strongly favoring selection of the second start AUG as the main initiator codon.  (+info)

Analysis of elements involved in pseudoknot-dependent expression and regulation of the repA gene of an IncL/M plasmid. (5/852)

Replication of the IncL/M plasmid pMU604 is controlled by a small antisense RNA molecule (RNAI), which, by inhibiting the formation of an RNA pseudoknot, regulates translation of the replication initiator protein, RepA. Efficient translation of the repA mRNA was shown to require the translation and correct termination of the leader peptide, RepB, and the formation of the pseudoknot. Although the pseudoknot was essential for the expression of repA, its presence was shown to interfere with the translation of repB. The requirement for pseudoknot formation could in large part be obviated by improving the ribosome binding region of repA, either by replacing the GUG start codon by AUG or by increasing the spacing between the start codon and the Shine-Dalgarno sequence (SD). The spacing between the distal pseudoknot sequence and the repA SD was shown to be suboptimal for maximal expression of repA.  (+info)

Multiple murine double minute gene 2 (MDM2) proteins are induced by ultraviolet light. (6/852)

The mdm2 (murine double minute 2) oncogene encodes several proteins, the largest of which (p90) binds to and inactivates the p53 tumor suppressor protein. Multiple MDM2 proteins have been detected in tumors and in cell lines expressing high levels of mdm2 mRNAs. Here we show that one of these proteins (p76) is expressed, along with p90, in wild-type and p53-null mouse embryo fibroblasts, indicating that it may have an important physiological role in normal cells. Expression of this protein is induced, as is that of p90, by UV light in a p53-dependent manner. The p76 protein is synthesized via translational initiation at AUG codon 50 and thus lacks the N terminus of p90 and does not bind p53. In cells, p90 and p76 can be synthesized from mdm2 mRNAs transcribed from both the P1 (constitutive) and P2 (p53-responsive) promoters. Site-directed mutagenesis reveals that these RNAs give rise to p76 via internal initiation of translation. In addition, mdm2 mRNAs lacking exon 3 give rise to p76 exclusively, and such mRNAs are induced by p53 in response to UV light. These data indicate that p76 may be an important product of the mdm2 gene and a downstream effector of p53.  (+info)

Postsynaptic alpha-neurotoxin gene of the spitting cobra, Naja naja sputatrix: structure, organization, and phylogenetic analysis. (7/852)

The venom of the spitting cobra, Naja naja sputatrix contains highly potent alpha-neurotoxins (NTXs) in addition to phospholipase A2 (PLA2) and cardiotoxin (CTX). In this study, we report the complete characterization of three genes that are responsible for the synthesis of three isoforms of alpha-NTX in the venom of a single spitting cobra. DNA amplification by long-distance polymerase chain reaction (LD-PCR) and genome walking have provided information on the gene structure including their promoter and 5' and 3' UTRs. Each NTX isoform is approximately 4 kb in size and contains three exons and two introns. The sequence homology among these isoforms was found to be 99%. Two possible transcription sites were identified by primer extension analysis and they corresponded to the adenine (A) nucleotide at positions +1 and -45. The promoter also contains two TATA boxes and a CCAAT box. Putative binding sites for transcriptional factors AP-2 and GATA are also present. The high percentage of similarity observed among the NTX gene isoforms of N. n. sputatrix as well as with the alpha-NTX and kappa-NTX genes from other land snakes suggests that the NTX gene has probably evolved from a common ancestral gene.  (+info)

The cis acting sequences responsible for the differential decay of the unstable MFA2 and stable PGK1 transcripts in yeast include the context of the translational start codon. (8/852)

A general pathway of mRNA turnover has been described for yeast in which the 3' poly(A) tail is first deadenylated to an oligo(A) length, leading to decapping and subsequent 5'-3' exonucleolytic decay. The unstable MFA2 mRNA and the stable PGK1 mRNAs both decay through this pathway, albeit at different rates of deadenylation and decapping. To determine the regions of the mRNAs that are responsible for these differences, we examined the decay of chimeric mRNAs derived from the 5' untranslated, coding, and 3' untranslated regions of these two mRNAs. These experiments have led to the identification of the features of these mRNAs that lead to their different stabilities. The MFA2 mRNA is unstable solely because its 3' UTR promotes the rates of deadenylation and decapping; all other features of this mRNA are neutral with respect to mRNA decay rates. The PGK1 mRNA is stable because the sequence context of the PGK1 translation start codon and the coding region function together to stabilize the transcript, whereas the PGK13' UTR is neutral with respect to decay. Importantly, changes in the PGK1 start codon context that destabilized the transcript also reduced its translational efficiency. This observation suggests that the nature of the translation initiation complex modulates the rates of mRNA decapping and decay.  (+info)