Preferred codons and amino acid couples in hyperthermophiles. (73/588)

BACKGROUND: Most organisms grow at temperatures from 20 to 50 degrees C but some prokaryotes, including Archaea and Bacteria, are capable of withstanding higher temperatures, from 60 to >100 degrees C. What makes these cells so resistant to heat? Their biomolecules must be sufficiently stable, especially proteins, to work under these extreme conditions, but the bases for thermostability remains elusive. RESULTS: The preferential usage of certain couples of amino acids and codons in thermal adaptation was investigated, by comparative proteome analysis, using 28 complete genomes from 18 mesophiles, 4 thermophiles, and 6 hyperthermophiles. In the hyperthermophiles proteomes, whenever the percent of Glu (E) and Lys (K) Increased, the percent of Gln (Q) and His (H) decreased, so that the E+K/Q+H ratio was > 4,5; in the mesophiles proteomes, it was < 2,5 and in the thermophiles an intermediary value was observed. The E+K/Q+H ratios for chaperonins, potentially thermostable proteins, were higher than their proteome ratios whereas, for DNA ligases, not necessarily thermostable, they followed the proteome ones. Analysis of codon usage revealed that hyperthermophiles preferred AGR codons for Arg in detriment of CGN codons, which were preferred by mesophiles. CONCLUSIONS: The results suggested that the E+K/Q+H ratio may provide a useful mark for distinguishing hyperthermophilic, thermophilic and mesophilic prokaryotes and that the high percent of the amino acid couple E+K, consistently associated to the low percent of the pair Q+H, could contribute to protein thermostability. Second, the preference for AGR codons for Arg was a signature of all hyperthermophilics so far analyzed.  (+info)

CxxS: fold-independent redox motif revealed by genome-wide searches for thiol/disulfide oxidoreductase function. (74/588)

Redox reactions involving thiol groups in proteins are major participants in cellular redox regulation and antioxidant defense. Although mechanistically similar, thiol-dependent redox processes are catalyzed by structurally distinct families of enzymes, which are difficult to identify by available protein function prediction programs. Herein, we identified a functional motif, CxxS (cysteine separated from serine by two other residues), that was often conserved in redox enzymes, but rarely in other proteins. Analyses of complete Escherichia coli, Campylobacter jejuni, Methanococcus jannaschii, and Saccharomyces cerevisiae genomes revealed a high proportion of proteins known to use the CxxS motif for redox function. This allowed us to make predictions in regard to redox function and identity of redox groups for several proteins whose function previously was not known. Many proteins containing the CxxS motif had a thioredoxin fold, but other structural folds were also present, and CxxS was often located in these proteins upstream of an alpha-helix. Thus, a conserved CxxS sequence followed by an alpha-helix is typically indicative of a redox function and corresponds to thiol-dependent redox sites in proteins. The data also indicate a general approach of genome-wide identification of redox proteins by searching for simple conserved motifs within secondary structure patterns.  (+info)

Correlations between Shine-Dalgarno sequences and gene features such as predicted expression levels and operon structures. (75/588)

This work assesses relationships for 30 complete prokaryotic genomes between the presence of the Shine-Dalgarno (SD) sequence and other gene features, including expression levels, type of start codon, and distance between successive genes. A significant positive correlation of the presence of an SD sequence and the predicted expression level of a gene based on codon usage biases was ascertained, such that predicted highly expressed genes are more likely to possess a strong SD sequence than average genes. Genes with AUG start codons are more likely than genes with other start codons, GUG or UUG, to possess an SD sequence. Genes in close proximity to upstream genes on the same coding strand in most genomes are significantly higher in SD presence. In light of these results, we discuss the role of the SD sequence in translation initiation and its relationship with predicted gene expression levels and with operon structure in both bacterial and archaeal genomes.  (+info)

Congruent evolution of different classes of non-coding DNA in prokaryotic genomes. (76/588)

Prokaryotic genomes are considered to be 'wall-to-wall' genomes, which consist largely of genes for proteins and structural RNAs, with only a small fraction of the genomic DNA allotted to intergenic regions, which are thought to typically contain regulatory signals. The majority of bacterial and archaeal genomes contain 6-14% non-coding DNA. Significant positive correlations were detected between the fraction of non-coding DNA and inter- and intra-operonic distances, suggesting that different classes of non-coding DNA evolve congruently. In contrast, no correlation was found between any of these characteristics of non-coding sequences and the number of genes or genome size. Thus, the non-coding regions and the gene sets in prokaryotes seem to evolve in different regimes. The evolution of non-coding regions appears to be determined primarily by the selective pressure to minimize the amount of non-functional DNA, while maintaining essential regulatory signals, because of which the content of non-coding DNA in different genomes is relatively uniform and intra- and inter-operonic non-coding regions evolve congruently. In contrast, the gene set is optimized for the particular environmental niche of the given microbe, which results in the lack of correlation between the gene number and the characteristics of non-coding regions.  (+info)

Whole-genome analysis of photosynthetic prokaryotes. (77/588)

The process of photosynthesis has had profound global-scale effects on Earth; however, its origin and evolution remain enigmatic. Here we report a whole-genome comparison of representatives from all five groups of photosynthetic prokaryotes and show that horizontal gene transfer has been pivotal in their evolution. Excluding a small number of orthologs that show congruent phylogenies, the genomes of these organisms represent mosaics of genes with very different evolutionary histories. We have also analyzed a subset of "photosynthesis-specific" genes that were elucidated through a differential genome comparison. Our results explain incoherencies in previous data-limited phylogenetic analyses of phototrophic bacteria and indicate that the core components of photosynthesis have been subject to lateral transfer.  (+info)

Species-specific protein sequence and fold optimizations. (78/588)

BACKGROUND: An organism's ability to adapt to its particular environmental niche is of fundamental importance to its survival and proliferation. In the largest study of its kind, we sought to identify and exploit the amino-acid signatures that make species-specific protein adaptation possible across 100 complete genomes. RESULTS: Environmental niche was determined to be a significant factor in variability from correspondence analysis using the amino acid composition of over 360,000 predicted open reading frames (ORFs) from 17 archaea, 76 bacteria and 7 eukaryote complete genomes. Additionally, we found clusters of phylogenetically unrelated archaea and bacteria that share similar environments by amino acid composition clustering. Composition analyses of conservative, domain-based homology modeling suggested an enrichment of small hydrophobic residues Ala, Gly, Val and charged residues Asp, Glu, His and Arg across all genomes. However, larger aromatic residues Phe, Trp and Tyr are reduced in folds, and these results were not affected by low complexity biases. We derived two simple log-odds scoring functions from ORFs (CG) and folds (CF) for each of the complete genomes. CF achieved an average cross-validation success rate of 85 +/- 8% whereas the CG detected 73 +/- 9% species-specific sequences when competing against all other non-redundant CG. Continuously updated results are available at http://genome.mshri.on.ca. CONCLUSION: Our analysis of amino acid compositions from the complete genomes provides stronger evidence for species-specific and environmental residue preferences in genomic sequences as well as in folds. Scoring functions derived from this work will be useful in future protein engineering experiments and possibly in identifying horizontal transfer events.  (+info)

Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. (79/588)

BACKGROUND: Comparative analysis of sequenced genomes reveals numerous instances of apparent horizontal gene transfer (HGT), at least in prokaryotes, and indicates that lineage-specific gene loss might have been even more common in evolution. This complicates the notion of a species tree, which needs to be re-interpreted as a prevailing evolutionary trend, rather than the full depiction of evolution, and makes reconstruction of ancestral genomes a non-trivial task. RESULTS: We addressed the problem of constructing parsimonious scenarios for individual sets of orthologous genes given a species tree. The orthologous sets were taken from the database of Clusters of Orthologous Groups of proteins (COGs). We show that the phyletic patterns (patterns of presence-absence in completely sequenced genomes) of almost 90% of the COGs are inconsistent with the hypothetical species tree. Algorithms were developed to reconcile the phyletic patterns with the species tree by postulating gene loss, COG emergence and HGT (the latter two classes of events were collectively treated as gene gains). We prove that each of these algorithms produces a parsimonious evolutionary scenario, which can be represented as mapping of loss and gain events on the species tree. The distribution of the evolutionary events among the tree nodes substantially depends on the underlying assumptions of the reconciliation algorithm, e.g. whether or not independent gene gains (gain after loss after gain) are permitted. Biological considerations suggest that, on average, gene loss might be a more likely event than gene gain. Therefore different gain penalties were used and the resulting series of reconstructed gene sets for the last universal common ancestor (LUCA) of the extant life forms were analysed. The number of genes in the reconstructed LUCA gene sets grows as the gain penalty increases. However, qualitative examination of the LUCA versions reconstructed with different gain penalties indicates that, even with a gain penalty of 1 (equal weights assigned to a gain and a loss), the set of 572 genes assigned to LUCA might be nearly sufficient to sustain a functioning organism. Under this gain penalty value, the numbers of horizontal gene transfer and gene loss events are nearly identical. This result holds true for two alternative topologies of the species tree and even under random shuffling of the tree. Therefore, the results seem to be compatible with approximately equal likelihoods of HGT and gene loss in the evolution of prokaryotes. CONCLUSIONS: The notion that gene loss and HGT are major aspects of prokaryotic evolution was supported by quantitative analysis of the mapping of the phyletic patterns of COGs onto a hypothetical species tree. Algorithms were developed for constructing parsimonious evolutionary scenarios, which include gene loss and gain events, for orthologous gene sets, given a species tree. This analysis shows, contrary to expectations, that the number of predicted HGT events that occurred during the evolution of prokaryotes might be approximately the same as the number of gene losses. The approach to the reconstruction of evolutionary scenarios employed here is conservative with regard to the detection of HGT because only patterns of gene presence-absence in sequenced genomes are taken into account. In reality, horizontal transfer might have contributed to the evolution of many other genes also, which makes it a dominant force in prokaryotic evolution.  (+info)

MBGD: microbial genome database for comparative analysis. (80/588)

MBGD is a workbench system for comparative analysis of completely sequenced microbial genomes. The central function of MBGD is to create an orthologous gene classification table using precomputed all-against-all similarity relationships among genes in multiple genomes. In MBGD, an automated classification algorithm has been implemented so that users can create their own classification table by specifying a set of organisms and parameters. This feature is especially useful when the user's interest is focused on some taxonomically related organisms. The created classification table is stored into the database and can be explored combining with the data of individual genomes as well as similarity relationships among genomes. Using these data, users can carry out comparative analyses from various points of view, such as phylogenetic pattern analysis, gene order comparison and detailed gene structure comparison. MBGD is accessible at http://mbgd.genome.ad.jp/.  (+info)