Birth and death of protein domains: a simple model of evolution explains power law behavior. (1/322)

BACKGROUND: Power distributions appear in numerous biological, physical and other contexts, which appear to be fundamentally different. In biology, power laws have been claimed to describe the distributions of the connections of enzymes and metabolites in metabolic networks, the number of interactions partners of a given protein, the number of members in paralogous families, and other quantities. In network analysis, power laws imply evolution of the network with preferential attachment, i.e. a greater likelihood of nodes being added to pre-existing hubs. Exploration of different types of evolutionary models in an attempt to determine which of them lead to power law distributions has the potential of revealing non-trivial aspects of genome evolution. RESULTS: A simple model of evolution of the domain composition of proteomes was developed, with the following elementary processes: i) domain birth (duplication with divergence), ii) death (inactivation and/or deletion), and iii) innovation (emergence from non-coding or non-globular sequences or acquisition via horizontal gene transfer). This formalism can be described as a birth, death and innovation model (BDIM). The formulas for equilibrium frequencies of domain families of different size and the total number of families at equilibrium are derived for a general BDIM. All asymptotics of equilibrium frequencies of domain families possible for the given type of models are found and their appearance depending on model parameters is investigated. It is proved that the power law asymptotics appears if, and only if, the model is balanced, i.e. domain duplication and deletion rates are asymptotically equal up to the second order. It is further proved that any power asymptotic with the degree not equal to -1 can appear only if the hypothesis of independence of the duplication/deletion rates on the size of a domain family is rejected. Specific cases of BDIMs, namely simple, linear, polynomial and rational models, are considered in details and the distributions of the equilibrium frequencies of domain families of different size are determined for each case. We apply the BDIM formalism to the analysis of the domain family size distributions in prokaryotic and eukaryotic proteomes and show an excellent fit between these empirical data and a particular form of the model, the second-order balanced linear BDIM. Calculation of the parameters of these models suggests surprisingly high innovation rates, comparable to the total domain birth (duplication) and elimination rates, particularly for prokaryotic genomes. CONCLUSIONS: We show that a straightforward model of genome evolution, which does not explicitly include selection, is sufficient to explain the observed distributions of domain family sizes, in which power laws appear as asymptotic. However, for the model to be compatible with the data, there has to be a precise balance between domain birth, death and innovation rates, and this is likely to be maintained by selection. The developed approach is oriented at a mathematical description of evolution of domain composition of proteomes, but a simple reformulation could be applied to models of other evolving networks with preferential attachment.  (+info)

Two-dimensional IR correlation spectroscopy of mutants of the beta-glycosidase from the hyperthermophilic archaeon Sulfolobus solfataricus identifies the mechanism of quaternary structure stabilization and unravels the sequence of thermal unfolding events. (2/322)

Beta-glycosidase from the hyperthermophilic archaeon Sulfolobus solfataricus is a homotetramer with a higher number of ion pairs compared with mesophilic glycoside hydrolases. The ion pairs are arranged in large networks located mainly at the tetrameric interface of the molecule. In the present study, the structure and thermal stability of the wild-type beta-glycosidase and of three mutants in residues R488 and H489 involved in the C-terminal ionic network were examined by FTIR (Fourier-transform IR) spectroscopy. The FTIR data revealed small differences in the secondary structure of the proteins and showed a lower thermostability of the mutant proteins with respect to the wild-type. Generalized 2D-IR (two-dimensional IR correlation spectroscopy) at different temperatures showed different sequences of thermal unfolding events in the mutants with respect to the wild-type, indicating that punctual mutations affect the unfolding and aggregation process of the protein. A detailed 2D-IR analysis of synchronous maps of the proteins allowed us to identify the temperatures at which the ionic network that stabilizes the quaternary structure of the native and mutant enzymes at the C-terminal breaks down. This evidence gives support to the current theories on the mechanism of ion-pair stabilization in proteins from hyperthermophilic organisms.  (+info)

Dual-genome primer design for construction of DNA microarrays. (3/322)

MOTIVATION: Microarray experiments using probes covering a whole transcriptome are expensive to initiate, and a major part of the costs derives from synthesizing gene-specific PCR primers or hybridization probes. The high costs may force researchers to limit their studies to a single organism, although comparing gene expression in different species would yield valuable information. RESULTS: We have developed a method, implemented in the software DualPrime, that reduces the number of primers required to amplify the genes of two different genomes. The software identifies regions of high sequence similarity, and from these regions selects PCR primers shared between the genomes, such that either one or, preferentially, both primers in a given PCR can be used for amplification from both genomes. To assure high microarray probe specificity, the software selects primer pairs that generate products of low sequence similarity to other genes within the same genome. We used the software to design PCR primers for 2182 and 1960 genes from the hyperthermophilic archaea Sulfolobus solfataricus and Sulfolobus acidocaldarius, respectively. Primer pairs were shared among 705 pairs of genes, and single primers were shared among 1184 pairs of genes, resulting in a saving of 31% compared to using only unique primers. We also present an alternative primer design method, in which each gene shares primers with two different genes of the other genome, enabling further savings. 3. AVAILABILITY: The software is freely available at http://www.biotech.kth.se/molbio/microarray/.  (+info)

The role of cis-acting sequences governing catabolite repression control of lacS expression in the archaeon Sulfolobus solfataricus. (4/322)

The archaeon Sulfolobus solfataricus uses a catabolite repression-like system to control production of several glycoside hydrolases. To better understand this regulatory system, studies of the regulation of expression of the beta-glycosidase gene (lacS) were conducted. Expression of lacS varies in response to medium composition and to mutations at an unlinked gene called car. Despite gene overlap, expression of the lacS promoter proximal gene, SSO3017, exhibited coregulation but not cotranscription with lacS. Measurements of mRNA half-life excluded differential stability as a factor in lacS regulation. Chromosomal repositioning by homologous recombination of a lacS deletion series clarified critical cis-acting sequences required for lacS regulation. lacS repositioned at amyA exhibited increased lacS expression and compromised the response to medium composition independently of lacS 5' flanking sequence composition. In contrast, regulation of lacS by the car mutation was dependent on sequences upstream of the archaeal TATA box. Expression of a promoter fusion between lacS and the car-independent malA promoter integrated either at amyA or at the natural lacS locus was insensitive to the allelic state of car. In contrast, the promoter fusion retained a response to medium composition only at the lacS locus. These results indicate that car acts at the lacS promoter and that the response to medium composition involves locus-specific sequences exclusive of those present 5' to lacS or within the lacS transcription unit.  (+info)

(S)-2,3-Di-O-geranylgeranylglyceryl phosphate synthase from the thermoacidophilic archaeon Sulfolobus solfataricus. Molecular cloning and characterization of a membrane-intrinsic prenyltransferase involved in the biosynthesis of archaeal ether-linked membrane lipids. (5/322)

The core structure of membrane lipids of archaea have some unique properties that permit archaea to be distinguished from the others, i.e. bacteria and eukaryotes. (S)-2,3-Di-O-geranylgeranylglyceryl phosphate synthase, which catalyzes the transfer of a geranylgeranyl group from geranylgeranyl diphosphate to (S)-3-O-geranylgeranylglyceryl phosphate, is involved in the biosynthesis of archaeal membrane lipids. Enzymes of the UbiA prenyltransferase family are known to catalyze the transfer of a prenyl group to various acceptors with hydrophobic ring structures in the biosynthesis of respiratory quinones, hemes, chlorophylls, vitamin E, and shikonin. The thermoacidophilic archaeon Sulfolobus solfataricus was found to encode three homologues of UbiA prenyltransferase in its genome. One of the homologues encoded by SSO0583 was expressed in Escherichia coli, purified, and characterized. Radio-assay and mass spectrometry analysis data indicated that the enzyme specifically catalyzes the biosynthesis of (S)-2,3-di-O-geranylgeranylglyceryl phosphate. The fact that the orthologues of the enzyme are encoded in almost all archaeal genomes clearly indicates the importance of their functions. A phylogenetic tree constructed using the amino acid sequences of some typical members of the UbiA prenyltransferase family and their homologues from S. solfataricus suggests that the two other S. solfataricus homologues, excluding the (S)-2,3-di-O-geranylgeranylglyceryl phosphate synthase, are involved in the production of respiratory quinone and heme, respectively. We propose here that archaeal prenyltransferases involved in membrane lipid biosynthesis might be prototypes of the protein family and that archaea might have played an important role in the molecular evolution of prenyltransferases.  (+info)

Amino acids of the Sulfolobus solfataricus mini-chromosome maintenance-like DNA helicase involved in DNA binding/remodeling. (6/322)

Herein we report the identification of amino acids of the Sulfolobus solfataricus mini-chromosome maintenance (MCM)-like DNA helicase (SsoMCM), which are critical for DNA binding/remodeling. The crystallographic structure of the N-terminal portion (residues 2-286) of the Methanothermobacter thermoautotrophicum MCM protein revealed a dodecameric assembly with two hexameric rings in a head-to-head configuration and a positively charged central channel proposed to encircle DNA molecules. A structure-guided alignment of the M. thermoautotrophicum and S. solfataricus MCM sequences identified positively charged amino acids in SsoMCM that could point to the center of the channel. These residues (Lys-129, Lys-134, His-146, and Lys-194) were changed to alanine. The purified mutant proteins were all found to form homo-hexamers in solution and to retain full ATPase activity. K129A, H146A, and K194A SsoMCMs are unable to bind DNA either in single- or double-stranded form in band shift assays and do not display helicase activity. In contrast, the substitution of lysine 134 to alanine affects only binding to duplex DNA molecules, whereas it has no effect on binding to single-stranded DNA and on the DNA unwinding activity. These results have important implications for the understanding of the molecular mechanism of the MCM DNA helicase action.  (+info)

A highly acid-stable and thermostable endo-beta-glucanase from the thermoacidophilic archaeon Sulfolobus solfataricus. (7/322)

The thermoacidophilic archaeon Sulfolobus solfataricus P2 encodes three hypothetic endo-beta-glucanases, SSO1354, SSO1949 and SSO2534. We cloned and expressed the gene sso1949 encoding the 334 amino acids containing protein SSO1949, which can be classified as a member of glycoside hydrolase family 12. The purified recombinant enzyme hydrolyses carboxymethylcellulose as well as cello-oligomers, with cellobiose and cellotriose as main reaction products. By following the hydrolysis of a fluorescently labelled cellohexaoside under a wide variety of conditions, we show that SSO1949 is a unique extremophilic enzyme. This archaeal enzyme has a pH optimum of approx. pH 1.8 and a temperature optimum of approx. 80 degrees C. Furthermore, the enzyme is thermostable, with a half-life of approx. 8 h at 80 degrees C and pH 1.8. The thermostability is strongly pH-dependent. At neutral pH, the thermal inactivation rate is nearly two orders of magnitude higher than at pH 1.8. Homology modelling suggests that the catalytic domain of SSO1949 has a similar fold to other mesophilic, acidophilic and neutral cellulases. The presence of a signal peptide indicates that SSO1949 is a secreted protein, which enables S. solfataricus to use cellulose as an external carbon source. It appears that SSO1949 is perfectly adapted to the extreme environment in solfataric pools. A cellulolytic enzyme with such a combination of stability and activity at high temperatures and low pH has not been described so far and could be a valuable tool for the large-scale hydrolysis of cellulose under acidic conditions.  (+info)

Identification and characterization of Sulfolobus solfataricus D-gluconate dehydratase: a key enzyme in the non-phosphorylated Entner-Doudoroff pathway. (8/322)

The extremely thermoacidophilic archaeon Sulfolobus solfataricus utilizes D-glucose as a sole carbon and energy source through the non-phosphorylated Entner-Doudoroff pathway. It has been suggested that this micro-organism metabolizes D-gluconate, the oxidized form of D-glucose, to pyruvate and D-glyceraldehyde by using two unique enzymes, D-gluconate dehydratase and 2-keto-3-deoxy-D-gluconate aldolase. In the present study, we report the purification and characterization of D-gluconate dehydratase from S. solfataricus, which catalyses the conversion of D-gluconate into 2-keto-3-deoxy-D-gluconate. D-Gluconate dehydratase was purified 400-fold from extracts of S. solfataricus by ammonium sulphate fractionation and chromatography on DEAE-Sepharose, Q-Sepharose, phenyl-Sepharose and Mono Q. The native protein showed a molecular mass of 350 kDa by gel filtration, whereas SDS/PAGE analysis provided a molecular mass of 44 kDa, indicating that D-gluconate dehydratase is an octameric protein. The enzyme showed maximal activity at temperatures between 80 and 90 degrees C and pH values between 6.5 and 7.5, and a half-life of 40 min at 100 degrees C. Bivalent metal ions such as Co2+, Mg2+, Mn2+ and Ni2+ activated, whereas EDTA inhibited the enzyme. A metal analysis of the purified protein revealed the presence of one Co2+ ion per enzyme monomer. Of the 22 aldonic acids tested, only D-gluconate served as a substrate, with K(m)=0.45 mM and V(max)=0.15 unit/mg of enzyme. From N-terminal sequences of the purified enzyme, it was found that the gene product of SSO3198 in the S. solfataricus genome database corresponded to D-gluconate dehydratase (gnaD). We also found that the D-gluconate dehydratase of S. solfataricus is a phosphoprotein and that its catalytic activity is regulated by a phosphorylation-dephosphorylation mechanism. This is the first report on biochemical and genetic characterization of D-gluconate dehydratase involved in the non-phosphorylated Entner-Doudoroff pathway.  (+info)