HOMSTRAD: adding sequence information to structure-based alignments of homologous protein families. (17/5719)

summary: We describe an extension to the Homologous Structure Alignment Database (HOMSTRAD; Mizuguchi et al., Protein Sci., 7, 2469-2471, 1998a) to include homologous sequences derived from the protein families database Pfam (Bateman et al., Nucleic Acids Res., 28, 263-266, 2000). HOMSTRAD is integrated with the server FUGUE (Shi et al., submitted, 2001) for recognition and alignment of homologues, benefitting from the combination of abundant sequence information and accurate structure-based alignments. AVAILABILITY The HOMSTRAD database is available at: http://www-cryst.bioc.cam.ac.uk/homstrad/. Query sequences can be submitted to the homology recognition/alignment server FUGUE at: http://www-cryst.bioc.cam.ac.uk/fugue/.  (+info)

Structure prediction meta server. (18/5719)

The Structure Prediction Meta Server offers a convenient way for biologists to utilize various high quality structure prediction servers available worldwide. The meta server translates the results obtained from remote services into uniform format, which are consequently used to request a jury prediction from a remote consensus server Pcons. AVAILABILITY: The structure prediction meta server is freely available at http://BioInfo.PL/meta/, some remote servers have however restrictions for non-academic users, which are respected by the meta server. SUPPLEMENTARY INFORMATION: Results of several sessions of the CAFASP and LiveBench programs for assessment of performance of fold-recognition servers carried out via the meta server are available at http://BioInfo.PL/services.html.  (+info)

Easier threading through web-based comparisons and cross-validations. (19/5719)

We have developed a WWW server for the integration and comparison of protein structure predictions performed by five different servers. Users submit an amino acid sequence to a selected set of these prediction methods. Results are gathered on a web-based page in order to facilitate comparison and analysis. All the alignments are further evaluated through a common threading tool making their comparisons easier. AVAILABILITY: The meta-server is available free at http://www.infobiosud.cnrs.fr/bioserver SUPPLEMENTARY INFORMATION: http://www.infobiosud.cnrs.fr/bioserver/hah1.html  (+info)

SIR: a simple indexing and retrieval system for biological flat file databases. (20/5719)

SUMMARY: SIR is a Simple Indexing and Retrieval tool for indexing and searching biological flat file databases. SIR is a cross-platform solution entirely written in Python. Since the package is very small and installation is trivial, this would be an ideal solution for database providers to provide a custom retrieval tool to access them. AVAILABILITY: The modules will be made available at http://www.EMBLHeidelberg.de/~chenna/PySAT/sir.html  (+info)

Abundant protein domains occur in proportion to proteome size. (21/5719)

BACKGROUND: Conserved domains in proteins have crucial roles in protein interactions, DNA binding, enzyme activity and other important cellular processes. It will be of interest to determine the proportions of genes containing such domains in the proteomes of different eukaryotes. RESULTS: The average proportion of conserved domains in each of five eukaryote genomes was calculated. In pairwise genome comparisons, the ratio of genes containing a given conserved domain in the two genomes on average reflected the ratio of the predicted total gene numbers of the two genomes. These ratios have been verified using a repository of databases and one of its subdivisions of conserved domains. CONCLUSIONS: Many conserved domains occur as a constant proportion of proteome size across the five sequenced eukaryotic genomes. This raises the possibility that this proportion is maintained because of functional constraints on interacting domains. The universality of the ratio in the five eukaryotic genomes attests to its potential importance.  (+info)

Naturally occurring circular permutations in proteins. (22/5719)

A pair of proteins is defined to be related by a circular permutation if the N-terminal region of one protein has significant sequence similarity to the C-terminal of the other and vice versa. To detect pairs of proteins that might be related by circular permutation, we implemented a procedure based on a combination of a fast screening algorithm that we had designed and manual verification of candidate pairs. The screening algorithm is a variation of a dynamic programming string matching algorithm, in which one of the sequences is doubled. This algorithm, although not guaranteed to identify all cases of circular permutation, is a good first indicator of protein pairs related by permutation events. The candidate pairs were further validated first by application of an exhaustive string matching algorithm and then by manual inspection using the dotplot visual tool. Screening the whole Swissprot database, a total of 25 independent protein pairs were identified. These cases are presented here, divided into three categories depending on the level of functional similarity of the related proteins. To validate our approach and to confirm further the small number of circularly permuted protein pairs, a systematic search for cases of circular permutation was carried out in the Pfam database of protein domains. Even with this more inclusive definition of a circular permutation, only seven additional candidates were found. None of these fitted our original definition of circular permutations. The small number of cases of circular permutation suggests that there is no mechanism of local genetic manipulation that can induce circular permutations; most examples observed seem to result from fusion of functional units.  (+info)

Stabilization of local structures by pi-CH and aromatic-backbone amide interactions involving prolyl and aromatic residues. (23/5719)

Weakly polar interactions between the side-chain aromatic rings and hydrogens of backbone amides (Ar-HN) and CHn of aliphatic groups (pi-CH) are known to form local structures and to stabilize secondary structure in peptides and proteins. To investigate the existence of these interactions and to explore their possible role in constraining the structures of Pro-Xaa and Xaa-Pro fragments in proteins, a database search was performed in a non-redundant set of proteins from the Brookheaven Protein Data Bank for pi-CH and Ar-HN interactions in Pro-Xaa and Xaa-Pro fragments (where Xaa is either Phe, Tyr or Trp). In Xaa-Pro fragments, the percentage of pi-CH interactions and Ar-HN interactions, respectively, was 20.6 and 3.2%, in Pro-Xaa fragments 26.8, 8.6 and 4.0% of the Pro-Xaa fragments contained both interactions, while no Xaa-Pro fragments had both. The protein fragments containing Ar-HN and/or pi-CH interactions were clustered on the basis of similarity of selected torsion angles. The clustering resulted in well defined clusters. Thus, pi-CH and Ar(i)-HN(i) interactions were able to constrain individual conformations of the Pro-Xaa and Xaa-Pro fragments. These local structures were found to be independent of the secondary structure of the polypeptide chains in which the fragments were found.  (+info)

Knowledge-based potential defined for a rotamer library to design protein sequences. (24/5719)

A knowledge-based potential for a rotamer library was developed to design protein sequences. Protein side-chain conformations are represented by 56 templates. Each of their fitness to a given structural site-environment is evaluated by a combined function of the three knowledge-based terms, i.e. two-body side-chain packing, one-body hydration and local conformation. The number of matches between the native sequence and the structural site-environment in the database and that of the virtually settled mismatches, counted in advance, were transformed into the energy scores. In the best-14 test (assessment for the reproduction ability of the native rotamer on its structural site within a quarter of 56 fitness rank positions), the structural stability analysis on mutants of human and T4 lysozymes and the inverse-folding search by a structure profile against the sequence database, this function performs better than the function deduced with the conventional normalization and our previously developed function. Targeting various structural motifs, de novo sequence design was conducted with the function. The sequences thus obtained exhibit reasonable molecular masses and hydrophobic/hydrophilic patterns similar to the native sequences of the target and act as if they were the homologs to the target proteins in BLASTP search. This significant improvement is discussed in terms of the reference state for normalization and the crucial role of short-range repulsion to prohibit residue bumps.  (+info)