Improved secondary structure predictions for a nicotinic receptor subunit: incorporation of solvent accessibility and experimental data into a two-dimensional representation. (65/21774)

Abstract A refined prediction of the nicotinic acetylcholine receptor (nAChR) subunits' secondary structure was computed with third-generation algorithms. The four selected programs, PHD, Predator, DSC, and NNSSP, based on different prediction approaches, were applied to each sequence of an alignment of nAChR and 5-HT3 receptor subunits, as well as a larger alignment with related subunit sequences from glycine and GABA receptors. A consensus prediction was computed for the nAChR subunits through a "winner takes all" method. By integrating the probabilities obtained with PHD, DSC, and NNSSP, this prediction was filtered in order to eliminate the singletons and to more precisely establish the structure limits (only 4% of the residues were modified). The final consensus secondary structure includes nine alpha-helices (24.2% of the residues, with an average length of 13.9 residues) and 17 beta-strands (22.5% of the residues, with an average length of 6.6 residues). The large extracellular domain is predicted to be mainly composed of beta-strands, with only two helices at the amino-terminal end. The transmembrane segments are predicted to be in a mixed alpha/beta topology (with a predominance of alpha-helices), with no known equivalent in the current protein database. The cytoplasmic domain is predicted to consist of two well-conserved amphipathic helices joined together by an unfolded stretch of variable length and sequence. In general, the segments predicted to occur in a periodic structure correspond to the more conserved regions, as defined by an analysis of sequence conservation per position performed on 152 superfamily members. The solvent accessibility of each residue was predicted from the multiple alignments with PHDacc. Each segment with more than three exposed residues was assumed to be external to the core protein. Overall, these data constitute an envelope of structural constraints. In a subsequent step, experimental data relative to the extracellular portion of the complete receptor were incorporated into the model. This led to a proposed two-dimensional representation of the secondary structure in which the peptide chain of the extracellular domain winds alternatively between the two interfaces of the subunit. Although this representation is not a tertiary structure and does not lead to predictions of specific beta-beta interaction, it should provide a basic framework for further mutagenesis investigations and for fold recognition (threading) searches.  (+info)

COVOL: an interactive program for evaluating second virial coefficients from the triaxial shape or dimensions of rigid macromolecules. (66/21774)

An interactive program is described for calculating the second virial coefficient contribution to the thermodynamic nonideality of solutions of rigid macromolecules based on their triaxial dimensions. The FORTRAN-77 program, available in precompiled form for the PC, is based on theory for the covolume of triaxial ellipsoid particles [Rallison, J. M., and S.E Harding. (1985). J. Colloid Interface Sci. 103:284-289]. This covolume has the potential to provide a magnitude for the second virial coefficient of macromolecules bearing no net charge. Allowance for a charge-charge contribution is made via an expression based on Debye-Huckel theory and uniform distribution of the net charge over the surface of a sphere with dimensions governed by the Stokes radius of the macromolecule. Ovalbumin, ribonuclease A, and hemoglobin are used as model systems to illustrate application of the COVOL routine.  (+info)

A vector-based method for drawing RNA secondary structure. (67/21774)

MOTIVATION: To produce a polygonal display of RNA secondary structure with minimal overlap and distortion of structural elements, with minimal search for positioning them, and with minimal user intervention. RESULTS: A new algorithm for automatically drawing RNA secondary structure has been developed. The algorithm represents the direction and space for a structural element using vector and vector space. Two heuristics are used. The first heuristic is concerned with ordering structural elements to be positioned and the second with positioning them in space. The algorithm and a graphical user interface have been implemented in a working program called VizQFolder on IBM PC compatibles. Experimental results demonstrate that VizQFolder is capable of automatically generating nearly overlap-free polygonal displays for long RNA molecules. The only distortion performed to avoid overlap is the rotation of helices, leading to efficient generation of a polygonal display without sacrificing its readability. VizQFolder is not coupled to a specific prediction program of RNA secondary structure, and thus can be used for visualizing secondary structure models obtained by any means. AVAILABILITY: The executable code of VizQFolder is available at http://automation.inha.ac.kr/khan. It can also be obtained from the authors upon request.  (+info)

Calign: aligning sequences with restricted affine gap penalties. (68/21774)

MOTIVATION: Given a genomic DNA sequence, it is still an open problem to determine its coding regions, i.e. the region consisting of exons and introns. The comparison of cDNA and genomic DNA helps the understanding of coding regions. For such an application, it might be adequate to use the restricted affine gap penalties which penalize long gaps with a constant penalty. RESULTS: Several techniques developed for solving the approximate string-matching problem are employed to yield efficient algorithms for computing the optimal alignment with restricted affine gap penalties. In particular, efficient algorithms can be derived based on the suffix automaton with failure transitions and on the diagonalwise monotonicity of the cost tables. We have implemented the above methods in C on Sun workstations running SunOS Unix. Preliminary experiments show that these approaches are very promising for aligning a cDNA sequence with a genomic DNA sequence. AVAILABILITY: Calign is available free of charge by anonymous ftp at: iubio.bio. indiana.edu, directory: molbio/align, files: calign.driver.c calign. c. Another URL reference for the files is http://iubio.bio.indiana.edu/soft/molbio/align/+ ++calign.c.  (+info)

ESPript: analysis of multiple sequence alignments in PostScript. (69/21774)

MOTIVATION: The program ESPript (Easy Sequencing in PostScript) allows the rapid visualization, via PostScript output, of sequences aligned with popular programs such as CLUSTAL-W or GCG PILEUP. It can read secondary structure files (such as that created by the program DSSP) to produce a synthesis of both sequence and structural information. RESULTS: ESPript can be run via a command file or a friendly html-based user interface. The program calculates an homology score by columns of residues and can sort this calculation by groups of sequences. It offers a palette of markers to highlight important regions in the alignment. ESPript can also paste information on residue conservation into coordinate files, for subsequent visualization with a graphics program. AVAILABILITY: ESPript can be accessed on its Web site at http://www.ipbs.fr/ESPript. Sources and helpfiles can be downloaded via anonymous ftp from ftp.ipbs.fr. A tar file is held in the directory pub/ESPript.  (+info)

DINAMO: interactive protein alignment and model building. (70/21774)

MOTIVATION: To facilitate the process of structure prediction by both comparative modeling and fold recognition, we describe DINAMO, an interactive protein alignment building and model evaluation tool that dynamically couples a multiple sequence alignment editor to a molecular graphics display. DINAMO allows the user to optimize the alignment and model to satisfy the known heuristics of protein structure by means of a set of analysis tools. The analysis tools return information to both the alignment editor and graphics model in the form of visual cues (color, shape), allowing for rapid evaluation. Several analysis tools may be employed, including residue conservation, residue properties (charge, hydrophobicity, volume), residue environmental preference, and secondary structure propensity. RESULTS: We demonstrate DINAMO by building a model for submission in the 3rd annual Critical Assessment of Techniques for Protein Structure Prediction (CASP3) contest. AVAILABILITY: DINAMO is freely available as a local application or Web-based Java applet at http://tito.ucsc.edu/dinamo  (+info)

Automated analysis of interatomic contacts in proteins. (71/21774)

MOTIVATION: New software has been designed to assist the molecular biologist in understanding the structural consequences of modifying a ligand and/or protein. RESULTS: Tools are described for the analysis of ligand-protein contacts (LPC software) and contacts of structural units (CSU software) such as helices, sheets, strands and residues. Our approach is based on a detailed analysis of interatomic contacts and interface complementarity. For any ligand or structural unit, these software automatically: (i) calculate the solvent-accessible surface of every atom; (ii) determine the contacting residues and type of interaction they undergo (hydrophobic-hydrophobic, aromatic-aromatic, etc.); (iii) indicate all putative hydrogen bonds. LPC software further predicts changes in binding strength following chemical modification of the ligand. AVAILABILITY: Both LPC and CSU can be accessed through the PDB and are integrated in the 3DB Atlas page of all PDB files. For any given file, the tools can also be accessed at http://www.pdb.bnl. gov/pdb-bin/lpc?PDB_ID= and http://www.pdb.bnl. gov/pdb-bin/csu?PDB_ID= with the four-letter PDB code added at the end in each case. Finally, LPC and CSU can be accessed at: http://sgedg.weizmann.ac.il/lpc and http://sgedg.weizmann.ac.il/csu.  (+info)

MIAH: automatic alignment of eukaryotic SSU rRNAs. (72/21774)

SUMMARY: MIAH is a WWW server for the automatic alignment of new eukaryotic SSU rRNA sequences to an existing alignment of 1500 sequences. AVAILABILITY: http://chah.ucc.ie/MIAH Contact :  (+info)