Skewed distribution of protein secondary structure contents over the conformational triangle. (41/20328)

A conformational triangle method is presented to analyze the secondary structure contents of 1028 structurally known proteins in the non-redundant data set of the recent 25% PDB_SELECT. The secondary structure contents of each protein are mapped on to a point in the triangle. It was found that the distribution of the 1028 points is strongly skewed in the triangle and about 42% of the whole area is empty, which is called the forbidden area. The detailed border between the allowable and forbidden areas was calculated. The possible explanation of the skewed distribution is discussed. The distributions of the mapping points for enzymes and non-enzymes in this non-redundant data set are compared. It was found that a necessary rather than a sufficient condition for an enzyme molecule is that its coil content must be >/=0.223. It is hoped that the skewed distribution observed here could be used to test the secondary structure and threading predictions.  (+info)

Large-scale clustering of cDNA-fingerprinting data. (42/20328)

Clustering is one of the main mathematical challenges in large-scale gene expression analysis. We describe a clustering procedure based on a sequential k-means algorithm with additional refinements that is able to handle high-throughput data in the order of hundreds of thousands of data items measured on hundreds of variables. The practical motivation for our algorithm is oligonucleotide fingerprinting-a method for simultaneous determination of expression level for every active gene of a specific tissue-although the algorithm can be applied as well to other large-scale projects like EST clustering and qualitative clustering of DNA-chip data. As a pairwise similarity measure between two p-dimensional data points, x and y, we introduce mutual information that can be interpreted as the amount of information about x in y, and vice versa. We show that for our purposes this measure is superior to commonly used metric distances, for example, Euclidean distance. We also introduce a modified version of mutual information as a novel method for validating clustering results when the true clustering is known. The performance of our algorithm with respect to experimental noise is shown by extensive simulation studies. The algorithm is tested on a subset of 2029 cDNA clones coming from 15 different genes from a cDNA library derived from human dendritic cells. Furthermore, the clustering of these 2029 cDNA clones is demonstrated when the entire set of 76,032 cDNA clones is processed.  (+info)

Exploring expression data: identification and analysis of coexpressed genes. (43/20328)

Analysis procedures are needed to extract useful information from the large amount of gene expression data that is becoming available. This work describes a set of analytical tools and their application to yeast cell cycle data. The components of our approach are (1) a similarity measure that reduces the number of false positives, (2) a new clustering algorithm designed specifically for grouping gene expression patterns, and (3) an interactive graphical cluster analysis tool that allows user feedback and validation. We use the clusters generated by our algorithm to summarize genome-wide expression and to initiate supervised clustering of genes into biologically meaningful groups.  (+info)

Novel selenoproteins identified in silico and in vivo by using a conserved RNA structural motif. (44/20328)

Selenocysteine is incorporated into selenoproteins by an in-frame UGA codon whose readthrough requires the selenocysteine insertion sequence (SECIS), a conserved hairpin in the 3'-untranslated region of eukaryotic selenoprotein mRNAs. To identify new selenoproteins, we developed a strategy that obviates the need for prior amino acid sequence information. A computational screen was used to scan nucleotide sequence data bases for sequences presenting a potential SECIS secondary structure. The computer-selected hairpins were then assayed in vivo for their functional capacities, and the cDNAs corresponding to the SECIS winners were identified. Four of them encoded novel selenoproteins as confirmed by in vivo experiments. Among these, SelZf1 and SelZf2 share a common domain with mitochondrial thioredoxin reductase-2. The three proteins, however, possess distinct N-terminal domains. We found that another protein, SelX, displays sequence similarity to a protein involved in bacterial pilus formation. For the first time, four novel selenoproteins were discovered based on a computational screen for the RNA hairpin directing selenocysteine incorporation.  (+info)

Genetic plasticity of V genes under somatic hypermutation: statistical analyses using a new resampling-based methodology. (45/20328)

Evidence for somatic hypermutation of immunoglobulin genes has been observed in all of the species in which immunoglobulins have been found. Previous studies have suggested that codon usage in immunoglobulin variable (V) region genes is such that the sequence-specificity of somatic hypermutation results in greater mutability in complementarity-determining regions of the gene than in the framework regions. We have developed a new resampling-based methodology to explore genetic plasticity in individual V genes and in V gene families in a statistically meaningful way. We determine what factors contribute to this mutability difference and characterize the strength of selection for this effect. We find that although the codon usage in immunoglobulin V genes renders them distinct among translationally equivalent sequences with random codon usage, they are nevertheless not optimal in this regard. We find that the mutability patterns in a number of species are similar to those we find for human sequences. Interestingly, sheep sequences show extremely strong mutability differences, consistent with the role of somatic hypermutation in the diversification of primary antibody repertoire in these animals. Human TCR V(beta) sequences resemble immunoglobulin in mutability pattern, suggesting one of several alternatives, that hypermutation is functionally operating in TCR, that it was once operating in TCR or in the common precursor of TCR and immunoglobulin, or that the hypermutation mechanism has evolved to exploit the codon usage in immunoglobulin (and fortuitously, TCR) rather than vice-versa. Our findings provide support to the hypothesis that somatic hypermutation appeared very early in the phylogeny of immune systems, that it is, to a large extent, shared between species, and that it makes an essential contribution to the generation of the antibody repertoire.  (+info)

The role of terminators and occlusion cues in motion integration and segmentation: a neural network model. (46/20328)

The perceptual interaction of terminators and occlusion cues with the functional processes of motion integration and segmentation is examined using a computational model. Integration is necessary to overcome noise and the inherent ambiguity in locally measured motion direction (the aperture problem). Segmentation is required to detect the presence of motion discontinuities and to prevent spurious integration of motion signals between objects with different trajectories. Terminators are used for motion disambiguation, while occlusion cues are used to suppress motion noise at points where objects intersect. The model illustrates how competitive and cooperative interactions among cells carrying out these functions can account for a number of perceptual effects, including the chopsticks illusion and the occluded diamond illusion. Possible links to the neurophysiology of the middle temporal visual area (MT) are suggested.  (+info)

Short-range vernier acuity: interactions of temporal frequency, temporal phase, and stimulus polarity. (47/20328)

We examined how vernier thresholds for flickering bars depend on the temporal frequency and relative temporal phase of the bars. The largest effect of relative phase (up to a fivefold increase in displacement thresholds) was seen at 2 Hz, and for most subjects, relative phase had little effect at 16 Hz and above. The effect of relative phase was essentially independent of contrast and trial duration. Thresholds were elevated by the greatest amount when bars were presented in antiphase, but at 1 and 4 Hz, quadrature phase offsets also led to substantial elevations in displacement thresholds. An experiment designed to examine the interaction of the vernier judgment with apparent motion failed to identify a role for mechanisms sensitive to apparent motion in threshold elevation. Another experiment in which the bars were modulated with sawtooth waveforms indicated that temporal correlation between the bars, rather than the ON versus OFF distinction, underlies the phase sensitivity. A simple dynamical model that posits partial rectification prior to a cross-correlation-like interaction accounts for the observed results.  (+info)

Trichromatic opponent color classification. (48/20328)

Stimuli varying in intensity and chromaticity, presented on numerous backgrounds, were classified into red/green, blue/yellow and white/black opponent color categories. These measurements revealed the shapes of the boundaries that separate opponent colors in three-dimensional color space. Opponent color classification boundaries were generally not planar, but their shapes could be summarized by a piecewise linear model in which increment and decrement color signals are combined with different weights at two stages to produce opponent color sensations. The effect of background light on classification was largely explained by separate gain changes in increment and decrement cone signals.  (+info)