Feature selection for DNA methylation based cancer classification.
Molecular portraits, such as mRNA expression or DNA methylation patterns, have been shown to be strongly correlated with phenotypical parameters. These molecular patterns can be revealed routinely on a genomic scale. However, class prediction based on these patterns is an under-determined problem, due to the extreme high dimensionality of the data compared to the usually small number of available samples. This makes a reduction of the data dimensionality necessary. Here we demonstrate how phenotypic classes can be predicted by combining feature selection and discriminant analysis. By comparing several feature selection methods we show that the right dimension reduction strategy is of crucial importance for the classification performance. The techniques are demonstrated by methylation pattern based discrimination between acute lymphoblastic leukemia and acute myeloid leukemia. (+info)
Separation of samples into their constituents using gene expression data.
Gene expression measurements are a powerful tool in molecular biology, but when applied to heterogeneous samples containing more than one cellular type the results are difficult to interpret. We present here a new approach to this problem allowing to deduce the gene expression profile of the various cellular types contained in a set of samples directly from the measurements taken on the whole sample. (+info)
The main biological determinants of tumor line taxonomy elucidated by a principal component analysis of microarray data.
By using principal components analysis (PCA) we demonstrate here that the information relevant to tumor line classification linked to the activity of 1375 genes expressed in 60 tumor cell lines can be reproduced by only five independent components. These components can be interpreted as cell motility and migration, cellular trafficking and endo/exocytosis, and epithelial character. PCA, at odds with cluster analysis methods routinely used in microarray analysis, allows for the participation of individual genes to multiple biochemical pathways, while assigning to each cell line a quantitative score reflecting fundamental biological functions. (+info)
High-resolution metabolic phenotyping of genetically and environmentally diverse potato tuber systems. Identification of phenocopies.
We conducted a comprehensive metabolic phenotyping of potato (Solanum tuberosum L. cv Desiree) tuber tissue that had been modified either by transgenesis or exposure to different environmental conditions using a recently developed gas chromatography-mass spectrometry profiling protocol. Applying this technique, we were able to identify and quantify the major constituent metabolites of the potato tuber within a single chromatographic run. The plant systems that we selected to profile were tuber discs incubated in varying concentrations of fructose, sucrose, and mannitol and transgenic plants impaired in their starch biosynthesis. The resultant profiles were then compared, first at the level of individual metabolites and then using the statistical tools hierarchical cluster analysis and principal component analysis. These tools allowed us to assign clusters to the individual plant systems and to determine relative distances between these clusters; furthermore, analyzing the loadings of these analyses enabled identification of the most important metabolites in the definition of these clusters. The metabolic profiles of the sugar-fed discs were dramatically different from the wild-type steady-state values. When these profiles were compared with one another and also with those we assessed in previous studies, however, we were able to evaluate potential phenocopies. These comparisons highlight the importance of such an approach in the functional and qualitative assessment of diverse systems to gain insights into important mediators of metabolism. (+info)
Percent G+C profiling accurately reveals diet-related differences in the gastrointestinal microbial community of broiler chickens.
Broiler chickens from eight commercial farms in Southern Finland were analyzed for the structure of their gastrointestinal microbial community by a nonselective DNA-based method, percent G+C-based profiling. The bacteriological impact of the feed source and in-farm whole-wheat amendment of the diet was assessed by percent G+C profiling. Also, a phylogenetic 16S rRNA gene (rDNA)-based study was carried out to aid in interpretation of the percent G+C profiles. This survey showed that most of the 16S rDNA sequences found could not be assigned to any previously known bacterial genus or they represented an unknown species of one of the taxonomically heterogeneous genera, such as Ruminococcus or Clostridium. The data from bacterial community profiling were analyzed by t-test, multiple linear regression, and principal-component statistical approaches. The percent G+C profiling method with appropriate statistical analyses detected microbial community differences smaller than 10% within each 5% increment of the percent G+C profiles. Diet turned out to be the strongest determinant of the cecal bacterial community structure. Both the source of feed and local feed amendment changed the bacteriological profile significantly, whereas profiles of individual farms with identical feed regimens hardly differed from each other. This suggests that the management of typical Finnish farms is relatively uniform or that hygiene on the farm, in fact, has little impact on the structure of the cecal bacterial community. Therefore, feed compounders should have a significant role in the modulation of gut microflora and consequently in prevention of gastrointestinal disorders in farm animals. (+info)
Independent representations of limb axis length and orientation in spinocerebellar response components.
Dorsal spinocerebellar tract (DSCT) neurons transmit sensory signals to the cerebellum that encode global hindlimb parameters, such as the hindlimb end-point position and its direction of movement. Here we use a population analysis approach to examine further the characteristics of DSCT neuronal responses during continuous movements of the hind foot. We used a robot to move the hind paw of anesthetized cats through the trajectories of a step or a figure-8 footpath in a parasagittal plane. Extracellular recordings from 82 cells converted to cycle histograms provided the basis for a principal-component analysis to determine the common features of the DSCT movement responses. Five principal components (PCs) accounted for about 80% of the total variance in the waveforms across units. The first two PCs accounted for about 60% of the variance and they were highly robust across samples. We examined the relationship between the responses and limb kinematic parameters by correlating the PC waveforms with waveforms of the joint angle and limb axis trajectories using multivariate linear regression models. Each PC waveform could be at least partly explained by a linear relationship to joint-angle trajectories, but except for the first PC, they required multiple angles. However, the limb axis parameters more closely related to both the first and second PC waveforms. In fact, linear regression models with limb axis length and orientation trajectories as predictors explained 94% of the variance in both PCs, and each was related to a particular linear combination of position and velocity. The first PC correlated with the limb axis orientation and orientation velocity trajectories, whereas second PC with the length and length velocity trajectories. These combinations were found to correspond to the dynamics of muscle spindle responses. The first two PCs were also most representative of the data set since about half the DSCT responses could be at least 85% accounted for by weighted linear combinations of these two PCs. Higher-order PCs were unrelated to limb axis trajectories and accounted instead for different dynamic components of the responses. The findings imply that an explicit and independent representation of the limb axis length and orientation may be present at the lowest levels of sensory processing in the spinal cord. (+info)
Reliability, validity and psychometric properties of the Greek translation of the Zung Depression Rating Scale.
INTRODUCTION: The current study aimed to assess the reliability, validity and psychometric properties of the Greek translation of the Zung Depression Rating Scale (ZDRS). METHODS: The study sample included 40 depressed patients 29.65 +/- 9.38 years old and 120 normal comparison subjects 27.23 +/- 10.62 years old. In 20 of them (12 patients and 8 comparison subjects) the instrument was re-applied 1-2 days later. Translation and Back Translation was made. Clinical Diagnosis was reached by consensus of two examiners with the use of the SCAN v.2.0 and the IPDE. Statistical Analysis included ANOVA, the Pearson Product Moment Correlation Coefficient, Principal Components Analysis and Discriminant Function Analysis and the calculation of Cronbach's alpha (alpha) RESULTS: Both Sensitivity and specificity exceed 90.00 at 44/45, Chronbach's alpha for the total scale was equal to 0.09, suggesting that the scale covers a broad spectrum of symptoms. Factor analysis revealed five factors (anxiety-depression, thought content, gastrenterological symptoms, irritability and social-interpersonal functioning). The test-retest reliability was satisfactory (Pearson's R between 0.92). CONCLUSION: The ZDRS-Greek translation is both reliable and valid and is suitable for clinical and research use with satisfactory properties. Its properties are similar to those reported in the international literature, although the literature is limited. However one should always have in mind the limitations inherent in the use of self-report scales. (+info)
Analysis of large-scale gene expression data.
DNA microarray technology has resulted in the generation of large complex data sets, such that the bottleneck in biological investigation has shifted from data generation, to data analysis. This review discusses some of the algorithms and tools for the analysis and organisation of microarray expression data, including clustering methods, partitioning methods, and methods for correlating expression data to other biological data. (+info)