MOTIVATION: Most existing bioinformatics methods are limited to making point estimates of one variable, e.g. the optimal alignment, with fixed input values for all other variables, e.g. gap penalties and scoring matrices. While the requirement to specify parameters remains one of the more vexing issues in bioinformatics, it is a reflection of a larger issue: the need to broaden the view on statistical inference in bioinformatics. RESULTS: The assignment of probabilities for all possible values of all unknown variables in a problem in the form of a posterior distribution is the goal of Bayesian inference. Here we show how this goal can be achieved for most bioinformatics methods that use dynamic programming. Specifically, a tutorial style description of a Bayesian inference procedure for segmentation of a sequence based on the heterogeneity in its composition is given. In addition, full Bayesian inference algorithms for sequence alignment are described. AVAILABILITY: Software and a set of transparencies for a tutorial describing these ideas are available at http://www.wadsworth.org/res&res/bioinfo/ (+info)
Genetic determination of individual birth weight and its association with sow productivity traits using Bayesian analyses.
Genetic association between individual birth weight (IBW) and litter birth weight (LBW) was analyzed on records of 14,950 individual pigs born alive between 1988 and 1994 at the pig breeding farm of the University of Kiel. Dams were from three purebred lines (German Landrace, German Edelschwein, and Large White) and their crosses. Phenotypically, preweaning mortality of pigs decreased substantially from 40% for pigs with < or = 1 kg weight to less than 7% for pigs with > 1.6 kg. For these low to high birth weight categories, preweaning growth (d 21 of age) and early postweaning growth (weaning to 25 kg) increased by more than 28 and 8% per day, respectively. Bayesian analysis was performed based on direct-maternal effects models for IBW and multiple-trait direct effects models for number of pigs born in total (NOBT) and alive (NOBA) and LBW. Bayesian posterior means for direct and maternal heritability and litter proportion of variance in IBW were .09, .26, and .18, respectively. After adjustment for NOBT, these changed to .08, .22, and .09, respectively. Adjustment for NOBT reduced the direct and maternal genetic correlation from -.41 to -.22. For these direct-maternal correlations, the 95% highest posterior density intervals were -.75 to -.07, and -.58 to .17 before and after adjustment for NOBT. Adjustment for NOBT was found to be necessary to obtain unbiased estimates of genetic effects for IBW. The relationship between IBW and NOBT, and thus the adjustment, was linear with a decrease in IBW of 44 g per additionally born pig. For litter traits, direct heritabilities were .10, .08, and .08 for NOBT, NOBA, and LBW, respectively. After adjustment of LBW for NOBA the heritability changed to .43. Expected variance components for LBW derived from estimates of IBW revealed that genetic and environmental covariances between full-sibs and variation in litter size resulted in the large deviation of maternal heritability for IBW and its equivalent estimate for LBW. These covariances among full-sibs could not be estimated if only LBW were recorded. Therefore, selection for increased IBW is recommended, with the opportunity to improve both direct and maternal genetic effects of birth weight of pigs and, thus, their vitality and pre- and postnatal growth. (+info)
Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data.
A general fine-scale Bayesian quantitative trait locus (QTL) mapping method for outcrossing species is presented. It is suitable for an analysis of complete and incomplete data from experimental designs of F2 families or backcrosses. The amount of genotyping of parents and grandparents is optional, as well as the assumption that the QTL alleles in the crossed lines are fixed. Grandparental origin indicators are used, but without forgetting the original genotype or allelic origin information. The method treats the number of QTL in the analyzed chromosome as a random variable and allows some QTL effects from other chromosomes to be taken into account in a composite interval mapping manner. A block-update of ordered genotypes (haplotypes) of the whole family is sampled once in each marker locus during every round of the Markov Chain Monte Carlo algorithm used in the numerical estimation. As a byproduct, the method gives the posterior distributions for linkage phases in the family and therefore it can also be used as a haplotyping algorithm. The Bayesian method is tested and compared with two frequentist methods using simulated data sets, considering two different parental crosses and three different levels of available parental information. The method is implemented as a software package and is freely available under the name Multimapper/outbred at URL http://www.rni.helsinki.fi/mjs/. (+info)
The validation of interviews for estimating morbidity.
Health interview surveys have been widely used to measure morbidity in developing countries, particularly for infectious diseases. Structured questionnaires using algorithms which derive sign/symptom-based diagnoses seem to be the most reliable but there have been few studies to validate them. The purpose of validation is to evaluate the sensitivity and specificity of brief algorithms (combinations of signs/symptoms) which can then be used for the rapid assessment of community health problems. Validation requires a comparison with an external standard such as physician or serological diagnoses. There are several potential pitfalls in assessing validity, such as selection bias, differences in populations and the pattern of diseases in study populations compared to the community. Validation studies conducted in the community may overcome bias caused by case selection. Health centre derived estimates can be adjusted and applied to the community with caution. Further study is needed to validate algorithms for important diseases in different cultural settings. Community-based studies need to be conducted, and the utility of derived algorithms for tracking disease frequency explored further. (+info)
Bayesian analysis of birth weight and litter size in Baluchi sheep using Gibbs sampling.
Variance and covariance components for birth weight (BWT), as a lamb trait, and litter size measured on ewes in the first, second, and third parities (LS1 through LS3) were estimated using a Bayesian application of the Gibbs sampler. Data came from Baluchi sheep born between 1966 and 1989 at the Abbasabad sheep breeding station, located northeast of Mashhad, Iran. There were 10,406 records of BWT recorded for all ewe lambs and for ram lambs that later became sires or maternal grandsires. All lambs that later became dams had records of LS1 through LS3. Separate bivariate analyses were done for each combination of BWT and one of the three variables LS1 through LS3. The Gibbs sampler with data augmentation was used to draw samples from the marginal posterior distribution for sire, maternal grandsire, and residual variances and the covariance between the sire and maternal grandsire for BWT, variances for the sire and residual variances for the litter size traits, and the covariances between sire effects for different trait combinations, sire and maternal grandsire effects for different combinations of BWT and LS1 through LS3, and the residual covariations between traits. Although most of the densities of estimates were slightly skewed, they seemed to fit the normal distribution well, because the mean, mode, and median were similar. Direct and maternal heritabilities for BWT were relatively high with marginal posterior modes of .14 and .13, respectively. The average of the three direct-maternal genetic correlation estimates for BWT was low, .10, but had a high standard deviation. Heritability increased from LS1 to LS3 and was relatively high, .29 to .37. Direct genetic correlations between BWT and LS1 and between BWT and LS3 were negative, -.32 and -.43, respectively. Otherwise, the same correlation between BWT and LS2 was positive and low, .06. Genetic correlations between maternal effects for BWT and direct effects for LS1 through LS3 were all highly negative and consistent for all parities, circa -.75. Environmental correlations between BWT and LS1 through LS3 were relatively low and ranged from .18 to .29 and had high standard errors. (+info)
Thermodynamics and kinetics of a folded-folded' transition at valine-9 of a GCN4-like leucine zipper.
Spin inversion transfer (SIT) NMR experiments are reported probing the thermodynamics and kinetics of interconversion of two folded forms of a GCN4-like leucine zipper near room temperature. The peptide is 13Calpha-labeled at position V9(a) and results are compared with prior findings for position L13(e). The SIT data are interpreted via a Bayesian analysis, yielding local values of T1a, T1b, kab, kba, and Keq as functions of temperature for the transition FaV9 right arrow over left arrow FbV9 between locally folded dimeric forms. Equilibrium constants, determined from relative spin counts at spin equilibrium, agree well with the ratios kab/kba from the dynamic SIT experiments. Thermodynamic and kinetic parameters are similar for V9(a) and L13(e), but not the same, confirming that the molecular conformational population is not two-state. The energetic parameters determined for both sites are examined, yielding conclusions that apply to both and are robust to uncertainties in the preexponential factor (kT/h) of the Eyring equation. These conclusions are 1) the activation free energy is substantial, requiring a sparsely populated transition state; 2) the transition state's enthalpy far exceeds that of either Fa or Fb; 3) the transition state's entropy far exceeds that of Fa, but is comparable to that of Fb; 4) "Arrhenius kinetics" characterize the temperature dependence of both kab and kba, indicating that the temperatures of slow interconversion are not below that of the glass transition. Any postulated free energy surface for these coiled coils must satisfy these constraints. (+info)
Iterative reconstruction based on median root prior in quantification of myocardial blood flow and oxygen metabolism.
The aim of this study was to compare reproducibility and accuracy of two reconstruction methods in quantification of myocardial blood flow and oxygen metabolism with 15O-labeled tracers and PET. A new iterative Bayesian reconstruction method based on median root prior (MRP) was compared with filtered backprojection (FBP) reconstruction method, which is traditionally used for image reconstruction in PET studies. METHODS: Regional myocardial blood flow (rMBF), oxygen extraction fraction (rOEF) and myocardial metabolic rate of oxygen consumption (rMMRO2) were quantified from images reconstructed in 27 subjects using both MRP and FBP methods. For each subject, regions of interest (ROIs) were drawn on the lateral, anterior and septal regions on four planes. To test reproducibility, the ROI drawing procedure was repeated. By using two sets of ROIs, variability was evaluated from images reconstructed with the MRP and the FBP methods. RESULTS: Correlation coefficients of mean values of rMBF, rOEF and rMMRO2 were significantly higher in the images reconstructed with the MRP reconstruction method compared with the images reconstructed with the FBP method (rMBF: MRP r = 0.896 versus FBP r = 0.737, P < 0.001; rOEF: 0.915 versus 0.855, P < 0.001; rMMRO2: 0.954 versus 0.885, P < 0.001). Coefficient of variation for each parameter was significantly lower in MRP images than in FBP images (rMBF: MRP 23.5% +/- 11.3% versus FBP 30.1% +/- 14.7%, P < 0.001; rOEF: 21.0% +/- 11.1% versus 32.1% +/- 19.8%, P < 0.001; rMMRO2: 23.1% +/- 13.2% versus 30.3% +/- 19.1%, P < 0.001). CONCLUSION: The MRP reconstruction method provides higher reproducibility and lower variability in the quantitative myocardial parameters when compared with the FBP method. This study shows that the new MRP reconstruction method improves accuracy and stability of clinical quantification of myocardial blood flow and oxygen metabolism with 15O and PET. (+info)
Taking account of between-patient variability when modeling decline in Alzheimer's disease.
The pattern of deterioration in patients with Alzheimer's disease is highly variable within a given population. With recent speculation that the apolipoprotein E allele may influence rate of decline and claims that certain drugs may slow the course of the disease, there is a compelling need for sound statistical methodology to address these questions. Current statistical methods for describing decline do not adequately take into account between-patient variability and possible floor and/or ceiling effects in the scale measuring decline, and they fail to allow for uncertainty in disease onset. In this paper, the authors analyze longitudinal Mini-Mental State Examination scores from two groups of Alzheimer's disease subjects from Palo Alto, California, and Minneapolis, Minnesota, in 1981-1993 and 1986-1988, respectively. A Bayesian hierarchical model is introduced as an elegant means of simultaneously overcoming all of the difficulties referred to above. (+info)