The significance of non-significance. (1/8370)

We discuss the implications of empirical results that are statistically non-significant. Figures illustrate the interrelations among effect size, sample sizes and their dispersion, and the power of the experiment. All calculations (detailed in Appendix) are based on actual noncentral t-distributions, with no simplifying mathematical or statistical assumptions, and the contribution of each tail is determined separately. We emphasize the importance of reporting, wherever possible, the a priori power of a study so that the reader can see what the chances were of rejecting a null hypothesis that was false. As a practical alternative, we propose that non-significant inference be qualified by an estimate of the sample size that would be required in a subsequent experiment in order to attain an acceptable level of power under the assumption that the observed effect size in the sample is the same as the true effect size in the population; appropriate plots are provided for a power of 0.8. We also point out that successive outcomes of independent experiments each of which may not be statistically significant on its own, can be easily combined to give an overall p value that often turns out to be significant. And finally, in the event that the p value is high and the power sufficient, a non-significant result may stand and be published as such.  (+info)

Capture-recapture models including covariate effects. (2/8370)

Capture-recapture methods are used to estimate the incidence of a disease, using a multiple-source registry. Usually, log-linear methods are used to estimate population size, assuming that not all sources of notification are dependent. Where there are categorical covariates, a stratified analysis can be performed. The multinomial logit model has occasionally been used. In this paper, the authors compare log-linear and logit models with and without covariates, and use simulated data to compare estimates from different models. The crude estimate of population size is biased when the sources are not independent. Analyses adjusting for covariates produce less biased estimates. In the absence of covariates, or where all covariates are categorical, the log-linear model and the logit model are equivalent. The log-linear model cannot include continuous variables. To minimize potential bias in estimating incidence, covariates should be included in the design and analysis of multiple-source disease registries.  (+info)

Model for bacteriophage T4 development in Escherichia coli. (3/8370)

Mathematical relations for the number of mature T4 bacteriophages, both inside and after lysis of an Escherichia coli cell, as a function of time after infection by a single phage were obtained, with the following five parameters: delay time until the first T4 is completed inside the bacterium (eclipse period, nu) and its standard deviation (sigma), the rate at which the number of ripe T4 increases inside the bacterium during the rise period (alpha), and the time when the bacterium bursts (mu) and its standard deviation (beta). Burst size [B = alpha(mu - nu)], the number of phages released from an infected bacterium, is thus a dependent parameter. A least-squares program was used to derive the values of the parameters for a variety of experimental results obtained with wild-type T4 in E. coli B/r under different growth conditions and manipulations (H. Hadas, M. Einav, I. Fishov, and A. Zaritsky, Microbiology 143:179-185, 1997). A "destruction parameter" (zeta) was added to take care of the adverse effect of chloroform on phage survival. The overall agreement between the model and the experiment is quite good. The dependence of the derived parameters on growth conditions can be used to predict phage development under other experimental manipulations.  (+info)

Molecular studies suggest that cartilaginous fishes have a terminal position in the piscine tree. (4/8370)

The Chondrichthyes (cartilaginous fishes) are commonly accepted as being sister group to the other extant Gnathostomata (jawed vertebrates). To clarify gnathostome relationships and to aid in resolving and dating the major piscine divergences, we have sequenced the complete mtDNA of the starry skate and have included it in phylogenetic analysis along with three squalomorph chondrichthyans-the common dogfish, the spiny dogfish, and the star spotted dogfish-and a number of bony fishes and amniotes. The direction of evolution within the gnathostome tree was established by rooting it with the most closely related non-gnathostome outgroup, the sea lamprey, as well as with some more distantly related taxa. The analyses placed the chondrichthyans in a terminal position in the piscine tree. These findings, which also suggest that the origin of the amniote lineage is older than the age of the oldest extant bony fishes (the lungfishes), challenge the evolutionary direction of several morphological characters that have been used in reconstructing gnathostome relationships. Applying as a calibration point the age of the oldest lungfish fossils, 400 million years, the molecular estimate placed the squalomorph/batomorph divergence at approximately 190 million years before present. This dating is consistent with the occurrence of the earliest batomorph (skates and rays) fossils in the paleontological record. The split between gnathostome fishes and the amniote lineage was dated at approximately 420 million years before present.  (+info)

Toward a leukemia treatment strategy based on the probability of stem cell death: an essay in honor of Dr. Emil J Freireich. (5/8370)

Dr. Emil J Freireich is a pioneer in the rational treatment of cancer in general and leukemia in particular. This essay in his honor suggests that the cell kill concept of chemotherapy of acute myeloblastic leukemia be extended to include two additional ideas. The first concept is that leukemic blasts, like normal hemopoietic cells, are organized in hierarchies, headed by stem cells. In both normal and leukemic hemopoiesis, killing stem cells will destroy the system; furthermore, both normal and leukemic cells respond to regulators. It follows that acute myelogenous leukemia should be considered as a dependent neoplasm. The second concept is that cell/drug interaction should be considered as two phases. The first, or proximal phase, consists of the events that lead up to injury; the second, or distal phase, comprises the responses of the cell that contribute to either progression to apoptosis or recovery. Distal responses are described briefly. Regulated drug sensitivity is presented as an example of how distal responses might be used to improve treatment.  (+info)

A reanalysis of IgM Western blot criteria for the diagnosis of early Lyme disease. (6/8370)

A two-step approach for diagnosis of Lyme disease, consisting of an initial EIA followed by a confirmatory Western immunoblot, has been advised by the Centers for Disease Control and Prevention (CDC). However, these criteria do not examine the influence of the prior probability of Lyme disease in a given patient on the predictive value of the tests. By using Bayesian analysis, a mathematical algorithm is proposed that computes the probability that a given patient's Western blot result represents Lyme disease. Assuming prior probabilities of early Lyme disease of 1%-10%, the current CDC minimum criteria for IgM immunoblot interpretation yield posttest probabilities of 4%-32%. The value of the two-step approach for diagnosis of early Lyme disease may be limited in populations at lower risk of disease or when patients present with atypical signs and symptoms.  (+info)

Bayesian inference on biopolymer models. (7/8370)

MOTIVATION: Most existing bioinformatics methods are limited to making point estimates of one variable, e.g. the optimal alignment, with fixed input values for all other variables, e.g. gap penalties and scoring matrices. While the requirement to specify parameters remains one of the more vexing issues in bioinformatics, it is a reflection of a larger issue: the need to broaden the view on statistical inference in bioinformatics. RESULTS: The assignment of probabilities for all possible values of all unknown variables in a problem in the form of a posterior distribution is the goal of Bayesian inference. Here we show how this goal can be achieved for most bioinformatics methods that use dynamic programming. Specifically, a tutorial style description of a Bayesian inference procedure for segmentation of a sequence based on the heterogeneity in its composition is given. In addition, full Bayesian inference algorithms for sequence alignment are described. AVAILABILITY: Software and a set of transparencies for a tutorial describing these ideas are available at http://www.wadsworth.org/res&res/bioinfo/  (+info)

Using imperfect secondary structure predictions to improve molecular structure computations. (8/8370)

MOTIVATION: Until ab initio structure prediction methods are perfected, the estimation of structure for protein molecules will depend on combining multiple sources of experimental and theoretical data. Secondary structure predictions are a particularly useful source of structural information, but are currently only approximately 70% correct, on average. Structure computation algorithms which incorporate secondary structure information must therefore have methods for dealing with predictions that are imperfect. EXPERIMENTS PERFORMED: We have modified our algorithm for probabilistic least squares structural computations to accept 'disjunctive' constraints, in which a constraint is provided as a set of possible values, each weighted with a probability. Thus, when a helix is predicted, the distances associated with a helix are given most of the weight, but some weights can be allocated to the other possibilities (strand and coil). We have tested a variety of strategies for this weighting scheme in conjunction with a baseline synthetic set of sparse distance data, and compared it with strategies which do not use disjunctive constraints. RESULTS: Naive interpretations in which predictions were taken as 100% correct led to poor-quality structures. Interpretations that allow disjunctive constraints are quite robust, and even relatively poor predictions (58% correct) can significantly increase the quality of computed structures (almost halving the RMS error from the known structure). CONCLUSIONS: Secondary structure predictions can be used to improve the quality of three-dimensional structural computations. In fact, when interpreted appropriately, imperfect predictions can provide almost as much improvement as perfect predictions in three-dimensional structure calculations.  (+info)