###### WFABC: a Wright-Fisher ABC-based approach for inferring effective population sizes and selection coefficients from time-sampled...

With novel developments in sequencing technologies, time-sampled data are becoming more available and accessible. Naturally, there have been efforts in parallel to infer population genetic parameters from these data sets. Here, we compare and analyse four recent approaches based on the Wright-Fisher model for inferring selection coefficients (s) given effective population size (N-e), with simulated temporal data sets. Furthermore, we demonstrate the advantage of a recently proposed approximate Bayesian computation (ABC)-based method that is able to correctly infer genomewide average N-e from time-serial data, which is then set as a prior for inferring per-site selection coefficients accurately and precisely. We implement this ABC method in a new software and apply it to a classical time-serial data set of the medionigra genotype in the moth Panaxia dominula. We show that a recessive lethal model is the best explanation for the observed variation in allele frequency by implementing an estimator ...

###### One-Locus Two-Allele Models With Maternal (Parental) Selection | Genetics

Here I have studied a series of simple one-locus two-allele models for maternal (parental) selection. Srb et al. (1965, Chapter 11) give several examples for maternal effects that can be attributed to a single diallelic locus; see Wade (1996) for more discussion of the relevance of maternal effects controlled by a small number of loci with large effects. My results indicate similarity between dynamic behaviors under maternal selection and fertility selection. The latter is well-known to be much more complicated than the dynamics resulting from viability selection (e.g., Owen 1953; Bodmer 1965; Hadeler and Liberman 1975). I have shown that maternal selection can result in a simultaneous stability of equilibria of different types. Thus, in the presence of maternal (parental) selection, the outcome of population evolution can significantly depend on initial conditions. With maternal selection, genetic variability can be maintained in a population even if none of the offspring of heterozygous ...

###### Multivariate mutation-selection balance with constrained pleiotropic effects. | Genetics

A multivariate quantitative genetic model is analyzed that is based on the assumption that the genetic variation at a locus j primarily influences an underlying physiological variable yj, while influence on the genotypic values is determined by a kind of developmental function which is not changed by mutations at this locus. Assuming additivity among loci the developmental function becomes a linear transformation of the underlying variables y onto the genotypic values x, x = By. In this way the pleiotropic effects become constrained by the structure of the B-matrix. The equilibrium variance under mutation-stabilizing selection balance in infinite and finite populations is derived by using the house of cards approximation. The results are compared to the predictions given by M. Turelli in 1985 for pleiotropic two-character models. It is shown that the B-matrix model gives the same results as Turellis five-allele model, suggesting that the crucial factor determining the equilibrium variance in ...

###### econometrics - Variance-Covariance Matrix of the errors of a linear regression - Cross Validated

Suppose you have 8 observations ($i=1,...,8$) from three different states (A, B, C) and you also know that observations for $i=1,2$ are from state A, for $i=3,4,5$ are from state B and for $i=6,7,8$ are from state C. You are trying to estimate parameters with a linear regression model where $\varepsilon_i$ is the error term. The assumptions on this error term are that: $E[\varepsilon_i]=0$, $V[\varepsilon]=\sigma^2$ and:. $$Cov[\varepsilon_i, \varepsilon_j]=\begin{cases} \sigma^2 \rho & \text{ if observation i comes from the same state of observation j} \\ 0 & \text{otherwise} \end{cases}$$. Now you have that:. $$\overline{\varepsilon_h}=\frac{1}{n_h} \sum_{i \in h} \varepsilon_i$$. where $h=A,B,C$. Im asked to compute the variance-covariance matrix of $\overline{\varepsilon}$ (notated $V[\overline{\varepsilon}]$) so Ive started to compute variances and covariances of $\overline{\varepsilon}$ for $h=A, \ B, \ C$.. ...

###### Inferring bottlenecks from genome-wide samples of short sequence blocks

The advent of the genomic era has necessitated the development of methods capable of analyzing large volumes of genomic data efficiently. Being able to reliably identify bottlenecks-extreme population size changes of short duration-not only is interesting in the context of speciation and extinction but also matters (as a null model) when inferring selection. Bottlenecks can be detected in polymorphism data via their distorting effect on the shape of the underlying genealogy. Here, we use the generating function of genealogies to derive the probability of mutational configurations in short sequence blocks under a simple bottleneck model. Given a large number of nonrecombining blocks, we can compute maximum-likelihood estimates of the time and strength of the bottleneck. Our method relies on a simple summary of the joint distribution of polymorphic sites. We extend the site frequency spectrum by counting mutations in frequency classes in short sequence blocks. Using linkage information over short ...

###### Frontiers | An efficient technique for Bayesian modeling of family data using the BUGS software | Genetics

Linear mixed models have become a popular tool to analyze continuous data from family-based designs by using random effects that model the correlation of subjects from the same family. However, mixed models for family data are challenging to implement with the BUGS (Bayesian inference Using Gibbs Sampling) software because of the high-dimensional covariance matrix of the random effects. This paper describes an efficient parameterization that utilizes the singular value decomposition of the covariance matrix of random effects, includes the BUGS code for such implementation, and extends the parameterization to generalized linear mixed models. The implementation is evaluated using simulated data and an example from a large family-based study is presented with a comparison to other existing methods.

###### Back to Biology: Confounding, Bias, and Dichotomous Traits | heavytailed

(Update 6/2013 - Ive edited and extended this old post from 10/2012. I had begun writing a new related post, and decided the material was better placed within this one as an extension.) Two recent observations set me down a dark and lonely road, and they are unsurprisingly related. They both have to do with…

###### Non-genetic Inheritance and Evolution | Department of Biology

Everyone knows that parents provide more than DNA for their offspring. Development does, after all, start with an egg. But such non-genetic inheritance has been conspicuously absent from discussions of how evolution works. Similarly to plasticity, non-genetic inheritance evolves and can be an adaptation. For example, we could ask how parental and offspring plasticity co-evolve and if this enables non-genetic transmission of information between generations. We can also ask if incomplete epigenetic resetting between generations could ever be favoured by natural selection. We address some of these issues as part of an EU funded large collaborative project called IDEAL.. But non-genetic inheritance is more than an adaptation to transfer information between generations (or a cause of phenotypic variance that biases responses to selection as in many quantitative genetic models). We have suggested there are a number of important insights gained from viewing heredity as a developmental process; by ...

###### Heteroscedasticity-Corrected Covariance Matrices :: SAS/ETS(R) 12.1 Users Guide

The HCCME= option in the MODEL statement selects the type of heteroscedasticity-consistent covariance matrix. In the presence of heteroscedasticity, the covariance matrix has a complicated structure that can result in inefficiencies in the OLS estimates and biased estimates of the variance-covariance matrix. The variances for cross-sectional and time dummy variables and the covariances with or between the dummy variables are not corrected for heteroscedasticity in the one-way and two-way models. Whether or not HCCME is specified, they are the same. For the two-way models, the variance and the covariances for the intercept are not corrected.[1] Consider the simple linear model: ...

###### Globally, unrelated protein sequences appear random : Bioinformatics - oi

Motivation: To test whether protein folding constraints and secondary structure sequence preferences significantly reduce the space of amino acid words in proteins, we compared the frequencies of four- and five-amino acid word clumps (independent words) in proteins to the frequencies predicted by four random sequence models.. Results: While the human proteome has many overrepresented word clumps, these words come from large protein families with biased compositions (e.g. Zn-fingers). In contrast, in a non-redundant sample of Pfam-AB, only 1% of four-amino acid word clumps (4.7% of 5mer words) are 2-fold overrepresented compared with our simplest random model [MC(0)], and 0.1% (4mers) to 0.5% (5mers) are 2-fold overrepresented compared with a window-shuffled random model. Using a false discovery rate q-value analysis, the number of exceptional four- or five-letter words in real proteins is similar to the number found when comparing words from one random model to another. Consensus overrepresented ...

###### Math::Random - Random Number Generators

When called in an array context, returns an array of $n deviates (each deviate being an array reference) generated from the multivariate normal distribution with mean vector @mean and variance-covariance matrix @covar. When called in a scalar context, generates and returns only one such deviate as an array reference, regardless of the value of $n. Argument restrictions: If the dimension of the deviate to be generated is p, @mean should be a length p array of real numbers. @covar should be a length p array of references to length p arrays of real numbers (i.e. a p by p matrix). Further, @covar should be a symmetric positive-definite matrix, although the Perl code does not check positive-definiteness, and the underlying C code assumes the matrix is symmetric. Given that the variance-covariance matrix is symmetric, it doesnt matter if the references refer to rows or columns. If a non-positive definite matrix is passed to the function, it will abort with the following message: ...

###### Hardy-Weinberg Equilibrium According to Hoyle

1. Basic Genetics of Platypapyrus foursuitii Platypapyrus foursuitii is a diploid organism. One feature that makes the species particularly amenable for genetic studies is that their chromosomal material takes the form of playing cards and can easily be handled like cards. Each card represents alleles in the gene pool, and two cards together represent the genotype of an individual. A person can hold any number of different individual genotypes, depending on the sample size you want. For a class of 25, you can have a population size of 50 by giving each student four cards. Sample sizes much less than 50 can result in significant fluctuations due to sampling error. For 50 individuals, you will need 100 cards or two decks. 2. A one-locus, two-allele model (50 individual organisms in the population) These instructions assume 50 individuals in the population, each individual containing two cards. If you have 25 students, you can give each one two pairs to work with. If you have some other number, ...

###### Statistical Inference for the Measurement of the Incidence of Taxes and Transfers

We establish the asymptotic sampling distribution of general functions of quantile-based estimators computed from samples that are not necessarily independent. The results provide the statistical framework within which to assess the progressivity of taxes and benefits, their horizontal inequity, and the change in the inequality of income which they cause. By the same token, these findings characterise the sampling distribution of a number of popular indices of progressivity, horizontal inequity, and redistribution. They can also be used to assess welfare and inequality changes using panel data, and to assess poverty when it depends on estimated population quantiles. We illustrate these results using micro data on the incidence of taxes and benefits in Canada.(This abstract was borrowed from another version of this item.)

###### Statistical Inference for the Measurement of the Incidences of Taxes and Transfers

Downloadable! We establish the asymptotic sampling distribution of general functions of quantile-based estimators computed from samples that are not necessarily independent. The results provide the statistical framework within which to assess the progressivity of taxes and benefits, their horizontal inequity, and the change in the inequality of income which they cause. By the same token, these findings characterise the sampling distribution of a number of popular indices of progressivity, horizontal inequity, and redistribution. They can also be used to assess welfare and inequality changes using panel data, and to assess poverty when it depends on estimated population quantiles. We illustrate these results using micro data on the incidence of taxes and benefits in Canada.

###### MODEL Statement :: SAS/STAT(R) 13.1 Users Guide

controls the maximum number of additional iterations PROC MIXED performs to update the fixed-effects and covariance parameter estimates following data point removal. If you specify n > 0, then statistics such as DFFITS, MDFFITS, and the likelihood distances measure the impact of observation(s) on all aspects of the analysis. Typically, the influence will grow compared to values at ITER=0. In models without RANDOM or REPEATED effects, the ITER= option has no effect. This documentation refers to analyses when n > 0 simply as iterative influence analysis, even if final covariance parameter estimates can be updated in a single step (for example, when METHOD=MIVQUE0 or METHOD=TYPE3). This nomenclature reflects the fact that only if n > 0 are all model parameters updated, which can require additional iterations. If n > 0 and METHOD=REML (default) or METHOD=ML, the procedure updates fixed effects and variance-covariance parameters after removing the selected observations with additional Newton-Raphson ...

###### pr.probability - Random walk conditioned on sum and last step - MathOverflow

This came up in the same project as Distribution of maximum of random walk conditioned to stay positive, which is certainly more standard. For this one, I completely dont know whether this is standard or difficult. Ive looked up some standard stuff (eg. on sequential sampling) where you have a boundary condition given by absorbing boundaries at $0$ and $a$, but the weighted sum seems to make things harder. Again, Id be very happy to learn that this is a standard thing with a good reference, or for advice as well as complete solutions. Any thoughts?. ...

###### DataCite Search

The additive genetic variance-covariance matrix (G) summarizes the multivariate genetic relationships among a set of traits. The geometry of G describes the distribution of multivariate genetic variance, and generates genetic constraints that bias the direction of evolution. Determining if and how the multivariate genetic variance evolves has been limited by a number of analytical challenges in comparing G-matrices. Current methods for the comparison of G typically share several drawbacks: metrics that lack a direct relationship ...

###### random $$ model

http://www.decisionsciencenews.com/2...-happens-next/ Code: // file c.c // cc -Wall -g c.c -o c #include|stdio.h| #include|stdlib.h| #inclu

###### AP Statistics Curriculum 2007 Infer 2Means Indep - Socr

Both the confidence intervals and the hypothesis testing methods in the independent-sample design require Normality of both samples. If the sample sizes are large (say ,50), Normality is not as critical, as the CLT implies the sampling distributions of the means are approximately Normal. If these parametric assumptions are invalid we must use a non-parametric (distribution free test), even if the latter is less powerful. The plots below indicate that Normal assumptions are not unreasonable for these data, and hence we may be justified in using the two independent sample T-test in this case. ...

###### Evolution - A-Z - Random sampling

Even when natural selection is not operating, the gene frequencies may change a little from the previous generation just by chance. This can happen because the genes that form a new generation are a random sample from the parental generation.. Random sampling occurs whenever a smaller number of successful individuals (or gametes) are sampled from a larger pool of potential survivers and the fitnesses of the genotypes are the same. Random sampling works at every stage as a new generation grows up but it starts at conception. In every species, each individual produces many more gametes than will ever fertilize, or be fertilized, to form new organisms. The successful gametes which do form offspring are a sample from the many gametes that the parents produce. Provided the parent is a heterozygote, such as Aa, it will then produce a large number of gametes, of which approximately one half will be A and the other half a. If that parent produces 10 offspring, it is most likely that five will inherit an ...

###### Fewer children mean longer life?

New research into ageing processes, based on modern genetic techniques, confirms theoretical expectations about the correlation between reproduction and lifespan. Studies of birds reveal that those that have offspring later ...

###### Various Consequences: The Shuffle

The shuffle, also known as the Fishers Exact Test, is a permutation test that can be used to estimate the sampling distribution of a statistic without relying on parametric assumptions. This is especially important when sample sizes are small. The other neat thing about permutation tests is that you dont have to know what the distribution of your statistic is. So if you have a really odd function of your data that you want to use rather than one of the classical statistics you can ...

###### A response to Mooney & Sokal

ive said this before, and ill say it again, a lot of it is purely reactive. race realists are not the only ones who say that race matters. when people make positive assertions about race based on social constructs which map onto to genetic correlations, that gets the ball rolling. finally, the quotation you are using has to be framed in the context of attempting to generate 100,000 year narratives. the fact that race is a fuzzy concept doesnt, to me, deny that it is more realistic than tracing human lineages 4,000 generations. an analogy would be if people attempted to trace races back 20,000 years, a problem that does crop up, and which falls under the same pitfalls as the issues i was bringing up. on the other hand, genetic correlations in the present generation are broken down (in general) by only small levels of deme-to-deme genetic exchange in most regions (e.g., given a modest number of genetic loci discrete clusters quickly emerge by populations which we a priori accept as ...

###### 3D Genetic Models | TurboSquid

3D genetic models for download, files in 3ds, max, c4d, maya, blend, obj, fbx with low poly, animated, rigged, game, and VR options.

###### Plus it

Z is a measure of the magnitude of bias in the COR. If Z = 1, the case-only estimate of interaction is not biased by genotype-environment association in the underlying population (76). Commonly, this assumption is assessed in control data from a small number of outside studies, using significance testing. Significance testing alone is not sufficient for assessment of potential bias (87). Rarely is Z estimated and/or adjusted for, analogous to other forms of bias such as confounding.. Results from this project illustrate some of the pitfalls of this approach. For instance, for XRCC1 399 ever-never smoking, 18 of the 21 included studies have estimates that are not statistically significantly different than the null value of 1.0. Considering any of these in a statistical significance testing framework would lead to the conclusion that the independence assumption was valid; therefore a case-only study estimate of interaction would not be biased, at least from independence assumption violation. ...

###### regression - Assumptions of generalised linear model - Cross Validated

Furthermore, #4 is an important thing to check, but I dont really think of it as an assumption per se. Lets think about how assumptions can be checked. Independence is often checked firstly by thinking about what the data stand for and how they were collected. In addition, it can be checked using things like a runs test, Durbin-Watson test, or examining the pattern of autocorrelations--you can also look at partial autocorrelations. (Note that, these can only be assessed relative to your continuous covariate.) With primarily categorical explanatory variables, homogeneity of variance can be checked by calculating the variance at each level of your factors. Having computed these, there are several tests used to check if they are about the same, primarily Levenes test, but also the Brown-Forsyth test. The $F_{max}$ test, also called Hartleys test is not recommended; if you would like a little more information about that I discuss it here. (Note that these tests can be applied to your ...

###### Terms of Service - Girlsside Boutique

Terms and Conditions Agreement between User and www.girlssideboutique.com Welcome to www.girlssideboutique.com. The www.girlssideboutique.com website (the Site) consists of various web pages operated by Girlsside Boutique. www.girlssideboutique.com is offered to you conditioned on your acceptance without modificat

###### Cruise Offers Terms of Use | Cruise Offers

The CruiseOffers Web Site is comprised of various Web pages operated by CruiseOffers/Travelrite International Pty Ltd
The CruiseOffers Web Site is offered to you conditioned on your acceptance without modification of the terms, conditions, and notices contained herein. Your use of the CruiseOffers Web Site constitutes your agreement to all such terms, conditions, and notices

###### Statistical Forum

Due to limited precision, accuracy, and variability in ordinal outcomes, it behooves researchers to use either 5-point, 7-point, or higher level Likert scales. With more options, more unique variance can be accounted for the in the analysis and statistical power is increased. One-sampled tests possess more statistical power than other between-subjects statistics because there is only one group being analyzed, no other independent groups are included ...

###### Milestone gives Shefflin sense of real completion - Independent.ie

When all the characteristics that have made Henry Shefflin the most successful and greatest hurler of this or probably any generation are distilled into one moment, perhaps the first ex

###### Tips to Improve How You Conduct Corporate Events

Times and generations are changing. As they say, change seems to be the only constant thing in this world. A business that doesn

###### May 2021 | BrokerCalls

How to Start Your Pay-Per-Call Business Are you interested in developing your own pay-per-call website? Pay-per-call lead generations are quickly… Read More ...

###### 0810.1018] A simple constant-probability RP reduction from NP to Parity P

Abstract: The proof of Todas celebrated theorem that the polynomial hierarchy is contained in $¶^{# P}$ relies on the fact that, under mild technical conditions on the complexity class $C$, we have $\exists C \subset BP \cdot \oplus C$. More concretely, there is a randomized reduction which transforms nonempty sets and the empty set, respectively, into sets of odd or even size. The customary method is to invoke Valiants and Vaziranis randomized reduction from NP to UP, followed by amplification of the resulting success probability from $1/\poly(n)$ to a constant by combining the parities of $\poly(n)$ trials. Here we give a direct algebraic reduction which achieves constant success probability without the need for amplification. Our reduction is very simple, and its analysis relies on well-known properties of the Legendre symbol in finite fields ...

###### Chemical & Biological Detection ENPM808B

Approximate x*exp(-x) with Orthogonal Functions (Legendre Polynomials, Chebychev Polynomials, Bessel Functions) & Compare to Taylors Series ...

###### 1-20chapter stats - What is the appropriate alternative hypothesis for

View Notes - 1-20chapter stats from MGMT 2123 at HCCS. is
B. Fail to reject the null hypothesis 78. In an effort to improve productivity in its factory, a firm recently instituted a training
program

###### How environments work in R and what is lazy evaluation | R-bloggers

Knowledge of the way how R evaluates expressions is crucial to avoid hours of staring at the screen or hitting unexpected and difficult bugs. Well start with an example of an issue I came accross a few months ago when using the purrr::map function....

###### DAILY STRUGGLES OF AN EARTHMOVER: April 2013

do cities stamp us as individuals? does each city direct us, mold our behaviour? do they brand us in a cohesive power? can a city diminish your sense of self ...

###### A Couple More BioShock Infinite Screens | Rock, Paper, Shotgun

A couple of new BioShock Infinite screenshots have appeared out of GamesCom. Theres not much more to say than that, but you can see them below, and continue

###### Спортивное питание Infinite Labs, купить Инфинити Лабс по низкой цене в Украине | Sport-trade

Купить спортивное питание Infinite Labs по низкой цене в Украине, ☆Самый широкий ассортимент☆Акции и Скидки☆ ✈Доставка по всей Украине, ✓СЕРТИФИЦИРОВАННЫЙ ТОВАР

###### Likelihoods and simulation methods for a class of nonneutral population genetics models. - Nuffield Department of Medicine

Methods for simulating samples and sample statistics, under mutation-selection-drift equilibrium for a class of nonneutral population genetics models, and for evaluating the likelihood surface, in selection and mutation parameters, are developed and applied for observed data. The methods apply to large populations in settings in which selection is weak, in the sense that selection intensities, like mutation rates, are of the order of the inverse of the population size. General diploid selection is allowed, but the approach is currently restricted to models, such as the infinite alleles model and certain K-models, in which the type of a mutant allele does not depend on the type of its progenitor allele. The simulation methods have considerable advantages over available alternatives. No other methods currently seem practicable for approximating likelihood surfaces.

###### Serval - Statistical properties of population differentiation estimators under stepwise mutation in a finite island model.

Microsatellite loci mutate at an extremely high rate and are generally thought to evolve through a stepwise mutation model. Several differentiation statistics taking into account the particular mutation scheme of the microsatellite have been proposed. The most commonly used is R(ST) which is independent of the mutation rate under a generalized stepwise mutation model. F(ST) and R(ST) are commonly reported in the literature, but often differ widely. Here we compare their statistical performances using individual-based simulations of a finite island model. The simulations were run under different levels of gene flow, mutation rates, population number and sizes. In addition to the per locus statistical properties, we compare two ways of combining R(ST) over loci. Our simulations show that even under a strict stepwise mutation model, no statistic is best overall. All estimators suffer to different extents from large bias and variance. While R(ST) better reflects population differentiation

###### PLOS Genetics: On the Analysis of Genome-Wide Association Studies in Family-Based Designs: A Universal, Robust Analysis...

Author Summary In genome-wide association studies, the multiple testing problem and confounding due to population stratification have been intractable issues. Family-based designs have considered only the transmission of genotypes from founder to nonfounder to prevent sensitivity to the population stratification, which leads to the loss of information. Here we propose a novel analysis approach that combines mutually independent FBAT and screening statistics in a robust way. The proposed method is more powerful than any other, while it preserves the complete robustness of family-based association tests, which only achieves much smaller power level. Furthermore, the proposed method is virtually as powerful as population-based approaches/designs, even in the absence of population stratification. By nature of the proposed method, it is always robust as long as FBAT is valid, and the proposed method achieves the optimal efficiency if our linear model for screening test reasonably explains the observed data

###### DROPS - Evaluating Stationary Distribution of the Binary GA Markov Chain in Special Cases

The evolutionary algorithm stochastic process is well-known to be Markovian. These have been under investigation in much of the theoretical evolutionary computing research. When mutation rate is positive, the Markov chain modeling an evolutionary algorithm is irreducible and, therefore, has a unique stationary distribution, yet, rather little is known about the stationary distribution. On the other hand, knowing the stationary distribution may provide some information about the expected times to hit optimum, assessment of the biases due to recombination and is of importance in population genetics to assess whats called a ``genetic load (see the introduction for more details). In this talk I will show how the quotient construction method can be exploited to derive rather explicit bounds on the ratios of the stationary distribution values of various subsets of the state space. In fact, some of the bounds obtained in the current work are expressed in terms of the parameters involved in all the ...

###### Random regression models in the analysis of feed intake and body weight of individually fed beef bulls in South Africa

The objective of this study was to estimate genetic parameters for weekly body weight of feed intake of individually fed beef bulls at centralized testing stations in South Africa using random regression models (RRM). The model for cumulative feed intake included the fixed linear regression on third order orthogonal Legendre polynomials of the actual days on test (7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77 and 84 day) for starting age group and contemporary group effects. Random regressions on third order orthogonal Legendre polynomials were included for the additive genetic effect of the animal and the additional random effect of weaning-herd-year (WHY) and on fourth order for the additional random permanent environmental effect of the animal. The model for body weights included the fixed linear regression on fourth order orthogonal Legendre polynomials of the actual days on test for starting age group and contemporary group effects. Random regressions on fourth order orthogonal Legendre ...

###### Slides7 v1 - Sampling Distributions Utku Suleymanoglu UMich Utku Suleymanoglu(UMich Sampling Distributions 1 21 Introduction...

View Notes - Slides7_v1 from ECON 404 at University of Michigan. Sampling Distributions
Utku Suleymanoglu
UMich Utku Suleymanoglu (UMich) Sampling Distributions 1 / 21 Introduction Population

###### Bayesian Comparisons of Codon Substitution Models - LIRMM - Laboratoire dInformatique, de Robotique et de Microélectronique de...

In 1994, Muse & Gaut (MG) and Goldman & Yang (GY) proposed evolutionary models that recognize the coding structure of the nucleotide sequences under study, by defining a Markovian substitution process with a state space consisting of the 61 sense codons (assuming the universal genetic code). Several variations and extensions to their models have since been proposed, but no general and flexible framework for contrasting the relative performance of alternative approaches has yet been applied. Here, we compute Bayes factors to evaluate the relative merit of several MG- and GY-style of codon substitution models, including recent extensions acknowledging heterogeneous nonsynonymous rates across sites, as well as selective effects inducing uneven amino acid or codon preferences. Our results on three real data sets support a logical model construction following the MG formulation, allowing for a flexible account of global amino acid or codon preferences, while maintaining distinct parameters governing overall

###### Conditional Approximate Bayesian Computation: A New Approach for Across-Site Dependency in High-Dimensional Mutation-Selection...

A key question in molecular evolutionary biology concerns the relative roles of mutation and selection in shaping genomic data. Moreover, features of mutation and selection are heterogeneous along the genome and over time. Mechanistic codon substitution models based on the mutation-selection framework are promising approaches to separating these effects. In practice, however, several complications arise, since accounting for such heterogeneities often implies handling models of high dimensionality (e.g., amino acid preferences), or leads to across-site dependence (e.g., CpG hypermutability), making the likelihood function intractable. Approximate Bayesian Computation (ABC) could address this latter issue. Here, we propose a new approach, named Conditional ABC (CABC), which combines the sampling efficiency of MCMC and the flexibility of ABC. To illustrate the potential of the CABC approach, we apply it to the study of mammalian CpG hypermutability based on a new mutation-level parameter implying

###### Quantitative Genetics of Genomic Imprinting: A Comparison of Simple Variance Derivations, the Effects of Inbreeding, and...

We have demonstrated that a simple one-locus two-allele model of genomic imprinting produces large differences in predictions for additive (Table 2) and dominance terms from a number of standard approaches for partitioning the genotypic value of an individual. These approaches are equivalent in the absence of imprinting under standard Mendelian expression (where heterozygotes have equivalent genotypic values and hence k1 = k2). Although all approaches give identical total genetic variance, there are differences in the partitioning of the genetic variance into additive, dominance and covariance terms (Table 3).. The major differences in the approaches arise due to differences in how breeding values and additive effects are defined. Approaches 1 and 2b incorporate both sex- and generation-dependent terms, and breeding values are equivalent for these approaches (Table 2). However, Approaches 2a and the regression methods (Approaches 3a and 3b) are unable to partition separate male and female terms. ...

###### Likelihood methods to infer balancing selection under K-allele models /by Erkan Ozge Buzbas. :: Electronic Theses and...

A balanced pattern in the allele frequencies of polymorphic loci is a potential sign of selection, particularly of overdominance. Although this type of selection is of some interest in population genetics, there exist no likelihood based approaches specifically tailored to make inference on selection intensity. To fill this gap, we present likelihood methods to estimate selection intensity under k-allele models with overdominance.;The stationary distribution of allele frequencies under a variety of Wright-Fisher k-allele models with selection and parent independent mutation is well studied. However, the statistical properties of maximum likelihood estimates of parameters under these models are not well understood. We show that under each of these models, there is a point in data space which carries the strongest possible signal for selection, yet, at this point, the likelihood is unbounded. This result remains valid even if all of the mutation parameters are assumed to be known. Therefore, ...

###### 4.2.2 - Sampling Distribution of the Sample Proportion | STAT 500

For this problem, we know $p=0.43$ and $n=50$. First, we should check our conditions for the sampling distribution of the sample proportion.. \(np=50(0.43)=21.5\) and \(n(1-p)=50(1-0.43)=28.5\) - both are greater than 5.. Since the conditions are satisfied, $\hat{p}$ will have a sampling distribution that is approximately normal with mean \(\mu=0.43\) and standard deviation [standard error] \(\sqrt{\dfrac{0.43(1-0.43)}{50}}\approx 0.07\).. \begin{align} P(0.45,\hat{p},0.5) &=P\left(\frac{0.45-0.43}{0.07}, \frac{\hat{p}-p}{\sqrt{\frac{p(1-p)}{n}}},\frac{0.5-0.43}{0.07}\right)\\ &\approx P\left(0.286,Z,1\right)\\ &=P(Z,1)-P(Z,0.286)\\ &=0.8413-0.6126\\ &=0.2287\end{align}. Therefore, if the true proportion of American who own an iPhone is 43%, then there would be a 22.87% chance that we would see a sample proportion between 45% and 50% when the sample size is 50.. ...

###### Bayesian Variable Selection in Searching for Additive and Dominant Effects in Genome-Wide Data

Although complex diseases and traits are thought to have multifactorial genetic basis, the common methods in genome-wide association analyses test each variant for association independent of the others. This computational simplification may lead to reduced power to identify variants with small effect sizes and requires correcting for multiple hypothesis tests with complex relationships. However, advances in computational methods and increase in computational resources are enabling the computation of models that adhere more closely to the theory of multifactorial inheritance. Here, a Bayesian variable selection and model averaging approach is formulated for searching for additive and dominant genetic effects. The approach considers simultaneously all available variants for inclusion as predictors in a linear genotype-phenotype mapping and averages over the uncertainty in the variable selection. This leads to naturally interpretable summary quantities on the significances of the variants and their ...

###### Survival and Innovation: The role of mutational robustness in evolution

Biological systems are resistant to perturbations caused by the environment and by the intrinsic noise of the system. Robustness to mutations is a particular aspect of robustness in which the phenotype is resistant to genotypic variation. Mutational robustness has been linked to the ability of the system to generate heritable genetic variation (a property known as evolvability). It is known that greater robustness leads to increased evolvability. Therefore, mechanisms that increase mutational robustness fuel evolvability. Two such mechanisms, molecular chaperones and gene duplication, have been credited with enormous importance in generating functional diversity through the increase of systems robustness to mutational insults. However, the way in which such mechanisms regulate robustness remains largely uncharacterized. In this review, I provide evidence in support of the role of molecular chaperones and gene duplication in innovation. Specifically, I present evidence that these mechanisms ...

###### Haplotypes: the joint distribution of alleles at linked loci. - Lancaster EPrints

We prove a result concerning the joint distribution of alleles at linked loci on a chromosome drawn from the population at stationarity. For a neutral locus, the allele is a draw from the stationary distribution of the mutation process. Furthermore, this allele is independent of the alleles at different loci on any chromosomes in the population.. ...

###### Sampling Distributions in Statistics - Videos & Lessons | Study.com

Use our video lessons and quizzes to learn about sampling distributions. Each lesson breaks down a concept into bite-sized pieces to help you...

###### Information Processing: Epistasis and Complex Traits

Although research effort is being expended into determining the importance of epistasis and epistatic variance for complex traits, there is considerable controversy about their importance. Here we undertake an analysis for quantitative traits utilizing a range of multilocus quantitative genetic models and gene frequency distributions, focusing on the potential magnitude of the epistatic variance. All the epistatic terms involving a particular locus appear in its average effect, with the number of two-locus interaction terms increasing in proportion to the square of the number of loci and that of third order as the cube and so on. Hence multilocus epistasis makes substantial contributions to the additive variance and does not, per se, lead to large increases in the nonadditive part of the genotypic variance. Even though this proportion can be high where epistasis is antagonistic to direct effects, it reduces with multiple loci. As the magnitude of the epistatic variance depends critically on the ...

###### Allometrical growth of the quantitative characters of plants II. The inheritance of plant leaf shape and leaf size of tobacco

The dominant character of leaf size varies with different genetic models and leaf positions. In Model 1, the dominant characters of top and lower leaves are small size, but for the middle leaves it is large size. In Model 2, large size is dominant for three types of leaves. In Model 3, small size is dominant for the top and middle leaves, but recessive for lower leaves. In Model 4, small size is dominant in the top and lower leaves, but recessive in the middle leaves (Table 6). Therefore, we can not conclude and illustrate the inheritance of leaf size for tobacco leaves. Leaf size was determined by genetics and environment (Gurevitch, 1992); hence it may be suitable to illustrate the genetic mechanism for leaf size in a fixed position of single leaf, or increase the number of planted locations to increase the generational mean. This would allow us to estimate the effect of genetic-environmental interaction and understand the inheritance of leaf size.. Genetic Models and Inheritance of Leaf ...

###### Integrated Bayesian analysis of rare exonic variants to identify risk genes for schizophrenia and neurodevelopmental disorders ...

In this work, we built a pipeline, extTADA, for the integrated Bayesian analysis of DN mutations and rare CC variants to infer rare-variant genetic architecture parameters and identify risk genes. We applied extTADA to data available for SCZ and four other NDDs (Additional file 1: Figure S1).. The extTADA pipeline extTADA is based on previous work in autism sequencing studies, TADA [16, 31]. It conducts a full Bayesian analysis of a simple rare-variant genetic architecture model and it borrows information across all annotation categories and DN and CC samples in genetic parameter inference, which is critical for sparse rare-variant sequence data. Using MCMC, extTADA samples from the joint posterior density of risk-gene proportion and mean relative risk parameters, and provides gene-level disease-association BFs, PPs, and FDRs. We hope that extTADA (https://github.com/hoangtn/extTADA) will be generally useful for rare-variant analyses across complex traits. extTADA can be used for rare CC variant ...

###### MLIP: using multiple processors to compute the posterior probability of linkage

Background: Localization of complex traits by genetic linkage analysis may involve exploration of a vast multidimensional parameter space. The posterior probability of linkage (PPL), a class of statistics for complex trait genetic mapping in humans, is designed to model the trait model complexity represented by the multidimensional parameter space in a mathematically rigorous fashion. However, the method requires the evaluation of integrals with no functional form, making it difficult to compute, and thus further test, develop and apply. This paper describes MLIP, a multiprocessor two-point genetic linkage analysis system that supports statistical calculations, such as the PPL, based on the full parameter space implicit in the linkage likelihood. Results: The fundamental question we address here is whether the use of additional processors effectively reduces total computation time for a PPL calculation. We use a variety of data - both simulated and real - to explore the question how close can ...

###### Browse by Authors and Editors - Nottingham ePrints

Legendre, Thomas (2017) Blaise. The Moth, 31 (Winter). pp. 8-11. Legendre, Thomas (2017) Great falls. The Curlew, Populus . pp. 38-45. Legendre, Thomas (2017) John McEnroes omelet. Copper Nickel, 24 . Legendre, Thomas (2016) Ultraviolet. Superstition Review, 18 . ISSN 1938-324X Legendre, Thomas (2016) Tenure tracks. Columbia Journal . Legendre, Thomas (2016) Ghostly desires in Edith Whartons Miss Mary Pask. Journal of the Short Story in English . ISSN 1969-6108 (In Press) Legendre, Thomas (2011) Landscape-mindscape: writing in Scotlands prehistoric future. Scottish Literary Review, 3 (2). pp. 121-132. ISSN 1756-5634 ...

###### Bayes Factor with Lindley Paradox and Tow Standard Methods in Model

For any statistical analysis, Model selection is necessary and required. In many cases of selection, Bayes factor is one of the important basic elements. For the unilateral hypothesis testing problem, we extend the harmony of frequency and Bayesian evidence to the generalized p-value of unilateral hypothesis testing problem, and study the harmony of generalized P-value and posterior probability of original hypothesis. For the problem of single point hypothesis testing, the posterior probability of the Bayes evidence under the traditional Bayes testing method, that is, the Bayes factor or the single point original hypothesis is established, is analyzed, a phenomenon known as the Lindley paradox, which is at odds with the classical frequency evidence of p-value. At this point, many statisticians have been worked for this from both frequentist and Bayesian perspective. In this paper, I am going to focus on Bayesian approach to model selection, starting from Bayes factors and going within Lindley Paradox,

###### Genetic Association with JMP Genomics, Part 3a: Marker Based Relationship Matrix

In JMP Genomics, the Relationship Matrix analysis is used for computing and displaying relatedness among lines. The Relationship Matrix tool estimates the relationships among the lines using marker data, rather than pedigree information (Kinship Matrix tool), and computes the relationship measures directly while also accounting for selection and genetic drift. The Relationship Matrix computes one of three options: Identity-by-Descent, Identity-by-State, or Allele-Sharing-Similarity. Output from this procedure can serve as the K matrix, representing familial relatedness, in a Q-K mixed model. This post will focus on the Relationship Matrix using a data set containing 343 rice lines with 8,336 markers.

###### Experimental Upper Bound and Theoretical Expectations for Parity-Violating Neutron Spin Rotation in He-4 | NIST

Neutron spin rotation is expected from quark-quark weak interactions in the Standard Model, which induce weak interactions among nucleons that violate parity.

###### GES Breeding values - CRV4ALL

Overview press publications with toplists of bulls. The file with breeding values of sire opens when clicking on download. The lists are sorted according to NVI with the exception of the beef merit index. Sire that are not included in the toplists can be found with the function Sire Search.. Information on the publication. For information about the publication, see News. The national toplists contains breeding values based on Dutch/Flemish daughter information. The Interbull toplists contains converted breeding values based on information from abroad. The genomic toplists contains breeding values based on pedigree information combined with genomic information. The combined toplists contains the top 500 bulls on NVI-base from the described list ...

###### Experimental Evolution, Ecology and Behaviour (EXEB): April 2011

3. The last point we discussed, which is maybe the most interesting, is the issue of the infinitesimal model. The infinitesimal model, originated by Fisher, assumes that contributions to the genetic variance are additive, relatively small and coming from many loci. The multiplication of QTL studies and other genomic approaches this last years has led to numerous discussions questioning this model, assuming that the reason for the lack of evidence for phenotypic traits controlled by few loci was more or less technological. We have ourselves discussed this issue in this very blog including when studies about human height and some QTLs found to explain just a few percents of variation. Well in light of this article it seems that it is again the case in drosophila, as control for height is seems to be largely polygenic, and the estimates presented here are even a low estimate as the methodology used is quite conservative (polymorphisms with population frequencies under 10% were not even analyzed ...

###### Genome-wide inference of ancestral recombination graphs

The complex correlation structure of a collection of orthologous DNA sequences is uniquely captured by the ancestral recombination graph (ARG), a complete record of coalescence and recombination events in the history of the sample. However, existing methods for ARG inference are computationally in …

###### Lahrouz, A. and Omari, L. (2013) Extinction and stationary distribution of a stochastic SIRS epidemic model with non-linear...

Lahrouz, A. and Omari, L. (2013) Extinction and stationary distribution of a stochastic SIRS epidemic model with non-linear incidence. Statistics & Probability Letters, 83, 960-968.

###### The Winnower | Using Bayes Factors to Get the Most out of Linear Regression: A Practical Guide Using R

I started this guide with a problem that gives conventional statistics extreme difficulty but is strikingly simple for Bayesian analysis: Conventional statistics do not allow researchers to make claims about one of the models being tested (sometimes the only model). This inferential asymmetry is toxic to interpreting research results. Bayes factors solve the problem of inferential asymmetry by treating all models equally, and have many other benefits: 1) No penalty for optional stopping or multiple comparisons. Collect data until you feel like stopping or run out of money and make as many model comparisons as you like; 2) Bayes factors give directly interpretable outputs. A Bayes factor means the same thing whether n is 10 or 10,000, and whether we compared 2 or 20 models. A credible interval ranging from .38 to .94 means that we should believe with 95% certainty that the true value lies in that range. 3) Prior probability distributions allow researchers to intimately connect their theories to ...

###### Phylogenetic confidence intervals for the optimal trait value

We consider a stochastic evolutionary model for a phenotype developing amongst n related species with unknown phylogeny. The unknown tree ismodelled by a Yule process conditioned on n contemporary nodes. The trait value is assumed to evolve along lineages as an Ornstein-Uhlenbeck process. As a result, the trait values of the n species form a sample with dependent observations. We establish three limit theorems for the samplemean corresponding to three domains for the adaptation rate. In the case of fast adaptation, we show that for large n the normalized sample mean isapproximately normally distributed. Using these limit theorems, we develop novel confidence interval formulae for the optimal trait value.. ...

###### devfun2: Deviance Function in Terms of Standard... in lme4: Linear Mixed-Effects Models using Eigen and S4

The deviance is profiled with respect to the fixed-effects
parameters but not with respect to sigma; that is, the
function takes parameters for the variance-covariance parameters
and for the residual standard deviation. The random-effects
variance-covariance parameters are on the standard deviation/correlation
scale, not the theta (Cholesky factor) scale.

###### Modelling genetic data using Bayesian hierarchical models by Feng Guo

Populations diverge from each other as a result of evolutionary forces such as genetic drift, natural selection, mutation, and migration. For certain types of genetic markers, and for single-nucleotide polymorphisms (SNPs), in particular, it is reasonable to presume that genotypes at most loci are selectively neutral. Because demographic parameters (e.g. population size and migration rates) are common across all loci, locus-specific variation, which can be measured by Wrights FST, will depart from a common mean only for loci with unusually high/low rate of mutation or for loci closely associated with genomic regions having a substantial effect on fitness. We propose two alternative Bayesian hierarchical-beta models to estimate locus-specific effects on FST. To detect loci for which locus-specific effects are not well explained by the common FST, we use the Kullback-Leibler divergence measure (KLD) to measure the divergence between the posterior distributions of locus-specific effects and the common FST

###### CiteSeerX - Citation Query On the Asymptotic Distribution of the Moran I Test Statistic with Applications

CiteSeerX - Scientific documents that cite the following paper: On the Asymptotic Distribution of the Moran I Test Statistic with Applications

###### Genetic Algorithms

Genetic algorithms (GA) are a computational paradigm inspired by the mechanics of natural evolution, including survival of the fittest, reproduction, and mutation. Surprisingly, these mechanics can be used to solve (i.e. compute) a wide range of practical problems, including numeric problems. Concrete examples illustrate how to encode a problem for solution as a genetic algorithm, and help explain why genetic algorithms work. Genetic algorithms are a popular line of current research, and there are many references describing both the theory of genetic algorithms and their use in practical problem solving ...

###### Celebrate Mathematics Awareness Month

Math has an impact on just about every aspect of our lives including some that we dont often think about. Math helped change the outcome of WWII, it also shows up in the way we drive our cars and the way we manage our finances. In celebration of Math Awareness Month, here are four TI-Nspire activities to use in your classes - whether you teach algebra, calculus or statistics. 1: German Tanks: Exploring Sampling Distributions. In this activity, your students will be challenged with the same problem the WWII Allies generals had: How do you determine how many German tanks there are? In WWII, the statisticians working for the Allies used sample statistics and sampling distributions to help determine the number of German tanks. Students explore different sample statistics and use simulation to develop a statistic that is effective in approximating the maximum number in a population. ...

###### Statistics - (Probability|Sampling) Distribution [Gerardnico]

Mode: for a discrete random variable, the value with highest probability (the location at which the probability mass function has its peak); for a continuous random variable, the location at which the probability density function has its peak ...

###### SumHer better estimates the SNP heritability of complex traits from summary statistics | Meta

We present SumHer, software for estimating confounding bias, SNP heritability, enrichments of heritability and genetic correlations using summary statistics from genome-wide association studies. The key difference between SumHer and the existing software LD Score Regression (LDSC) is th...read more ...

###### Mutation-Selection Balance

An organisms genome is continually being alteredby mutations, the vast majority of which are harmful to the organism or
its descendants, because they reduce the bearers viability or fertility

###### Phylogenetic estimation of context-dependent substitution rates by maximum likelihood

Nucleotide substitution in both coding and noncoding regions is context-dependent, in the sense that substitution rates depend on the identity of neighboring bases. Context-dependent substitution has been modeled in the case of two sequences and an unrooted phylogenetic tree, but it has only been ac …

###### STUDIES IN RADAR CROSS-SECTIONS-II. THE ZEROS OF THE ASSOCIATED LEGENDRE FUNCTIONS PMN (MU PRIME) OF NON-INTEGRAL DEGREE.

This paper derives (by a new method) an equation due to Macdonald for determining the zeros of the associated Legendre functions of order m and non-integral degree n when the argument is close to -1. A closed form solution is obtained for the values of Q subscript n superscript m (mu) and Q subscript n superscript -m (mu) for mu close to 1. Certain observations are made concerning errors in a recently published article.(*RADAR CROSS SECTIONS

###### What Psychology Teachers Should Know About Open Science and the New Statistics - Introduction to the New Statistics

My main disagreement with the authors is over their use of confirmatory/exploratory as the distinction between analyses that have been planned, and preferably preregistered, and those that are exploratory. Its a vital distinction, of course, but confirmatory, while a traditional and widely-used term, does not capture well the intended meaning. Confirmatory vs exploratory probably originates with the two approaches to using factor analysis. It could make sense to follow an exploratory FA that identified a promising factor structure with a test of that now-prespecified structure with a new set of data. That second test might reasonably be labelled confirmatory of that structure, although the data could of course cast doubt on rather than confirm the FA model under test.. By contrast, a typical preregistered investigation, in which the research questions and the corresponding data analysis are fully planned, asks questions about the sizes of effects. It estimates effect sizes rather than seeks ...