Learn all about what the dog genome project is, how it got started and what it can show us. The successful mapping of the dog genome can help in curing both human and canine genetic disorders.
Genome evolution is the process by which a genome changes in structure (sequence) or size over time. The study of genome evolution involves multiple fields such as structural analysis of the genome, the study of genomic parasites, gene and ancient genome duplications, polyploidy, and comparative genomics. Genome evolution is a constantly changing and evolving field due to the steadily growing number of sequenced genomes, both prokaryotic and eukaryotic, available to the scientific community and the public at large. Since the first sequenced genomes became available in the late 1970s, scientists have been using comparative genomics to study the differences and similarities between various genomes. Genome sequencing has progressed over time to include more and more complex genomes including the eventual sequencing of the entire human genome in 2001. By comparing genomes of both close relatives and distant ancestors the stark differences and similarities between species began to emerge as well as ...
The Genome Assembly and Annotation Team carries out genome projects in the classical sense, from design of the de novo sequencing strategy, on through assembly and annotation of the genome.. The team specializes in large eukaryotic genomes and transcriptomes, especially those of animals and plants. Other types of genomes analyzed include those of organelles, endosymbionts, metagenomes and metatranscriptomes, and cancer genomes. Genome assembly is not only difficult due to the sheer size of the data and computational requirements, but also because the biology of genomes is confounded by repetitive elements, polyploidy and variation (single-nucleotide, insertions/deletions, and larger structural variants). The team focuses its efforts on meeting and overcoming these challenges, incorporating new technologies and developing new computational protocols as each project demands.. Annotation of the gene content of the newly assembled genome is key to understanding the genome, once finished. On this ...
The Genome Assembly and Annotation Team carries out genome projects in the classical sense, from design of the de novo sequencing strategy, on through assembly and annotation of the genome.. The team specializes in large eukaryotic genomes and transcriptomes, especially those of animals and plants. Other types of genomes analyzed include those of organelles, endosymbionts, metagenomes and metatranscriptomes, and cancer genomes. Genome assembly is not only difficult due to the sheer size of the data and computational requirements, but also because the biology of genomes is confounded by repetitive elements, polyploidy and variation (single-nucleotide, insertions/deletions, and larger structural variants). The team focuses its efforts on meeting and overcoming these challenges, incorporating new technologies and developing new computational protocols as each project demands.. Annotation of the gene content of the newly assembled genome is key to understanding the genome, once finished. On this ...
DescriptionDe novo Genome assembly and k-mer frequency counting are two of the classical prob- lems of Bioinformatics. k-mer counting helps to identify genomic k-mers from sequenced reads which may then inform read correction or genome assembly. Genome assembly has two major subproblems: contig construction and scaffolding. A contig is a continu- ous sub-sequence of the genome assembled from sequencing reads. Scaffolding attempts to construct a linear sequence of contigs (with possible gaps in between) using paired reads (two reads whose distance on the genome is approximately known). In this the- sis I will present a new computationally efficient tool for identifying frequent k-mers which are more likely to be genomic, and a set of linear inequalities which can improve scaffolding (which is known to be NP-hard) by identifying reliable paired reads. Identifying reliable k-mers from Whole Genome Amplification (WGA) data is more challenging compared to multi-cell data due to the coverage variation ...
SAN DIEGO, Oct. 13, 2016 (GLOBE NEWSWIRE) - BioNano Genomics, the leader in physical genome mapping, together with Howard Hughes Medical Institute (HHMI) Investigator and new Rockefeller University Professor, Erich Jarvis, Ph.D., today announced that his team will use BioNano Genomics Next Generation Mapping (NGM) combined with Pacific Biosciences sequencing technology to construct thousands of vertebrate reference genomes in the Vertebrate Genomes Project.. Dr. Jarviss lab and his collaborator Dr. Olivier Fredrigo, co-director of the Duke University Genome Sequencing Center, performed a systematic evaluation of available DNA sequencing and scaffolding technologies. They concluded that a combination of BioNanos NGM and PacBio sequencing will yield well-structured and informative genome assemblies, making the technologies a very good combination for establishing reference quality genomes.. Dr. Jarvis has purchased an Irys® System for next-generation mapping to play an integral role in ...
Below is a selection of genome projects that are using the Australian Apollo Service. Click on a link to find out more about the research group performing the organism genome annotation project and to access Apollo to view publicly available genome browsers. Publicly available genome browsers do not require a login. For non-public genomes, an account login is required which may be granted by getting in touch with the research group.. If you are an Australian-based researcher and have a genome annotation project you would like to showcase here, please email us the details at [email protected] ...
A systematic study of genome context methods: calibration, normalization and combination - Background: Genome context methods have been introduced in the last decade as automatic methods to predict functional relatedness between genes in a target genome using the patterns of existence and relative locations of the homologs of those genes in a set of reference genomes. Much work has been done in the application of these methods to different bioinformatics tasks, but few papers present a systematic study of the methods and their combination necessary for their optimal use. Results: We present a thorough study of the four main families of genome context methods found in the literature: phylogenetic profile, gene fusion, gene cluster, and gene neighbor. We find that for most organisms the gene neighbor method outperforms the phylogenetic profile method by as much as 40% in sensitivity, being competitive with the gene cluster method at low sensitivities. Gene fusion is generally the worst performing of the
Generation of wt genomes by excision of the BAC vector from the MCMV BAC genome.After transfection of the MCMV BAC plasmid into eukaryotic cells we expected homologous recombination via the duplicated sequences leading to excision of the vector sequences and generation of a wt genome (see Fig. 2 and Fig. 3A, maps 4 and 5). During construction of the original MCMV BAC plasmid pSM3 we had observed that overlength genomes are not stable in cells (22), suggesting that overlength genomes are poorly packaged into viral capsids. Similar observations have been made for other DNA viruses. An overlength of more than 5% over the adenovirus wt genome leads to unstable genomes (2), and Epstein-Barr virus preferentially packages genomes within a very narrow size range (3). Thus, we expected that even when rare recombination events occur at the created target site, preferential packaging of unit length genomes should lead to an accumulation of viruses with the wt genome.. For reconstitution of virus progeny ...
Prokaryotes dominate the biosphere and regulate biogeochemical processes essential to all life. Yet, our knowledge about their biology is for the most part limited to the minority that has been successfully cultured. Molecular techniques now allow for obtaining genome sequences of uncultivated prokaryotic taxa, facilitating in-depth analyses that may ultimately improve our understanding of these key organisms. We compared results from two culture-independent strategies for recovering bacterial genomes: single-amplified genomes and metagenome-assembled genomes. Single-amplified genomes were obtained from samples collected at an offshore station in the Baltic Sea Proper and compared to previously obtained metagenome-assembled genomes from a time series at the same station. Among 16 single-amplified genomes analyzed, seven were found to match metagenome-assembled genomes, affiliated with a diverse set of taxa. Notably, genome pairs between the two approaches were nearly identical (average 99.51% sequence
The mouse genome database (MGD, http://www.informatics.jax.org/), the international community database for mouse, provides access to extensive integrated data on the genetics, genomics and biology of the laboratory mouse. The mouse is an excellent and unique animal surrogate for studying normal development and disease processes in humans. Thus, MGDs primary goals are to facilitate the use of mouse models for studying human disease and enable the development of translational research hypotheses based on comparative genotype, phenotype and functional analyses. Core MGD data content includes gene characterization and functions, phenotype and disease model descriptions, DNA and protein sequence data, polymorphisms, gene mapping data and genome coordinates, and comparative gene data focused on mammals. Data are integrated from diverse sources, ranging from major resource centers to individual investigator laboratories and the scientific literature, using a combination of automated processes and
In this exercise you will compare the genomes of two Escherichia coli strains, K12 DH10B and B REL606, using whole genome syntenic comparison and high-resolution analyses of specific genomic regions. These analyses will use CoGes tools [[SynMap]] and [[GEvo]] respectively, and will reveal evolutionary changes between these two genomes that happened after the divergence of their lineages. While the nucleotide sequence of these genomes is identical over large expanses of their genomes, many other types of large-scale genomic change will be discovered including phage insertions, transposon transposition, and genomic insertion, deletion, inversion, and duplication events. The computational tools used to do these analyses can be used for comparing genomes of any organisms. First, you are going to identify syntenic regions between these genomes. Syntenic is defined as two or more genomic regions that share a common ancestry and thus are derived from a common ancestor. To do this, you are going to ...
If you have a question about this talk, please contact .. Anthony Doran1, Thomas Keane1,2, and The Mouse Genomes Project consortium 1Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK 2EMBL-EBI, Wellcome Genome Campus, Hinxton, UK. The Mouse Genomes Project has completed the first draft assembled genome sequences and strain specific gene annotation for twelve classical laboratory and four wild-derived inbred mouse strains (WSB/EiJ, CAST /EiJ, PWK /PhJ, and SPRET /EiJ). These strains include all of the founders of the Collaborative Cross and Diversity Outbred Cross. We used a hybrid approach for genome annotation, combining evidence from the mouse reference Gencode annotation and strain-specific RNA -seq and PacBio cDNA, to identify novel strain-specific gene structures and alleles. Approx. 20,000 protein coding genes and 45,000 transcripts are annotated per strain. As these strains are fully inbred, we used heterozygous SNP density as a marker for highly polymorphic loci, and ...
Despite the recent massive progress in production of vertebrate genome sequence data and large-scale efforts to completely annotate the human genome, we still have scant knowledge of the principles that built genomes in evolution, of genome architecture and its functional organization. This work uses bioinformatics and zebrafish transgenesis to explain a mechanism for the maintenance of long-range conserved synteny across vertebrate genomes and to analyze the arrangement of underlying gene regulation systems. Large mammal-teleost conserved chromosomal segments contain highly conserved non-coding elements (HCNEs), their target genes, as well as phylogenetically and functionally unrelated bystander genes. Target genes are developmental and transcriptional regulatory genes with complex, temporally and spatially regulated expression patterns. Bystander genes are not specifically under the control of the regulatory elements that drive the target genes and are usually expressed in different, less ...
The project was announced on June 11 by MetaMorphix Inc. of Savage, Maryland. The company acquired preliminary (1x) coverage of the cow genome as well as a map of 600,000 cow single nucleotide polymorphisms (SNPs), when it purchased the animal genomics and genotyping business of Celera Genomics of Rockville, Maryland in March. Celera retains a minority business interest in MetaMorphix. Using the preliminary map of likely bovine SNPs, MetaMorphixs genomics division, MMI Genomics in Davis, California, is working with two cattle subsidiaries of the international agribusiness company Cargill to develop a physical map that covers the entire cow genome and also to locate genetic markers associated with cattle traits. Weve taken a different approach than the public projects that are looking into the bovine genome, says Sue Denise, the research and development director of MMI Genomics (formerly the AgGen division of Celera). With this initial sequencing on a substantial amount of the bovine genome, ...
The ever-increasing number of sequenced and annotated genomes has made management of their annotations a significant undertaking, especially for large eukaryotic genomes containing many thousands of genes. Typically, changes in gene and transcript numbers are used to summarize changes from release to release, but these measures say nothing about changes to individual annotations, nor do they provide any means to identify annotations in need of manual review. In response, we have developed a suite of quantitative measures to better characterize changes to a genomes annotations between releases, and to prioritize problematic annotations for manual review. We have applied these measures to the annotations of five eukaryotic genomes over multiple releases - H. sapiens, M. musculus, D. melanogaster, A. gambiae, and C. elegans. Our results provide the first detailed, historical overview of how these genomes annotations have changed over the years, and demonstrate the usefulness of these measures for genome
Projects Research Projects GEL personnel and collaborators are currently engaged in a variety of research projects Here we focus on describing our research interests and many of the projects described span multiple funding sources The funding section of this site contains more information about the objectives and attribution for individual grants If you don t see what you are looking for here let us know paul genome wisc edu Genome Projects We are engaged in multiple projects aimed at increasing the number and diversity genome sequences from enterobacteria These include plant pathogenic enterobacteria which remain underrepresented among complete genomes available as well as neglected genera isolated from a wide variety of sources Software and Database Development We have built several tools to assist with annotation and comparative analysis of genome data These include Mauve a widely used multiple genome aligner and the ASAP database The Software section of this web site includes additional ...
I thought you might be interested in looking at dog genome.. http://www.bordercollie.org/boards/topic/8688-dog-genome/?do=findComment&comment=97997 ...
Over the last decade, and especially after the advent of fluorescent in situ hybridization imaging and Chromosome Conformation Capture methods, the availability of experimental data on genome three-dimensional (3D) organization has dramatically increased. We now have access to unprecedented details on how genomes organize within the interphase nucleus. Development of new computational approaches that leverage such data has already resulted in the first 3D structures of genomic domains and genomes. Such approaches expand our knowledge of the chromatin folding principles, which has been classically studied using polymer physics and molecular simulations. 3D Genomes proposes to continue developing computational approaches for integrating experimental data with polymer physics, thereby bridging the resolution gap for structural determination of genomes and genomic domains. Then, such methods will be applied to address outstanding questions in genome biology, which shall provide insight into the ...
Abstract: The current status of the functional annotations associated with the human genome is in a rudimentary state. The majority of current genome annotations is heavily protein coding gene centric. This focus on protein coding genes intrinsically influences current perceptions of how the genome is structured and is regulated. This view of the genome also has an underlying supposition that transcripts with very little coding potential are not biologically important. However, recent unbiased experiments analyzing the sites of transcription across large sections of the human genome have led to the conclusion that the current human genome annotations can not account for the amounts of empirically detected transcription. (Kapranov, et al. 2002; Rinn, et al., 2003, Kampa, et al., 2004, Martone, et al., 2003, Cawley et al., 2004). Most of the detected unannotated transcription is composed of RNAs with very little coding capacity (,100 aa). These transcripts of unknown function (TUFs) share many ...
2017-02-16 15:06:47] Checking for Bowtie Bowtie version: 2.2.8.0 [2017-02-16 15:06:47] Checking for Bowtie index files (genome).. [2017-02-16 15:06:47] Checking for reference FASTA file [2017-02-16 15:06:47] Generating SAM header for genome [2017-02-16 15:06:47] Preparing reads left reads: min. length=75, max. length=75, 100 kept reads (0 discarded) right reads: min. length=75, max. length=75, 100 kept reads (0 discarded) [2017-02-16 15:06:47] Mapping left_kept_reads to genome genome with Bowtie2 [2017-02-16 15:06:47] Mapping left_kept_reads_seg1 to genome genome with Bowtie2 (1/3) [2017-02-16 15:06:47] Mapping left_kept_reads_seg2 to genome genome with Bowtie2 (2/3) [2017-02-16 15:06:47] Mapping left_kept_reads_seg3 to genome genome with Bowtie2 (3/3) [2017-02-16 15:06:47] Mapping right_kept_reads to genome genome with Bowtie2 [2017-02-16 15:06:47] Mapping right_kept_reads_seg1 to genome genome with Bowtie2 (1/3) [2017-02-16 15:06:48] Mapping right_kept_reads_seg2 to genome genome with Bowtie2 ...
We first determined whether genome build information is consistently supplied along with submissions to public repositories. As a representative example, we examined the records in the GEO and ENCODE databases with the following search criteria. In the GEO database, we examined all the records (one sample per series) that involved high-throughput sequencing submitted after 31 December 2008 for three species: Homo sapiens; Mus musculus; and Drosophila melanogaster. We then checked whether the data-processing section of metadata explicitly mentioned the genome build information, by case-insensitively searching for the following words: {hg17,hg18,hg19,hg38,grch36,grch37,grch38,build37.2,build37.1,build36.3,ncbi35,ncbi36,ncbi37,mm8,mm9,mm10,grcm38,bdgp6,bdgp5,bdgp5.25,build5.41,build5.3,build5,build4.1,dm6,dm3,ncbi}. In the ENCODE database, we examined the metadata file of all records.. Around 23.0% of the queried series records did not contain the genome build information explicitly in the ...
Links to domain combinations containing the Oncogene products superfamily in all genomes. Links for both groups of genomes, such as eukaryotes, bacteria and archaea, and individual genomes are provided.
Links to domain combinations containing the Hypothetical protein VC0424 superfamily in all genomes. Links for both groups of genomes, such as eukaryotes, bacteria and archaea, and individual genomes are provided.
Im looking to have a single FASTA sequence for each chromosome in an organism, but if I check the sequences in panTro5.fa (chimp) that Ive downloaded from UCSC I get a ton of ids like: chr10_NW_015973889v1_random, chr10_NW_015973890v1_random, etc.. What are these and how do I get rid of them? I dont have them in my hg38.fa (human) file because you can download all the chromosomes individually and then assemble them into one fasta, but I dont think you get that option with other genomes.. I need to use the genomes to find hits for viral LTR sequences and the number of hits is important so I dont want to get the same hit in the same region of the genome twice or more.. ...
Recovering the structure of ancestral genomes can be formalized in terms of properties of binary matrices such as the Consecutive-Ones Property (C1P). The Linearization Problem asks to extract, from a given binary matrix, a maximum weight subset of rows that satisfies such a property. This problem is in general intractable, and in particular if the ancestral genome is expected to contain only linear chromosomes or a unique circular chromosome. In the present work, we consider a relaxation of this problem, which allows ancestral genomes that can contain several chromosomes, each either linear or circular. We show that, when restricted to binary matrices of degree two, which correspond to adjacencies, the genomic characters used in most ancestral genome reconstruction methods, this relaxed version of the Linearization Problem is polynomially solvable using a reduction to a matching problem. This result holds in the more general case where columns have bounded multiplicity, which models possibly duplicated
1. Clicking on View Rat Genes Report (3 in RGD Search Result above) provides a list of gene records containing the search term (bold).. 2. The list is tabbed for each species and is exportable (see below).. 3. Descriptive information is presented in columns - the gene symbol links to the gene report page.. 4. Search results can be filtered by genome assembly or by chromosome, and it is sortable by any column heading. Additional searches for the specific data object and species can be performed from this page by entering a different or additional term in the Refine Term box and clicking Update. Search results for other objects such as QTLs and strains, are configured similarly.. ...
Background: Recovering the structure of ancestral genomes can be formalized in terms of properties of binary matrices such as the Consecutive-Ones Property (C1P). The Linearization Problem asks to extract, from a given binary matrix, a maximum weight subset of rows that satisfies such a property. This problem is in general intractable, and in particular if the ancestral genome is expected to contain only linear chromosomes or a unique circular chromosome. In the present work, we consider a relaxation of this problem, which allows ancestral genomes that can contain several chromosomes, each either linear or circular. Result: We show that, when restricted to binary matrices of degree two, which correspond to adjacencies, the genomic characters used in most ancestral genome reconstruction methods, this relaxed version of the Linearization Problem is polynomially solvable using a reduction to a matching problem. This result holds in the more general case where columns have bounded multiplicity, ...
In Genome Biology this week: sequencing breast cancer tumors from genetically engineered mouse models, micro realignments with SRMA, talk recaps from Beyond the Genome, and more.
View Notes - Lecture 2 from PLB 40175 at UC Davis. PLB 113 Lecture 2 II. Genome Organization and Gene Expression A. Plants have big (and small genomes) B. Genomes consist of single (LOW) copy and
The NIH is now accepting applications for the Somatic Cell Genome Editing (SCGE) program. The SCGE program aims to improve genome editing technologies to accelerate the translation of this technology into clinical applications and maximize the potential to treat as many diseases as possible. Pending the availability of funds and sufficient numbers of meritorious applications, the NIH expects to fund projects to provide better animal models for assessing genome editing in vivo, tools and assays to detect adverse consequences of genome editing in human cells, new technologies to deliver genome editing machinery into disease relevant cells and tissues in vivo, novel genome editing and engineering systems, and a Dissemination and Coordinating Center. Applications are due April 3, 2018. For additional information on these RFAs visit our Funding Opportunities page.. ...
Curated databases of completely sequenced genomes have been designed independently at the NCBI (RefSeq) and EBI (Genome Reviews) to cope with non-standard annotation found in the version of the sequenced genome that has been published by databanks GenBank/EMBL/DDBJ. These curation attempts were expected to review the annotations and to improve their pertinence when using them to annotate newly released genome sequences by homology to previously annotated genomes. However, we observed that such an uncoordinated effort has two unwanted consequences. First, it is not trivial to map the protein identifiers of the same sequence in both databases. Secondly, the two reannotated versions of the same genome differ at the level of their structural annotation. Here, we propose CorBank, a program devised to provide cross-referencing protein identifiers no matter what the level of identity is found between their matching sequences. Approximately 98% of the 1,983,258 amino acid sequences are matching, allowing
The Mouse the premier animal model for studying human disease the premier animal model for studying human disease > 95% same genes > 95% same genes same diseases, similar reasons (e.g., cancer, hypertension, diabetes, osteoporosis, …) same diseases, similar reasons (e.g., cancer, hypertension, diabetes, osteoporosis, …) 1000s lab strains, diff. characteristics 1000s lab strains, diff. characteristics precise genetic control precise genetic control
Nobody mentioned junk DNA and the resolution of the C-value paradox. Nobody mentioned the small number of genes in the human genome in spite of the fact that a great many articles begin with the claim that this was a shocking discovery [but see False History and the Number of Genes]. Jernej Ule mentioned alternative splicing but nobody else did in spite of the fact that many papers claim that most human genes are capable of making several different proteins. This is also a false claim, IMHO, but youd never know that from reading the journal. Peter Fraser was the only one who mentioned the vast regulatory network of enhancers as claimed by the ENCODE Consortium. If true, that would clearly count as a major discovery. (Its not true.) Eukaryotic genomes are chock full of defective transposons but none of the editors thought that was a key advance in our understanding of the genome ...
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
Symbol: This is the official symbol assigned to this strain according to the strain nomenclature guidelines. This is a combination of strain and substrain designations for inbred strains (or symbol and ILAR code for other strain types).. Strain: The official strain symbol.. Substrain: The official substrain symbol - this can be a collection of ILAR lab codes defining the history of this particular strain. Can also be found in pulldown section below with links to the strain report pages.. Full Name: If the strain has a text name then it is displayed here; this is not visible if no name is associated to the strain, as in this example. Ontology ID: The identification number of the strain ontology term assigned by RGD, linked to the term in the ontology browser. In the strain ontology, rat strains are organized in a hierarchical fashion based on the type of strain and the way they were developed.. Also known as: Old symbols and synonyms that were used for the strain. If a strain is renamed to comply ...
by vulgavis , Jun 16, 2020 , Biology, Genome Biology, Mobile DNA, Nature Communications, Scientific Reports, TE Day, Technology, TEs, transposable elements, Transposons , 0 , ...
Please give us your feedback so we can improve the information on the page. Thank you in advance for your help. Please add your email address if you would like a reply.. ...
The Department of Genetics and Genome Biology at the University of Leicester occupies a recently-refurbished, modern, purpose-built laboratory space, furnished with up-to-date equipment for the latest molecular genetic methods. We have an array of facilities both in-department and within the College of Life Sciences.
Comparative assembly using multiple genomes.The target genome is shown in the center, aligned to two related genomes, A and B. The DNA sequence of the target di
Highly fragmented reference genomes (with thousands or more short contigs or scaffolds) have been a persistent challenge for our small RNA-seq analysis program ShortStack. During a run, ShortStack needs to retrieve genomic sub-sequences for analysis of predicted RNA secondary structure. This is required to identify MIRNA hairpins. Early on I made the decision to use the samtools faidx function as the engine to retrieve genome sub-sequences. This was just pragmatic and lazy .. samtools was already required for other portions of ShortStacks analysis, and there wasnt a need to reinvent the wheel. However, when we started to do runs against highly fragmented genome assemblies, we found analysis was very slow. The slowness was traced to the samtools faidx function, which is very sensitive to the number of contigs/references.. The first attempt to fix this issue was in version 3.0, when I introduced the use genome-stitching. When the reference genome had more than 50 sequences, and some were , 1Mb ...
ENCODES a protein that exhibits transcription factor binding (ortholog); INVOLVED IN cell growth involved in cardiac muscle cell development (ortholog); cellular response to growth factor stimulus (ortholog); cellular response to hypoxia (ortholog); PARTICIPATES IN platelet-derived growth factor signaling pathway; ASSOCIATED WITH atrial fibrillation (ortholog); Cardiomegaly (ortholog); congenital megabladder (ortholog); FOUND IN nucleus (ortholog)
Two scientists claim to have pushed the boundaries of what can be learned about the ancestral history of the human race from one persons genome. Dr Richard Durbin and Dr Heng Li from the UKs Wellcome Trust Sanger Institute in Cambridge used information from the genomes of only seven people to show that humans living in Europe and China endured a severe population bottleneck between 10,000 and 60,000 years ago.. In the study published in Nature, the scientists used a new statistical technique to analyse differences between alleles within a genome. They found the more similar the alleles, the more recent the genetic separation was between parents - and by calculating the separation date, the researchers were able to estimate past population sizes. Each human genome contains information from the mother and the father, and the differences between these at any place in the genome carry information about its history, Dr Li said.. Scientists have traditionally performed this kind of analysis on ...