Molecular Sequence Annotation: The addition of descriptive information about the function or structure of a molecular sequence to its MOLECULAR SEQUENCE DATA record.Genomics: The systematic study of the complete DNA sequences (GENOME) of organisms.Databases, Genetic: Databases devoted to knowledge about specific genes and gene products.Software: Sequential operating programs and data which instruct the functioning of a digital computer.Internet: A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange.Genome: The genetic complement of an organism, including all of its GENES, as represented in its DNA, or in some cases, its RNA.User-Computer Interface: The portion of an interactive computer program that issues messages to and receives commands from a user.Computational Biology: A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets.Computer Graphics: The process of pictorial communication, between human and computers, in which the computer input and output have the form of charts, drawings, or other appropriate pictorial representation.Databases, Protein: Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties.Database Management Systems: Software designed to store, manipulate, manage, and control data for specific uses.Documentation: Systematic organization, storage, retrieval, and dissemination of specialized information, especially of a scientific or technical nature (From ALA Glossary of Library and Information Science, 1983). It often involves authenticating or validating information.Vocabulary, Controlled: A specified list of terms with a fixed and unalterable meaning, and from which a selection is made when CATALOGING; ABSTRACTING AND INDEXING; or searching BOOKS; JOURNALS AS TOPIC; and other documents. The control is intended to avoid the scattering of related subjects under different headings (SUBJECT HEADINGS). The list may be altered or extended only by the publisher or issuing agency. (From Harrod's Librarians' Glossary, 7th ed, p163)Databases, Nucleic Acid: Databases containing information about NUCLEIC ACIDS such as BASE SEQUENCE; SNPS; NUCLEIC ACID CONFORMATION; and other properties. Information about the DNA fragments kept in a GENE LIBRARY or GENOMIC LIBRARY is often maintained in DNA databases.Algorithms: A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task.Expressed Sequence Tags: Partial cDNA (DNA, COMPLEMENTARY) sequences that are unique to the cDNAs from which they were derived.Genome, Bacterial: The genetic complement of a BACTERIA as represented in its DNA.Sequence Analysis, Protein: A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence.Gene Expression Profiling: The determination of the pattern of genes expressed at the level of GENETIC TRANSCRIPTION, under specific circumstances or in a specific cell.Molecular Sequence Data: Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories.Sequence Alignment: The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms.Proteins: Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein.Base Sequence: The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence.Genome, Human: The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs.Chromosome Mapping: Any method used for determining the location of and relative distances between genes on a chromosome.Oligonucleotide Array Sequence Analysis: Hybridization of a nucleic acid sample to a very large set of OLIGONUCLEOTIDE PROBES, which have been attached individually in columns and rows to a solid support, to determine a BASE SEQUENCE, or to detect variations in a gene sequence, GENE EXPRESSION, or for GENE MAPPING.Natural Language Processing: Computer processing of a language with rules that reflect and describe current usage rather than prescribed usage.Databases, Factual: Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references.Phylogeny: The relationships of groups of organisms as reflected by their genetic makeup.Data Mining: Use of sophisticated analysis tools to sort through, organize, examine, and combine large sets of information.Genome, Plant: The genetic complement of a plant (PLANTS) as represented in its DNA.Terminology as Topic: The terms, expressions, designations, or symbols used in a particular science, discipline, or specialized subject area.Genomic Instability: An increased tendency of the GENOME to acquire MUTATIONS when various processes involved in maintaining and replicating the genome are dysfunctional.Cluster Analysis: A set of statistical methods used to group variables or observations into strongly inter-related subgroups. In epidemiology, it may be used to analyze a closely grouped series of events or cases of disease or other health-related phenomenon with well-defined distribution patterns in relation to time or place or both.Comparative Genomic Hybridization: A method for comparing two sets of chromosomal DNA by analyzing differences in the copy number and location of specific sequences. It is used to look for large sequence changes such as deletions, duplications, amplifications, or translocations.Evolution, Molecular: The process of cumulative change at the level of DNA; RNA; and PROTEINS, over successive generations.Genomic Library: A form of GENE LIBRARY containing the complete DNA sequences present in the genome of a given organism. It contrasts with a cDNA library which contains only sequences utilized in protein coding (lacking introns).Sequence Analysis, RNA: A multistage process that includes cloning, physical mapping, subcloning, sequencing, and information analysis of an RNA SEQUENCE.Proteome: The protein complement of an organism coded for by its genome.Amino Acid Sequence: The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION.Genome, Fungal: The complete gene complement contained in a set of chromosomes in a fungus.Open Reading Frames: A sequence of successive nucleotide triplets that are read as CODONS specifying AMINO ACIDS and begin with an INITIATOR CODON and end with a stop codon (CODON, TERMINATOR).High-Throughput Nucleotide Sequencing: Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced. The sequencing may be done by analysis of the synthesis or ligation products, hybridization to preexisting sequences, etc.Genes: A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms.Gene Ontology: Sets of structured vocabularies used for describing and categorizing genes, and gene products by their molecular function, involvement in biological processes, and cellular location. These vocabularies and their associations to genes and gene products (Gene Ontology annotations) are generated and curated by the Gene Ontology Consortium.Models, Genetic: Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment.Programming Languages: Specific languages used to prepare computer programs.Proteomics: The systematic study of the complete complement of proteins (PROTEOME) of organisms.Artificial Intelligence: Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language.Gene Library: A large collection of DNA fragments cloned (CLONING, MOLECULAR) from a given organism, tissue, organ, or cell type. It may contain complete genomic sequences (GENOMIC LIBRARY) or complementary DNA sequences, the latter being formed from messenger RNA and lacking intron sequences.Transcriptome: The pattern of GENE EXPRESSION at the level of genetic transcription in a specific organism or under specific circumstances in specific cells.Multigene Family: A set of genes descended by duplication and variation from some ancestral gene. Such genes may be clustered together on the same chromosome or dispersed on different chromosomes. Examples of multigene families include those that encode the hemoglobins, immunoglobulins, histocompatibility antigens, actins, tubulins, keratins, collagens, heat shock proteins, salivary glue proteins, chorion proteins, cuticle proteins, yolk proteins, and phaseolins, as well as histones, ribosomal RNA, and transfer RNA genes. The latter three are examples of reiterated genes, where hundreds of identical genes are present in a tandem array. (King & Stanfield, A Dictionary of Genetics, 4th ed)Genome, Archaeal: The genetic complement of an archaeal organism (ARCHAEA) as represented in its DNA.Contig Mapping: Overlapping of cloned or sequenced DNA to construct a continuous region of a gene, chromosome or genome.Protein Interaction Mapping: Methods for determining interaction between PROTEINS.Genomic Islands: Distinct units in some bacterial, bacteriophage or plasmid GENOMES that are types of MOBILE GENETIC ELEMENTS. Encoded in them are a variety of fitness conferring genes, such as VIRULENCE FACTORS (in "pathogenicity islands or islets"), ANTIBIOTIC RESISTANCE genes, or genes required for SYMBIOSIS (in "symbiosis islands or islets"). They range in size from 10 - 500 kilobases, and their GC CONTENT and CODON usage differ from the rest of the genome. They typically contain an INTEGRASE gene, although in some cases this gene has been deleted resulting in "anchored genomic islands".Search Engine: Software used to locate data or information stored in machine-readable form locally or at a distance such as an INTERNET site.Pattern Recognition, Automated: In INFORMATION RETRIEVAL, machine-sensing or identification of visible patterns (shapes, forms, and configurations). (Harrod's Librarians' Glossary, 7th ed)Pseudogenes: Genes bearing close resemblance to known genes at different loci, but rendered non-functional by additions or deletions in structure that prevent normal transcription or translation. When lacking introns and containing a poly-A segment near the downstream end (as a result of reverse copying from processed nuclear RNA into double-stranded DNA), they are called processed genes.Knowledge Bases: Collections of facts, assumptions, beliefs, and heuristics that are used in combination with databases to achieve desired results, such as a diagnosis, an interpretation, or a solution to a problem (From McGraw Hill Dictionary of Scientific and Technical Terms, 6th ed).Conserved Sequence: A sequence of amino acids in a polypeptide or of nucleotides in DNA or RNA that is similar across multiple species. A known set of conserved sequences is represented by a CONSENSUS SEQUENCE. AMINO ACID MOTIFS are often composed of conserved sequences.Semantics: The relationships between symbols and their meanings.Software Design: Specifications and instructions applied to the software.Sequence Analysis, DNA: A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis.Genetic Variation: Genotypic differences observed among individuals in a population.DNA, Complementary: Single-stranded complementary DNA synthesized from an RNA template by the action of RNA-dependent DNA polymerase. cDNA (i.e., complementary DNA, not circular DNA, not C-DNA) is used in a variety of molecular cloning experiments as well as serving as a specific hybridization probe.Exons: The parts of a transcript of a split GENE remaining after the INTRONS are removed. They are spliced together to become a MESSENGER RNA or other functional RNA.PubMed: A bibliographic database that includes MEDLINE as its primary subset. It is produced by the National Center for Biotechnology Information (NCBI), part of the NATIONAL LIBRARY OF MEDICINE. PubMed, which is searchable through NLM's Web site, also includes access to additional citations to selected life sciences journals not in MEDLINE, and links to other resources such as the full-text of articles at participating publishers' Web sites, NCBI's molecular biology databases, and PubMed Central.Genome, Protozoan: The complete genetic complement contained in a set of CHROMOSOMES in a protozoan.Genes, Plant: The functional hereditary units of PLANTS.Abstracting and Indexing as Topic: Activities performed to identify concepts and aspects of published information and research reports.Synteny: The presence of two or more genetic loci on the same chromosome. Extensions of this original definition refer to the similarity in content and organization between chromosomes, of different species for example.Metabolic Networks and Pathways: Complex sets of enzymatic reactions connected to each other via their product and substrate metabolites.Species Specificity: The restriction of a characteristic behavior, anatomical structure or physical system, such as immune response; metabolic response, or gene or gene variant to the members of one species. It refers to that property which differentiates one species from another but it is also used for phylogenetic levels higher or lower than the species.Reproducibility of Results: The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results.Sequence Homology, Nucleic Acid: The sequential correspondence of nucleotides in one nucleic acid molecule with those of another nucleic acid molecule. Sequence homology is an indication of the genetic relatedness of different organisms and gene function.Oryza sativa: Annual cereal grass of the family POACEAE and its edible starchy grain, rice, which is the staple food of roughly one-half of the world's population.Enzymes: Biological molecules that possess catalytic activity. They may occur naturally or be synthetically created. Enzymes are usually proteins, however CATALYTIC RNA and CATALYTIC DNA molecules have also been identified.Polymorphism, Single Nucleotide: A single nucleotide variation in a genetic sequence that occurs at appreciable frequency in the population.Cloning, Molecular: The insertion of recombinant DNA molecules from prokaryotic and/or eukaryotic sources into a replicating vehicle, such as a plasmid or virus vector, and the introduction of the resultant hybrid molecules into recipient cells without altering the viability of those cells.Sequence Homology, Amino Acid: The degree of similarity between sequences of amino acids. This information is useful for the analyzing genetic relatedness of proteins and species.Untranslated Regions: The parts of the messenger RNA sequence that do not code for product, i.e. the 5' UNTRANSLATED REGIONS and 3' UNTRANSLATED REGIONS.Sequence Analysis: A multistage process that includes the determination of a sequence (protein, carbohydrate, etc.), its fragmentation and analysis, and the interpretation of the resulting sequence information.Genome, Insect: The genetic complement of an insect (INSECTS) as represented in its DNA.Automation: Controlled operation of an apparatus, process, or system by mechanical or electronic devices that take the place of human organs of observation, effort, and decision. (From Webster's Collegiate Dictionary, 1993)Introns: Sequences of DNA in the genes that are located between the EXONS. They are transcribed along with the exons but are removed from the primary gene transcript by RNA SPLICING to leave mature RNA. Some introns code for separate genes.Genome, Helminth: The genetic complement of a helminth (HELMINTHS) as represented in its DNA.Genome, Viral: The complete genetic complement contained in a DNA or RNA molecule in a virus.Phenotype: The outward appearance of the individual. It is the product of interactions between genes, and between the GENOTYPE and the environment.Disease: A definite pathologic process with a characteristic set of signs and symptoms. It may affect the whole body or any of its parts, and its etiology, pathology, and prognosis may be known or unknown.Chromosomes, Artificial, Bacterial: DNA constructs that are composed of, at least, a REPLICATION ORIGIN, for successful replication, propagation to and maintenance as an extra chromosome in bacteria. In addition, they can carry large amounts (about 200 kilobases) of other sequence for a variety of bioengineering purposes.Alternative Splicing: A process whereby multiple RNA transcripts are generated from a single gene. Alternative splicing involves the splicing together of other possible sets of EXONS during the processing of some, but not all, transcripts of the gene. Thus a particular exon may be connected to any one of several alternative exons to form a mature RNA. The alternative forms of mature MESSENGER RNA produce PROTEIN ISOFORMS in which one part of the isoforms is common while the other parts are different.Prokaryotic Cells: Cells lacking a nuclear membrane so that the nuclear material is either scattered in the cytoplasm or collected in a nucleoid region.DNA: A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine).Data Interpretation, Statistical: Application of statistical procedures to analyze specific observed or assumed facts from a particular study.Biological Processes: Biological activities and function of the whole organism in human, animal, microorgansims, and plants, and of the biosphere.Gene Regulatory Networks: Interacting DNA-encoded regulatory subsystems in the GENOME that coordinate input from activator and repressor TRANSCRIPTION FACTORS during development, cell differentiation, or in response to environmental cues. The networks function to ultimately specify expression of particular sets of GENES for specific conditions, times, or locations.Genomic Imprinting: The variable phenotypic expression of a GENE depending on whether it is of paternal or maternal origin, which is a function of the DNA METHYLATION pattern. Imprinted regions are observed to be more methylated and less transcriptionally active. (Segen, Dictionary of Modern Medicine, 1992)Transcription, Genetic: The biosynthesis of RNA carried out on a template of DNA. The biosynthesis of DNA from an RNA template is called REVERSE TRANSCRIPTION.Metagenomics: The genomic analysis of assemblages of organisms.Genes, Bacterial: The functional hereditary units of BACTERIA.RNA, Messenger: RNA sequences that serve as templates for protein synthesis. Bacterial mRNAs are generally primary transcripts in that they do not require post-transcriptional processing. Eukaryotic mRNA is synthesized in the nucleus and must be exported to the cytoplasm for translation. Most eukaryotic mRNAs have a sequence of polyadenylic acid at the 3' end, referred to as the poly(A) tail. The function of this tail is not known for certain, but it may play a role in the export of mature mRNA from the nucleus as well as in helping stabilize some mRNA molecules by retarding their degradation in the cytoplasm.Nucleic Acid Hybridization: Widely used technique which exploits the ability of complementary sequences in single-stranded DNAs or RNAs to pair with each other to form a double helix. Hybridization can take place between two complimentary DNA sequences, between a single-stranded DNA and a complementary RNA, or between two RNA sequences. The technique is used to detect and isolate specific sequences, measure homology, or define other characteristics of one or both strands. (Kendrew, Encyclopedia of Molecular Biology, 1994, p503)MEDLINE: The premier bibliographic database of the NATIONAL LIBRARY OF MEDICINE. MEDLINE® (MEDLARS Online) is the primary subset of PUBMED and can be searched on NLM's Web site in PubMed or the NLM Gateway. MEDLINE references are indexed with MEDICAL SUBJECT HEADINGS (MeSH).Gene Duplication: Processes occurring in various organisms by which new genes are copied. Gene duplication may result in a MULTIGENE FAMILY; supergenes or PSEUDOGENES.Sequence Homology: The degree of similarity between sequences. Studies of AMINO ACID SEQUENCE HOMOLOGY and NUCLEIC ACID SEQUENCE HOMOLOGY provide useful information about the genetic relatedness of genes, gene products, and species.Protein Structure, Tertiary: The level of protein structure in which combinations of secondary protein structures (alpha helices, beta sheets, loop regions, and motifs) pack together to form folded shapes called domains. Disulfide bridges between cysteines in two different parts of the polypeptide chain along with other interactions between the chains play a role in the formation and stabilization of tertiary structure. Small proteins usually consist of only one domain but larger proteins may contain a number of domains connected by segments of polypeptide chain which lack regular secondary structure.Polymerase Chain Reaction: In vitro method for producing large amounts of specific DNA or RNA fragments of defined length and sequence from small amounts of short oligonucleotide flanking sequences (primers). The essential steps include thermal denaturation of the double-stranded target molecules, annealing of the primers to their complementary sequences, and extension of the annealed primers by enzymatic synthesis with DNA polymerase. The reaction is efficient, specific, and extremely sensitive. Uses for the reaction include disease diagnosis, detection of difficult-to-isolate pathogens, mutation analysis, genetic testing, DNA sequencing, and analyzing evolutionary relationships.RNA, Untranslated: RNA which does not code for protein but has some enzymatic, structural or regulatory function. Although ribosomal RNA (RNA, RIBOSOMAL) and transfer RNA (RNA, TRANSFER) are also untranslated RNAs they are not included in this scope.Mutation: Any detectable and heritable change in the genetic material that causes a change in the GENOTYPE and which is transmitted to daughter cells and to succeeding generations.DNA, Bacterial: Deoxyribonucleic acid that makes up the genetic material of bacteria.Word Processing: Text editing and storage functions using computer software.Structural Homology, Protein: The degree of 3-dimensional shape similarity between proteins. It can be an indication of distant AMINO ACID SEQUENCE HOMOLOGY and used for rational DRUG DESIGN.Hypermedia: Computerized compilations of information units (text, sound, graphics, and/or video) interconnected by logical nonlinear linkages that enable users to follow optimal paths through the material and also the systems used to create and display this information. (From Thesaurus of ERIC Descriptors, 1994)Gene Order: The sequential location of genes on a chromosome.Crowdsourcing: Social media model for enabling public involvement and recruitment in participation. Use of social media to collect feedback and recruit volunteer subjects.Genes, Insect: The functional hereditary units of INSECTS.Gene Expression Regulation: Any of the processes by which nuclear, cytoplasmic, or intercellular factors influence the differential control (induction or repression) of gene action at the level of transcription or translation.Gene Dosage: The number of copies of a given gene present in the cell of an organism. An increase in gene dosage (by GENE DUPLICATION for example) can result in higher levels of gene product formation. GENE DOSAGE COMPENSATION mechanisms result in adjustments to the level GENE EXPRESSION when there are changes or differences in gene dosage.Arabidopsis: A plant genus of the family BRASSICACEAE that contains ARABIDOPSIS PROTEINS and MADS DOMAIN PROTEINS. The species A. thaliana is used for experiments in classical plant genetics as well as molecular genetic studies in plant physiology, biochemistry, and development.Drosophila melanogaster: A species of fruit fly much used in genetics because of the large size of its chromosomes.Models, Statistical: Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc.DNA Transposable Elements: Discrete segments of DNA which can excise and reintegrate to another site in the genome. Most are inactive, i.e., have not been found to exist outside the integrated state. DNA transposable elements include bacterial IS (insertion sequence) elements, Tn elements, the maize controlling elements Ac and Ds, Drosophila P, gypsy, and pogo elements, the human Tigger elements and the Tc and mariner elements which are found throughout the animal kingdom.DNA, Plant: Deoxyribonucleic acid that makes up the genetic material of plants.Online Systems: Systems where the input data enter the computer directly from the point of origin (usually a terminal or workstation) and/or in which output data are transmitted directly to that terminal point of origin. (Sippl, Computer Dictionary, 4th ed)Classification: The systematic arrangement of entities in any field into categories classes based on common characteristics such as properties, morphology, subject matter, etc.Markov Chains: A stochastic process such that the conditional probability distribution for a state at any future instant, given the present state, is unaffected by any additional knowledge of the past history of the system.Bacterial Proteins: Proteins found in any species of bacterium.Workflow: Description of pattern of recurrent functions or procedures frequently found in organizational processes, such as notification, decision, and action.Protein Interaction Maps: Graphs representing sets of measurable, non-covalent physical contacts with specific PROTEINS in living organisms or in cells.Gene Expression: The phenotypic manifestation of a gene or genes by the processes of GENETIC TRANSCRIPTION and GENETIC TRANSLATION.Biological Ontologies: Structured vocabularies describing concepts from the fields of biology and relationships between concepts.Binding Sites: The parts of a macromolecule that directly participate in its specific combination with another molecule.Quantitative Trait Loci: Genetic loci associated with a QUANTITATIVE TRAIT.Genetic Markers: A phenotypically recognizable genetic trait which can be used to identify a genetic locus, a linkage group, or a recombination event.Human Genome Project: A coordinated effort of researchers to map (CHROMOSOME MAPPING) and sequence (SEQUENCE ANALYSIS, DNA) the human GENOME.Regulatory Elements, Transcriptional: Nucleotide sequences of a gene that are involved in the regulation of GENETIC TRANSCRIPTION.Software Validation: The act of testing the software for compliance with a standard.Transcription Initiation Site: The first nucleotide of a transcribed DNA sequence where RNA polymerase (DNA-DIRECTED RNA POLYMERASE) begins synthesizing the RNA transcript.Transcription Factors: Endogenous substances, usually proteins, which are effective in the initiation, stimulation, or termination of the genetic transcription process.Promoter Regions, Genetic: DNA sequences which are recognized (directly or indirectly) and bound by a DNA-dependent RNA polymerase during the initiation of transcription. Highly conserved sequences within the promoter include the Pribnow box in bacteria and the TATA BOX in eukaryotes.Saccharomyces cerevisiae: A species of the genus SACCHAROMYCES, family Saccharomycetaceae, order Saccharomycetales, known as "baker's" or "brewer's" yeast. The dried form is used as a dietary supplement.Sequence Tagged Sites: Short tracts of DNA sequence that are used as landmarks in GENOME mapping. In most instances, 200 to 500 base pairs of sequence define a Sequence Tagged Site (STS) that is operationally unique in the human genome (i.e., can be specifically detected by the polymerase chain reaction in the presence of all other genomic sequences). The overwhelming advantage of STSs over mapping landmarks defined in other ways is that the means of testing for the presence of a particular STS can be completely described as information in a database.Base Composition: The relative amounts of the PURINES and PYRIMIDINES in a nucleic acid.Computer Simulation: Computer-based representation of physical systems and phenomena such as chemical processes.Genotype: The genetic constitution of the individual, comprising the ALLELES present at each GENETIC LOCUS.Models, Biological: Theoretical representations that simulate the behavior or activity of biological processes or diseases. For disease models in living animals, DISEASE MODELS, ANIMAL is available. Biological models include the use of mathematical equations, computers, and other electronic equipment.Codon, Initiator: A codon that directs initiation of protein translation (TRANSLATION, GENETIC) by stimulating the binding of initiator tRNA (RNA, TRANSFER, MET). In prokaryotes, the codons AUG or GUG can act as initiators while in eukaryotes, AUG is the only initiator codon.Automatic Data Processing: Data processing largely performed by automatic means.Microarray Analysis: The simultaneous analysis, on a microchip, of multiple samples or targets arranged in an array format.Physical Chromosome Mapping: Mapping of the linear order of genes on a chromosome with units indicating their distances by using methods other than genetic recombination. These methods include nucleotide sequencing, overlapping deletions in polytene chromosomes, and electron micrography of heteroduplex DNA. (From King & Stansfield, A Dictionary of Genetics, 5th ed)Takifugu: A genus of pufferfish commonly used for research.Repetitive Sequences, Nucleic Acid: Sequences of DNA or RNA that occur in multiple copies. There are several types: INTERSPERSED REPETITIVE SEQUENCES are copies of transposable elements (DNA TRANSPOSABLE ELEMENTS or RETROELEMENTS) dispersed throughout the genome. TERMINAL REPEAT SEQUENCES flank both ends of another sequence, for example, the long terminal repeats (LTRs) on RETROVIRUSES. Variations may be direct repeats, those occurring in the same direction, or inverted repeats, those opposite to each other in direction. TANDEM REPEAT SEQUENCES are copies which lie adjacent to each other, direct or inverted (INVERTED REPEAT SEQUENCES).Genome-Wide Association Study: An analysis comparing the allele frequencies of all available (or a whole GENOME representative set of) polymorphic markers in unrelated patients with a specific symptom or disease condition, and those of healthy controls to identify markers associated with a specific disease or condition.RNA Isoforms: The different gene transcripts generated from a single gene by RNA EDITING or ALTERNATIVE SPLICING of RNA PRECURSORS.Genome, Microbial: The genetic complement of a microorganism as represented in its DNA or in some microorganisms its RNA.DNA, Intergenic: Any of the DNA in between gene-coding DNA, including untranslated regions, 5' and 3' flanking regions, INTRONS, non-functional pseudogenes, and non-functional repetitive sequences. This DNA may or may not encode regulatory functions.Genetic Loci: Specific regions that are mapped within a GENOME. Genetic loci are usually identified with a shorthand notation that indicates the chromosome number and the position of a specific band along the P or Q arm of the chromosome where they are found. For example the locus 6p21 is found within band 21 of the P-arm of CHROMOSOME 6. Many well known genetic loci are also known by common names that are associated with a genetic function or HEREDITARY DISEASE.Microsatellite Repeats: A variety of simple repeat sequences that are distributed throughout the GENOME. They are characterized by a short repeat unit of 2-8 basepairs that is repeated up to 100 times. They are also known as short tandem repeats (STRs).Periodicals as Topic: A publication issued at stated, more or less regular, intervals.Genes, Overlapping: Genes whose nucleotide sequences overlap to some degree. The overlapped sequences may involve structural or regulatory genes of eukaryotic or prokaryotic cells.Genes, Fungal: The functional hereditary units of FUNGI.Reverse Transcriptase Polymerase Chain Reaction: A variation of the PCR technique in which cDNA is made from RNA via reverse transcription. The resultant cDNA is then amplified using standard PCR protocols.Plant Proteins: Proteins found in plants (flowers, herbs, shrubs, trees, etc.). The concept does not include proteins found in vegetables for which VEGETABLE PROTEINS is available.Nucleotide Motifs: Commonly observed BASE SEQUENCE or nucleotide structural components which can be represented by a CONSENSUS SEQUENCE or a SEQUENCE LOGO.Chromosomes, Plant: Complex nucleoprotein structures which contain the genomic DNA and are part of the CELL NUCLEUS of PLANTS.Quality Control: A system for verifying and maintaining a desired level of quality in a product or process by careful planning, use of proper equipment, continued inspection, and corrective action as required. (Random House Unabridged Dictionary, 2d ed)DNA Primers: Short sequences (generally about 10 base pairs) of DNA that are complementary to sequences of messenger RNA and allow reverse transcriptases to start copying the adjacent sequences of mRNA. Primers are used extensively in genetic and molecular biology techniques.Systems Biology: Comprehensive, methodical analysis of complex biological systems by monitoring responses to perturbations of biological processes. Large scale, computerized collection and analysis of the data are used to develop and test models of biological systems.Gene Expression Regulation, Plant: Any of the processes by which nuclear, cytoplasmic, or intercellular factors influence the differential control of gene action in plants.RNA: A polynucleotide consisting essentially of chains with a repeating backbone of phosphate and ribose units to which nitrogenous bases are attached. RNA is unique among biological macromolecules in that it can encode genetic information, serve as an abundant structural component of cells, and also possesses catalytic activity. (Rieger et al., Glossary of Genetics: Classical and Molecular, 5th ed)RNA Splice Sites: Nucleotide sequences located at the ends of EXONS and recognized in pre-messenger RNA by SPLICEOSOMES. They are joined during the RNA SPLICING reaction, forming the junctions between exons.Regulatory Sequences, Nucleic Acid: Nucleic acid sequences involved in regulating the expression of genes.Computer Communication Networks: A system containing any combination of computers, computer terminals, printers, audio or visual display devices, or telephones interconnected by telecommunications equipment or cables: used to transmit or receive information. (Random House Unabridged Dictionary, 2d ed)DNA Copy Number Variations: Stretches of genomic DNA that exist in different multiples between individuals. Many copy number variations have been associated with susceptibility or resistance to disease.MicroRNAs: Small double-stranded, non-protein coding RNAs, 21-25 nucleotides in length generated from single-stranded microRNA gene transcripts by the same RIBONUCLEASE III, Dicer, that produces small interfering RNAs (RNA, SMALL INTERFERING). They become part of the RNA-INDUCED SILENCING COMPLEX and repress the translation (TRANSLATION, GENETIC) of target RNA by binding to homologous 3'UTR region as an imperfect match. The small temporal RNAs (stRNAs), let-7 and lin-4, from C. elegans, are the first 2 miRNAs discovered, and are from a class of miRNAs involved in developmental timing.Encyclopedias as Topic: Works containing information articles on subjects in every field of knowledge, usually arranged in alphabetical order, or a similar work limited to a special field or subject. (From The ALA Glossary of Library and Information Science, 1983)Amino Acid Motifs: Commonly observed structural components of proteins formed by simple combinations of adjacent secondary structures. A commonly observed structure may be composed of a CONSERVED SEQUENCE which can be represented by a CONSENSUS SEQUENCE.Blotting, Southern: A method (first developed by E.M. Southern) for detection of DNA that has been electrophoretically separated and immobilized by blotting on nitrocellulose or other type of paper or nylon membrane followed by hybridization with labeled NUCLEIC ACID PROBES.Eukaryota: One of the three domains of life (the others being BACTERIA and ARCHAEA), also called Eukarya. These are organisms whose cells are enclosed in membranes and possess a nucleus. They comprise almost all multicellular and many unicellular organisms, and are traditionally divided into groups (sometimes called kingdoms) including ANIMALS; PLANTS; FUNGI; and various algae and other taxa that were previously part of the old kingdom Protista.RNA, Plant: Ribonucleic acid in plants having regulatory and catalytic roles as well as involvement in protein synthesis.Plants: Multicellular, eukaryotic life forms of kingdom Plantae (sensu lato), comprising the VIRIDIPLANTAE; RHODOPHYTA; and GLAUCOPHYTA; all of which acquired chloroplasts by direct endosymbiosis of CYANOBACTERIA. They are characterized by a mainly photosynthetic mode of nutrition; essentially unlimited growth at localized regions of cell divisions (MERISTEMS); cellulose within cells providing rigidity; the absence of organs of locomotion; absence of nervous and sensory systems; and an alternation of haploid and diploid generations.Reference Standards: A basis of value established for the measure of quantity, weight, extent or quality, e.g. weight standards, standard solutions, methods, techniques, and procedures used in diagnosis and therapy.In Situ Hybridization, Fluorescence: A type of IN SITU HYBRIDIZATION in which target sequences are stained with fluorescent dye so their location and size can be determined using fluorescence microscopy. This staining is sufficiently distinct that the hybridization signal can be seen both in metaphase spreads and in interphase nuclei.Chromosomes, Human: Very long DNA molecules and associated proteins, HISTONES, and non-histone chromosomal proteins (CHROMOSOMAL PROTEINS, NON-HISTONE). Normally 46 chromosomes, including two sex chromosomes are found in the nucleus of human cells. They carry the hereditary information of the individual.Information Management: Management of the acquisition, organization, storage, retrieval, and dissemination of information. (From Thesaurus of ERIC Descriptors, 1994)Retroelements: Elements that are transcribed into RNA, reverse-transcribed into DNA and then inserted into a new site in the genome. Long terminal repeats (LTRs) similar to those from retroviruses are contained in retrotransposons and retrovirus-like elements. Retroposons, such as LONG INTERSPERSED NUCLEOTIDE ELEMENTS and SHORT INTERSPERSED NUCLEOTIDE ELEMENTS do not contain LTRs.Alleles: Variant forms of the same gene, occupying the same locus on homologous CHROMOSOMES, and governing the variants in production of the same gene product.Databases as Topic: Organized collections of computer records, standardized in format and content, that are stored in any of a variety of computer-readable modes. They are the basic sets of data from which computer-readable files are created. (from ALA Glossary of Library and Information Science, 1983)Gene Components: The parts of the gene sequence that carry out the different functions of the GENES.Restriction Mapping: Use of restriction endonucleases to analyze and generate a physical map of genomes, genes, or other segments of DNA.Abbreviations: Works consisting of lists of shortened forms of written words or phrases used for brevity. Acronyms are included here.Sensitivity and Specificity: Binary classification measures to assess test results. Sensitivity or recall rate is the proportion of true positives. Specificity is the probability of correctly determining the absence of a condition. (From Last, Dictionary of Epidemiology, 2d ed)Chromosome Aberrations: Abnormal number or structure of chromosomes. Chromosome aberrations may result in CHROMOSOME DISORDERS.Bayes Theorem: A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result.Dictionaries as Topic: Lists of words, usually in alphabetical order, giving information about form, pronunciation, etymology, grammar, and meaning.Caenorhabditis elegans: A species of nematode that is widely used in biological, biochemical, and genetic studies.Genes, Archaeal: The functional genetic units of ARCHAEA.Mycoplasmataceae: A family of gram-negative, non-motile bacteria from human and animal sources. One saprophytic species is known.Mass Spectrometry: An analytical method used in determining the identity of a chemical based on its mass using mass analyzers/mass spectrometers.Bacteria: One of the three domains of life (the others being Eukarya and ARCHAEA), also called Eubacteria. They are unicellular prokaryotic microorganisms which generally possess rigid cell walls, multiply by cell division, and exhibit three principal forms: round or coccal, rodlike or bacillary, and spiral or spirochetal. Bacteria can be classified by their response to OXYGEN: aerobic, anaerobic, or facultatively anaerobic; by the mode by which they obtain their energy: chemotrophy (via chemical reaction) or PHOTOTROPHY (via light reaction); for chemotrophs by their source of chemical energy: CHEMOLITHOTROPHY (from inorganic compounds) or chemoorganotrophy (from organic compounds); and by their source for CARBON; NITROGEN; etc.; HETEROTROPHY (from organic sources) or AUTOTROPHY (from CARBON DIOXIDE). They can also be classified by whether or not they stain (based on the structure of their CELL WALLS) with CRYSTAL VIOLET dye: gram-negative or gram-positive.
Genome Annotation. The Bioconductor project provides software for associating microarray and other genomic data in real time to ... Facilitate the inclusion of biological metadata in the analysis of genomic data, e.g. literature data from PubMed, annotation ... Software tools are available for assembling and processing genomic annotation data, from databases such as GenBank, the Gene ... In addition there are a large number of genome annotation packages available that are mainly, but not solely, oriented towards ...
Mount SM (2000). "Genomic Sequence, Splicing, and Gene Annotation". Am. J. Hum. Genet. 67 (4): 788-92. doi:10.1086/303098. PMC ...
A re-annotation was made in 2003. Venkateswaran, K.; Moser, D. P.; Dollhopf, M. E.; Lies, D. P.; Saffarini, D. A.; MacGregor, B ... In 2002, its genomic sequence was published. It has a 4.9Mb circular chromosome that is predicted to encode 4,758 protein open ...
"Genomic annotation prediction based on integrated information". Springer. pp. 238-252. doi:10.1007/978-3-642-35686-5_20 - via ...
"Manual curation is not sufficient for annotation of genomic databases". Bioinformatics. 23 (13): i41-i48. doi:10.1093/ ...
2008). "WGAViewer: software for genomic annotation of whole genome association studies". Genome Research. 18 (4): 640-643. doi: ... WGAViewer currently offers several classes of annotation of the GWAS results: (1) Overview of WGA results allowing zooming in/ ... out searching for gene/SNP top hits sorting with individual SNP annotation (2) Genic annotation of WGA results with explicit ... such as the genomic context of the SNP, linkage disequilibrium (LD) with ungenotyped SNPs, gene expression database, and the ...
Choy KW, Wang CC, Ogura A, Lau TK, Rogers MS, Ikeo K, Gojobori T, Lam DS, Pang CP (Mar 2006). "Genomic annotation of 15,809 ...
Annotation Bot) work together to provide version-controlled annotation and metadata for genomic variant data in order to ... Solutions for Genomic Variant Annotation. RENCI, University of North Carolina at Chapel Hill. doi:10.7921.G0QN64N3. Available ... renci.org/technical-reports/tr-14-04-canvas-and-annobot-solutions-for-genomic-variant-annotation. Webb, A. E. (2011). Linkage, ... relational PostgreSQL relational database that stores genomic variant data with associated annotation and metadata. AnnoBot ...
"GREAT Input: Genomic Regions Enrichment of Annotations Tool, Bejerano Lab, Stanford University". bejerano.stanford.edu. " ... In 2010, Gill Bejerano from Stanford University released the Genomic region enrichment of annotations tool (GREAT), a software ... The Database for Annotation, Visualization and Integrated Discovery (DAVID) is a bioinformatics tool that pools together ... Lina Wadi; Mona Meyer; Joel Weiser; Lincoln D Stein; Jüri Reimand (2016). "Impact of outdated gene annotations on pathway ...
"Genomic analyses with biofilter 2.0: knowledge driven filtering, annotation, and model development". BioData Mining. 6 (1): 25 ...
2007). "Genomic annotation of 15,809 ESTs identified from pooled early gestation human eyes". Physiol. Genomics. 25 (1): 9-15. ... 2005). "Generation and annotation of the DNA sequences of human chromosomes 2 and 4". Nature. 434 (7034): 724-31. doi:10.1038/ ...
... is a database of genomic annotations taking alternative splicing events into consideration. Alternative splicing TassDB ... AspicDB Kim, Pora; Kim Namshin; Lee Younghee; Kim Bumjin; Shin Youngah; Lee Sanghyuk (Jan 2005). "ECgene: genome annotation for ...
2006). "Genomic annotation of 15,809 ESTs identified from pooled early gestation human eyes". Physiol. Genomics. 25 (1): 9-15. ...
He was the Genome annotation Group Leader in Joint Genomic Institute, Lawrence Berkeley National Lab (2003) and was the ... 2006). Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol Solovyev V, Salamov A. (2011) Automatic ... Salamov A., Solovyev V. (2000) Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10(4), 516-522. Solovyev et al. ( ... He is interested in genome structural and functional annotation and applying it for rational design of biological systems. ...
A number of servers provide genomic annotations computed with the aid of TRANSFAC. Others have used such analyses to infer ... Binding of a TF to a genomic site is documented by specifying the localization of the site, its sequence and the experimental ... Wingender E (July 2008). "The TRANSFAC project as an example of framework technology that supports the analysis of genomic ... TRANSFAC (TRANScription FACtor database) is a manually curated database of eukaryotic transcription factors, their genomic ...
The Pseudomonas Genome Database is a database of genomic annotations for the pseudomonas genomes. pseudomonas Winsor, Geoffrey ...
Sequence curation at WormBase refers to the maintenance and annotation of the primary genomic sequence and a consensus gene set ... re-analysis of the original genomic data often leads to modifications of the genomic sequence. The changes in the genomic ... The reason for the change and the evidence for the change are added to the annotation of the CDS - these can be seen in the ... Genome Browser - browse the genes of C. elegans (and other species) in their genomic context Textpresso - a search tool that ...
Gaulton, Kyle (Dec 2015). "Genetic fine mapping and genomic annotation defines causal mechanisms at type 2 diabetes ...
... 2 accepts GenBank and Locus Reference Genomic (LRG) records. The annotation is also used to apply the correct codon ... Mutalyzer website HGVS sequence variant nomenclature website LOVD 2.0 website GenBank website Locus Reference Genomic website ... Mutalyzer requires a DNA sequence record containing the transcript and protein feature annotation as a reference. ...
For genomic re-sequencing and newly sequenced genomes, a de-novo assembly will be provided. 2. The Genomatix Genome Analyzer ( ... It allows for easy integration and visualization in the terabytes of background annotation of the ElDorado genome database. GGA ... Genomatix offers integrated solutions and databases for genome annotation and regulation analysis. Genomatix product portfolio ... Clustering and peak finding, analysis for phylogenetic conservation, large scale correlation analysis with annotated genomic ...
Richardson MK, Crooijmans RP, Groenen MA (2007). "Sequencing and genomic annotation of the chicken (Gallus gallus) Hox clusters ...
... it is most accurate when there are large regions of contiguous genomic DNA available for comparison. Gene annotations provide ... DNA sequence data from genomic and metagenomic projects are essentially the same, but genomic sequence data offers higher ... Therefore, community genomic information is another fundamental tool (with metabolomics and proteomics) in the quest to ... Breitbart, M; Salamon P; Andresen B; Mahaffy JM; Segall AM; Mead D; Azam F; Rohwer F (2002). "Genomic analysis of uncultured ...
... is a curated genomic database containing functional annotations of agriculturally important animals, plants, microbes ... AgBase biocurators provides annotation of Gene Ontology terms and Plant Ontology terms for gene products. By 2011 AgBase ...
Early genome projects, such as human and fly used Pfam extensively for functional annotation of genomic data. The Pfam website ... One of its major aims at inception was to aid in the annotation of the C. elegans genome. The project was partly driven by the ... A critical step in improving the pace of updating and improving entries was to open up the functional annotation of Pfam ... It is anticipated that while community involvement will greatly improve the level of annotation of these families, some will ...
Developed in 1993, original GeneMark was used in 1995 as a primary gene prediction tool for annotation of the first completely ... how to define parameters for gene prediction in a rather short sequence that has no large genomic context. In 1999 this ... doi:10.1093/nar/29.12.2607 Mills R., Rozanov M., Lomsadze A., Tatusova T. and Borodovsky M. "Improving gene annotation in ... www.ncbi.nlm.nih.gov/genome/annotation_prok/process). Accurate identification of species specific parameters of the GeneMark ...
Vieira ML, Caillat-Zucman S, Gajdos P, Cohen-Kaminsky S, Casteur A, Bach JF (September 1993). "Identification by genomic typing ... "Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project". Immunogenetics. 60 (1): 1-18. doi ... "Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project". Immunogenetics. 60 (1): 1-18. doi ...
With the accumulation of gene function annotation by the IMPC, as well as technical advancements in gene editing such as CRISPR ... This last technique is particularly valuable for human genomic sequencing since it enhances our ability to replicate human ... For example, by comparing mouse genetic functional data with genomic data for selected species with specific diseases, improved ... compared genetic functional data from mice with genomic data from gorillas, showing how such analyses could aid in the ...
GAIA: framework annotation of genomic sequence.. Bailey LC Jr1, Fischer S, Schug J, Crabtree J, Gibson M, Overton GC. ... Here we describe a process of high-throughput, reliable annotation, called framework annotation, which is designed to provide a ... New framework annotation is produced by CARTA, a set of autonomous sensors that perform automatic analyses and assert results ... The center of GAIA consists of an annotation database and the associated data management subsystem that forms the software bus ...
Command Line Tools for Genomic Data Science. In this module, well be taking a look at Sequences and Genomic Features in a ... Sequences and Genomic Features 2: Sequence Representation and Generation11:29. Sequences and Genomic Features 3: Annotation14: ... Sequences and Genomic Features 5: Recreating Sequences & Features12:42. Sequences and Genomic Features 6: Genomic Feature ... Sequences and Genomic Features 3: Annotation. To view this video please enable JavaScript, and consider upgrading to a web ...
Computational and functional annotation of genomic elements during development of the model vertebrate zebrafish ... Periodic Reporting for period 1 - ZENCODE-ITN (Computational and functional annotation of genomic elements during development ... We aim to comprehensively annotate functional elements, decipher genomic codes of transcription, as well as coding and non- ... We aim to comprehensively annotate functional elements, decipher genomic codes of transcription, as well as coding and non- ...
... software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data - ... VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic ... data into a genomic context, all of which can be visualized at three levels of genomic resolution.. Data is interrogated via ... PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics ...
... Tianlei Xu ... Annotating a given genomic locus or a set of genomic loci is an important yet challenging task. This is especially true for the ... Only the Genomic Locations (chromosomes, start and end position) will be used. Strand information and other metadata columns ... By checking the presence or absence of millions of eQTLs in the set of genomic intervals of interest, loci2path build a bridge ...
... ... Home » Snat: a SNP annotation tool for bovine by integrating various sources of genomic information ... Population structure, differential bias and genomic control in a large-scale, case-control association study. Clayton, David G ... The article presents a study which was conducted to develop a bovine single nucleotide polymorphisms (SNP) annotation tool ( ...
The annotation of biochemical pathways resulted in a total of 11,971 unigenes assigned with 145 KEGG maps and 1,759 enzyme ... The annotation of biochemical pathways resulted in a total of 11,971 unigenes assigned with 145 KEGG maps and 1,759 enzyme ... databases identified 175,882 GO annotations. A total of 11,308 guar unigenes were annotated with various enzyme codes (EC) and ... databases identified 175,882 GO annotations. A total of 11,308 guar unigenes were annotated with various enzyme codes (EC) and ...
As a result, we are able to suggest functional roles for several previously unknown genes or unknown genomic regions in E. coli ... Here we propose a new method for functional annotation using the conservation patterns of gene clusters. If several gene ... The current speed of sequencing already exceeds the capability of annotation, creating a potential bottleneck. A large ... Homology-based annotation tools aim to detect sequence similarity between new genes and known genes by following a one-by-one ...
The Genomic Threading Database: a comprehensive resource for structural annotations of the genomes from key organisms. NUCLEIC ... The Genomic Threading Database: a comprehensive resource for structural annotations of the genomes from key organisms ... The Genomic Threading Database: a comprehensive resource for structural annotations of the genomes from key organisms. ... Currently, the Genomic Threading Database (GTD) contains structural assignments for the proteins encoded within the genomes of ...
... that enables researchers to rapidly infer gene function based on available gene expression data and functional annotations. Our ... Furthermore, Prosecutor utilizes additional biological information such as genomic context and known regulatory mechanisms that ... Despite a plethora of functional genomic efforts, the function of many genes in sequenced genomes remains unknown. The ... Prediction ability of four annotation sources. Histograms of ROC areas (Area Under the Curve) for four annotation sources for E ...
Jiang K, Du F, lv L, Zhuo H, Xu T, Peng L, Chen Y, Li L, Zhang J. Genetic Fine Mapping and Genomic Annotation Defines Causal ... Genetic Fine Mapping and Genomic Annotation Defines Causal Mechanisms at A Novel Colorectal Cancer Susceptibility Locus in Han ...
Highly accurate preliminary annotation of Tetrahymenas coding potential was hindered by the lack of both comparative genomic ... Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic ... Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic ... For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular ...
Genomic analyses with biofilter 2.0: Knowledge driven filtering, annotation, and model development. BioData Mining. 2013 Dec 30 ... Genomic analyses with biofilter 2.0 : Knowledge driven filtering, annotation, and model development. / Pendergrass, Sarah A.; ... Fingerprint Dive into the research topics of Genomic analyses with biofilter 2.0: Knowledge driven filtering, annotation, and ... title = "Genomic analyses with biofilter 2.0: Knowledge driven filtering, annotation, and model development", ...
The lack of a facile and rapid annotation algorithm has led to the rnpB gene being the most grossly under annotated essential ... yielding high quality results to aid researchers annotating either genomic or metagenomic data. It is the only algorithm to ... makes it labor intensive and challenging to create and maintain covariance models for the detection of RNase P RNA in genomic ... gene in completed prokaryotic genomes with only a 24% annotation rate. Here we describe the coupling of the largest RNase P RNA ...
... Summarizing the key genome annotation resources in Bioconductor. Executive ... Organism-oriented annotation. For biological annotation, generally sequence or gene based, there are three key types of package ... You can survey all annotation packages at the annotation page.. Packages Homo.sapiens, Mus.musculus and Rattus.norvegicus are ... Systems biology oriented annotation. Packages GO.db, KEGG.db, KEGGREST, and reactome.db are primarily intended as organism- ...
Masking of genomic sequence: How much of the genome was masked. *Transcript and protein alignments: The number and type of ... This annotation should be referred to as NCBI Canis lupus dingo Annotation Release 100. Annotation release ID: 100. Date of ... For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page. ... For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.. Assembly name. ...
Background Several bioinformatics tools have been designed for assembly and annotation of chloroplast (cp) genomes, making it ... it is much easier to compare cp genomes than the whole genomic data for genomic comparative analysis. Early on chloroplast DNA ... The annotation of the cv Tombul cp genome was carried out using three different tools, cpGAVAS, DOGMA and GeSeq [40, 41, 42, 43 ... Comparing the results of the annotation tools, ten genes (atpF, clpP, ndhA, ndhB, ndhK, petA, rpl2, rpoC1, ycf3, ycf15) were ...
... shown are number of annotation segments (A), length distribution of segment annotations (B), and percent genomic coverage (C). ... Genomic distribution, coverage, and overlap of diverse regulatory annotations. To catalog super, typical, stretch enhancers, ... Enrichment for overlap between each pair of regulatory annotations in Figure S1 was calculated using the Genomic Association ... Regulatory annotation sources. Regulatory annotations for the GM12878, H1 hESC, and HepG2 cell types were downloaded from ...
Genomic and transcriptomic materials were combined into one file per species and sent through the KEGG Automatic Annotation ... Genome Annotation.. Genome annotation was conducted using MAKER2 (71), incorporating the Semi-HMM based Nucleic Acid Parser ( ... and a framework for comparative genomic studies, which should be valuable for future phylogenetic and genomic investigations of ... Genomic insights into the evolutionary origin of Myxozoa within Cnidaria. E. Sally Chang, Moran Neuhof, Nimrod D. Rubinstein, ...
The preceding results were overlaid and organized by using the Artemis annotation tool (47) for the final annotation of the ... Genome Annotation.. The Sulcia genome has no detectable GC skew nor a dnaA gene, two common ways of positioning the origin of ... Parallel genomic evolution and metabolic interdependence in an ancient symbiosis Message Subject (Your Name) has sent you a ... Parallel genomic evolution and metabolic interdependence in an ancient symbiosis. John P. McCutcheon and Nancy A. Moran ...
Functional annotation of SNVs.. Several computational tools and databases were used to predict the functional effect of coding ... filters removing genomic regions of inferior sequence quality (quality filters), and (ii) filters targeting genomic regions ... Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing. *Joke Reumers1. ,2. na1, ... Reumers, J., De Rijk, P., Zhao, H. et al. Optimized filtering reduces the error rate in detecting genomic variants by short- ...
Annotation of scaffolds representing ∼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families ... Genomic insights into the Ixodes scapularis tick vector of Lyme disease. *Monika Gulia-Nuss1. ,47. na1 nAff49, ... Genomic insights into the Ixodes scapularis tick vector of Lyme disease. Nat. Commun. 7:10507 doi: 10.1038/ncomms10507 (2016). ... The annotation of the I. scapularis genome was performed via a joint effort between the JCVI and VectorBase. The genome ...
Functional annotation. Initial functional annotation was performed using InterProScan to search against the InterPro protein ... based on the KEGG annotation resource [35]. KEGG genes and KO term annotations were assigned based on similarity searches with ... Annotation of carbohydrate active enzymes. The CAZymes Analysis Toolkit (CAT) [44] was used to detect B. xylophilus ... An annotation method "based on association rules between CAZy families and Pfam domains" was used with an E-value threshold of ...
Genomic-scale data available via TrichDB may be queried based on BLAST searches, annotation keywords and gene ID searches, GO ... Genomic-scale data available via TrichDB may be queried based on BLAST searches, annotation keywords and gene ID searches, GO ... Aurrecoechea et al., 2009) GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia ...
  • Generate summary statistics of genomic variants across the cohort that you selected. (amazon.com)
  • We evaluated the analytical sensitivity and specificity of the panel on 1624 known single nucleotide variants (SNVs) and indels on a mixture of genomic DNA from 10 previously characterized lymphoblastoid cell lines, and analyzed 50 Spanish patients with presumed hereditary SNHL not caused by GJB2/GJB6 , OTOF nor MT-RNR1 mutations. (springer.com)
  • Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. (frontiersin.org)
  • In this study, we selected representative MRSA strains from patients' systemic surveillance in Yunnan province of China, performed the genomic sequencing and compared their features, together with some food derived strains. (biomedcentral.com)
  • Objective To study the detailed nature of genomic microevolution during mixed infection with multiple Helicobacter pylori strains in an individual. (bmj.com)
  • Genomic comparison with other STEC O157 strains revealed that PV15-279 recently emerged from the stx1a/stx2c -positive GP STEC O157:H7 clone circulating in Japan. (cdc.gov)
  • Genomic tools can provide insights to local adaptation, population structure, genes functions, responses to environmental conditions, immune responses, and many other processes important for conservation strategies (Allendorf et al. (google.com)
  • With an updated approach on recent techniques and current human genomic databases, the book is a valuable source for students and researchers in genome and medical informatics. (worldcat.org)
  • The potential applications of whole-genome sequencing in genomic medicine are enormous and range from elucidating disease-causing mutations for monogenic traits to dissecting the molecular genetic basis of complex diseases and discovering somatic alterations in cancer 1 , 2 . (nature.com)
  • However, increased sequencing capacity was not matched by a corresponding development in annotation and the gene annotation process is now the rate-limiting step in whole-genome sequencing projects. (biomedcentral.com)
  • Altogether, these findings open up the possibility of a whole-genome reinvestigation of the S. cerevisiae annotation. (biomedcentral.com)
  • eQTL set entity also contains the following information: tissue name for the eQTL study, IDs and genomic ranges for the eQTL SNPs, IDs for the associated genes. (bioconductor.org)
  • The article presents a study which was conducted to develop a bovine single nucleotide polymorphisms (SNP) annotation tool (Snat) based on a web interface to systematically prioritize these statistically significant SNPs and facilitate follow-up replication studies. (ebscohost.com)
  • Here we describe the coupling of the largest RNase P RNA database with the local alignment scoring algorithm to create the most sensitive and rapid prokaryote rnpB gene identification and annotation algorithm to date. (biomedcentral.com)
  • In other ncRNAs, such as 16S and tRNA, sequence variability possess less of an identification and annotation hurdle because they have conserved secondary structures. (biomedcentral.com)
  • This conserved structure allows researchers to generate a descriptor model coupled with a covariance model for structure aided identification and annotation. (biomedcentral.com)
  • Due to the limited genetic information available for European hazelnut ( Corylus avellana L.) and as part of a genome sequencing project, we analyzed the complete chloroplast genome of the cultivar 'Tombul' with multiple annotation tools. (springer.com)
  • It also allows to load multiple annotation packages at the same time in order to e.g. compare gene models between Ensembl releases. (bioconductor.org)
  • We describe a wealth of potential virulence loci and attribute biological function to several putative genomic islands, which may then be further characterized using conventional molecular techniques. (nih.gov)
  • Here, we compare five widely used annotations of active regulatory elements that represent high densities of one or more relevant epigenomic marks-"super" and "typical" (nonsuper) enhancers, stretch enhancers, high-occupancy target (HOT) regions, and broad domains-across the four matched human cell types for which they are available. (genetics.org)
  • These scripts also provide original functionalities such as the creation of pie charts representing the genomic distribution of the peaks, as well as histograms of their distribution around transcriptional start sites (TSS) and 3' untranslated regions (3' UTR). (biomedcentral.com)
  • Thus, it became possible to reinvestigate the S. cerevisiae genome in the syntenic regions leading to an improved annotation. (biomedcentral.com)
  • Despite these striking genomic similarities, the average conservation at the DNA level is 55% in coding regions but drops to 33% in noncoding regions. (biomedcentral.com)
  • We carried out an extensive search for homology at the amino-acid level between A. gossypii coding regions and S. cerevisiae 'annotation-free' regions: stretches of sequence bearing no annotated genomic features such as ORFs, RNA genes, or transposable elements. (biomedcentral.com)
  • However, accurate selection of target genomic regions (gene panel/exome/genome), analytical performance and variant interpretation remain relevant difficulties for its clinical implementation. (springer.com)
  • Genome-wide detection of proteoforms in the brain would enable better genome annotation of protein coding regions. (mcponline.org)
  • It is designed to make a user-defined gene or SNP list (or genomic regions) more interpretable by comprehensively utilising ontology annotations and interaction networks to reveal relationships and enhance opportunities for biological discovery. (rdrr.io)
  • Differential gene expression controls variation in numerous plant traits, such as flowering time and plant/pest interactions, but little is known about the genomic distribution of the determinants of transcript levels and their associated variation. (genetics.org)
  • package provides also a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. (bioconductor.org)