Data Mining: Use of sophisticated analysis tools to sort through, organize, examine, and combine large sets of information.MiningInformation Storage and Retrieval: Organized activities related to the storage, location, search, and retrieval of information.Database Management Systems: Software designed to store, manipulate, manage, and control data for specific uses.Databases, Genetic: Databases devoted to knowledge about specific genes and gene products.Computational Biology: A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets.Algorithms: A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task.Software: Sequential operating programs and data which instruct the functioning of a digital computer.Coal MiningDatabases, Factual: Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references.Decision Trees: A graphic device used in decision analysis, series of decision options are represented as branches (hierarchical).User-Computer Interface: The portion of an interactive computer program that issues messages to and receives commands from a user.Decision Support Systems, Management: Computer-based systems that enable management to interrogate the computer on an ad hoc basis for various kinds of information in the organization, which predict the effect of potential decisions.Internet: A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange.Artificial Intelligence: Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language.Natural Language Processing: Computer processing of a language with rules that reflect and describe current usage rather than prescribed usage.PubMed: A bibliographic database that includes MEDLINE as its primary subset. It is produced by the National Center for Biotechnology Information (NCBI), part of the NATIONAL LIBRARY OF MEDICINE. PubMed, which is searchable through NLM's Web site, also includes access to additional citations to selected life sciences journals not in MEDLINE, and links to other resources such as the full-text of articles at participating publishers' Web sites, NCBI's molecular biology databases, and PubMed Central.Gene Expression Profiling: The determination of the pattern of genes expressed at the level of GENETIC TRANSCRIPTION, under specific circumstances or in a specific cell.Computer Graphics: The process of pictorial communication, between human and computers, in which the computer input and output have the form of charts, drawings, or other appropriate pictorial representation.Databases, Protein: Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties.Genomics: The systematic study of the complete DNA sequences (GENOME) of organisms.Pattern Recognition, Automated: In INFORMATION RETRIEVAL, machine-sensing or identification of visible patterns (shapes, forms, and configurations). (Harrod's Librarians' Glossary, 7th ed)Expressed Sequence Tags: Partial cDNA (DNA, COMPLEMENTARY) sequences that are unique to the cDNAs from which they were derived.Systems Integration: The procedures involved in combining separately developed modules, components, or subsystems so that they work together as a complete system. (From McGraw-Hill Dictionary of Scientific and Technical Terms, 4th ed)Hospital Administrators: Managerial personnel responsible for implementing policy and directing the activities of hospitals.Oligonucleotide Array Sequence Analysis: Hybridization of a nucleic acid sample to a very large set of OLIGONUCLEOTIDE PROBES, which have been attached individually in columns and rows to a solid support, to determine a BASE SEQUENCE, or to detect variations in a gene sequence, GENE EXPRESSION, or for GENE MAPPING.Adverse Drug Reaction Reporting Systems: Systems developed for collecting reports from government agencies, manufacturers, hospitals, physicians, and other sources on adverse drug reactions.Data Interpretation, Statistical: Application of statistical procedures to analyze specific observed or assumed facts from a particular study.Cluster Analysis: A set of statistical methods used to group variables or observations into strongly inter-related subgroups. In epidemiology, it may be used to analyze a closely grouped series of events or cases of disease or other health-related phenomenon with well-defined distribution patterns in relation to time or place or both.Databases as Topic: Organized collections of computer records, standardized in format and content, that are stored in any of a variety of computer-readable modes. They are the basic sets of data from which computer-readable files are created. (from ALA Glossary of Library and Information Science, 1983)Abstracting and Indexing as Topic: Activities performed to identify concepts and aspects of published information and research reports.MEDLINE: The premier bibliographic database of the NATIONAL LIBRARY OF MEDICINE. MEDLINE® (MEDLARS Online) is the primary subset of PUBMED and can be searched on NLM's Web site in PubMed or the NLM Gateway. MEDLINE references are indexed with MEDICAL SUBJECT HEADINGS (MeSH).Neural Networks (Computer): A computer architecture, implementable in either hardware or software, modeled after biological neural networks. Like the biological system in which the processing capability is a result of the interconnection strengths between arrays of nonlinear processing nodes, computerized neural networks, often called perceptrons or multilayer connectionist models, consist of neuron-like units. A homogeneous group of units makes up a layer. These networks are good at pattern recognition. They are adaptive, performing tasks by example, and thus are better for decision-making than are linear learning machines or cluster analysis. They do not require explicit programming.Sequence Analysis, Protein: A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence.Programming Languages: Specific languages used to prepare computer programs.Databases, Bibliographic: Extensive collections, reputedly complete, of references and citations to books, articles, publications, etc., generally on a single subject or specialized subject area. Databases can operate through automated files, libraries, or computer disks. The concept should be differentiated from DATABASES, FACTUAL which is used for collections of data and facts apart from bibliographic references to them.Terminology as Topic: The terms, expressions, designations, or symbols used in a particular science, discipline, or specialized subject area.Databases, Nucleic Acid: Databases containing information about NUCLEIC ACIDS such as BASE SEQUENCE; SNPS; NUCLEIC ACID CONFORMATION; and other properties. Information about the DNA fragments kept in a GENE LIBRARY or GENOMIC LIBRARY is often maintained in DNA databases.Drug Repositioning: The deliberate and methodical practice of finding new applications for existing drugs.Hypermedia: Computerized compilations of information units (text, sound, graphics, and/or video) interconnected by logical nonlinear linkages that enable users to follow optimal paths through the material and also the systems used to create and display this information. (From Thesaurus of ERIC Descriptors, 1994)Vocabulary, Controlled: A specified list of terms with a fixed and unalterable meaning, and from which a selection is made when CATALOGING; ABSTRACTING AND INDEXING; or searching BOOKS; JOURNALS AS TOPIC; and other documents. The control is intended to avoid the scattering of related subjects under different headings (SUBJECT HEADINGS). The list may be altered or extended only by the publisher or issuing agency. (From Harrod's Librarians' Glossary, 7th ed, p163)Knowledge Bases: Collections of facts, assumptions, beliefs, and heuristics that are used in combination with databases to achieve desired results, such as a diagnosis, an interpretation, or a solution to a problem (From McGraw Hill Dictionary of Scientific and Technical Terms, 6th ed).Medical Records Systems, Computerized: Computer-based systems for input, storage, display, retrieval, and printing of information contained in a patient's medical record.Pharmacovigilance: The detection of long and short term side effects of conventional and traditional medicines through research, data mining, monitoring, and evaluation of healthcare information obtained from healthcare providers and patients.Search Engine: Software used to locate data or information stored in machine-readable form locally or at a distance such as an INTERNET site.Reproducibility of Results: The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results.Bayes Theorem: A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result.Proteins: Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein.Multigene Family: A set of genes descended by duplication and variation from some ancestral gene. Such genes may be clustered together on the same chromosome or dispersed on different chromosomes. Examples of multigene families include those that encode the hemoglobins, immunoglobulins, histocompatibility antigens, actins, tubulins, keratins, collagens, heat shock proteins, salivary glue proteins, chorion proteins, cuticle proteins, yolk proteins, and phaseolins, as well as histones, ribosomal RNA, and transfer RNA genes. The latter three are examples of reiterated genes, where hundreds of identical genes are present in a tandem array. (King & Stanfield, A Dictionary of Genetics, 4th ed)United States Food and Drug Administration: An agency of the PUBLIC HEALTH SERVICE concerned with the overall planning, promoting, and administering of programs pertaining to maintaining standards of quality of foods, drugs, therapeutic devices, etc.Software Design: Specifications and instructions applied to the software.Medical Informatics: The field of information science concerned with the analysis and dissemination of medical data through the application of computers to various aspects of health care and medicine.Proteomics: The systematic study of the complete complement of proteins (PROTEOME) of organisms.Documentation: Systematic organization, storage, retrieval, and dissemination of specialized information, especially of a scientific or technical nature (From ALA Glossary of Library and Information Science, 1983). It often involves authenticating or validating information.Automatic Data Processing: Data processing largely performed by automatic means.Molecular Sequence Annotation: The addition of descriptive information about the function or structure of a molecular sequence to its MOLECULAR SEQUENCE DATA record.Sequence Alignment: The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms.Computer Simulation: Computer-based representation of physical systems and phenomena such as chemical processes.Information Management: Management of the acquisition, organization, storage, retrieval, and dissemination of information. (From Thesaurus of ERIC Descriptors, 1994)Proteome: The protein complement of an organism coded for by its genome.Sequence Analysis, DNA: A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis.Systems Biology: Comprehensive, methodical analysis of complex biological systems by monitoring responses to perturbations of biological processes. Large scale, computerized collection and analysis of the data are used to develop and test models of biological systems.Protein Interaction Mapping: Methods for determining interaction between PROTEINS.Periodicals as Topic: A publication issued at stated, more or less regular, intervals.Models, Statistical: Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc.Genome, Plant: The genetic complement of a plant (PLANTS) as represented in its DNA.Support Vector Machines: Learning algorithms which are a set of related supervised computer learning methods that analyze data and recognize patterns, and used for classification and regression analysis.Decision Support Systems, Clinical: Computer-based information systems used to integrate clinical and patient information and provide support for decision-making in patient care.Semantics: The relationships between symbols and their meanings.Phylogeny: The relationships of groups of organisms as reflected by their genetic makeup.Chromosome Mapping: Any method used for determining the location of and relative distances between genes on a chromosome.Drug Discovery: The process of finding chemicals for potential therapeutic use.Principal Component Analysis: Mathematical procedure that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components.Genome, Human: The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs.Genome: The genetic complement of an organism, including all of its GENES, as represented in its DNA, or in some cases, its RNA.Gene Regulatory Networks: Interacting DNA-encoded regulatory subsystems in the GENOME that coordinate input from activator and repressor TRANSCRIPTION FACTORS during development, cell differentiation, or in response to environmental cues. The networks function to ultimately specify expression of particular sets of GENES for specific conditions, times, or locations.Molecular Sequence Data: Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories.Polymorphism, Single Nucleotide: A single nucleotide variation in a genetic sequence that occurs at appreciable frequency in the population.Electronic Health Records: Media that facilitate transportability of pertinent information concerning patient's illness across varied providers and geographic locations. Some versions include direct linkages to online consumer health information that is relevant to the health conditions and treatments related to a specific patient.Drug-Related Side Effects and Adverse Reactions: Disorders that result from the intended use of PHARMACEUTICAL PREPARATIONS. Included in this heading are a broad variety of chemically-induced adverse conditions due to toxicity, DRUG INTERACTIONS, and metabolic effects of pharmaceuticals.Phenotype: The outward appearance of the individual. It is the product of interactions between genes, and between the GENOTYPE and the environment.Workflow: Description of pattern of recurrent functions or procedures frequently found in organizational processes, such as notification, decision, and action.Models, Genetic: Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment.Classification: The systematic arrangement of entities in any field into categories classes based on common characteristics such as properties, morphology, subject matter, etc.Transcriptome: The pattern of GENE EXPRESSION at the level of genetic transcription in a specific organism or under specific circumstances in specific cells.Genes: A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms.Models, Biological: Theoretical representations that simulate the behavior or activity of biological processes or diseases. For disease models in living animals, DISEASE MODELS, ANIMAL is available. Biological models include the use of mathematical equations, computers, and other electronic equipment.Genes, Plant: The functional hereditary units of PLANTS.DNA, Plant: Deoxyribonucleic acid that makes up the genetic material of plants.Metabolic Networks and Pathways: Complex sets of enzymatic reactions connected to each other via their product and substrate metabolites.Amino Acid Sequence: The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION.