**Algorithms**: A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task.

**Software**: Sequential operating programs and data which instruct the functioning of a digital computer.

**Pattern Recognition, Automated**: In INFORMATION RETRIEVAL, machine-sensing or identification of visible patterns (shapes, forms, and configurations). (Harrod's Librarians' Glossary, 7th ed)

**Computer Simulation**: Computer-based representation of physical systems and phenomena such as chemical processes.

**Computational Biology**: A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets.

**Reproducibility of Results**: The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results.

**Artificial Intelligence**: Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language.

**Models, Statistical**: Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc.

**Sensitivity and Specificity**: Binary classification measures to assess test results. Sensitivity or recall rate is the proportion of true positives. Specificity is the probability of correctly determining the absence of a condition. (From Last, Dictionary of Epidemiology, 2d ed)

**Cluster Analysis**: A set of statistical methods used to group variables or observations into strongly inter-related subgroups. In epidemiology, it may be used to analyze a closely grouped series of events or cases of disease or other health-related phenomenon with well-defined distribution patterns in relation to time or place or both.

**Image Processing, Computer-Assisted**: A technique of inputting two-dimensional images into a computer and then enhancing or analyzing the imagery into a form that is more useful to the human observer.

**Sequence Analysis, Protein**: A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence.

**Sequence Alignment**: The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms.

**Image Interpretation, Computer-Assisted**: Methods developed to aid in the interpretation of ultrasound, radiographic images, etc., for diagnosis of disease.

**Phantoms, Imaging**: Devices or objects in various imaging techniques used to visualize or enhance visualization by simulating conditions encountered in the procedure. Phantoms are used very often in procedures employing or measuring x-irradiation or radioactive material to evaluate performance. Phantoms often have properties similar to human tissue. Water demonstrates absorbing properties similar to normal tissue, hence water-filled phantoms are used to map radiation levels. Phantoms are used also as teaching aids to simulate real conditions with x-ray or ultrasonic machines. (From Iturralde, Dictionary and Handbook of Nuclear Medicine and Clinical Imaging, 1990)

**Models, Genetic**: Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment.

**Signal Processing, Computer-Assisted**: Computer-assisted processing of electric, ultrasonic, or electronic signals to interpret function and activity.

**Software Validation**: The act of testing the software for compliance with a standard.

**Imaging, Three-Dimensional**: The process of generating three-dimensional images by electronic, photographic, or other methods. For example, three-dimensional images can be generated by assembling multiple tomographic images with the aid of a computer, while photographic 3-D images (HOLOGRAPHY) can be made by exposing film to the interference pattern created when two laser light sources shine on an object.

**Sequence Analysis, DNA**: A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis.

**Image Enhancement**: Improvement of the quality of a picture by various techniques, including computer processing, digital filtering, echocardiographic techniques, light and ultrastructural MICROSCOPY, fluorescence spectrometry and microscopy, scintigraphy, and in vitro image processing at the molecular level.

**Markov Chains**: A stochastic process such that the conditional probability distribution for a state at any future instant, given the present state, is unaffected by any additional knowledge of the past history of the system.

**Proteins**: Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein.

**Databases, Protein**: Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties.

**Bayes Theorem**: A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result.

**Gene Expression Profiling**: The determination of the pattern of genes expressed at the level of GENETIC TRANSCRIPTION, under specific circumstances or in a specific cell.

**Monte Carlo Method**: In statistics, a technique for numerically approximating the solution of a mathematical problem by studying the distribution of some random variable, often generated by a computer. The name alludes to the randomness characteristic of the games of chance played at the gambling casinos in Monte Carlo. (From Random House Unabridged Dictionary, 2d ed, 1993)

**Computer Graphics**: The process of pictorial communication, between human and computers, in which the computer input and output have the form of charts, drawings, or other appropriate pictorial representation.

**Automation**: Controlled operation of an apparatus, process, or system by mechanical or electronic devices that take the place of human organs of observation, effort, and decision. (From Webster's Collegiate Dictionary, 1993)

**Databases, Factual**: Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references.

**Oligonucleotide Array Sequence Analysis**: Hybridization of a nucleic acid sample to a very large set of OLIGONUCLEOTIDE PROBES, which have been attached individually in columns and rows to a solid support, to determine a BASE SEQUENCE, or to detect variations in a gene sequence, GENE EXPRESSION, or for GENE MAPPING.

**Neural Networks (Computer)**: A computer architecture, implementable in either hardware or software, modeled after biological neural networks. Like the biological system in which the processing capability is a result of the interconnection strengths between arrays of nonlinear processing nodes, computerized neural networks, often called perceptrons or multilayer connectionist models, consist of neuron-like units. A homogeneous group of units makes up a layer. These networks are good at pattern recognition. They are adaptive, performing tasks by example, and thus are better for decision-making than are linear learning machines or cluster analysis. They do not require explicit programming.

**Numerical Analysis, Computer-Assisted**: Computer-assisted study of methods for obtaining useful quantitative solutions to problems that have been expressed mathematically.

**Models, Theoretical**: Theoretical representations that simulate the behavior or activity of systems, processes, or phenomena. They include the use of mathematical equations, computers, and other electronic equipment.

**User-Computer Interface**: The portion of an interactive computer program that issues messages to and receives commands from a user.

**Data Compression**: Information application based on a variety of coding methods to minimize the amount of data to be stored, retrieved, or transmitted. Data compression can be applied to various forms of data, such as images and signals. It is used to reduce costs and increase efficiency in the maintenance of large volumes of data.

**Fuzzy Logic**: Approximate, quantitative reasoning that is concerned with the linguistic ambiguity which exists in natural or synthetic language. At its core are variables such as good, bad, and young as well as modifiers such as more, less, and very. These ordinary terms represent fuzzy sets in a particular problem. Fuzzy logic plays a key role in many medical expert systems.

**Artifacts**: Any visible result of a procedure which is caused by the procedure itself and not by the entity being analyzed. Common examples include histological structures introduced by tissue processing, radiographic images of structures that are not naturally present in living tissue, and products of chemical reactions that occur during analysis.

**Diagnosis, Computer-Assisted**: Application of computer programs designed to assist the physician in solving a diagnostic problem.

**Databases, Genetic**: Databases devoted to knowledge about specific genes and gene products.

**Data Interpretation, Statistical**: Application of statistical procedures to analyze specific observed or assumed facts from a particular study.

**Models, Biological**: Theoretical representations that simulate the behavior or activity of biological processes or diseases. For disease models in living animals, DISEASE MODELS, ANIMAL is available. Biological models include the use of mathematical equations, computers, and other electronic equipment.

**Normal Distribution**: Continuous frequency distribution of infinite range. Its properties are as follows: 1, continuous, symmetrical distribution with both tails extending to infinity; 2, arithmetic mean, mode, and median identical; and 3, shape completely determined by the mean and standard deviation.

**Information Storage and Retrieval**: Organized activities related to the storage, location, search, and retrieval of information.

**Likelihood Functions**: Functions constructed from a statistical model and a set of observed data which give the probability of that data for various values of the unknown model parameters. Those parameter values that maximize the probability are the maximum likelihood estimates of the parameters.

**Radiographic Image Interpretation, Computer-Assisted**: Computer systems or networks designed to provide radiographic interpretive information.

**Genomics**: The systematic study of the complete DNA sequences (GENOME) of organisms.

**Internet**: A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange.

**Decision Trees**: A graphic device used in decision analysis, series of decision options are represented as branches (hierarchical).

**Radiographic Image Enhancement**: Improvement in the quality of an x-ray image by use of an intensifying screen, tube, or filter and by optimum exposure techniques. Digital processing methods are often employed.

**Subtraction Technique**: Combination or superimposition of two images for demonstrating differences between them (e.g., radiograph with contrast vs. one without, radionuclide images using different radionuclides, radiograph vs. radionuclide image) and in the preparation of audiovisual materials (e.g., offsetting identical images, coloring of vessels in angiograms).

**Programming Languages**: Specific languages used to prepare computer programs.

**Wavelet Analysis**: Signal and data processing method that uses decomposition of wavelets to approximate, estimate, or compress signals with finite time and frequency domains. It represents a signal or data in terms of a fast decaying wavelet series from the original prototype wavelet, called the mother wavelet. This mathematical algorithm has been adopted widely in biomedical disciplines for data and signal processing in noise removal and audio/image compression (e.g., EEG and MRI).

**Computing Methodologies**: Computer-assisted analysis and processing of problems in a particular area.

**Signal-To-Noise Ratio**: The comparison of the quantity of meaningful data to the irrelevant or incorrect data.

**Data Mining**: Use of sophisticated analysis tools to sort through, organize, examine, and combine large sets of information.

**Protein Interaction Mapping**: Methods for determining interaction between PROTEINS.

**Models, Molecular**: Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures.

**Wireless Technology**: Techniques using energy such as radio frequency, infrared light, laser light, visible light, or acoustic energy to transfer information without the use of wires, over both short and long distances.

**Support Vector Machines**: Learning algorithms which are a set of related supervised computer learning methods that analyze data and recognize patterns, and used for classification and regression analysis.

**Automatic Data Processing**: Data processing largely performed by automatic means.

**Software Design**: Specifications and instructions applied to the software.

**Sequence Analysis, RNA**: A multistage process that includes cloning, physical mapping, subcloning, sequencing, and information analysis of an RNA SEQUENCE.

**Computers**

**Molecular Sequence Data**: Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories.

**Stochastic Processes**: Processes that incorporate some element of randomness, used particularly to refer to a time series of random variables.

**Genome**: The genetic complement of an organism, including all of its GENES, as represented in its DNA, or in some cases, its RNA.

**Gene Regulatory Networks**: Interacting DNA-encoded regulatory subsystems in the GENOME that coordinate input from activator and repressor TRANSCRIPTION FACTORS during development, cell differentiation, or in response to environmental cues. The networks function to ultimately specify expression of particular sets of GENES for specific conditions, times, or locations.

**ROC Curve**: A graphic means for assessing the ability of a screening test to discriminate between healthy and diseased persons; may also be used in other studies, e.g., distinguishing stimuli responses as to a faint stimuli or nonstimuli.

**Equipment Design**: Methods of creating machines and devices.

**Models, Chemical**: Theoretical representations that simulate the behavior or activity of chemical processes or phenomena; includes the use of mathematical equations, computers, and other electronic equipment.

**Probability**: The study of chance processes or the relative frequency characterizing a chance process.

**Predictive Value of Tests**: In screening and diagnostic tests, the probability that a person with a positive test is a true positive (i.e., has the disease), is referred to as the predictive value of a positive test; whereas, the predictive value of a negative test is the probability that the person with a negative test does not have the disease. Predictive value is related to the sensitivity and specificity of the test.

**Chromosome Mapping**: Any method used for determining the location of and relative distances between genes on a chromosome.

**Phylogeny**: The relationships of groups of organisms as reflected by their genetic makeup.

**Magnetic Resonance Imaging**: Non-invasive method of demonstrating internal anatomy based on the principle that atomic nuclei in a strong magnetic field absorb pulses of radiofrequency energy and emit them as radiowaves which can be reconstructed into computerized images. The concept includes proton spin tomographic techniques.

**Time Factors**: Elements of limited time intervals, contributing to particular results or situations.

**Base Sequence**: The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence.

**Discriminant Analysis**: A statistical analytic technique used with discrete dependent variables, concerned with separating sets of observed values and allocating new values. It is sometimes used instead of regression analysis.

**Cone-Beam Computed Tomography**: Computed tomography modalities which use a cone or pyramid-shaped beam of radiation.

**Tomography, X-Ray Computed**: Tomography using x-ray transmission and a computer algorithm to reconstruct the image.

**Least-Squares Analysis**: A principle of estimation in which the estimates of a set of parameters in a statistical model are those quantities minimizing the sum of squared differences between the observed values of a dependent variable and the values predicted by the model.

**Nonlinear Dynamics**: The study of systems which respond disproportionately (nonlinearly) to initial conditions or perturbing stimuli. Nonlinear systems may exhibit "chaos" which is classically characterized as sensitive dependence on initial conditions. Chaotic systems, while distinguished from more ordered periodic systems, are not random. When their behavior over time is appropriately displayed (in "phase space"), constraints are evident which are described by "strange attractors". Phase space representations of chaotic systems, or strange attractors, usually reveal fractal (FRACTALS) self-similarity across time scales. Natural, including biological, systems often display nonlinear dynamics and chaos.

**Programming, Linear**: A technique of operations research for solving certain kinds of problems involving many variables where a best value or set of best values is to be found. It is most likely to be feasible when the quantity to be optimized, sometimes called the objective function, can be stated as a mathematical expression in terms of the various activities within the system, and when this expression is simply proportional to the measure of the activities, i.e., is linear, and when all the restrictions are also linear. It is different from computer programming, although problems using linear programming techniques may be programmed on a computer.

**Equipment Failure Analysis**: The evaluation of incidents involving the loss of function of a device. These evaluations are used for a variety of purposes such as to determine the failure rates, the causes of failures, costs of failures, and the reliability and maintainability of devices.

**Genome, Human**: The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs.

**Proteomics**: The systematic study of the complete complement of proteins (PROTEOME) of organisms.

**Databases, Nucleic Acid**: Databases containing information about NUCLEIC ACIDS such as BASE SEQUENCE; SNPS; NUCLEIC ACID CONFORMATION; and other properties. Information about the DNA fragments kept in a GENE LIBRARY or GENOMIC LIBRARY is often maintained in DNA databases.

**Principal Component Analysis**: Mathematical procedure that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components.

**Polymorphism, Single Nucleotide**: A single nucleotide variation in a genetic sequence that occurs at appreciable frequency in the population.

**Amino Acid Sequence**: The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION.

**Computer Communication Networks**: A system containing any combination of computers, computer terminals, printers, audio or visual display devices, or telephones interconnected by telecommunications equipment or cables: used to transmit or receive information. (Random House Unabridged Dictionary, 2d ed)

**Natural Language Processing**: Computer processing of a language with rules that reflect and describe current usage rather than prescribed usage.

**Tomography**: Imaging methods that result in sharp images of objects located on a chosen plane and blurred images located above or below the plane.

**Proteome**: The protein complement of an organism coded for by its genome.