Cluster Analysis: A set of statistical methods used to group variables or observations into strongly inter-related subgroups. In epidemiology, it may be used to analyze a closely grouped series of events or cases of disease or other health-related phenomenon with well-defined distribution patterns in relation to time or place or both.Multigene Family: A set of genes descended by duplication and variation from some ancestral gene. Such genes may be clustered together on the same chromosome or dispersed on different chromosomes. Examples of multigene families include those that encode the hemoglobins, immunoglobulins, histocompatibility antigens, actins, tubulins, keratins, collagens, heat shock proteins, salivary glue proteins, chorion proteins, cuticle proteins, yolk proteins, and phaseolins, as well as histones, ribosomal RNA, and transfer RNA genes. The latter three are examples of reiterated genes, where hundreds of identical genes are present in a tandem array. (King & Stanfield, A Dictionary of Genetics, 4th ed)Gene Expression Profiling: The determination of the pattern of genes expressed at the level of GENETIC TRANSCRIPTION, under specific circumstances or in a specific cell.Oligonucleotide Array Sequence Analysis: Hybridization of a nucleic acid sample to a very large set of OLIGONUCLEOTIDE PROBES, which have been attached individually in columns and rows to a solid support, to determine a BASE SEQUENCE, or to detect variations in a gene sequence, GENE EXPRESSION, or for GENE MAPPING.Phylogeny: The relationships of groups of organisms as reflected by their genetic makeup.Principal Component Analysis: Mathematical procedure that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components.Molecular Sequence Data: Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories.Random Amplified Polymorphic DNA Technique: Technique that utilizes low-stringency polymerase chain reaction (PCR) amplification with single primers of arbitrary sequence to generate strain-specific arrays of anonymous DNA fragments. RAPD technique may be used to determine taxonomic identity, assess kinship relationships, analyze mixed genome samples, and create specific probes.Discriminant Analysis: A statistical analytic technique used with discrete dependent variables, concerned with separating sets of observed values and allocating new values. It is sometimes used instead of regression analysis.Genetic Variation: Genotypic differences observed among individuals in a population.Space-Time Clustering: A statistically significant excess of cases of a disease, occurring within a limited space-time continuum.DNA Fingerprinting: A technique for identifying individuals of a species that is based on the uniqueness of their DNA sequence. Uniqueness is determined by identifying which combination of allelic variations occur in the individual at a statistically relevant number of different loci. In forensic studies, RESTRICTION FRAGMENT LENGTH POLYMORPHISM of multiple, highly polymorphic VNTR LOCI or MICROSATELLITE REPEAT loci are analyzed. The number of loci used for the profile depends on the ALLELE FREQUENCY in the population.Sequence Analysis, DNA: A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis.Algorithms: A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task.DNA, Bacterial: Deoxyribonucleic acid that makes up the genetic material of bacteria.Bacterial Typing Techniques: Procedures for identifying types and strains of bacteria. The most frequently employed typing systems are BACTERIOPHAGE TYPING and SEROTYPING as well as bacteriocin typing and biotyping.Cluster Headache: A primary headache disorder that is characterized by severe, strictly unilateral PAIN which is orbital, supraorbital, temporal or in any combination of these sites, lasting 15-180 min. occurring 1 to 8 times a day. The attacks are associated with one or more of the following, all of which are ipsilateral: conjunctival injection, lacrimation, nasal congestion, rhinorrhea, facial SWEATING, eyelid EDEMA, and miosis. (International Classification of Headache Disorders, 2nd ed. Cephalalgia 2004: suppl 1)Genotype: The genetic constitution of the individual, comprising the ALLELES present at each GENETIC LOCUS.Phenotype: The outward appearance of the individual. It is the product of interactions between genes, and between the GENOTYPE and the environment.Base Sequence: The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence.Iron-Sulfur Proteins: A group of proteins possessing only the iron-sulfur complex as the prosthetic group. These proteins participate in all major pathways of electron transport: photosynthesis, respiration, hydroxylation and bacterial hydrogen and nitrogen fixation.Reproducibility of Results: The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results.Amino Acid Sequence: The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION.Geography: The science dealing with the earth and its life, especially the description of land, sea, and air and the distribution of plant and animal life, including humanity and human industries with reference to the mutual relations of these elements. (From Webster, 3d ed)Amplified Fragment Length Polymorphism Analysis: The detection of RESTRICTION FRAGMENT LENGTH POLYMORPHISMS by selective PCR amplification of restriction fragments derived from genomic DNA followed by electrophoretic analysis of the amplified restriction fragments.Polymerase Chain Reaction: In vitro method for producing large amounts of specific DNA or RNA fragments of defined length and sequence from small amounts of short oligonucleotide flanking sequences (primers). The essential steps include thermal denaturation of the double-stranded target molecules, annealing of the primers to their complementary sequences, and extension of the annealed primers by enzymatic synthesis with DNA polymerase. The reaction is efficient, specific, and extremely sensitive. Uses for the reaction include disease diagnosis, detection of difficult-to-isolate pathogens, mutation analysis, genetic testing, DNA sequencing, and analyzing evolutionary relationships.Genes, Bacterial: The functional hereditary units of BACTERIA.Species Specificity: The restriction of a characteristic behavior, anatomical structure or physical system, such as immune response; metabolic response, or gene or gene variant to the members of one species. It refers to that property which differentiates one species from another but it is also used for phylogenetic levels higher or lower than the species.RNA, Ribosomal, 16S: Constituent of 30S subunit prokaryotic ribosomes containing 1600 nucleotides and 21 proteins. 16S rRNA is involved in initiation of polypeptide synthesis.Software: Sequential operating programs and data which instruct the functioning of a digital computer.Data Interpretation, Statistical: Application of statistical procedures to analyze specific observed or assumed facts from a particular study.Bacterial Proteins: Proteins found in any species of bacterium.Polymorphism, Restriction Fragment Length: Variation occurring within a species in the presence or length of DNA fragment generated by a specific endonuclease at a specific site in the genome. Such variations are generated by mutations that create or abolish recognition sites for these enzymes or change the length of the fragment.Computational Biology: A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets.Sequence Alignment: The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms.Models, Statistical: Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc.Genetic Markers: A phenotypically recognizable genetic trait which can be used to identify a genetic locus, a linkage group, or a recombination event.Classification: The systematic arrangement of entities in any field into categories classes based on common characteristics such as properties, morphology, subject matter, etc.Pattern Recognition, Automated: In INFORMATION RETRIEVAL, machine-sensing or identification of visible patterns (shapes, forms, and configurations). (Harrod's Librarians' Glossary, 7th ed)Dental Fissures: Deep grooves or clefts in the surface of teeth equivalent to class 1 cavities in Black's classification of dental caries.Microsatellite Repeats: A variety of simple repeat sequences that are distributed throughout the GENOME. They are characterized by a short repeat unit of 2-8 basepairs that is repeated up to 100 times. They are also known as short tandem repeats (STRs).Factor Analysis, Statistical: A set of statistical methods for analyzing the correlations among several variables in order to estimate the number of fundamental dimensions that underlie the observed data and to describe and measure those dimensions. It is used frequently in the development of scoring systems for rating scales and questionnaires.Time Factors: Elements of limited time intervals, contributing to particular results or situations.Analysis of Variance: A statistical technique that isolates and assesses the contributions of categorical independent variables to variation in the mean of a continuous dependent variable.Molecular Epidemiology: The application of molecular biology to the answering of epidemiological questions. The examination of patterns of changes in DNA to implicate particular carcinogens and the use of molecular markers to predict which individuals are at highest risk for a disease are common examples.Spatio-Temporal Analysis: Techniques which study entities using their topological, geometric, or geographic properties and include the dimension of time in the analysis.Sequence Homology, Amino Acid: The degree of similarity between sequences of amino acids. This information is useful for the analyzing genetic relatedness of proteins and species.DNA, Plant: Deoxyribonucleic acid that makes up the genetic material of plants.Electrophoresis, Gel, Pulsed-Field: Gel electrophoresis in which the direction of the electric field is changed periodically. This technique is similar to other electrophoretic methods normally used to separate double-stranded DNA molecules ranging in size up to tens of thousands of base-pairs. However, by alternating the electric field direction one is able to separate DNA molecules up to several million base-pairs in length.Questionnaires: Predetermined sets of questions used to collect data - clinical data, social status, occupational group, etc. The term is often applied to a self-completed survey instrument.Evolution, Molecular: The process of cumulative change at the level of DNA; RNA; and PROTEINS, over successive generations.Expressed Sequence Tags: Partial cDNA (DNA, COMPLEMENTARY) sequences that are unique to the cDNAs from which they were derived.DNA Primers: Short sequences (generally about 10 base pairs) of DNA that are complementary to sequences of messenger RNA and allow reverse transcriptases to start copying the adjacent sequences of mRNA. Primers are used extensively in genetic and molecular biology techniques.DNA, Ribosomal: DNA sequences encoding RIBOSOMAL RNA and the segments of DNA separating the individual ribosomal RNA genes, referred to as RIBOSOMAL SPACER DNA.Models, Genetic: Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment.Electrophoresis, Starch Gel: Electrophoresis in which a starch gel (a mixture of amylose and amylopectin) is used as the diffusion medium.Ecotype: Geographic variety, population, or race, within a species, that is genetically adapted to a particular habitat. An ecotype typically exhibits phenotypic differences but is capable of interbreeding with other ecotypes.Genetic Structures: The biological objects that contain genetic information and that are involved in transmitting genetically encoded traits from one organism to another.Computer Simulation: Computer-based representation of physical systems and phenomena such as chemical processes.China: A country spanning from central Asia to the Pacific Ocean.BrazilPolymorphism, Genetic: The regular and simultaneous occurrence in a single interbreeding population of two or more discontinuous genotypes. The concept includes differences in genotypes ranging in size from a single nucleotide site (POLYMORPHISM, SINGLE NUCLEOTIDE) to large nucleotide sequences visible at a chromosomal level.Statistics as Topic: The science and art of collecting, summarizing, and analyzing data that are subject to random variation. The term is also applied to the data themselves and to the summarization of the data.Cloning, Molecular: The insertion of recombinant DNA molecules from prokaryotic and/or eukaryotic sources into a replicating vehicle, such as a plasmid or virus vector, and the introduction of the resultant hybrid molecules into recipient cells without altering the viability of those cells.Topography, Medical: The systematic surveying, mapping, charting, and description of specific geographical sites, with reference to the physical features that were presumed to influence health and disease. Medical topography should be differentiated from EPIDEMIOLOGY in that the former emphasizes geography whereas the latter emphasizes disease outbreaks.Minisatellite Repeats: Tandem arrays of moderately repetitive, short (10-60 bases) DNA sequences which are found dispersed throughout the GENOME, at the ends of chromosomes (TELOMERES), and clustered near telomeres. Their degree of repetition is two to several hundred at each locus. Loci number in the thousands but each locus shows a distinctive repeat unit.Bacteria: One of the three domains of life (the others being Eukarya and ARCHAEA), also called Eubacteria. They are unicellular prokaryotic microorganisms which generally possess rigid cell walls, multiply by cell division, and exhibit three principal forms: round or coccal, rodlike or bacillary, and spiral or spirochetal. Bacteria can be classified by their response to OXYGEN: aerobic, anaerobic, or facultatively anaerobic; by the mode by which they obtain their energy: chemotrophy (via chemical reaction) or PHOTOTROPHY (via light reaction); for chemotrophs by their source of chemical energy: CHEMOLITHOTROPHY (from inorganic compounds) or chemoorganotrophy (from organic compounds); and by their source for CARBON; NITROGEN; etc.; HETEROTROPHY (from organic sources) or AUTOTROPHY (from CARBON DIOXIDE). They can also be classified by whether or not they stain (based on the structure of their CELL WALLS) with CRYSTAL VIOLET dye: gram-negative or gram-positive.Bayes Theorem: A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result.Escherichia coli: A species of gram-negative, facultatively anaerobic, rod-shaped bacteria (GRAM-NEGATIVE FACULTATIVELY ANAEROBIC RODS) commonly found in the lower part of the intestine of warm-blooded animals. It is usually nonpathogenic, but some strains are known to produce DIARRHEA and pyogenic infections. Pathogenic strains (virotypes) are classified by their specific pathogenic mechanisms such as toxins (ENTEROTOXIGENIC ESCHERICHIA COLI), etc.Databases, Genetic: Databases devoted to knowledge about specific genes and gene products.Gene Expression Regulation: Any of the processes by which nuclear, cytoplasmic, or intercellular factors influence the differential control (induction or repression) of gene action at the level of transcription or translation.Geographic Information Systems: Computer systems capable of assembling, storing, manipulating, and displaying geographically referenced information, i.e. data identified according to their locations.Models, Molecular: Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures.Transcription, Genetic: The biosynthesis of RNA carried out on a template of DNA. The biosynthesis of DNA from an RNA template is called REVERSE TRANSCRIPTION.Gene Expression Regulation, Neoplastic: Any of the processes by which nuclear, cytoplasmic, or intercellular factors influence the differential control of gene action in neoplastic tissue.Neoplasms, Plasma Cell: Neoplasms associated with a proliferation of a single clone of PLASMA CELLS and characterized by the secretion of PARAPROTEINS.Chromosome Mapping: Any method used for determining the location of and relative distances between genes on a chromosome.Serotyping: Process of determining and distinguishing species of bacteria or viruses based on antigens they share.Reverse Transcriptase Polymerase Chain Reaction: A variation of the PCR technique in which cDNA is made from RNA via reverse transcription. The resultant cDNA is then amplified using standard PCR protocols.Microarray Analysis: The simultaneous analysis, on a microchip, of multiple samples or targets arranged in an array format.Catastrophization: Cognitive and emotional processes encompassing magnification of pain-related stimuli, feelings of helplessness, and a generally pessimistic orientation.Mutation: Any detectable and heritable change in the genetic material that causes a change in the GENOTYPE and which is transmitted to daughter cells and to succeeding generations.Protein Array Analysis: Ligand-binding assays that measure protein-protein, protein-small molecule, or protein-nucleic acid interactions using a very large set of capturing molecules, i.e., those attached separately on a solid support, to measure the presence or interaction of target molecules in the sample.Multivariate Analysis: A set of techniques used when variation in several variables has to be studied simultaneously. In statistics, multivariate analysis is interpreted as any analytic method that allows simultaneous study of two or more dependent variables.Proteomics: The systematic study of the complete complement of proteins (PROTEOME) of organisms.Food Habits: Acquired or learned food preferences.Fruit: The fleshy or dry ripened ovary of a plant, enclosing the seed or seeds.RNA, Bacterial: Ribonucleic acid in bacteria having regulatory and catalytic roles as well as involvement in protein synthesis.Nucleic Acid Hybridization: Widely used technique which exploits the ability of complementary sequences in single-stranded DNAs or RNAs to pair with each other to form a double helix. Hybridization can take place between two complimentary DNA sequences, between a single-stranded DNA and a complementary RNA, or between two RNA sequences. The technique is used to detect and isolate specific sequences, measure homology, or define other characteristics of one or both strands. (Kendrew, Encyclopedia of Molecular Biology, 1994, p503)Soil Microbiology: The presence of bacteria, viruses, and fungi in the soil. This term is not restricted to pathogenic organisms.Gene Expression: The phenotypic manifestation of a gene or genes by the processes of GENETIC TRANSCRIPTION and GENETIC TRANSLATION.Molecular Typing: Using MOLECULAR BIOLOGY techniques, such as DNA SEQUENCE ANALYSIS; PULSED-FIELD GEL ELECTROPHORESIS; and DNA FINGERPRINTING, to identify, classify, and compare organisms and their subtypes.Gene Library: A large collection of DNA fragments cloned (CLONING, MOLECULAR) from a given organism, tissue, organ, or cell type. It may contain complete genomic sequences (GENOMIC LIBRARY) or complementary DNA sequences, the latter being formed from messenger RNA and lacking intron sequences.Cross-Sectional Studies: Studies in which the presence or absence of disease or other health-related variables are determined in each member of the study population or in a representative sample at one particular time. This contrasts with LONGITUDINAL STUDIES which are followed over a period of time.Demography: Statistical interpretation and description of a population with reference to distribution, composition, or structure.Chorda Tympani Nerve: A branch of the facial (7th cranial) nerve which passes through the middle ear and continues through the petrotympanic fissure. The chorda tympani nerve carries taste sensation from the anterior two-thirds of the tongue and conveys parasympathetic efferents to the salivary glands.Databases, Factual: Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references.Transcriptome: The pattern of GENE EXPRESSION at the level of genetic transcription in a specific organism or under specific circumstances in specific cells.RNA, Messenger: RNA sequences that serve as templates for protein synthesis. Bacterial mRNAs are generally primary transcripts in that they do not require post-transcriptional processing. Eukaryotic mRNA is synthesized in the nucleus and must be exported to the cytoplasm for translation. Most eukaryotic mRNAs have a sequence of polyadenylic acid at the 3' end, referred to as the poly(A) tail. The function of this tail is not known for certain, but it may play a role in the export of mature mRNA from the nucleus as well as in helping stabilize some mRNA molecules by retarding their degradation in the cytoplasm.Models, Biological: Theoretical representations that simulate the behavior or activity of biological processes or diseases. For disease models in living animals, DISEASE MODELS, ANIMAL is available. Biological models include the use of mathematical equations, computers, and other electronic equipment.Artificial Intelligence: Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language.Ecosystem: A functional system which includes the organisms of a natural community together with their environment. (McGraw Hill Dictionary of Scientific and Technical Terms, 4th ed)Diet: Regular course of eating and drinking adopted by a person or animal.Proteome: The protein complement of an organism coded for by its genome.Risk Factors: An aspect of personal behavior or lifestyle, environmental exposure, or inborn or inherited characteristic, which, on the basis of epidemiologic evidence, is known to be associated with a health-related condition considered important to prevent.Phylogeography: A field of study concerned with the principles and processes governing the geographic distributions of genealogical lineages, especially those within and among closely related species. (Avise, J.C., Phylogeography: The History and Formation of Species. Harvard University Press, 2000)Immunohistochemistry: Histochemical localization of immunoreactive substances using labeled antibodies as reagents.Cohort Studies: Studies in which subsets of a defined population are identified. These groups may or may not be exposed to factors hypothesized to influence the probability of the occurrence of a particular disease or other outcome. Cohorts are defined populations which, as a whole, are followed in an attempt to determine distinguishing subgroup characteristics.Binding Sites: The parts of a macromolecule that directly participate in its specific combination with another molecule.DNA, Ribosomal Spacer: The intergenic DNA segments that are between the ribosomal RNA genes (internal transcribed spacers) and between the tandemly repeated units of rDNA (external transcribed spacers and nontranscribed spacers).Genetics, Population: The discipline studying genetic composition of populations and effects of factors such as GENETIC SELECTION, population size, MUTATION, migration, and GENETIC DRIFT on the frequencies of various GENOTYPES and PHENOTYPES using a variety of GENETIC TECHNIQUES.Microdissection: The performance of dissections with the aid of a microscope.Severity of Illness Index: Levels within a diagnostic group which are established by various measurement criteria applied to the seriousness of a patient's disorder.United StatesDisease Outbreaks: Sudden increase in the incidence of a disease. The concept includes EPIDEMICS and PANDEMICS.Alleles: Variant forms of the same gene, occupying the same locus on homologous CHROMOSOMES, and governing the variants in production of the same gene product.Spectroscopy, Fourier Transform Infrared: A spectroscopic technique in which a range of wavelengths is presented simultaneously with an interferometer and the spectrum is mathematically derived from the pattern thus obtained.Genomics: The systematic study of the complete DNA sequences (GENOME) of organisms.Biodiversity: The variety of all native living organisms and their various forms and interrelationships.Tumor Markers, Biological: Molecular products metabolized and secreted by neoplastic tissue and characterized biochemically in cells or body fluids. They are indicators of tumor stage and grade as well as useful for monitoring responses to treatment and predicting recurrence. Many chemical groups are represented including hormones, antigens, amino and nucleic acids, enzymes, polyamines, and specific cell membrane proteins and lipids.Genes, Plant: The functional hereditary units of PLANTS.Sequence Homology, Nucleic Acid: The sequential correspondence of nucleotides in one nucleic acid molecule with those of another nucleic acid molecule. Sequence homology is an indication of the genetic relatedness of different organisms and gene function.Rivers: Large natural streams of FRESH WATER formed by converging tributaries and which empty into a body of water (lake or ocean).Gene Expression Regulation, Bacterial: Any of the processes by which cytoplasmic or intercellular factors influence the differential control of gene action in bacteria.Sensitivity and Specificity: Binary classification measures to assess test results. Sensitivity or recall rate is the proportion of true positives. Specificity is the probability of correctly determining the absence of a condition. (From Last, Dictionary of Epidemiology, 2d ed)Least-Squares Analysis: A principle of estimation in which the estimates of a set of parameters in a statistical model are those quantities minimizing the sum of squared differences between the observed values of a dependent variable and the values predicted by the model.Protein Structure, Secondary: The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to alpha helices, beta strands (which align to form beta sheets) or other types of coils. This is the first folding level of protein conformation.Pain Measurement: Scales, questionnaires, tests, and other methods used to assess pain severity and duration in patients or experimental animals to aid in diagnosis, therapy, and physiological studies.Conserved Sequence: A sequence of amino acids in a polypeptide or of nucleotides in DNA or RNA that is similar across multiple species. A known set of conserved sequences is represented by a CONSENSUS SEQUENCE. AMINO ACID MOTIFS are often composed of conserved sequences.DNA, Fungal: Deoxyribonucleic acid that makes up the genetic material of fungi.Pseudomonas: A genus of gram-negative, aerobic, rod-shaped bacteria widely distributed in nature. Some species are pathogenic for humans, animals, and plants.Sequence Analysis, Protein: A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence.Plant Diseases: Diseases of plants.ItalyProtein Conformation: The characteristic 3-dimensional shape of a protein, including the secondary, supersecondary (motifs), tertiary (domains) and quaternary structure of the peptide chain. PROTEIN STRUCTURE, QUATERNARY describes the conformation assumed by multimeric proteins (aggregates of more than one polypeptide chain).Enzymes: Biological molecules that possess catalytic activity. They may occur naturally or be synthetically created. Enzymes are usually proteins, however CATALYTIC RNA and CATALYTIC DNA molecules have also been identified.DNA, Complementary: Single-stranded complementary DNA synthesized from an RNA template by the action of RNA-dependent DNA polymerase. cDNA (i.e., complementary DNA, not circular DNA, not C-DNA) is used in a variety of molecular cloning experiments as well as serving as a specific hybridization probe.Cattle: Domesticated bovine animals of the genus Bos, usually kept on a farm or ranch and used for the production of meat or dairy products or for heavy labor.Electron Spin Resonance Spectroscopy: A technique applicable to the wide variety of substances which exhibit paramagnetism because of the magnetic moments of unpaired electrons. The spectra are useful for detection and identification, for determination of electron structure, for study of interactions between molecules, and for measurement of nuclear spins and moments. (From McGraw-Hill Encyclopedia of Science and Technology, 7th edition) Electron nuclear double resonance (ENDOR) spectroscopy is a variant of the technique which can give enhanced resolution. Electron spin resonance analysis can now be used in vivo, including imaging applications such as MAGNETIC RESONANCE IMAGING.Restriction Mapping: Use of restriction endonucleases to analyze and generate a physical map of genomes, genes, or other segments of DNA.Image Processing, Computer-Assisted: A technique of inputting two-dimensional images into a computer and then enhancing or analyzing the imagery into a form that is more useful to the human observer.Proteins: Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein.Ferredoxins: Iron-containing proteins that transfer electrons, usually at a low potential, to flavoproteins; the iron is not present as in heme. (McGraw-Hill Dictionary of Scientific and Technical Terms, 5th ed)Socioeconomic Factors: Social and economic factors that characterize the individual or group within the social structure.Fibromyalgia: A common nonarticular rheumatic syndrome characterized by myalgia and multiple points of focal muscle tenderness to palpation (trigger points). Muscle pain is typically aggravated by inactivity or exposure to cold. This condition is often associated with general symptoms, such as sleep disturbances, fatigue, stiffness, HEADACHES, and occasionally DEPRESSION. There is significant overlap between fibromyalgia and the chronic fatigue syndrome (FATIGUE SYNDROME, CHRONIC). Fibromyalgia may arise as a primary or secondary disease process. It is most frequent in females aged 20 to 50 years. (From Adams et al., Principles of Neurology, 6th ed, p1494-95)Taste: The ability to detect chemicals through gustatory receptors in the mouth, including those on the TONGUE; the PALATE; the PHARYNX; and the EPIGLOTTIS.Agriculture: The science, art or practice of cultivating soil, producing crops, and raising livestock.

Ringo, Doty, Demeter and Simard, Cerebral Cortex 1994;4:331-343: a proof of the need for the spatial clustering of interneuronal connections to enhance cortical computation. (1/16506)

It has been argued that an important principle driving the organization of the cerebral cortex towards local processing has been the need to decrease time lost to interneuronal conduction delay. In this paper, I show for a simplified model of the cerebral cortex, using analytical means, that if interneuronal conduction time increases proportional to interneuronal distance, then the only way to increase the numbers of synaptic events occurring in a fixed finite time period is to spatially cluster interneuronal connections.  (+info)

Cluster survey evaluation of coverage and risk factors for failure to be immunized during the 1995 National Immunization Days in Egypt. (2/16506)

BACKGROUND: In 1995, Egypt continued to experience endemic wild poliovirus transmission despite achieving high routine immunization coverage with at least three doses of oral poliovirus vaccine (OPV3) and implementing National Immunization Days (NIDs) annually for several years. METHODS: Parents of 4188 children in 3216 households throughout Egypt were surveyed after the second round of the 1995 NIDs. RESULTS: Nationwide, 74% of children are estimated to have received both NID doses, 17% one NID dose, and 9% neither NID dose. Previously unimmunized (47%) or partially immunized (64%) children were less likely to receive two NID doses of OPV than were fully immunized children (76%) (P < 0.001). Other risk factors nationwide for failure to receive NID OPV included distance from residence to nearest NID site >10 minute walk (P < 0.001), not being informed about the NID at least one day in advance (P < 0.001), and residing in a household which does not watch television (P < 0.001). Based on these findings, subsequent NIDs in Egypt were modified to improve coverage, which has resulted in a marked decrease in the incidence of paralytic poliomyelitis in Egypt. CONCLUSIONS: In selected situations, surveys can provide important information that is useful for planning future NIDs.  (+info)

Clusters of Pneumocystis carinii pneumonia: analysis of person-to-person transmission by genotyping. (3/16506)

Genotyping at the internal transcribed spacer (ITS) regions of the nuclear rRNA operon was performed on isolates of P. carinii sp. f. hominis from three clusters of P. carinii pneumonia among eight patients with haematological malignancies and six with HIV infection. Nine different ITS sequence types of P. carinii sp. f. hominis were identified in the samples from the patients with haematological malignancies, suggesting that this cluster of cases of P. carinii pneumonia was unlikely to have resulted from nosocomial transmission. A common ITS sequence type was observed in two of the patients with haematological malignancies who shared a hospital room, and also in two of the patients with HIV infection who had prolonged close contact on the ward. In contrast, different ITS sequence types were detected in samples from an HIV-infected homosexual couple who shared the same household. These data suggest that person-to-person transmission of P. carinii sp. f. hominis may occur from infected to susceptible immunosuppressed patients with close contact within hospital environments. However direct transmission between patients did not account for the majority of cases within the clusters, suggesting that person-to-person transmission of P. carinii sp. f. hominis infection may be a relatively infrequent event and does not constitute the major route of transmission in man.  (+info)

Influence of sampling on estimates of clustering and recent transmission of Mycobacterium tuberculosis derived from DNA fingerprinting techniques. (4/16506)

The availability of DNA fingerprinting techniques for Mycobacterium tuberculosis has led to attempts to estimate the extent of recent transmission in populations, using the assumption that groups of tuberculosis patients with identical isolates ("clusters") are likely to reflect recently acquired infections. It is never possible to include all cases of tuberculosis in a given population in a study, and the proportion of isolates found to be clustered will depend on the completeness of the sampling. Using stochastic simulation models based on real and hypothetical populations, the authors demonstrate the influence of incomplete sampling on the estimates of clustering obtained. The results show that as the sampling fraction increases, the proportion of isolates identified as clustered also increases and the variance of the estimated proportion clustered decreases. Cluster size is also important: the underestimation of clustering for any given sampling fraction is greater, and the variability in the results obtained is larger, for populations with small clusters than for those with the same number of individuals arranged in large clusters. A considerable amount of caution should be used in interpreting the results of studies on clustering of M. tuberculosis isolates, particularly when sampling fractions are small.  (+info)

Newly recognized focus of La Crosse encephalitis in Tennessee. (5/16506)

La Crosse virus is a mosquito-borne arbovirus that causes encephalitis in children. Only nine cases were reported in Tennessee during the 33-year period from 1964-1996. We investigated a cluster of La Crosse encephalitis cases in eastern Tennessee in 1997. Medical records of all suspected cases of La Crosse virus infection at a pediatric referral hospital were reviewed, and surveillance was enhanced in the region. Previous unreported cases were identified by surveying 20 hospitals in the surrounding 16 counties. Mosquito eggs were collected from five sites. Ten cases of La Crosse encephalitis were serologically confirmed. None of the patients had been discharged from hospitals in the region with diagnosed La Crosse encephalitis in the preceding 5 years. Aedes triseriatus and Aedes albopictus were collected at the case sites; none of the mosquitos had detectable La Crosse virus. This cluster may represent an extension of a recently identified endemic focus of La Crosse virus infection in West Virginia.  (+info)

Hierarchical cluster analysis applied to workers' exposures in fiberglass insulation manufacturing. (6/16506)

The objectives of this study were to explore the application of cluster analysis to the characterization of multiple exposures in industrial hygiene practice and to compare exposure groupings based on the result from cluster analysis with that based on non-measurement-based approaches commonly used in epidemiology. Cluster analysis was performed for 37 workers simultaneously exposed to three agents (endotoxin, phenolic compounds and formaldehyde) in fiberglass insulation manufacturing. Different clustering algorithms, including complete-linkage (or farthest-neighbor), single-linkage (or nearest-neighbor), group-average and model-based clustering approaches, were used to construct the tree structures from which clusters can be formed. Differences were observed between the exposure clusters constructed by these different clustering algorithms. When contrasting the exposure classification based on tree structures with that based on non-measurement-based information, the results indicate that the exposure clusters identified from the tree structures had little in common with the classification results from either the traditional exposure zone or the work group classification approach. In terms of the defining homogeneous exposure groups or from the standpoint of health risk, some toxicological normalization in the components of the exposure vector appears to be required in order to form meaningful exposure groupings from cluster analysis. Finally, it remains important to see if the lack of correspondence between exposure groups based on epidemiological classification and measurement data is a peculiarity of the data or a more general problem in multivariate exposure analysis.  (+info)

A taxonomy of health networks and systems: bringing order out of chaos. (7/16506)

OBJECTIVE: To use existing theory and data for empirical development of a taxonomy that identifies clusters of organizations sharing common strategic/structural features. DATA SOURCES: Data from the 1994 and 1995 American Hospital Association Annual Surveys, which provide extensive data on hospital involvement in hospital-led health networks and systems. STUDY DESIGN: Theories of organization behavior and industrial organization economics were used to identify three strategic/structural dimensions: differentiation, which refers to the number of different products/services along a healthcare continuum; integration, which refers to mechanisms used to achieve unity of effort across organizational components; and centralization, which relates to the extent to which activities take place at centralized versus dispersed locations. These dimensions were applied to three components of the health service/product continuum: hospital services, physician arrangements, and provider-based insurance activities. DATA EXTRACTION METHODS: We identified 295 health systems and 274 health networks across the United States in 1994, and 297 health systems and 306 health networks in 1995 using AHA data. Empirical measures aggregated individual hospital data to the health network and system level. PRINCIPAL FINDINGS: We identified a reliable, internally valid, and stable four-cluster solution for health networks and a five-cluster solution for health systems. We found that differentiation and centralization were particularly important in distinguishing unique clusters of organizations. High differentiation typically occurred with low centralization, which suggests that a broader scope of activity is more difficult to centrally coordinate. Integration was also important, but we found that health networks and systems typically engaged in both ownership-based and contractual-based integration or they were not integrated at all. CONCLUSIONS: Overall, we were able to classify approximately 70 percent of hospital-led health networks and 90 percent of hospital-led health systems into well-defined organizational clusters. Given the widespread perception that organizational change in healthcare has been chaotic, our research suggests that important and meaningful similarities exist across many evolving organizations. The resulting taxonomy provides a new lexicon for researchers, policymakers, and healthcare executives for characterizing key strategic and structural features of evolving organizations. The taxonomy also provides a framework for future inquiry about the relationships between organizational strategy, structure, and performance, and for assessing policy issues, such as Medicare Provider Sponsored Organizations, antitrust, and insurance regulation.  (+info)

Double blind, cluster randomised trial of low dose supplementation with vitamin A or beta carotene on mortality related to pregnancy in Nepal. The NNIPS-2 Study Group. (8/16506)

OBJECTIVE: To assess the impact on mortality related to pregnancy of supplementing women of reproductive age each week with a recommended dietary allowance of vitamin A, either preformed or as beta carotene. DESIGN: Double blind, cluster randomised, placebo controlled field trial. SETTING: Rural southeast central plains of Nepal (Sarlahi district). SUBJECTS: 44 646 married women, of whom 20 119 became pregnant 22 189 times. INTERVENTION: 270 wards randomised to 3 groups of 90 each for women to receive weekly a single oral supplement of placebo, vitamin A (7000 micrograms retinol equivalents) or beta carotene (42 mg, or 7000 micrograms retinol equivalents) for over 31/2 years. MAIN OUTCOME MEASURES: All cause mortality in women during pregnancy up to 12 weeks post partum (pregnancy related mortality) and mortality during pregnancy to 6 weeks postpartum, excluding deaths apparently related to injury (maternal mortality). RESULTS: Mortality related to pregnancy in the placebo, vitamin A, and beta carotene groups was 704, 426, and 361 deaths per 100 000 pregnancies, yielding relative risks (95% confidence intervals) of 0. 60 (0.37 to 0.97) and 0.51 (0.30 to 0.86). This represented reductions of 40% (P<0.04) and 49% (P<0.01) among those who received vitamin A and beta carotene. Combined, vitamin A or beta carotene lowered mortality by 44% (0.56 (0.37 to 0.84), P<0.005) and reduced the maternal mortality ratio from 645 to 385 deaths per 100 000 live births, or by 40% (P<0.02). Differences in cause of death could not be reliably distinguished between supplemented and placebo groups. CONCLUSION: Supplementation of women with either vitamin A or beta carotene at recommended dietary amounts during childbearing years can lower mortality related to pregnancy in rural, undernourished populations of south Asia.  (+info)

  • It is a main task of exploratory data mining , and a common technique for statistical data analysis , used in many fields, including pattern recognition , image analysis , information retrieval , bioinformatics , data compression , computer graphics and machine learning . (wikipedia.org)
  • Popular notions of clusters include groups with small distances between cluster members, dense areas of the data space, intervals or particular statistical distributions . (wikipedia.org)
  • The municipality of Eindhoven and CBS have jointly conducted a statistical cluster analysis on all Eindhoven residents aged 16 and over. (cbs.nl)
  • The report provides a comprehensive statistical description of the individual clusters and shows their mutual connection as well as an overall picture. (cbs.nl)
  • You'll also explore statistical analysis examining the relative effectiveness of soft, hard, and smart power strategies. (coursera.org)
  • The LOVE clustering approach is a rigorous, adaptable, and scalable latent model-based statistical method that can be used in basic science or medical research to identify potentially significant biological or functional pathways. (pcrm.org)
  • Cluster analysis is a statistical method used to group similar objects into respective categories. (surveygizmo.com)
  • We outline a general methodology for model-based clustering that provides a principled statistical approach to these issues. (psu.edu)
  • When the clusters are relatively homogeneous (that is, the intra-cluster correlation is small), parallel studies tend to deliver better statistical performance than a stepped wedge trial. (bmj.com)
  • To identify differences between states, we implemented hierarchical cluster analysis (8-10) using the hclust function in R version 3.2.5 (free statistical computing software) with Euclidean distance as the distance measure and included the adjusted unhealthy behaviors and the prevention and outcome prevalence measures. (cdc.gov)
  • I sometimes find K-means clustering tough to explain as a statistical technique, but this makes for a great example: if you're a fielder facing Ichiro, it might be a good idea to keep an eye on those six spots when he hits. (smartdatacollective.com)
  • This paper presents a procedure for clustering analysis that combines Kohone's Self organizing Feature Map (SOFM) and statistical schemes. (repec.org)
  • The procedure outperformed others clustering techniques in the job of identifying consistent groups of countries from the economic and statistical viewpoints. (repec.org)
  • Although clustering--the classifying of objects into meaningful sets--is an important procedure, cluster analysis as a multivariate statistical procedure is poorly understood. (booktopia.com.au)
  • Dyson et al 1 use a pragmatic design to address an interesting question, but I am concerned that the statistical analysis may be inappropriate and could have led to erroneous conclusions being drawn. (bmj.com)
  • For this reason, cluster trials should be published with an estimate of the degree of clustering within groups (the intraclass correlation coefficient) and the effect that this has upon statistical power (the design effect). (bmj.com)
  • Cluster trials are a valuable tool in emergency medicine research, and this study is a good example, yet care needs to be taken in statistical analysis and reporting. (bmj.com)
  • We must accept that our original analysis, which assumed statistical independence between observations obtained from staff within the same hospital, might not be justified. (bmj.com)
  • Clustering allows researchers to identify and define patterns between data elements. (surveygizmo.com)
  • Liu SH, Li Y, Liu B. Exploratory Cluster Analysis to Identify Patterns of Chronic Kidney Disease in the 500 Cities Project. (cdc.gov)
  • We used cluster analysis to explore patterns of chronic kidney disease in 500 of the largest US cities. (cdc.gov)
  • To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel-time series. (dtu.dk)
  • Empirically derived eating patterns using factor or cluster analysis: a review. (nih.gov)
  • This paper reviews studies performed to date that have employed cluster or factor analysis to empirically derive eating patterns. (nih.gov)
  • Since 1980, at least 93 studies were published that used cluster or factor analysis to define dietary exposures, of which 65 were used to test hypotheses or examine associations between patterns and disease outcomes or biomarkers. (nih.gov)
  • Studies were conducted in diverse populations across many countries and continents and suggest that patterns are associated with many different biomarkers and disease outcomes, whether measured by cluster or factor analysis. (nih.gov)
  • Hierarchical cluster analysis indicated distinct patterns of vestibular end-organ impairment, showing that the results for the same end-organs on both sides are more similar than to other end-organs. (frontiersin.org)
  • Hierarchical cluster analysis may help differentiate characteristic patterns of BVL. (frontiersin.org)
  • Constant upsurge of experimental data has produced new challenges in terms of maintenance, storage and analysis to derive meaningful patterns. (ssrn.com)
  • Growth in both the theory and applications of this clustering methodology has been steady since its inception. (scholarpedia.org)
  • Based on this novel methodology, we argue that verb cluster ordering in Dutch dialects can be reduced to three grammatical parameters (largely similar to the ones described in Barbiers et al. (jhu.edu)
  • It develops a methodology for clustering a large number of developing countries, identifying and ranking their welfare regimes, assessing their stability over the decade 1990-2000, and relating these to important structural variables. (bath.ac.uk)
  • In this paper , Italian school buildings' stock was analyzed by cluster analysis with the aim of providing a methodology able to identify the best energy retrofit interventions from the perspective of cost-benefit , and to correlate them with the specific characteristics of the educational buildings. (buildup.eu)
  • The cluster analysis is conducted with the aim of assigning data points (sequences) into reasonably homogenous groups (clusters). (thefreelibrary.com)
  • This study aimed to develop a novel, practical sequencing protocol that covered both conserved and variable regions of the viral genome and assess the influence of each subregion, sequence concatenation and unrelated reference sequences on phylogenetic clustering analysis. (plos.org)
  • NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome. (plos.org)
  • Seven HCV genotypes (1 to 7) with approximately 100 sub-types (1a, 1b, etc.) have been identified on the basis of molecular phylogenetic analyses of HCV sequences [ 4 ]. (plos.org)
  • D2_cluster: A Validated Method for Clustering EST and Full-length cDNA Sequences' John Burke, Dan Davison, and Winston Hide. (bio.net)
  • BACKGROUND: A computational system for analysis of the repetitive structure of genomic sequences is described. (jcvi.org)
  • The associated software (RepeatFinder), should prove helpful in the analysis of repeat structure for both complete and partial genome sequences. (jcvi.org)
  • Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster ) are more similar (in some sense) to each other than to those in other groups (clusters). (wikipedia.org)
  • A criterion such as between-groups sum of squares or likelihood can be plotted against the number of clusters in a scree plot . (encyclopedia.com)
  • We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. (pnas.org)
  • To be precise, in the first stage I need to create clusters on the basis of a set of variables, s1, and in the second stage I need to create clusters, within the groups formed in the first stage, using a different set of variables, s2. (stata.com)
  • The goal of performing a cluster analysis is to sort different objects or data points into groups in a manner that the degree of association between two objects is high if they belong to the same group, and low if they belong to different groups. (surveygizmo.com)
  • For example, when cluster analysis is performed as part of market research , specific groups can be identified within a population. (surveygizmo.com)
  • The analysis of these groups can then determine how likely a population cluster is to purchase products or services. (surveygizmo.com)
  • If these groups are defined clearly, a marketing team can then target varying cluster with tailored, targeted communication. (surveygizmo.com)
  • Cluster analysis is the automated search for groups of related observations in a data set. (psu.edu)
  • The analyst groups objects so that objects in the same group (called a cluster) are more similar to each other than to objects in other groups (clusters) in some way. (wikipedia.org)
  • Cluster Randomised Controlled Trials (cRCTs) are trials which randomise groups of patients rather than individual patients. (sheffield.ac.uk)
  • Finding Groups in Data: An Introduction to ClusterAnalysis . (program-transformation.org)
  • Clustering is a form of unsupervised machine learning, where instances are organized into groups whose members share similarities. (frontiersin.org)
  • A more significant clustering is one which groups distinct combinations into separate clusters. (bookdepository.com)
  • Kaleidoscope Pro detects similar vocalizations and quickly sorts them into groups to streamline your analysis. (wildlifeacoustics.com)
  • Kaleidoscope Pro automatically scans your recordings and pulls out distinct sounds and phrases, such as frog calls or bird songs, and groups them into clusters. (wildlifeacoustics.com)
  • Cluster analysis is a systematic quantitative technique used to discover groups in data (Kaufman & Rousseeuw 1990). (yogamag.net)
  • Firstly, segmentations lie at the core of many submissions to IJMR, but simply because a cluster analysis has produced a number of discrete groups of consumers, that does not mean it provides a valid interpretation of the market. (mrs.org.uk)
  • Secondly, segmentations need to facilitate action, but often the clusters are somewhat meaningless if it is not possible to use the output in a practical way, for example, to develop a marketing strategy that can communicate differentiated messages to the target groups. (mrs.org.uk)
  • A simple method for comparing independent groups of clustered binary data with group-specific covariates is proposed. (nih.gov)
  • The cluster analysis can then identify groups of patients that have similar symptoms. (statisticssolutions.com)
  • The researcher then may use cluster analysis to identify homogenous groups of customers that have similar needs and attitudes. (statisticssolutions.com)
  • A cluster analysis can group those observations into a series of clusters and help build a taxonomy of groups and subgroups of similar plants. (statisticssolutions.com)
  • Other techniques you might want to try in order to identify similar groups of observations are Q-analysis , multi-dimensional scaling (MDS) , and latent class analysis . (statisticssolutions.com)
  • The fewer groups randomised and the more individuals there are per group, the greater the potential impact of any clustering. (bmj.com)
  • The aim of the course is to provide an introduction to and understanding of the key issues in the design, analysis and reporting of cluster randomised controlled trials (cRCTs). (sheffield.ac.uk)
  • Typically, it's good to conduct some data filtering prior to the analysis: This could include removing spots that are outside of the tissue or removing spots or genes that have a low number of reads. (bioconductor.org)
  • Find clusters in biomedical data involving genes.2. (coursera.org)
  • The resulting cluster suggests some functional relationships between genes, and some known genes belongs to a unique functional classes shall provide indication for unknown genes in the same clusters. (spie.org)
  • We describe a core gene cluster, comprised of eight genes (designated CTB1-8 ), and associated with cercosporin toxin production in Cercospora nicotianae . (wiley.com)
  • Sequence analysis identified 10 putative open reading frames (ORFs) flanking the previously characterized CTB1 and CTB3 genes that encode, respectively, the polyketide synthase and a dual methyltransferase/monooxygenase required for cercosporin production. (wiley.com)
  • Disruption of the CTB2 gene encoding a methyltransferase or the CTB8 gene yielded mutants that were completely defective in cercosporin production and inhibitory expression of the other CTB cluster genes. (wiley.com)
  • this clusters the genes into 10 clusters. (utsa.edu)
  • Get genes in different clusters and perform Gene Ontology analysis. (utsa.edu)
  • in Workshop on Clustering High Dimensional Data and its Applications, SIAM Data Mining. (springer.com)
  • Clustering techniques based on cores (representative points) are appropriate tools for data mining of large data sets. (wias-berlin.de)
  • Data mining in telecommunications: case study of cluster analysis. (thefreelibrary.com)
  • Clustering analysis has been widely applied in diverse fields such as data mining, access structures, knowledge discovery, software engineering, organization of information systems, and machine learning. (igi-global.com)
  • CART and other Salford data mining modules now include an approach to cluster analysis, density estimation and unsupervised learning using ideas that we trace to Leo Breiman , but which may have been known informally in among statisticians at Stanford and elsewhere for some time. (salford-systems.com)
  • Data mining can be classified into various models such as Clustering, Decision trees, Association rules, and Sequential pattern and time series. (ssrn.com)
  • We can visualize the result of running it by turning the object to a dendrogram and making several adjustments to the object, such as: changing the labels, coloring the labels based on the real species category, and coloring the branches based on cutting the tree into three clusters. (r-project.org)
  • create a figure to visualize and compare the 4 selected clusters in more detail. (utsa.edu)
  • They give a nearly global-optimal discrete clustering solution by using singular value decomposition and nonmaximum suppression in an i. (psu.edu)
  • See http://www.jstatsoft.org/v18/i06/paper # http://www.stat.washington.edu/research/reports/2006/tr504.pdf # library(mclust) # Run the function to see how many clusters # it finds to be optimal, set it to search for # at least 1 model and up 20. (stackoverflow.com)
  • This choice may not be optimal, as it should be made in the very beginning, when there may not exist an informal expectation of what the number of natural clusters would be. (igi-global.com)
  • consumers are clustered according to psychographic, demographic, and purchasing behavior variables. (encyclopedia.com)
  • Through close collaboration and an iterative process, a division was made into nine clusters of residentsbased on 25 demographic and socioeconomic characteristics. (cbs.nl)
  • Cluster analysis of 500 US cities, summarized at the state level, plus Washington, DC, based on kidney disease-related factors (unhealthy behaviors, prevention measures, and outcomes related to CKD) and adjusted for socio-demographic characteristics. (cdc.gov)
  • We used the regression residuals for each measure (hereafter "adjusted prevalence measures") in the subsequent cluster analysis, as they indicate the portion of variability that cannot be explained by the socio-demographic characteristics. (cdc.gov)
  • Demographic clusters. (sas.com)
  • This analysis technique is typically performed during the exploratory phase of research, since unlike techniques such as factor analysis , it doesn't make any distinction between dependent and independent variables. (surveygizmo.com)
  • Gasch, A.P. and M.B. Eisen, Exploring the Conditional Coregulation of Yeast Gene Expression through Fuzzy K-Means Clustering. (springer.com)
  • Although the responsible gene cluster has been identified, the biosynthetic pathway remains to be elucidated. (mdpi.com)
  • In the present study, members of the gene cluster were deleted individually in a Fusarium graminearum strain overexpressing the local transcription factor. (mdpi.com)
  • STACK_PACK 1.0 has been developed by SANBI in collaboration with Electric Genetics, Cape Town (PTY) LTD, to support analysis of the increasing EST load for Gene Discovery. (bio.net)
  • and 'A comprehensive approach to clustering of expressed human gene sequence: The Sequence Tag Alignment and Consensus Knowledgebase. (bio.net)
  • idx - cluster ID for each gene. (utsa.edu)
  • center: average expression for each gene cluster. (utsa.edu)
  • Expression of four ORFs located on the two distal ends of the cluster did not correlate with cercosporin biosynthesis and did not show regulation by CTB8, suggesting that the biosynthetic cluster was limited to CTB1-8 . (wiley.com)
  • display the average expression level of each cluster in figure 5. (utsa.edu)
  • However, such analyses do not address the full potential of genome-scale experiments to alter our understanding of cellular biology by providing, through an inclusive analysis of the entire repertoire of transcripts, a continuing comprehensive window into the state of a cell as it goes through a biological process. (pnas.org)
  • Bing X, Bunea F, Royer M, Das J. Latent model-based clustering for biological discovery. (pcrm.org)
  • This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. (researchandmarkets.com)