Databases, Factual: Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references.Databases, Genetic: Databases devoted to knowledge about specific genes and gene products.Databases as Topic: Organized collections of computer records, standardized in format and content, that are stored in any of a variety of computer-readable modes. They are the basic sets of data from which computer-readable files are created. (from ALA Glossary of Library and Information Science, 1983)Databases, Protein: Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties.Databases, Bibliographic: Extensive collections, reputedly complete, of references and citations to books, articles, publications, etc., generally on a single subject or specialized subject area. Databases can operate through automated files, libraries, or computer disks. The concept should be differentiated from DATABASES, FACTUAL which is used for collections of data and facts apart from bibliographic references to them.Databases, Nucleic Acid: Databases containing information about NUCLEIC ACIDS such as BASE SEQUENCE; SNPS; NUCLEIC ACID CONFORMATION; and other properties. Information about the DNA fragments kept in a GENE LIBRARY or GENOMIC LIBRARY is often maintained in DNA databases.Internet: A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange.Information Storage and Retrieval: Organized activities related to the storage, location, search, and retrieval of information.Database Management Systems: Software designed to store, manipulate, manage, and control data for specific uses.Software: Sequential operating programs and data which instruct the functioning of a digital computer.User-Computer Interface: The portion of an interactive computer program that issues messages to and receives commands from a user.Computational Biology: A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets.Systems Integration: The procedures involved in combining separately developed modules, components, or subsystems so that they work together as a complete system. (From McGraw-Hill Dictionary of Scientific and Technical Terms, 4th ed)Algorithms: A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task.Expressed Sequence Tags: Partial cDNA (DNA, COMPLEMENTARY) sequences that are unique to the cDNAs from which they were derived.Randomized Controlled Trials as Topic: Works about clinical trials that involve at least one test treatment and one control treatment, concurrent enrollment and follow-up of the test- and control-treated groups, and in which the treatments to be administered are selected by a random process, such as the use of a random-numbers table.Databases, Chemical: Databases devoted to knowledge about specific chemicals.MEDLINE: The premier bibliographic database of the NATIONAL LIBRARY OF MEDICINE. MEDLINE® (MEDLARS Online) is the primary subset of PUBMED and can be searched on NLM's Web site in PubMed or the NLM Gateway. MEDLINE references are indexed with MEDICAL SUBJECT HEADINGS (MeSH).Genomics: The systematic study of the complete DNA sequences (GENOME) of organisms.Sequence Alignment: The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms.Data Mining: Use of sophisticated analysis tools to sort through, organize, examine, and combine large sets of information.Abstracting and Indexing as Topic: Activities performed to identify concepts and aspects of published information and research reports.Sequence Analysis, Protein: A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence.Sequence Analysis, DNA: A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis.Terminology as Topic: The terms, expressions, designations, or symbols used in a particular science, discipline, or specialized subject area.Molecular Sequence Data: Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories.Proteins: Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein.Information Systems: Integrated set of files, procedures, and equipment for the storage, manipulation, and retrieval of information.PubMed: A bibliographic database that includes MEDLINE as its primary subset. It is produced by the National Center for Biotechnology Information (NCBI), part of the NATIONAL LIBRARY OF MEDICINE. PubMed, which is searchable through NLM's Web site, also includes access to additional citations to selected life sciences journals not in MEDLINE, and links to other resources such as the full-text of articles at participating publishers' Web sites, NCBI's molecular biology databases, and PubMed Central.Computer Graphics: The process of pictorial communication, between human and computers, in which the computer input and output have the form of charts, drawings, or other appropriate pictorial representation.Databases, Pharmaceutical: Databases devoted to knowledge about PHARMACEUTICAL PRODUCTS.Computer Communication Networks: A system containing any combination of computers, computer terminals, printers, audio or visual display devices, or telephones interconnected by telecommunications equipment or cables: used to transmit or receive information. (Random House Unabridged Dictionary, 2d ed)Online Systems: Systems where the input data enter the computer directly from the point of origin (usually a terminal or workstation) and/or in which output data are transmitted directly to that terminal point of origin. (Sippl, Computer Dictionary, 4th ed)Vocabulary, Controlled: A specified list of terms with a fixed and unalterable meaning, and from which a selection is made when CATALOGING; ABSTRACTING AND INDEXING; or searching BOOKS; JOURNALS AS TOPIC; and other documents. The control is intended to avoid the scattering of related subjects under different headings (SUBJECT HEADINGS). The list may be altered or extended only by the publisher or issuing agency. (From Harrod's Librarians' Glossary, 7th ed, p163)Molecular Sequence Annotation: The addition of descriptive information about the function or structure of a molecular sequence to its MOLECULAR SEQUENCE DATA record.Programming Languages: Specific languages used to prepare computer programs.Phylogeny: The relationships of groups of organisms as reflected by their genetic makeup.Amino Acid Sequence: The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION.CD-ROM: An optical disk storage system for computers on which data can be read or from which data can be retrieved but not entered or modified. A CD-ROM unit is almost identical to the compact disk playback device for home use.Periodicals as Topic: A publication issued at stated, more or less regular, intervals.Search Engine: Software used to locate data or information stored in machine-readable form locally or at a distance such as an INTERNET site.Software Design: Specifications and instructions applied to the software.Gene Expression Profiling: The determination of the pattern of genes expressed at the level of GENETIC TRANSCRIPTION, under specific circumstances or in a specific cell.Evidence-Based Medicine: An approach of practicing medicine with the goal to improve and evaluate patient care. It requires the judicious integration of best research evidence with the patient's values to make decisions about medical care. This method is to help physicians make proper diagnosis, devise best testing plan, choose best treatment and methods of disease prevention, as well as develop guidelines for large groups of patients with the same disease. (from JAMA 296 (9), 2006)Treatment Outcome: Evaluation undertaken to assess the results or consequences of management and procedures used in combating disease in order to determine the efficacy, effectiveness, safety, and practicability of these interventions in individual cases or series.Review Literature as Topic: Published materials which provide an examination of recent or current literature. Review articles can cover a wide range of subject matter at various levels of completeness and comprehensiveness based on analyses of literature that may include research findings. The review may reflect the state of the art. It also includes reviews as a literary form.United StatesGenome: The genetic complement of an organism, including all of its GENES, as represented in its DNA, or in some cases, its RNA.Protein Interaction Mapping: Methods for determining interaction between PROTEINS.Base Sequence: The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence.Knowledge Bases: Collections of facts, assumptions, beliefs, and heuristics that are used in combination with databases to achieve desired results, such as a diagnosis, an interpretation, or a solution to a problem (From McGraw Hill Dictionary of Scientific and Technical Terms, 6th ed).Publications: Copies of a work or document distributed to the public by sale, rental, lease, or lending. (From ALA Glossary of Library and Information Science, 1983, p181)Reproducibility of Results: The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results.Publication Bias: The influence of study results on the chances of publication and the tendency of investigators, reviewers, and editors to submit or accept manuscripts for publication based on the direction or strength of the study findings. Publication bias has an impact on the interpretation of clinical trials and meta-analyses. Bias can be minimized by insistence by editors on high-quality research, thorough literature reviews, acknowledgement of conflicts of interest, modification of peer review practices, etc.Cost-Benefit Analysis: A method of comparing the cost of a program with its expected benefits in dollars (or other currency). The benefit-to-cost ratio is a measure of total return expected per unit of money spent. This analysis generally excludes consideration of factors that are not measured ultimately in economic terms. Cost effectiveness compares alternative ways to achieve a specific set of results.Directories as Topic: Lists of persons or organizations, systematically arranged, usually in alphabetic or classed order, giving address, affiliations, etc., for individuals, and giving address, officers, functions, and similar data for organizations. (ALA Glossary of Library and Information Science, 1983)Documentation: Systematic organization, storage, retrieval, and dissemination of specialized information, especially of a scientific or technical nature (From ALA Glossary of Library and Information Science, 1983). It often involves authenticating or validating information.Medical Records Systems, Computerized: Computer-based systems for input, storage, display, retrieval, and printing of information contained in a patient's medical record.National Library of Medicine (U.S.): An agency of the NATIONAL INSTITUTES OF HEALTH concerned with overall planning, promoting, and administering programs pertaining to advancement of medical and related sciences. Major activities of this institute include the collection, dissemination, and exchange of information important to the progress of medicine and health, research in medical informatics and support for medical library development.Risk Factors: An aspect of personal behavior or lifestyle, environmental exposure, or inborn or inherited characteristic, which, on the basis of epidemiologic evidence, is known to be associated with a health-related condition considered important to prevent.Natural Language Processing: Computer processing of a language with rules that reflect and describe current usage rather than prescribed usage.Medical Record Linkage: The creation and maintenance of medical and vital records in multiple institutions in a manner that will facilitate the combined use of the records of identified individuals.Gene Library: A large collection of DNA fragments cloned (CLONING, MOLECULAR) from a given organism, tissue, organ, or cell type. It may contain complete genomic sequences (GENOMIC LIBRARY) or complementary DNA sequences, the latter being formed from messenger RNA and lacking intron sequences.Research Design: A plan for collecting and utilizing data so that desired information can be obtained with sufficient precision or so that an hypothesis can be tested properly.Metabolic Networks and Pathways: Complex sets of enzymatic reactions connected to each other via their product and substrate metabolites.Meta-Analysis as Topic: A quantitative method of combining the results of independent studies (usually drawn from the published literature) and synthesizing summaries and conclusions which may be used to evaluate therapeutic effectiveness, plan new studies, etc., with application chiefly in the areas of research and medicine.Bibliometrics: The use of statistical methods in the analysis of a body of literature to reveal the historical development of subject fields and patterns of authorship, publication, and use. Formerly called statistical bibliography. (from The ALA Glossary of Library and Information Science, 1983)Pharmacoepidemiology: The science concerned with the benefit and risk of drugs used in populations and the analysis of the outcomes of drug therapies. Pharmacoepidemiologic data come from both clinical trials and epidemiological studies with emphasis on methods for the detection and evaluation of drug-related adverse effects, assessment of risk vs benefit ratios in drug therapy, patterns of drug utilization, the cost-effectiveness of specific drugs, methodology of postmarketing surveillance, and the relation between pharmacoepidemiology and the formulation and interpretation of regulatory guidelines. (Pharmacoepidemiol Drug Saf 1992;1(1); J Pharmacoepidemiol 1990;1(1))Sequence Homology, Amino Acid: The degree of similarity between sequences of amino acids. This information is useful for the analyzing genetic relatedness of proteins and species.Genome, Human: The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs.Cluster Analysis: A set of statistical methods used to group variables or observations into strongly inter-related subgroups. In epidemiology, it may be used to analyze a closely grouped series of events or cases of disease or other health-related phenomenon with well-defined distribution patterns in relation to time or place or both.Sequence Analysis: A multistage process that includes the determination of a sequence (protein, carbohydrate, etc.), its fragmentation and analysis, and the interpretation of the resulting sequence information.Chromosome Mapping: Any method used for determining the location of and relative distances between genes on a chromosome.Proteome: The protein complement of an organism coded for by its genome.Great BritainArtificial Intelligence: Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language.Genome, Plant: The genetic complement of a plant (PLANTS) as represented in its DNA.Subject Headings: Terms or expressions which provide the major means of access by subject to the bibliographic unit.Proteomics: The systematic study of the complete complement of proteins (PROTEOME) of organisms.Drug Information Services: Services providing pharmaceutic and therapeutic drug information and consultation.Enzymes: Biological molecules that possess catalytic activity. They may occur naturally or be synthetically created. Enzymes are usually proteins, however CATALYTIC RNA and CATALYTIC DNA molecules have also been identified.Medical Informatics: The field of information science concerned with the analysis and dissemination of medical data through the application of computers to various aspects of health care and medicine.Information Dissemination: The circulation or wide dispersal of information.Classification: The systematic arrangement of entities in any field into categories classes based on common characteristics such as properties, morphology, subject matter, etc.Unified Medical Language System: A research and development program initiated by the NATIONAL LIBRARY OF MEDICINE to build knowledge sources for the purpose of aiding the development of systems that help health professionals retrieve and integrate biomedical information. The knowledge sources can be used to link disparate information systems to overcome retrieval problems caused by differences in terminology and the scattering of relevant information across many databases. The three knowledge sources are the Metathesaurus, the Semantic Network, and the Specialist Lexicon.Informatics: The field of information science concerned with the analysis and dissemination of data through the application of computers.MEDLARS: A computerized biomedical bibliographic storage and retrieval system operated by the NATIONAL LIBRARY OF MEDICINE. MEDLARS stands for Medical Literature Analysis and Retrieval System, which was first introduced in 1964 and evolved into an online system in 1971 called MEDLINE (MEDLARS Online). As other online databases were developed, MEDLARS became the name of the entire NLM information system while MEDLINE became the name of the premier database. MEDLARS was used to produce the former printed Cumulated Index Medicus, and the printed monthly Index Medicus, until that publication ceased in December 2004.Data Collection: Systematic gathering of data for a particular purpose from various sources, including questionnaires, interviews, observation, existing records, and electronic devices. The process is usually preliminary to statistical analysis of the data.Sensitivity and Specificity: Binary classification measures to assess test results. Sensitivity or recall rate is the proportion of true positives. Specificity is the probability of correctly determining the absence of a condition. (From Last, Dictionary of Epidemiology, 2d ed)Molecular Biology: A discipline concerned with studying biological phenomena in terms of the chemical and physical interactions of molecules.Evolution, Molecular: The process of cumulative change at the level of DNA; RNA; and PROTEINS, over successive generations.International Classification of Diseases: A system of categories to which morbid entries are assigned according to established criteria. Included is the entire range of conditions in a manageable number of categories, grouped to facilitate mortality reporting. It is produced by the World Health Organization (From ICD-10, p1). The Clinical Modifications, produced by the UNITED STATES DEPT. OF HEALTH AND HUMAN SERVICES, are larger extensions used for morbidity and general epidemiological purposes, primarily in the U.S.Sequence Analysis, RNA: A multistage process that includes cloning, physical mapping, subcloning, sequencing, and information analysis of an RNA SEQUENCE.Dictionaries as Topic: Lists of words, usually in alphabetical order, giving information about form, pronunciation, etymology, grammar, and meaning.Pattern Recognition, Automated: In INFORMATION RETRIEVAL, machine-sensing or identification of visible patterns (shapes, forms, and configurations). (Harrod's Librarians' Glossary, 7th ed)Oligonucleotide Array Sequence Analysis: Hybridization of a nucleic acid sample to a very large set of OLIGONUCLEOTIDE PROBES, which have been attached individually in columns and rows to a solid support, to determine a BASE SEQUENCE, or to detect variations in a gene sequence, GENE EXPRESSION, or for GENE MAPPING.Biomedical Research: Research that involves the application of the natural sciences, especially biology and physiology, to medicine.Hypermedia: Computerized compilations of information units (text, sound, graphics, and/or video) interconnected by logical nonlinear linkages that enable users to follow optimal paths through the material and also the systems used to create and display this information. (From Thesaurus of ERIC Descriptors, 1994)Biomedical Technology: The application of technology to the solution of medical problems.DNA, Complementary: Single-stranded complementary DNA synthesized from an RNA template by the action of RNA-dependent DNA polymerase. cDNA (i.e., complementary DNA, not circular DNA, not C-DNA) is used in a variety of molecular cloning experiments as well as serving as a specific hybridization probe.Risk Assessment: The qualitative or quantitative estimation of the likelihood of adverse effects that may result from exposure to specified health hazards or from the absence of beneficial influences. (Last, Dictionary of Epidemiology, 1988)Incidence: The number of new cases of a given disease during a given period in a specified population. It also is used for the rate at which new events occur in a defined population. It is differentiated from PREVALENCE, which refers to all cases, new or old, in the population at a given time.Human Genome Project: A coordinated effort of researchers to map (CHROMOSOME MAPPING) and sequence (SEQUENCE ANALYSIS, DNA) the human GENOME.Outcome Assessment (Health Care): Research aimed at assessing the quality and effectiveness of health care as measured by the attainment of a specified end result or outcome. Measures include parameters such as improved health, lowered morbidity or mortality, and improvement of abnormal states (such as elevated blood pressure).Genome, Bacterial: The genetic complement of a BACTERIA as represented in its DNA.Open Reading Frames: A sequence of successive nucleotide triplets that are read as CODONS specifying AMINO ACIDS and begin with an INITIATOR CODON and end with a stop codon (CODON, TERMINATOR).Publishing: "The business or profession of the commercial production and issuance of literature" (Webster's 3d). It includes the publisher, publication processes, editing and editors. Production may be by conventional printing methods or by electronic publishing.Retrospective Studies: Studies used to test etiologic hypotheses in which inferences about an exposure to putative causal factors are derived from data relating to characteristics of persons under study or to events or experiences in their past. The essential feature is that some of the persons under study have the disease or outcome of interest and their characteristics are compared with those of unaffected persons.Adverse Drug Reaction Reporting Systems: Systems developed for collecting reports from government agencies, manufacturers, hospitals, physicians, and other sources on adverse drug reactions.Data Interpretation, Statistical: Application of statistical procedures to analyze specific observed or assumed facts from a particular study.Multigene Family: A set of genes descended by duplication and variation from some ancestral gene. Such genes may be clustered together on the same chromosome or dispersed on different chromosomes. Examples of multigene families include those that encode the hemoglobins, immunoglobulins, histocompatibility antigens, actins, tubulins, keratins, collagens, heat shock proteins, salivary glue proteins, chorion proteins, cuticle proteins, yolk proteins, and phaseolins, as well as histones, ribosomal RNA, and transfer RNA genes. The latter three are examples of reiterated genes, where hundreds of identical genes are present in a tandem array. (King & Stanfield, A Dictionary of Genetics, 4th ed)Cohort Studies: Studies in which subsets of a defined population are identified. These groups may or may not be exposed to factors hypothesized to influence the probability of the occurrence of a particular disease or other outcome. Cohorts are defined populations which, as a whole, are followed in an attempt to determine distinguishing subgroup characteristics.Sequence Homology: The degree of similarity between sequences. Studies of AMINO ACID SEQUENCE HOMOLOGY and NUCLEIC ACID SEQUENCE HOMOLOGY provide useful information about the genetic relatedness of genes, gene products, and species.Technology Assessment, Biomedical: Evaluation of biomedical technology in relation to cost, efficacy, utilization, etc., and its future impact on social, ethical, and legal systems.Semantics: The relationships between symbols and their meanings.Controlled Clinical Trials as Topic: Works about clinical trials involving one or more test treatments, at least one control treatment, specified outcome measures for evaluating the studied intervention, and a bias-free method for assigning patients to the test treatment. The treatment may be drugs, devices, or procedures studied for diagnostic, therapeutic, or prophylactic effectiveness. Control measures include placebos, active medicines, no-treatment, dosage forms and regimens, historical comparisons, etc. When randomization using mathematical techniques, such as the use of a random numbers table, is employed to assign patients to test or control treatments, the trials are characterized as RANDOMIZED CONTROLLED TRIALS AS TOPIC.Records as Topic: The commitment in writing, as authentic evidence, of something having legal importance. The concept includes certificates of birth, death, etc., as well as hospital, medical, and other institutional records.Disease: A definite pathologic process with a characteristic set of signs and symptoms. It may affect the whole body or any of its parts, and its etiology, pathology, and prognosis may be known or unknown.Biology: One of the BIOLOGICAL SCIENCE DISCIPLINES concerned with the origin, structure, development, growth, function, genetics, and reproduction of animals, plants, and microorganisms.Conserved Sequence: A sequence of amino acids in a polypeptide or of nucleotides in DNA or RNA that is similar across multiple species. A known set of conserved sequences is represented by a CONSENSUS SEQUENCE. AMINO ACID MOTIFS are often composed of conserved sequences.Automatic Data Processing: Data processing largely performed by automatic means.Reference Books: Books designed by the arrangement and treatment of their subject matter to be consulted for definite terms of information rather than to be read consecutively. Reference books include DICTIONARIES; ENCYCLOPEDIAS; ATLASES; etc. (From the ALA Glossary of Library and Information Science, 1983)Polymorphism, Single Nucleotide: A single nucleotide variation in a genetic sequence that occurs at appreciable frequency in the population.Odds Ratio: The ratio of two odds. The exposure-odds ratio for case control data is the ratio of the odds in favor of exposure among cases to the odds in favor of exposure among noncases. The disease-odds ratio for a cohort or cross section is the ratio of the odds in favor of disease among the exposed to the odds in favor of disease among the unexposed. The prevalence-odds ratio refers to an odds ratio derived cross-sectionally from studies of prevalent cases.EuropeHospital Information Systems: Integrated, computer-assisted systems designed to store, manipulate, and retrieve information concerned with the administrative and clinical aspects of providing medical services within the hospital.Toxicology: The science concerned with the detection, chemical composition, and biological action of toxic substances or poisons and the treatment and prevention of toxic manifestations.Quality Control: A system for verifying and maintaining a desired level of quality in a product or process by careful planning, use of proper equipment, continued inspection, and corrective action as required. (Random House Unabridged Dictionary, 2d ed)Computer Simulation: Computer-based representation of physical systems and phenomena such as chemical processes.Time Factors: Elements of limited time intervals, contributing to particular results or situations.Information Services: Organized services to provide information on any questions an individual might have using databases and other sources. (From Random House Unabridged Dictionary, 2d ed)Computer Systems: Systems composed of a computer or computers, peripheral equipment, such as disks, printers, and terminals, and telecommunications capabilities.Models, Statistical: Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc.Medical Informatics Computing: Precise procedural mathematical and logical operations utilized in the study of medical information pertaining to health care.Quality-Adjusted Life Years: A measurement index derived from a modification of standard life-table procedures and designed to take account of the quality as well as the duration of survival. This index can be used in assessing the outcome of health care procedures or services. (BIOETHICS Thesaurus, 1994)Genetic Variation: Genotypic differences observed among individuals in a population.Workflow: Description of pattern of recurrent functions or procedures frequently found in organizational processes, such as notification, decision, and action.Contig Mapping: Overlapping of cloned or sequenced DNA to construct a continuous region of a gene, chromosome or genome.Data Compression: Information application based on a variety of coding methods to minimize the amount of data to be stored, retrieved, or transmitted. Data compression can be applied to various forms of data, such as images and signals. It is used to reduce costs and increase efficiency in the maintenance of large volumes of data.Clinical Coding: Process of substituting a symbol or code for a term such as a diagnosis or procedure. (from Slee's Health Care Terms, 3d ed.)Registries: The systems and processes involved in the establishment, support, management, and operation of registers, e.g., disease registers.Software Validation: The act of testing the software for compliance with a standard.Transcriptome: The pattern of GENE EXPRESSION at the level of genetic transcription in a specific organism or under specific circumstances in specific cells.Computer Security: Protective measures against unauthorized access to or interference with computer operating systems, telecommunications, or data structures, especially the modification, deletion, destruction, or release of data in computers. It includes methods of forestalling interference by computer viruses or so-called computer hackers aiming to compromise stored data.Pharmaceutical Preparations: Drugs intended for human or veterinary use, presented in their finished dosage form. Included here are materials used in the preparation and/or formulation of the finished dosage form.Canada: The largest country in North America, comprising 10 provinces and three territories. Its capital is Ottawa.Hospitalization: The confinement of a patient in a hospital.Forensic Genetics: The application of genetic analyses and MOLECULAR DIAGNOSTIC TECHNIQUES to legal matters and crime analysis.Health Services Research: The integration of epidemiologic, sociological, economic, and other analytic sciences in the study of health services. Health services research is usually concerned with relationships between need, demand, supply, use, and outcome of health services. The aim of the research is evaluation, particularly in terms of structure, process, output, and outcome. (From Last, Dictionary of Epidemiology, 2d ed)Catalogs as Topic: Ordered compilations of item descriptions and sufficient information to afford access to them.Neoplasms: New abnormal growth of tissue. Malignant neoplasms show a greater degree of anaplasia and have the properties of invasion and metastasis, compared to benign neoplasms.Drug-Related Side Effects and Adverse Reactions: Disorders that result from the intended use of PHARMACEUTICAL PREPARATIONS. Included in this heading are a broad variety of chemically-induced adverse conditions due to toxicity, DRUG INTERACTIONS, and metabolic effects of pharmaceuticals.Data Display: The visual display of data in a man-machine system. An example is when data is called from the computer and transmitted to a CATHODE RAY TUBE DISPLAY or LIQUID CRYSTAL display.Clinical Trials as Topic: Works about pre-planned studies of the safety, efficacy, or optimum dosage schedule (if appropriate) of one or more diagnostic, therapeutic, or prophylactic drugs, devices, or techniques selected according to predetermined criteria of eligibility and observed for predefined evidence of favorable and unfavorable effects. This concept includes clinical trials conducted both in the U.S. and in other countries.Confidentiality: The privacy of information and its protection against unauthorized disclosure.Models, Economic: Statistical models of the production, distribution, and consumption of goods and services, as well as of financial considerations. For the application of statistics to the testing and quantifying of economic theories MODELS, ECONOMETRIC is available.Prevalence: The total number of cases of a given disease in a specified population at a designated time. It is differentiated from INCIDENCE, which refers to the number of new cases in the population at a given time.Metabolomics: The systematic identification and quantitation of all the metabolic products of a cell, tissue, organ, or organism under varying conditions. The METABOLOME of a cell or organism is a dynamic collection of metabolites which represent its net response to current conditions.Sequence Homology, Nucleic Acid: The sequential correspondence of nucleotides in one nucleic acid molecule with those of another nucleic acid molecule. Sequence homology is an indication of the genetic relatedness of different organisms and gene function.Metabolism: The chemical reactions that occur within the cells, tissues, or an organism. These processes include both the biosynthesis (ANABOLISM) and the breakdown (CATABOLISM) of organic materials utilized by the living organism.United States Department of Veterans Affairs: A cabinet department in the Executive Branch of the United States Government concerned with overall planning, promoting, and administering programs pertaining to VETERANS. It was established March 15, 1989 as a Cabinet-level position.Genetic Diseases, Inborn: Diseases that are caused by genetic mutations present during embryo or fetal development, although they may be observed later in life. The mutations may be inherited from a parent's genome or they may be acquired in utero.Species Specificity: The restriction of a characteristic behavior, anatomical structure or physical system, such as immune response; metabolic response, or gene or gene variant to the members of one species. It refers to that property which differentiates one species from another but it is also used for phylogenetic levels higher or lower than the species.Research: Critical and exhaustive investigation or experimentation, having for its aim the discovery of new facts and their correct interpretation, the revision of accepted conclusions, theories, or laws in the light of newly discovered facts, or the practical application of such new or revised conclusions, theories, or laws. (Webster, 3d ed)High-Throughput Nucleotide Sequencing: Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced. The sequencing may be done by analysis of the synthesis or ligation products, hybridization to preexisting sequences, etc.Models, Theoretical: Theoretical representations that simulate the behavior or activity of systems, processes, or phenomena. They include the use of mathematical equations, computers, and other electronic equipment.Markov Chains: A stochastic process such that the conditional probability distribution for a state at any future instant, given the present state, is unaffected by any additional knowledge of the past history of the system.Infant, Newborn: An infant during the first month after birth.Cloning, Molecular: The insertion of recombinant DNA molecules from prokaryotic and/or eukaryotic sources into a replicating vehicle, such as a plasmid or virus vector, and the introduction of the resultant hybrid molecules into recipient cells without altering the viability of those cells.Genes, Plant: The functional hereditary units of PLANTS.Protein Structure, Tertiary: The level of protein structure in which combinations of secondary protein structures (alpha helices, beta sheets, loop regions, and motifs) pack together to form folded shapes called domains. Disulfide bridges between cysteines in two different parts of the polypeptide chain along with other interactions between the chains play a role in the formation and stabilization of tertiary structure. Small proteins usually consist of only one domain but larger proteins may contain a number of domains connected by segments of polypeptide chain which lack regular secondary structure.Models, Molecular: Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures.Models, Genetic: Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment.Protein Interaction Maps: Graphs representing sets of measurable, non-covalent physical contacts with specific PROTEINS in living organisms or in cells.Bias (Epidemiology): Any deviation of results or inferences from the truth, or processes leading to such deviation. Bias can result from several sources: one-sided or systematic variations in measurement from the true value (systematic error); flaws in study design; deviation of inferences, interpretations, or analyses based on flawed data or data collection; etc. There is no sense of prejudice or subjectivity implied in the assessment of bias under these conditions.Internationality: The quality or state of relating to or affecting two or more nations. (After Merriam-Webster Collegiate Dictionary, 10th ed)Names: Personal names, given or surname, as cultural characteristics, as ethnological or religious patterns, as indications of the geographic distribution of families and inbreeding, etc. Analysis of isonymy, the quality of having the same or similar names, is useful in the study of population genetics. NAMES is used also for the history of names or name changes of corporate bodies, such as medical societies, universities, hospitals, government agencies, etc.Insurance Claim Reporting: The design, completion, and filing of forms with the insurer.Pregnancy: The status during which female mammals carry their developing young (EMBRYOS or FETUSES) in utero before birth, beginning from FERTILIZATION to BIRTH.Epidemiologic Studies: Studies designed to examine associations, commonly, hypothesized causal relations. They are usually concerned with identifying or measuring the effects of risk factors or exposures. The common types of analytic study are CASE-CONTROL STUDIES; COHORT STUDIES; and CROSS-SECTIONAL STUDIES.Medical Informatics Applications: Automated systems applied to the patient care process including diagnosis, therapy, and systems of communicating medical data within the health care setting.Structural Homology, Protein: The degree of 3-dimensional shape similarity between proteins. It can be an indication of distant AMINO ACID SEQUENCE HOMOLOGY and used for rational DRUG DESIGN.Genetic Predisposition to Disease: A latent susceptibility to disease at the genetic level, which may be activated under certain conditions.Gene Regulatory Networks: Interacting DNA-encoded regulatory subsystems in the GENOME that coordinate input from activator and repressor TRANSCRIPTION FACTORS during development, cell differentiation, or in response to environmental cues. The networks function to ultimately specify expression of particular sets of GENES for specific conditions, times, or locations.Case-Control Studies: Studies which start with the identification of persons with a disease of interest and a control (comparison, referent) group without the disease. The relationship of an attribute to the disease is examined by comparing diseased and non-diseased persons with regard to the frequency or levels of the attribute in each group.Quebec: A province of eastern Canada. Its capital is Quebec. The region belonged to France from 1627 to 1763 when it was lost to the British. The name is from the Algonquian quilibek meaning the place where waters narrow, referring to the gradually narrowing channel of the St. Lawrence or to the narrows of the river at Cape Diamond. (From Webster's New Geographical Dictionary, 1988, p993 & Room, Brewer's Dictionary of Names, 1992, p440)Government Publications as Topic: Discussion of documents issued by local, regional, or national governments or by their agencies or subdivisions.Drug Prescriptions: Directions written for the obtaining and use of DRUGS.Bayes Theorem: A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result.Automation: Controlled operation of an apparatus, process, or system by mechanical or electronic devices that take the place of human organs of observation, effort, and decision. (From Webster's Collegiate Dictionary, 1993)Ontario: A province of Canada lying between the provinces of Manitoba and Quebec. Its capital is Toronto. It takes its name from Lake Ontario which is said to represent the Iroquois oniatariio, beautiful lake. (From Webster's New Geographical Dictionary, 1988, p892 & Room, Brewer's Dictionary of Names, 1992, p391)Access to Information: Individual's rights to obtain and use information collected or generated by others.Patents as Topic: Exclusive legal rights or privileges applied to inventions, plants, etc.Mutation: Any detectable and heritable change in the genetic material that causes a change in the GENOTYPE and which is transmitted to daughter cells and to succeeding generations.Forecasting: The prediction or projection of the nature of future problems or existing conditions based upon the extrapolation or interpretation of existing scientific data or by the application of scientific methodology.Electronic Health Records: Media that facilitate transportability of pertinent information concerning patient's illness across varied providers and geographic locations. Some versions include direct linkages to online consumer health information that is relevant to the health conditions and treatments related to a specific patient.Mass Spectrometry: An analytical method used in determining the identity of a chemical based on its mass using mass analyzers/mass spectrometers.Age Factors: Age as a constituent element or influence contributing to the production of a result. It may be applicable to the cause or the effect of a circumstance. It is used with human or animal concepts but should be differentiated from AGING, a physiological process, and TIME FACTORS which refers only to the passage of time.Observational Study as Topic: A clinical study in which participants may receive diagnostic, therapeutic, or other types of interventions, but the investigator does not assign participants to specific interventions (as in an interventional study).Forms and Records Control: A management function in which standards and guidelines are developed for the development, maintenance, and handling of forms and records.Genes: A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms.

mRNA:guanine-N7 cap methyltransferases: identification of novel members of the family, evolutionary analysis, homology modeling, and analysis of sequence-structure-function relationships. (1/5719)

BACKGROUND: The 5'-terminal cap structure plays an important role in many aspects of mRNA metabolism. Capping enzymes encoded by viruses and pathogenic fungi are attractive targets for specific inhibitors. There is a large body of experimental data on viral and cellular methyltransferases (MTases) that carry out guanine-N7 (cap 0) methylation, including results of extensive mutagenesis. However, a crystal structure is not available and cap 0 MTases are too diverged from other MTases of known structure to allow straightforward homology-based interpretation of these data. RESULTS: We report a 3D model of cap 0 MTase, developed using sequence-to-structure threading and comparative modeling based on coordinates of the glycine N-methyltransferase. Analysis of the predicted structural features in the phylogenetic context of the cap 0 MTase family allows us to rationalize most of the experimental data available and to propose potential binding sites. We identified a case of correlated mutations in the cofactor-binding site of viral MTases that may be important for the rational drug design. Furthermore, database searches and phylogenetic analysis revealed a novel subfamily of hypothetical MTases from plants, distinct from "orthodox" cap 0 MTases. CONCLUSIONS: Computational methods were used to infer the evolutionary relationships and predict the structure of Eukaryotic cap MTase. Identification of novel cap MTase homologs suggests candidates for cloning and biochemical characterization, while the structural model will be useful in designing new experiments to better understand the molecular function of cap MTases.  (+info)

SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide database. (2/5719)

Proteomics, or the direct analysis of the expressed protein components of a cell, is critical to our understanding of cellular biological processes in normal and diseased tissue. A key requirement for its success is the ability to identify proteins in complex mixtures. Recent technological advances in tandem mass spectrometry has made it the method of choice for high-throughput identification of proteins. Unfortunately, the software for unambiguously identifying peptide sequences has not kept pace with the recent hardware improvements in mass spectrometry instruments. Critical for reliable high-throughput protein identification, scoring functions evaluate the quality of a match between experimental spectra and a database peptide. Current scoring function technology relies heavily on ad-hoc parameterization and manual curation by experienced mass spectrometrists. In this work, we propose a two-stage stochastic model for the observed MS/MS spectrum, given a peptide. Our model explicitly incorporates fragment ion probabilities, noisy spectra, and instrument measurement error. We describe how to compute this probability based score efficiently, using a dynamic programming technique. A prototype implementation demonstrates the effectiveness of the model.  (+info)

An insight into domain combinations. (3/5719)

Domains are the building blocks of all globular proteins, and are units of compact three-dimensional structure as well as evolutionary units. There is a limited repertoire of domain families, so that these domain families are duplicated and combined in different ways to form the set of proteins in a genome. Proteins are gene products. The processes that produce new genes are duplication and recombination as well as gene fusion and fission. We attempt to gain an overview of these processes by studying the structural domains in the proteins of seven genomes from the three kingdoms of life: Eubacteria, Archaea and Eukaryota. We use here the domain and superfamily definitions in Structural Classification of Proteins Database (SCOP) in order to map pairs of adjacent domains in genome sequences in terms of their superfamily combinations. We find 624 out of the 764 superfamilies in SCOP in these genomes, and the 624 families occur in 585 pairwise combinations. Most families are observed in combination with one or two other families, while a few families are very versatile in their combinatorial behaviour. This type of pattern can be described by a scale-free network. Finally, we study domain repeats and we compare the set of the domain combinations in the genomes to those in PDB, and discuss the implications for structural genomics.  (+info)

Generating protein interaction maps from incomplete data: application to fold assignment. (4/5719)

MOTIVATION: We present a framework to generate comprehensive overviews of protein-protein interactions. In the post-genomic view of cellular function, each biological entity is seen in the context of a complex network of interactions. Accordingly, we model functional space by representing protein-protein-interaction data as undirected graphs. We suggest a general approach to generate interaction maps of cellular networks in the presence of huge amounts of fragmented and incomplete data, and to derive representations of large networks which hide clutter while keeping the essential architecture of the interaction space. This is achieved by contracting the graphs according to domain-specific hierarchical classifications. The key concept here is the notion of induced interaction, which allows the integration, comparison and analysis of interaction data from different sources and different organisms at a given level of abstraction. RESULTS: We apply this approach to compute the overlap between the DIP compendium of interaction data and a dataset of yeast two-hybrid experiments. The architecture of this network is scale-free, as frequently seen in biological networks, and this property persists through many levels of abstraction. Connections in the network can be projected downwards from higher levels of abstraction down to the level of individual proteins. As an example, we describe an algorithm for fold assignment by network context. This method currently predicts protein folds at 30% accuracy without any requirement of detectable sequence similarity of the query protein to a protein of known structure. We used this algorithm to compile a list of structural assignments for previously unassigned genes from yeast. Finally we discuss ways forward to use interaction networks for the prediction of novel protein-protein interactions. AVAILABILITY: http://www.ebi.ac.uk/~lappe/FoldPred/.  (+info)

Prediction of the coupling specificity of G protein coupled receptors to their G proteins. (5/5719)

G protein coupled receptors (GPCRs) are found in great numbers in most eukaryotic genomes. They are responsible for sensing a staggering variety of structurally diverse ligands, with their activation resulting in the initiation of a variety of cellular signalling cascades. The physiological response that is observed following receptor activation is governed by the guanine nucleotide-binding proteins (G proteins) to which a particular receptor chooses to couple. Previous investigations have demonstrated that the specificity of the receptor-G protein interaction is governed by the intracellular domains of the receptor. Despite many studies it has proven very difficult to predict de novo, from the receptor sequence alone, the G proteins to which a GPCR is most likely to couple. We have used a data-mining approach, combining pattern discovery with membrane topology prediction, to find patterns of amino acid residues in the intracellular domains of GPCR sequences that are specific for coupling to a particular functional class of G proteins. A prediction system was then built, being based upon these discovered patterns. We can report this approach was successful in the prediction of G protein coupling specificity of unknown sequences. Such predictions should be of great use in providing in silico characterisation of newly cloned receptor sequences and for improving the annotation of GPCRs stored in protein sequence databases. AVAILABILITY: http://www.ebi.ac.uk/~croning/coupling.html.  (+info)

Non-symmetric score matrices and the detection of homologous transmembrane proteins. (6/5719)

Given a transmembrane protein, we wish to find related ones by a database search. Due to the strongly hydrophobic amino acid composition of transmembrane domains, suboptimal results are obtained when general-purpose scoring matrices such as BLOSUM are used. Recently, a transmembrane-specific score matrix called PHAT was shown to perform much better than BLOSUM. In this article, we derive a transmembrane score matrix family, called SLIM, which has several distinguishing features. In contrast to currently used matrices, SLIM is non-symmetric. The asymmetry arises because different background compositions are assumed for the transmembrane query and the unknown database sequences. We describe the mathematical model behind SLIM in detail and show that SLIM outperforms PHAT both on simulated data and in a realistic setting. Since non-symmetric score matrices are a new concept in database search methods, we discuss some important theoretical and practical issues.  (+info)

Improved prediction of the number of residue contacts in proteins by recurrent neural networks. (7/5719)

Knowing the number of residue contacts in a protein is crucial for deriving constraints useful in modeling protein folding, protein structure, and/or scoring remote homology searches. Here we use an ensemble of bi-directional recurrent neural network architectures and evolutionary information to improve the state-of-the-art in contact prediction using a large corpus of curated data. The ensemble is used to discriminate between two different states of residue contacts, characterized by a contact number higher or lower than the average value of the residue distribution. The ensemble achieves performances ranging from 70.1% to 73.1% depending on the radius adopted to discriminate contacts (6Ato 12A). These performances represent gains of 15% to 20% over the base line statistical predictors always assigning an aminoacid to the most numerous state, 3% to 7% better than any previous method. Combination of different radius predictors further improves the performance. SERVER: http://promoter.ics.uci.edu/BRNN-PRED/.  (+info)

Protein-protein interaction map inference using interacting domain profile pairs. (8/5719)

A number of predictive methods have been designed to predict protein interaction from sequence or expression data. On the experimental front, however, high-throughput proteomics technologies are starting to yield large volumes of protein-protein interaction data. High-quality experimental protein interaction maps constitute the natural dataset upon which to build interaction predictions. Thus the motivation to develop the first interaction-based protein interaction map prediction algorithm. A technique to predict protein-protein interaction maps across organisms is introduced, the 'interaction-domain pair profile' method. The method uses a high-quality protein interaction map with interaction domain information as input to predict an interaction map in another organism. It combines sequence similarity searches with clustering based on interaction patterns and interaction domain information. We apply this approach to the prediction of an interaction map of Escherichia coli from the recently published interaction map of the human gastric pathogen Helicobacter pylori. Results are compared with predictions of a second inference method based only on full-length protein sequence similarity - the "naive" method. The domain-based method is shown to i) eliminate a significant amount of false-positives of the naive method that are the consequences of multi-domain proteins; ii) increase the sensitivity compared to the naive method by identifying new potential interactions. AVAILABILITY: Contact the authors.  (+info)

Chris Bizon, Andreas Prlic. Calculating All Pairwise Similarities from the RCSB Protein Data Bank: Client/Server Work Distribution on the Open Science Grid,
The average sequence length in UniProtKB/Swiss-Prot is 359 amino acids. The shortest sequence is GWA_SEPOF (P83570): 2 amino acids. The longest sequence is TITIN_MOUSE (A2ASS6): 35213 amino acids. 4. JOURNAL CITATIONS Note: the following citation statistics reflect the number of distinct journal citations. Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2753 4.1 Table of the frequency of journal citations Journals cited 1x: 866 2x: 391 3x: 163 4x: 136 5x: 110 6x: 100 7x: 66 8x: 56 9x: 37 10x: 33 11- 20x: 219 21- 50x: 227 51-100x: 118 ,100x: 231 4.2 List of the most cited journals in UniProtKB/Swiss-Prot Nb Citations Journal name -- --------- ------------------------------------------------------------- 1 24439 Journal of Biological Chemistry 2 11326 Proceedings of the National Academy of Sciences of the U.S.A. 3 6606 Journal of Bacteriology 4 5619 Biochemical and Biophysical Research Communications 5 5312 Biochemistry 6 4984 Nucleic Acids Research 7 4816 FEBS Letters 8 ...
VIEW RECORDING. DOWNLOAD SLIDES. ABSTRACT. I first posed this question in an Editorial in 2005. Well the future is now, so what is the answer to the question? I will give you at least my opinion of an answer and back it up with work that we and others have been doing at this interface. My own experience will be drawn from our database work with the RCSB Protein Data Bank (PDB) and the Immune Epitope Database (IEDB) and as Co-founder and Founding Editor in Chief of the journal PLOS Computational Biology.. SPEAKER BIOGRAPHY. Philip E. Bourne PhD is Associate Vice Chancellor for Innovation and Industry Alliances, a Professor in the Department of Pharmacology and Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego, Associate Director of the RCSB Protein Data Bank and an Adjunct Professor at the Sanford Burnham Institute. Bournes professional interests focus on relevant biological and educational outcomes derived from computation and scholarly ...
The RCSB Protein Data Bank (http://www.pdb.org) is a publicly accessible information portal for researchers and students interested in structural biology. At its center is the PDB archive -- the sole international repository for the 3-dimensional structure data of biological macromolecules. These structures hold significant promise for the pharmaceutical and biotechnology industries in the search for new drugs and in efforts to understand the mysteries of human disease The primary mission of the RCSB PDB is to provide accurate, well-annotated data in the most timely and efficient way possible to facilitate new discoveries and scientific advances. The RCSB processes, stores, and disseminates these important data, and develops the software tools needed to assist users in depositing and accessing structural information The RCSB Protein Data Bank at Rutgers University in Piscataway, NJ has an opening for a Biochemical Information & Annotation Specialist to curate and standardize macromolecular ...
PDB setzte sich ursprünglich aus Proteinstrukturen aus der Röntgen-Kristallstrukturanalyse und dem 1968 gegründeten Brookhaven RAster Display (BRAD) zusammen. Im Jahr 1969, entstand unter der Förderung durch Walter Hamilton am Brookhaven National Laboratory und der Urheberschaft von Edgar Meyer (Texas A&M University) eine Software zur Speicherung von Atomkoordinaten in einem gemeinsamen Format. Im Jahr 1971 wurde die Suchfunktion SEARCH eingeführt, mit der die Daten heruntergeladen und offline gespeichert werden konnten.[3] Nach Hamiltons Tod 1973 übernahm Tom Koeztle die Leitung für die folgenden 20 Jahre. Im Jahr 1994 ging die Führung an Joel Sussman über. Von Oktober 1998 bis Juni 1999 wurde PDB in das Research Collaboratory for Structural Bioinformatics (RCSB) übertragen.[4][5] Dort wurde Helen M. Berman of Rutgers University neue Direktorin.[6] Im Jahr 2003 wurde PDB mit der Gründung von Worldwide Protein Data Bank (wwPDB) international. Gründungsmitglieder sind PDBe ...
The RCSB web servers returned an unexpected error. It has been logged and will be reviewed by the PDB team. Here are some suggested remedial steps: ...
The RCSB web servers returned an unexpected error. It has been logged and will be reviewed by the PDB team. Here are some suggested remedial steps: ...
The Data Catalogue is a service that allows University of Liverpool Researchers to create records of information about their finalised research data, and save those data in a secure online environment. The Data Catalogue provides a good means of making that data available in a structured way, in a form that can be discovered by both general search engines and academic search tools. There are two types of record that can be created in the Data Catalogue: A discovery-only record - in these cases, the research data may be held somewhere else but a record is provided to help people find it. A record is created that alerts users to the existence of the data, and provides a link to where those data are held. A discovery and data record - in these cases, a record is created to help people discover the data exist, and the data themselves are deposited into the Data Catalogue. This process creates a unique Digital Object identifier (DOI) which can be used in citations to the data ...
2. TAXONOMIC ORIGIN Total number of species represented in this release of UniProtKB/Swiss-Prot: 12922 The first twenty species represent 112553 sequences: 20.9 % of the total number of entries. 2.1 Table of the frequency of occurrence of species Species represented 1x: 5426 2x: 1882 3x: 981 4x: 639 5x: 466 6x: 381 7x: 284 8x: 217 9x: 199 10x: 123 11- 20x: 668 21- 50x: 403 51-100x: 212 ,100x: 1041 2.2 Table of the most represented species ------ --------- -------------------------------------------- Number Frequency Species ------ --------- -------------------------------------------- 1 20233 Homo sapiens (Human) 2 16566 Mus musculus (Mouse) 3 11571 Arabidopsis thaliana (Mouse-ear cress) 4 7815 Rattus norvegicus (Rat) 5 6621 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Bakers yeast) 6 5965 Bos taurus (Bovine) 7 5089 Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) 8 4431 Escherichia coli (strain K12) 9 4188 Bacillus subtilis (strain 168) 10 4126 Dictyostelium ...
2. TAXONOMIC ORIGIN Total number of species represented in this release of UniProtKB/Swiss-Prot: 12726 The first twenty species represent 111314 sequences: 20.8 % of the total number of entries. 2.1 Table of the frequency of occurrence of species Species represented 1x: 5365 2x: 1849 3x: 955 4x: 628 5x: 463 6x: 374 7x: 272 8x: 218 9x: 198 10x: 110 11- 20x: 655 21- 50x: 392 51-100x: 209 ,100x: 1038 2.2 Table of the most represented species ------ --------- -------------------------------------------- Number Frequency Species ------ --------- -------------------------------------------- 1 20246 Homo sapiens (Human) 2 16473 Mus musculus (Mouse) 3 11018 Arabidopsis thaliana (Mouse-ear cress) 4 7690 Rattus norvegicus (Rat) 5 6619 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Bakers yeast) 6 5885 Bos taurus (Bovine) 7 4976 Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) 8 4431 Escherichia coli (strain K12) 9 4244 Bacillus subtilis 10 4122 Dictyostelium discoideum (Slime ...
Thus, for the same protein, different sets of binding-site residues might be obtained depending on the PDB structure that is considered, and a residue of a protein may be defined as binding-site residue in one PDB structure but as non-binding-site residue in another. This inconsistency can cause serious problems in research. Thus, for a given protein, researchers need to identify all PDB structures that contain the protein, and calculate binding-site residues on the protein using all of them.. After users have found all the PDB structures that contain a given protein, the protein sequences shown in different PDB structures must be aligned properly to combine the binding-site information obtained from different structures. This step is not as simple as it may first appear. It cannot be done by matching the sequence indexes of residues in the PDB structures, because the same protein chain may have different sequence indexing in different PDB structures. For example, 1qqi_A and 1gxp_A are the same ...
Summary of the gene family classification of four related species, Cyclina sinensis, Crassostrea gigas, Lottia gigantea and Capitella teleta.Only putative pepti
An improved multistage intelligent database search method includes (1) a prefilter that uses a precomputed index to compute a list of most
The table below provides information about proteins whose structures have been determined by solid-state NMR, to a resolution sufficient to have resulted in a file deposited with the worldwide Protein Data Bank (wwPDB). Here is the NMR page of the wwPDB ...
Accession numbers must be cited immediately following the Materials and Methods section. Accession numbers are unique identifiers in bioinformatics allocated to nucleotide and protein sequences to allow tracking of different versions of that sequence record and the associated sequence in a data repository [e.g., databases at the National Center for Biotechnical Information (NCBI) at the National Library of Medicine (GenBank) and the Worldwide Protein Data Bank]. There are different types of accession numbers in use based on the type of sequence cited, each of which uses a different coding. Authors should explicitly mention the type of accession number together with the actual number, bearing in mind that an error in a letter or number can result in a dead link in the online version of the article. Please use the following format: accession number type ID: xxxx (e.g., MMDB ID: 12345; PDB ID: 1TUP). Note that in the final version of the electronic copy, accession numbers will be linked to the ...
Are you a structural biologist looking for an exciting career change in 2016?. We are looking to recruit an expert structural biologist (with experience in structure determination) to join the Protein Data Bank in Europe curation team (PDBe: pdbe.org) at the European Bioinformatics Institute (EMBL-EBI, Cambridge, UK: ebi.ac.uk) as a Scientific Data Curator. The work involves annotating preliminary PDB and Electron Microscopy Data Bank (EMDB) submissions and extracting relevant biological information. In addition, curators contribute to training, outreach and user-support activities of PDBe and the EMBL-EBI.. For more information, please go to:. https://ig14.i-grasp.com/fe/tpl_embl01.asp?newms=jj&id=54423&aid=15470. ...
The Protein Identifier Mapping Service provides a free interface to resolve protein identifiers across multiple databases that correspond to the same logical protein.
Protein structure mining using a structural alphabet.: Protein structure mining using a structural alphabet. . Biblioteca virtual para leer y descargar libros, documentos, trabajos y tesis universitarias en PDF. Material universiario, documentación y tareas realizadas por universitarios en nuestra biblioteca. Para descargar gratis y para leer online.
Biomedical applications drive all aspects of our methods development efforts. We do this through collaborations with biomedical scientists, and through primary biomedical research within the CCSB. We also have a strong commitment to biomedical education and outreach, using CCSB tools to disseminate the results of biomedical research to diverse audiences.. As part of the HIVE Center, we are studying HIV and its interaction with host cells throughout the viral life cycle.. In collaboration with Barry Sharpless, we are designing specific covalent inhibitors and applying them to multiple biomedical targets.. Working with PDB-101, the outreach/education portal of the RCSB Protein Data Bank, we produce many materials and resources for use in education and outreach.. ...
Figure 1. Above is a Jmol image of the consensus V3 loop of gp120. The partially-hidden nature of the conserved region of gp120 makes it difficult for our bodies to develope effective neutralizing antibodies. The image is from the RCSB Protein Data Bank. PDB 1CE4. Antibodies specific to gp120 and the gp41 envelope proteins (Janeway et al, 2005) can be found in plasma of infected patients within weeks of initial infection (Paul, 2003), and may play a role in minimizing viral impact during the asymptomatic period, but are unable to clear an infection. Despite the early presence of HIV-specific antibodies, the high levels of antibodies with the ability to neutralize viruses are generally only found in long-term nonprogressors (Paul, 2003). Two trimers of gp120 and gp41 create the envelope protein gp160, which is heavily glycosylated. CD4 T cells bind gp120 on a depression in the protein (Paul, 2003). The virus also binds chemokine receptors on another depressed site on gp120 as co-receptors. Both ...
Mitochondrial tRNAs have been the subject of study for structural biologists interested in their secondary structure characteristics, evolutionary biologists have researched patterns of compensatory and structural evolution and medical studies have been directed towards understanding the basis of human disease. However, an up to date, manually curated database of mitochondrially encoded tRNAs from higher animals is currently not available. We obtained the complete mitochondrial sequence for 277 tetrapod species from GenBank and re-annotated all of the tRNAs based on a multiple alignment of each tRNA gene and secondary structure prediction made independently for each tRNA. The mitochondrial (mt) tRNA sequences and the secondary structure based multiple alignments are freely available as Supplemental Information online. We compiled a manually curated database of mitochondrially encoded tRNAs from tetrapods with completely sequenced genomes. In the course of our work, we reannotated more than 10% of all
Although domain-centric annotations hold great promise in describing phenotypic nature of independent domains, most domains themselves may not just work alone. In multi-domain proteins, they may be combined together to form distinct domain architectures. The recombination of the existing domains is considered as one of major driving forces for phenotypic diversificaation. As an extension, we have also generated supra-domain phenotype ontology and its annotations. Compared to domain-centric phenotype ontology and annotations (SCOP domains at the Superfamily level and Family level), this version focuses on supra-domains and individual SCOP domains ONLY at the Superfamily level. Besides, in terms of individual superfamilies, their annotations from the domain-centric version may be different from those from supra-domains version. Depending on your focus, the former should be used for the consideration of both the Superfamily level and Family level, otherwise the latter should be used if you are ...
There are 330 cases currently listed in Australia. Results are displayed 25 per page. Login or create an account for additional advocacy tools, including e-mail notifications when updates are posted to selected cases.. Pages: 1 2 3 4 5 6 7 8 9 10 Next» ...
There are 509 cases currently listed in the US state of VA. Results are displayed 25 per page. Login or create an account for additional advocacy tools, including e-mail notifications when updates are posted to selected cases.. Pages: «Prev 6 7 8 9 10 11 12 13 14 15 Next» ...
InterPro is an integrated resource for protein families, domains, and active sites. The resource provides an invaluable means for automatic classification of protein sequences into families or domains with a view to providing functional annotation for the proteins. It constitutes an amalgamation of the major protein signature databases: PROSITE, PRINTS, Pfam, ProDom, SMART, TIGRFAMs, PIR SuperFamily, and SUPERFAMILY into a unified database where similarities and differences between the signatures from each of these databases are rationalized for ease of use. All signatures representing the same family or domain are collated into unique InterPro entries, with annotation and a list of the proteins in UniProt that these signatures match. New sequences not available in UniProt can be run through all signatures in InterPro using the InterProScan software. InterPro is useful for large-scale classification of whole genomes, as well as for functional annotation of individual protein sequences. ...
... builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), CharProtDB, MetaCyc, EcoCyc, REBASE, and the Fitness Browser. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E , 0.001. To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears ...
Tutor: Micaela Lewinson. Bioinformatics is an interdisciplinary area of study linking computational tools and databases to biology. Topics such as DNA sequence analysis, genome sequencing, expression of genes, 3D-structures of proteins are all considered parts of bioinformatics. In my Bioinformatics course (BIO 260) students are introduced to this exciting new interdisciplinary area spanning computational science and biology. This is a non-traditional biology course - it does not have a "wet lab", but students spend a significant time in the computer lab using various databases (such as DNA sequence databases, protein structure databases) and software packages to solve problems in biology. One of the class projects is analysis and annotation of a previously unpublished, newly sequenced genome. Through this project students have an opportunity to use digital research to make a novel contribution to science! This exciting project is made possible through my participation in a multi-institution ...
ARLINGTON Va.- The assets of the Protein Data Bank (PDB) justkeep ...The PDB holds the three-dimensional structures of nearly 24000p...This month with a doubling in the number of the federal agencies...Mary Clutter assistant director for NSFs Directorate forBiolog... Biological processes involve small molecular machines shesaid...,Protein,data,bank,opens,new,era,with,broader,support,biological,biology news articles,biology news today,latest biology news,current biology news,biology newsletters
Technical Library Search Help provides information on wildcards, boolean operators, Google searches, and other tips to improve your search results.
08h30 Protein sequence databases: theory. 10h30 COFFEE BREAK. 11h00 Controlled vocabularies and standardization resources: theory. 12h15 LUNCH. 13h30 Protein sequence databases and Gene Ontology: practicals. 15h00 COFFEE BREAK. 15h30 Analysis tools using ontologies : theory. 16h00 Protein sequence databases and Gene Ontology: practicals. 17h00 Evaluation / Exam. 18h00 END ...
RGD:2851, RGD:2850, FB:FBgn0087012, MGI:MGI:109323, UniProtKB:P28335, UniProtKB:P30939, UniProtKB:O46635, UniProtKB:P08908, FB:FBgn0263116, FB:FBgn0004168, MGI:MGI:96274, MGI:MGI:96276, MGI:MGI:96273, UniProtKB:P41595, PANTHER:PTN000664111, UniProtKB:P47898, RGD:61800, UniProtKB:A0A0B4KFU6, UniProtKB:P28566, UniProtKB:Q13639, UniProtKB:P28222, UniProtKB:P28223, UniProtKB:Q50DZ8, UniProtKB:P28221, RGD:71034, FB:FBgn0004573, MGI:MGI:96281, MGI:MGI:96284, WB:WBGene00004776, RGD:2848, RGD:62044, RGD:62388, WB:WBGene00004779, RGD:2846, RGD: ...
Current Protein & Peptide Science covers a field by discussing research from the leading laboratories in a field and should pose questions for futu...
Current Protein & Peptide Science covers a field by discussing research from the leading laboratories in a field and should pose questions for futu...
Genomic locations of UniProt/SwissProt variants are labeled with the amino acid change at a given position and, if known, the abbreviated disease name. A ? is used if there is no disease annotated at this location, but the protein is described as being linked to only a single disease in UniProt. Mouse over a mutation to see the UniProt comments. Artificially-introduced mutations are colored green and naturally-occurring variants are colored red. For full information about a particular variant, click the UniProt variant linkout. The UniProt record linkout lists all variants of a particular protein sequence. The Source articles linkout lists the articles in PubMed that originally described the variant(s) and were used as evidence by the UniProt curators. ...
Genomic locations of UniProt/SwissProt variants are labeled with the amino acid change at a given position and, if known, the abbreviated disease name. A ? is used if there is no disease annotated at this location, but the protein is described as being linked to only a single disease in UniProt. Mouse over a mutation to see the UniProt comments. Artificially-introduced mutations are colored green and naturally-occurring variants are colored red. For full information about a particular variant, click the UniProt variant linkout. The UniProt record linkout lists all variants of a particular protein sequence. The Source articles linkout lists the articles in PubMed that originally described the variant(s) and were used as evidence by the UniProt curators. ...
2008 yılında Ege Üniversitesi İstatistik bölümünden lisans derecesiyle mezun oldu. 2011 yılında Ege Üniversitesi Tıp Fakültesi Biyoistatistik Anabilim Dalından yüksek lisans derecesini aldı. Yüksek lisans tez çalışmasında, biyoeşdeğerlik çalışmaları üzerine çalıştı. 2016 yılında Hacettepe Üniversitesi Tıp Fakültesi Biyoistatistik Anabilim Dalından doktora derecesini aldı. Doktora çalışması kapsamında 2015-2016 yılları arasında University of California San Diego üniversitesinde bulunan San Diego Supercomputer Centera bağlı RCSB Protein Data Bankta doktora tez çalışmasını yürüttü. Doktora tez çalışması kapsamında Protein Data Bankta bulunan protein yapılarının kestirimi üzerine yeni yöntemler geliştirdi. 2017 yılından itibaren Trakya Üniversitesi Tıp Fakültesi Biyoistatistik ve Tıbbi Bilişim Anabilim Dalıında öğretim üyesi olarak görev yapmaktadır. İlgi alanları makine öğrenimi, proteomik, genetik ve ...
AE006468.YBAD Location/Qualifiers FT CDS 469392..469841 FT /codon_start=1 FT /transl_table=11 FT /gene="ybaD" FT /locus_tag="STM0415" FT /product="putative transcriptional regulator" FT /note="similar to E. coli orf, hypothetical protein FT (AAC73516.1); Blastp hit to AAC73516.1 (149 aa), 95% FT identity in aa 1 - 149" FT /db_xref="EnsemblGenomes-Gn:STM0415" FT /db_xref="EnsemblGenomes-Tr:AAL19369" FT /db_xref="GOA:P0A2M1" FT /db_xref="InterPro:IPR003796" FT /db_xref="InterPro:IPR005144" FT /db_xref="UniProtKB/Swiss-Prot:P0A2M1" FT /protein_id="AAL19369.1" FT /translation="MHCPFCFAVDTKVIDSRLVGEGSSVRRRRQCLVCNERFTTFEVAE FT LVMPRVIKSNDVREPFNEDKLRSGMLRALEKRPVSADDVEMALNHIKSQLRATGEREVP FT SKMIGNLVMEQLKKLDKVAYIRFASVYRSFEDIKDFGEEIARLQD" MHCPFCFAVD TKVIDSRLVG EGSSVRRRRQ CLVCNERFTT FEVAELVMPR VIKSNDVREP 60 FNEDKLRSGM LRALEKRPVS ADDVEMALNH IKSQLRATGE REVPSKMIGN LVMEQLKKLD 120 KVAYIRFASV YRSFEDIKDF GEEIARLQD 149 ...
75% of the Pfam model length. long_domain=1 sequences are found in library_long_domains.fa.gz and in library_all_domains.fa.gz non_redundant Useful to calculate family size "0" flags a redundant domain that overlaps with another with longer sequence homology annotation "1" flags the non-redundant domain with the longer sequence homology annotation ======================= 3. Supplementary Annotation files ================================== pfam_to_clan.txt - Lists the pfam family to clan superfamily correspondence. Note: The annotations on this database are at the superfamily level, which we recommend for homology evaluation. See the FAQ.txt and (Gonzalez and Pearson, NAR, 2010) for more details of why coalescing superfamilies is the preferred choice when evaluating homology. refprotdom_domain_bound_ext.txt - Lists the domains that in pfam v.21 were annotated as partial homologies whose coordinates we extended. Current uniprot accessions and sequence ids are provided, as well as the corresponding ...
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. The complete paper is available on-line by following links from the PubMed website ...
1zlm: Function and biology annotation of Crystal structure of the SH3 domain of human osteoclast stimulating factor. Includes SCOP, CATH, InterPro, GO and Intenz annotation.
CP000247.PE653 Location/Qualifiers FT CDS complement(696239..696502) FT /codon_start=1 FT /transl_table=11 FT /locus_tag="ECP_0661" FT /product="hypothetical protein" FT /db_xref="EnsemblGenomes-Gn:ECP_0661" FT /db_xref="EnsemblGenomes-Tr:ABG68689" FT /db_xref="InterPro:IPR007454" FT /db_xref="InterPro:IPR027471" FT /db_xref="UniProtKB/Swiss-Prot:Q0TK42" FT /protein_id="ABG68689.1" FT /translation="MKTKLNELLEFPTPFTYKVMGQALPELVDQVVEVVQRHAPGDYTP FT TVKPSSKGNYHSVSITINATHIEQVETLYEELGKIDIVRMVL" MKTKLNELLE FPTPFTYKVM GQALPELVDQ VVEVVQRHAP GDYTPTVKPS SKGNYHSVSI 60 TINATHIEQV ETLYEELGKI DIVRMVL 87 ...
Biochemistry is a rich source of important computational problems that should be of interest to mathematicians, computer scientists and engineers. The dramatic drop in the cost of sequencing DNA as well as progress in several structural genomics initiatives have created many new and exciting opportunities. ...
Using X-ray crystallography, researchers at the University of Pittsburgh School of Medicine led by structural biologist Joanne I. Yeh, Ph.D., have become the first to decipher the three-dimensional st...
MedeA 2.19 provides a range of enhancements and new capabilities in the MedeA® environment.. MedeA® database search results can now be directly linked to MedeA-Flowchart calculations for efficient computational screening studies. You can search InfoMaticA databases (InfoMaticA provides access to hundreds of thousands of crystallographic structures), generate targeted selections of structures, optionally modify stoichiometry, and submit these structures for efficient first-principles property evaluation using MedeA-Flowcharts. A range of capabilities are combined in MedeA-HighThroughput to facilitate such calculations, these enhancements include the efficient generation of structure lists and analysis tools.. MedeA-Morphology allows you to analyze the macroscopic consequences of interatomic forces and crystal symmetry in terms of crystal shape. Taking as input a defined unit cell and symmetry, MedeA-Morphology computes the equilibrium crystal shape, based on BFDH rules, and allows users to ...
[email protected]: Protease cleavage sites predicted with PeptideCutter from the Expert Protein Analysis System (ExPASy) proteomics server of the Swiss Institute of Bioinformatics (SIB ...
[email protected]: Protease cleavage sites predicted with PeptideCutter from the Expert Protein Analysis System (ExPASy) proteomics server of the Swiss Institute of Bioinformatics (SIB ...
Protein structures are stabilized using noncovalent interactions. In addition to the traditional noncovalent interactions, newer types of interactions are thought to be present in proteins. One such interaction, an anion-p pair, in which the positively charged edge of an aromatic ring interacts with an anion, forming a favorable anion-quadrupole interaction, has been previously proposed [Jackson, M. R., et al. (2007) J. Phys. Chem. B111, 8242?8249]. To study the role of anion-? interactions in stabilizing
Martindale features a comprehensive collection of federal and state legal research sources. Use our services to find the materials you need.
Results from your Search are visible below. You may further refine your search using the Refine Your Results links below, right. Clicking on one of the refinement items will return a subset of your original search. To return to your original results, simply choose the Any [term] link at the top of each section. You may also sort your results, either by Relevance/Ranking (default result), Title, or Date Modified. ...
Results from your Search are visible below. You may further refine your search using the Refine Your Results links below, right. Clicking on one of the refinement items will return a subset of your original search. To return to your original results, simply choose the Any [term] link at the top of each section. You may also sort your results, either by Relevance/Ranking (default result), Title, or Date Modified. ...
Other names in common use include beta-ketoacyl-[acyl-carrier protein](ACP) reductase, beta-ketoacyl acyl carrier protein (ACP ... 3R)-3-hydroxyacyl-[acyl-carrier-protein] + NADP+ ⇌. {\displaystyle \rightleftharpoons }. 3-oxoacyl-[acyl-carrier-protein] + ... beta-ketoacyl-acyl carrier protein reductase, 3-ketoacyl acyl carrier protein reductase, 3-ketoacyl ACP reductase, NADPH- ... acylcarrier-protein] dehydrase, and enoyl-[acyl-carrier-protein] reductase from Spinacia oleracea leaves". Arch. Biochem. ...
pathway databases. *2D and 3D protein descriptors[26]. General[edit]. *Python wrapper; see Cinfony ... "ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins". BMC ...
CDD: conserved protein domain database. *PopSet: population study data sets (epidemiology). *GEO Profiles: expression and ... Databases[edit]. Entrez searches the following databases: *PubMed: biomedical literature citations and abstracts, including ... The Entrez (pronounced ɒnˈtreɪ[1]) Global Query Cross-Database Search System is a federated search engine, or web portal that ... 41 (Database issue): D8-D20. doi:10.1093/nar/gks1189. PMC 3531099. PMID 23193264.. ...
"Deep Question Answering for protein annotation". Database (Oxford). 2015. doi:10.1093/database/bav081. PMC 4572360 . PMID ... The common feature of all these systems is that they had a core database or knowledge system that was hand-written by experts ... A QA implementation, usually a computer program, may construct its answers by querying a structured database of knowledge or ... More sophisticated questioners expect answers that are outside the scope of written texts or structured databases. To upgrade a ...
EGF at the Human Protein Reference Database.. *Epidermal+growth+factor at the US National Library of Medicine Medical Subject ... Wnt-protein binding. • protein binding. • growth factor activity. • Wnt-activated receptor activity. • protein tyrosine kinase ... positive regulation of protein ubiquitination involved in ubiquitin-dependent protein catabolic process. • angiogenesis. • Wnt ... positive regulation of protein tyrosine kinase activity. • activation of transmembrane receptor protein tyrosine kinase ...
"Estrogen (G protein coupled) Receptor". IUPHAR Database of Receptors and Ion Channels. International Union of Basic and ... G protein-coupled estrogen receptor 1 (GPER), also known as G protein-coupled receptor 30 (GPR30), is a protein that in humans ... This protein is a member of the rhodopsin-like family of G protein-coupled receptors and is a multi-pass membrane protein that ... protein binding. • signal transducer activity. • mineralocorticoid receptor activity. • steroid binding. • G-protein coupled ...
A Protein-Protein Interaction Database for Maize". Plant Physiology. 170 (2): 618-626. doi:10.1104/pp.15.01821. ISSN 1532-2548 ... Protein function prediction[edit]. Protein interaction networks have been used to predict the function of proteins of unknown ... The yeast interactome, i.e. all protein-protein interactions among proteins of Saccharomyces cerevisiae, has been estimated to ... The basic unit of a protein network is the protein-protein interaction (PPI). While there are numerous methods to study PPIs, ...
Other databases: Protein Data Bank, Ensembl and InterPro. *Specialised genomic databases: BOLD, Saccharomyces Genome Database, ... Secondary databases: UniProt, database of protein sequences grouping together Swiss-Prot, TrEMBL and Protein Information ... is a database annotating intrinsic disorder in proteins.. PANTHER is a large collection of protein families that have been ... InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins ...
Orientations of Proteins in Membranes database. *Opportunistic Mesh, a wireless networking technology ...
... the Ribosomal Protein Gene database". Nucleic Acids Res. 32 (Database issue): D168-70. doi:10.1093/nar/gkh004. PMC 308739. PMID ... 40S ribosomal proteins[edit]. The table "40S ribosomal proteins" shows the individual protein folds of the 40S subunit colored ... Nomenclature according to the ribosomal protein gene database, applies to H. sapiens and T. thermophila ... Proteins shared only between eukaryotes and archaea are shown as orange ribbons and proteins specific to eukaryotes are shown ...
Databases: *. Roberts RJ, Vincze T, Posfai, J, Macelis D. "REBASE". Archived from the original on 2016-12-30. Retrieved 2008-06 ... They are used to assist insertion of genes into plasmid vectors during gene cloning and protein production experiments. For ... 35 (Database issue): D269-70. doi:10.1093/nar/gkl891. PMC 1899104. PMID 17202163.. ... RCSB Protein Data Bank. Archived from the original on 2008-05-31. Retrieved 2008-06-06.. ...
... -protein binding databaseEdit. BioLiP is a comprehensive ligand-protein interaction database, with the 3D structure of ... It provides the linkage to protein targets such as its location in the biochemical pathways, SNPs and protein/RNA baseline ... Often bulky ligands are employed to simulate the steric protection afforded by proteins to metal-containing active sites. Of ... MANORAA is a webserver for analyzing conserved and differential molecular interaction of the ligand in complex with protein ...
PRIDB: a protein-RNA interface database. Nucleic Acids Research. 2016-11-07, 39 (Database issue): D277-D282. ISSN 0305-1048. ... RadA protein is an archaeal RecA protein homolog that catalyzes DNA strand exchange. Genes Dev. 1998, 12 (9): 1248-53. PMC ... Bank, RCSB Protein Data. RCSB Protein Data Bank - RCSB PDB. (原始内容存档于2012年12月27日).. ... RNA-binding proteins in human genetic disease. Trends in genetics: TIG. 2008-08-01, 24 (8): 416-425. ISSN 0168-9525. PMID ...
Protein Data Base (PDB), Sterol Regulatory Element Binding 1A structure.. *v. *t ... SREB proteins are indirectly required for cholesterol biosynthesis and for uptake and fatty acid biosynthesis. These proteins ... proteins. However, in contrast to E-box-binding HLH proteins, an arginine residue is replaced with tyrosine making them capable ... SREBP precursors are retained in the ER membranes through a tight association with SCAP and a protein of the INSIG family. ...
... MAPS - Comprehensive lipid and lipid-associated gene/protein databases.. *LipidBank - Japanese database of lipids and ... nuclear located protein kinase C and cyclic AMP-dependent protein kinase". Frontiers in Bioscience. 13 (13): 1206-26. doi: ... Protein-lipid interaction. *Phenolic lipid, a class of natural products composed of long aliphatic chains and phenolic rings ... Parodi AJ, Leloir LF (April 1979). "The role of lipid intermediates in the glycosylation of proteins in the eucaryotic cell". ...
January 2004). "Human protein reference database as a discovery resource for proteomics". Nucleic Acids Res. 32 (Database issue ... the Human Protein Reference Database (Hprd), contains manually annotated and curated entries for human proteins. The ... Such tools generally take a query such as a DNA, RNA, or protein sequence or 'keyword' and search one or more databases for ... Many public databases are already extensively linked so that complementary information in another database is easily accessible ...
His research group hosts the Database of Interacting Proteins.[16] Career[edit]. *Postdoctoral research, Princeton University ( ... "The Database of Interacting Proteins: 2004 update". Nucleic Acids Research. 32 (90001): D449-D451. doi:10.1093/nar/gkh086. PMC ... Proteins[2]. Amyloid[3]. Structural biology[4][5][6]. Institutions. Howard Hughes Medical Institute. University of Oxford. ... David Eisenberg's publications indexed by the Scopus bibliographic database. (subscription required) *^ Eisenberg, David J. ( ...
"Human recombinant activated protein C for severe sepsis". Cochrane Database of Systematic Reviews (4): CD004388. doi:10.1002/ ... The exact mechanism for this protein is currently not known, but efforts continue to isolate activated protein C mutants that ... Drotrecogin alfa (activated) (Xigris, marketed by Eli Lilly and Company) is a recombinant form of human activated protein C ... In vitro data suggest that activated protein C exerts an antithrombotic effect by inhibiting factors Va and VIIIa, and that it ...
"Database of Protein, Chemical, and Genetic Interactions , BioGRID". thebiogrid.org. Retrieved 2016-04-25.. ... Protein[edit]. Figure 1: A basic schematic of Polδ function at the DNA replication fork. The Polδ complex (p125, p66, p50 and ... Protein name in human Homo sapiens Mus musculus Saccharomyces cerevisiae Schizosaccharomyces pombe ... "NCBI CDD Conserved Protein Domain DNA_polB_delta_exo". www.ncbi.nlm.nih.gov. Retrieved 2016-04-25.. ...
... a structural classification of proteins database". Nucleic Acids Research. 25 (1): 236-9. doi:10.1093/nar/25.1.236. PMC 146380 ... Tertiary Protein Structure and Folds: section 4.3.2.1. From Principles of Protein Structure, Comparative Protein Modelling, and ... Richardson JS (1981). Anatomy and Taxonomy of Protein Structures. Advances in Protein Chemistry. 34. pp. 167-339. doi:10.1016/ ... Hutchinson EG, Thornton JM (1990). "HERA--a program to draw schematic diagrams of protein secondary structures". Proteins. 8 (3 ...
It affects the production of multiple proteins, including lipoproteins, binding proteins, and proteins responsible for blood ... Chemicals Identified in Human Biological Media: A Data Base. Design and Development Branch, Survey and Analysis Division, ... not protein bound). 0.5[72][original research?]. 9[72][original research?]. pg/mL ... Wu CH, Motohashi T, Abdel-Rahman HA, Flickinger GL, Mikhail G (August 1976). "Free and protein-bound plasma estradiol-17 beta ...
"Antenatal dietary education and supplementation to increase energy and protein intake". The Cochrane Database of Systematic ... "Aerobic exercise for women during pregnancy". Cochrane Database of Systematic Reviews. 3 (3): CD000180. doi:10.1002/14651858. ... Whitworth, M; Bricker, L; Mullan, C (14 July 2015). "Ultrasound for fetal assessment in early pregnancy". The Cochrane database ... The Cochrane Database of Systematic Reviews. 4: CD004905. doi:10.1002/14651858.CD004905.pub5. ISSN 1469-493X. PMID 28407219.. ...
ORegAnno - Open Regulatory Annotation Database. *Identifying a Protein Binding Sites on DNA molecule YouTube tutorial video ... Bidirectionally paired genes in the Gene Ontology database shared at least one database-assigned functional category with their ... Chaperone proteins are three times more likely, and mitochondrial genes are more than twice as likely. Many basic housekeeping ... In the case of a transcription factor binding site, there may be a single sequence that binds the protein most strongly under ...
"List of autophagy-related proteins and 3D structures". Autophagy Database. 290. Archived from the original on 2012-08-01 ... WIPI2, a PtdIns(3)P binding protein of the WIPI (WD-repeat protein interacting with phosphoinositides) protein family, was ... Without efficient autophagy, neurons gather ubiquitinated protein aggregates and degrade. Ubiquitinated proteins are proteins ... This allows unneeded proteins to be degraded and the amino acids recycled for the synthesis of proteins that are essential for ...
"Automated assembly of protein blocks for database searching". Nucleic Acids Res. 19 (23): 6565-72. doi:10.1093/nar/19.23.6565. ... First 90 positions of a protein multiple sequence alignment of instances of the acidic ribosomal protein P0 (L10E) from several ... A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, generally protein, DNA, or ... For proteins, this method usually involves two sets of parameters: a gap penalty and a substitution matrix assigning scores or ...
Integrative neuroscience attempts to consolidate these observations through unified descriptive models and databases of ... proteins, and chemical coupling to network oscillations, columnar and topographic architecture, and learning and memory. ...
... literature and increasingly reliable computational predictions have resulted in creation of vast databases of protein ... Protein-protein interactions Functional associations Protein-protein interaction databases Pathways Protein-protein interaction ... Szklarczyk D., Jensen L.J. (2015) Protein-Protein Interaction Databases. In: Meyerkord C., Fu H. (eds) Protein-Protein ... Here we present an overview of the most widely used protein-protein interaction databases and the methods they employ to gather ...
Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metage- ... The current release of Pfam (22.0) contains 9318 protein families. ... Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile ... title = {Pfam protein families database},. booktitle = {Nucleic Acids Research, 2008, 36(Database issue): D281-D288},. year ...
Orientations of Proteins in Membranes (OPM) database provides spatial positions of membrane protein structures with respect to ... The database provides spatial arrangement of proteins in the lipid bilayer. Data types. captured. Protein structures from the ... The database was widely used in experimental and theoretical studies of membrane-associated proteins.[13][14][15][16][17] ... ST NetWatch: Protein Databases review of OPM in Signal Transduction NetWatch list from Science ...
Im trying to find a way of searching databases for proteins (ideally from ,E.Coli) with a mass in a given range (say between ... databases search - protein size. bionet at cgmvax.cgm.cnrs-gif.fr bionet at cgmvax.cgm.cnrs-gif.fr Fri May 6 10:43:33 EST 1994 ...
The Princeton Protein Orthology Database (P-POD), developed by the Genome Databases Group at Princeton, computes and displays ... Download The Princeton Protein Orthology Database for free. ... The Princeton Protein Orthology Database. Status: Beta. Brought ... Follow The Princeton Protein Orthology Database. The Princeton Protein Orthology Database Web Site ... The Princeton Protein Orthology Database (P-POD), developed by the Genome Databases Group at Princeton, computes and displays ...
... of protein family databases for automatic protein functional classification increases ... As new protein sequences continue to flood into public databases with the advancement of sequencing technologies, the ... Database 10.1093/database/base019. Bru C, Courcelle E, Carrere S et al. (2005) The ProDom database of protein domain families: ... Database 10.1093/database/bar033. Redfern O, Grant A, Maibaum M and Orengo C (2005) Survey of current protein family databases ...
2010 Jan;38(Database issue):D211-22. doi: 10.1093/nar/gkp985. Epub 2009 Nov 17. Research Support, Non-U.S. Govt ... The Pfam protein families database.. Finn RD1, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, ... Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have ... New Pfam display of a protein domain architecture. Pfam-A families classified as type family and domain with a lozenge ...
2014 Jan;42(Database issue):D222-30. doi: 10.1093/nar/gkt1223. Epub 2013 Nov 27. Research Support, Non-U.S. Govt ... Pfam: the protein families database.. Finn RD1, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington ... is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0 ... The results are shown with the protein coordinates of the open reading frame, but it is also possible to toggle this to DNA ...
Dear Yeastnetters, A new database for the proteins of S. cerevisiae has been released from the QUEST Protein Database Center at ... This database, called YPD (Yeast Protein Database), can be downloaded from ftp isis.cshl.org in directory pub/yeast/YPD. The ... The database will be updated periodically to include new yeast proteins that appear in the sequence databases and to add new ... Yeast_Protein_Database. Jim Garrels quest7!jg at ISIS.CSHL.ORG Wed Nov 23 22:54:43 EST 1994 *Previous message: NO SUBJECT ...
GLYCINE SOYA PROTEIN; GLYCINE SOYA PROTEINS; PROTEINS, GLYCINE SOYA; PROTEINS, SOY; PROTEINS, SOYBEAN; SOY PROTEIN; SOY PROTEIN ... About GLYCINE SOJA PROTEIN: Glycine Soja (Soybean) Protein is a protein obtained from the soybean, Glycine soja.. Function(s): ... Synonym(s): GLYCINE SOJA (SOYBEAN) PROTEIN, GLYCINE HISPIDA PROTEIN; ...
15-million grant to combine three of the worlds current protein sequence databases into a single global resource. ... Throwing its financial support behind the concept of a centralized repository for protein data, the National Human Genome ... Dubbed the United Protein Database, or UniProt, the new, public database will combine the resources of three existing protein ... Funding for global protein database. NIH/National Human Genome Research Institute. Funder. National Institutes of Health. ...
Subtotal for Parent Omega Protein Corp: $120,000 Omega Protein Corp Lobbying by Industry. Industry. Total. ... Itemized Lobbying Expenses for Omega Protein Corp. Firms Hired. Total Reported by Filer. Reported Contract Expenses (included ...
The Nuclear Protein Database (NPD) contains information on proteins that are localized to the nuclei of vertebrate cells. Over ... When known, the sub-nuclear compartment where the protein was found is reported. The NPD also provides information on a ... proteins amino acid sequence, predicted size, and isoelectric point, as well as any repeats, motifs, or domains within the ... 1000 vertebrate proteins, mainly from mice and humans, are included. ...
Would ,, anybody be so kind to tell me about: ,, ,, (1). the protein structure databases (PIR, PDB, GenBank...) and their ,, ... Help about protein database.. Cornelius Krasel zxmkr08 at studserv.zdv.uni-tuebingen.de Wed Jun 9 05:05:00 EST 1993 *Previous ... Help about protein database. ,, ,, Hello friends: ,, ,, Im currently a computer science graduate student and going to have a ... Please send any suggestion or answer to: The only 3D database I know of is the protein data bank (PDB). All the others archive ...
Subtotal for Parent Plasma Protein Therapeutics Assn: $748,750 Plasma Protein Therapeutics Assn Lobbying by Industry. Industry ... Itemized Lobbying Expenses for Plasma Protein Therapeutics Assn. Firms Hired. Total Reported by Filer. Reported Contract ...
Beyond providing Skin Deep® as an educational tool for consumers, EWG offers its EWG VERIFIED™ mark as a quick and easily identifiable way of conveying personal care products that meet EWGs strict health criteria. Before a company can use EWG VERIFIEDTM on such products, the company must show that it fully discloses the products ingredients on their labels or packaging, they do not contain EWG ingredients of concern, and are made with good manufacturing practices, among other criteria. Note that EWG receives licensing fees from all EWG VERIFIED member companies that help to support the important work we do. Learn more , Legal Disclaimer ...
Pick a protein such as insulin, and the database specifies what proteins, nucleic acids, and other molecules it interacts with ... DNA often gets the glory, but hardworking proteins actually build our bodies and keep them running. Scientists can find out how ... Users can submit their own information to the database, which so far has information on more than 6200 interactions. The site ... You can also learn what biochemical pathways a particular protein participates in and whether it belongs to any larger ...
Features include a general database search, a graphical tool for visualizing the mitochondrial DNA sequences, and 3D structures ... and the Human Mitochondrial Genome Database. HMPDb is intended as a tool not only to aid in studying the mitochondrion but in ... conveniently consolidates information from a number of other databases, including GenBank, Online Mendelian Inheritance in Man ... for mitochondrial proteins. Users are welcome to contact the National Institute of Standards and Technology with corrections or ...
... is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0 ... Pfam: the protein families database Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30. doi: 10.1093/nar/gkt1223. Epub 2013 ... is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0 ... increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a ...
The information on protein function, as essentialcomponent of biological systems, is essential for the development of biology ... Keywords: protein; database; annotation; text mining; protein function; protein structure; protein domains; posttranslational ... APPRIS, as example of a secondary database based on the information provided by the core protein databases. The APPRIS database ... of the many protein databases, which status is periodically reviewed and maintained in The Molecular Biology Database ...
... able to identify proteins, characterize post-translational modifications, and... ... Protein identification from tandem mass spectra is one of the most versatile and widely used proteomics workflows, ... Protein identification MS/MS spectra Protein sequence databases Peptide identification Search engine ... Protein Identification from Tandem Mass Spectra by Database Searching. In: Wu C., Arighi C., Ross K. (eds) Protein ...
Protein Databases on the Internet. Dong Xu and Ying Xu. Version of Record online: 1 MAY 2001 , DOI: 10.1002/0471142727. ... Protein Databases on the Internet (pages 19.4.1-19.4.15). Dong Xu and Ying Xu ... Protein Databases on the Internet. Current Protocols in Molecular Biology. 68:19.4:19.4.1-19.4.15. ... Protein Databases on the Internet (pages 19.4.1-19.4.17). Dong Xu ...
The studys authors said they have been overwhelmed by reactions to their finding that curation of protein-protein interaction ... Study Finding Erroneous Protein-Protein Interactions in Curated Databases Stirs Debate. Jan 16, 2009 ... A new study published in Nature Methods that determined that literature-curated protein-protein interaction databases "can be ... Home » Tools & Technology » Informatics » Study Finding Erroneous Protein-Protein Interactions in Curated Databases Stirs ...
N. J. Edwards, "Protein identification from tandem mass spectra by database searching," Methods in Molecular Biology, vol. 694 ... Method for Rapid Protein Identification in a Large Database. Wenli Zhang1,2,3 and Xiaofang Zhao1 ... D. Li, Y. Fu, R. Sun et al., "pFind: a novel database-searching software system for automated peptide and protein ... "An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database," Journal of the ...
... Wenli Zhang1,2,3 and Xiaofang Zhao1 ... Wenli Zhang and Xiaofang Zhao, "Method for Rapid Protein Identification in a Large Database," BioMed Research International, ...
  • The creation of this database will allow performing further analyses by nanoLC-FT-MS, avoiding the systematic use of MS / MS, thus significantly increasing the speed of analysis and coverage of the studied proteome. (cea.fr)
  • For each protein, we produced quantitative profiles of localization scores for 16 subcellular compartments at single-cell resolution to trace proteome-wide relocalization in conditions over time. (g3journal.org)
  • Here, we describe CYCLoPs ( C ollection of Y east C ells Lo calization P attern s ), a web database resource that provides a central platform for housing and analyzing our yeast proteome dynamics datasets at the single cell level. (g3journal.org)
  • Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. (semanticscholar.org)
  • To aid future protein biomarker studies of disease and health from human plasma, we developed an online database, HIP 2 (Healthy Human Individual's Integrated Plasma Proteome). (biomedcentral.com)
  • The fluctuating nature of blood from different individuals, huge dynamic protein concentration ranges (up to 10 12 ), and the protein detection limits of most MS platforms, have made the plasma proteome elusive to define. (biomedcentral.com)
  • To overcome the poor coverage, potential bias, and complementary nature of each experimental measurement of the human plasma proteome, it is necessary for biomedical researchers to collect and assess all reliable publicly-available plasma protein data sets generated from different MS analytical and computational platforms for healthy individuals. (biomedcentral.com)
  • An important issue for the elucidation of the functional organization of the proteome is the extraction of information about protein complex formation and function from the PPI network. (biomedcentral.com)
  • Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. (psu.edu)
  • The current release of Pfam (22.0) contains 9318 protein families. (psu.edu)
  • Pfam is a widely used database of protein families and domains. (nih.gov)
  • New Pfam display of a protein domain architecture. (nih.gov)
  • Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. (nih.gov)
  • Pfam-B entries are automatically generated from the ProDom database =-=(3)-=-, and are represented by a single alignment. (psu.edu)
  • INTRODUCTION InterPro (1) is an integrative database which was founded 10 years ago when the PROSITE (2), PRINTS (3), Pfam (4) and ProDom =-=(5)-=- databases formed a consortium to amalgamate the predictive signatures they individually produced into a single resource. (psu.edu)
  • In ProtCID, protein chains in the protein data bank (PDB) are grouped based on their PFAM domain architectures. (semanticscholar.org)
  • Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. (nih.gov)
  • The UniProt database will become a resource for all scientists to use, both to develop a better understanding of biology and to translate that basic science into clinical applications. (eurekalert.org)
  • p>When browsing through different UniProt proteins, you can use the 'basket' to save them, so that you can back to find or analyse them later. (uniprot.org)
  • Each dataset is completed with manually added information including protein classifiers as well as automatically retrieved and updated information from public databases (UniProt and PubMed). (frontiersin.org)
  • The percentage of RBPs, the abundance of the various RBDs harboured by each strain have been graphically represented in this database and available alongside other files for user download. (ncbs.res.in)
  • The first is the identification of gel-separated, low abundance proteins based on amino acid sequence composition following coimmunoprecipitation with the human apoptosis inhibitor protein BclX(L). The second is the determination of the precise sites of phosphorylation of the human regulatory protein 4E-BP1, which controls mRNA translation. (nih.gov)
  • The easy clinical access and processing of plasma samples, and the abundance of proteins as well as metabolites that may collectively define a person's health status, have made human plasma the top choice among bio-fluids for future clinical molecular diagnostic applications. (biomedcentral.com)
  • It contains RBPs recorded from 614 complete E. coli proteomes available in the RefSeq database (as of October 2018). (ncbs.res.in)
  • The file YPD.doc provides a more complete description of the database and its data fields. (bio.net)
  • YPD currently contains data on 3050 proteins of known sequence. (bio.net)
  • BETHESDA, Md., Oct. 23, 2002 - Throwing its financial support behind the concept of a centralized repository for protein data, the National Human Genome Research Institute (NHGRI), in cooperation with five other institutes and centers at the National Institutes of Health (NIH), has awarded a three-year, $15-million grant to combine three of the world's current protein sequence databases into a single global resource. (eurekalert.org)
  • The only 3D database I know of is the protein data bank (PDB). (bio.net)
  • Perkins DN, Pappin DJ, Creasy DM et al (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. (springer.com)
  • Mass Spectrometry databases are a unique challenge for maintaining the vast quantity of data generated from an MS experiment due to both size and complexity issues. (wikibooks.org)
  • Although significant progress has been made in the standardization of these data types, there is still significant incongruence from one spectral database to another. (wikibooks.org)
  • Begun in 1970, the NIST standard reference database is a verbose collection of spectral data in a common data type, requiring both a minimal amount of data regarding the experiment as well as a standard format for the presentation of spectral data from a wide variety of MS applications. (wikibooks.org)
  • One such example of this database type is the Mass Spectrometry Database Committee's comprehensive drug library , which contains spectral data for pharmaceutical substances, metabolites, and intermediate compounds. (wikibooks.org)
  • The data found in the database comes from a vast range of experiments and is stored in a format that allows for simple and complex querying in a common format. (wikibooks.org)
  • The data contained within is primarily protein and peptide IDs, MS mass spectra, and any related metadata. (wikibooks.org)
  • The Protein-RNA Interface Database (PRIDB) is a database of protein-RNA interfaces extracted from the Protein Data Bank. (wikipedia.org)
  • Data base analysis of protein expression patterns during T-cell ontogeny and activation. (pnas.org)
  • We have developed a data base of lymphoid proteins detectable by two-dimensional polyacrylamide gel electrophoresis. (pnas.org)
  • Using this data base, we have compared the protein constituents of mature T cells and immature thymocytes before and after mitotic stimulation. (pnas.org)
  • Since then, six other member databases have also joined and their data has bee. (psu.edu)
  • IntAct captures protein interaction data from peer-reviewed literature and direct user submissions. (openhelix.com)
  • These data were then structured in AT_Chloro, a database specific of the chloroplast from Arabidopsis thaliana . (cea.fr)
  • If you have interaction data for SARS-CoV-2 (or other Coronavirus-related data) that you'd like to deposit directly into the BioGRID Database, please contact us at [email protected] . (thebiogrid.org)
  • It is an excellent resource for students and professionals involved with gene or protein expression data in a variety of settings. (wiley.com)
  • The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA-binding specificities of proteins. (harvard.edu)
  • This initial release of the UniPROBE database provides a centralized resource for accessing comprehensive PBM data on the preferences of proteins for all possible sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. (harvard.edu)
  • In total, the database hosts DNA-binding data for over 175 nonredundant proteins from a diverse collection of organisms, including the prokaryote Vibrio harveyi, the eukaryotic malarial parasite Plasmodium falciparum, the parasitic Apicomplexan Cryptosporidium parvum, the yeast Saccharomyces cerevisiae, the worm Caenorhabditis elegans, mouse and human. (harvard.edu)
  • We analyzed the degree of ambiguity of gene and protein names within and between dictionaries, to a lexicon of common English words and domain-related non-gene terms, and we compared different data sources in terms of size of extracted dictionaries and overlap of synonyms between those. (biomedcentral.com)
  • In conclusion, these results indicate that the combination of data contained in different databases allows the generation of gene and protein name dictionaries that contain significantly more used names than dictionaries obtained from individual data sources. (biomedcentral.com)
  • Modeling loops is an often necessary step in protein structure and function determination, even with experimental X-ray and NMR data. (uwaterloo.ca)
  • Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. (biomedcentral.com)
  • The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. (biomedcentral.com)
  • Trained over a non-redundant data set consisting of 2, 383 proteins and fed with sequence, evolutionary and structural properties, NB PPIPS achieves 60.7% recall and 34.6% precision in 10 fold cross-validation, which greatly improves over the baseline classifier that only utilizes protein sequence information. (iastate.edu)
  • The final data set contains 505 unique protein-peptide interface clusters from 1431 complexes. (kuleuven.be)
  • Binding site data for RBPs such as Argonaute 1-4, Insulin-like growth factor II mRNA-binding protein 1-3, TNRC6 proteins A-C, Pumilio 2, Quaking and Polypyrimidine tract binding protein can be visualized at the level of the genome and of individual transcripts. (pubmedcentralcanada.ca)
  • Despite the progress made during the past few decades, our knowledge about regulation of protein function by phosphorylation and the basis of kinase specificity remains incomplete, mainly because of lack of data. (biomedcentral.com)
  • ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. (eurekamag.com)
  • The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. (eurekamag.com)
  • The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. (eurekamag.com)
  • This database has been developed to be the comprehensive collection of healthy human plasma proteins, and has protein data captured in a relational database schema built to contain mappings of supporting peptide evidence from several high-quality and high-throughput mass-spectrometry (MS) experimental data sets. (biomedcentral.com)
  • Protein identification is commonly carried out by comparing MS data with public databases. (biomedcentral.com)
  • Data integration of PPIs focused specifically on protein complexes, subunits, and their functions. (biomedcentral.com)
  • Based on integrated PPI data and literature, we have developed a human protein complex database with a complex quality index (PCDq), which includes both known and predicted complexes and subunits. (biomedcentral.com)
  • We integrated six PPI data (BIND, DIP, MINT, HPRD, IntAct, and GNP_Y2H), and predicted human protein complexes by finding densely connected regions in the PPI networks. (biomedcentral.com)
  • The overlap of PPI data entities across databases is relatively low. (biomedcentral.com)
  • The results are shown with the 'protein' coordinates of the open reading frame, but it is also possible to toggle this to DNA sequence coordinates. (nih.gov)
  • TypeError: Cannot read property 'Run' of undefined at Object.proteinURL [as protein] (/usr/local/lib/node_modules/bionode-ncbi/lib/bionode-ncbi.js:431:26) at DestroyableTransform.transform [as _transform] (/usr/local/lib/node_modules/bionode-ncbi/lib/bionode-ncbi.js:410:17) at DestroyableTransform.Transform. (github.com)
  • recent references: [http://www.ncbi.nlm.nih.gov/pubmed/21071413 The BioGRID Interaction Database: 2011 update], [http://www.ncbi.nlm.nih.gov/pubmed/20489023 A global protein kinase and phosphatase interaction network in yeast. (openwetware.org)
  • The database contains information on about 300 abundant proteins of human myocardial tissue, including approximately 40 proteins that were identified by different methods. (uniprot.org)
  • 2011) Extending CATH: increasing coverage of the protein structure universe and linking structure with function. (els.net)
  • Therefore, if a region of protein sequence provides a highly significant match to a particular CATH-Gene3D FunFam, then there is a good chance they shares a similar function. (cathdb.info)
  • In 1974, Dr. Dayhoff devised the concept of the protein family and super-family, defined by sequence similarity, as a means of organizing and classifying proteins. (eurekalert.org)
  • As a result, when two proteins share a significant sequence similarity, it is extremely likely they will also share similar 3D structure. (cathdb.info)
  • Hanks and Hunter were the first to report that sequence similarity of kinase catalytic domains reflects protein kinase function and/or mode of regulation ( 3 , 4 ). (pubmedcentralcanada.ca)