PubMed: A bibliographic database that includes MEDLINE as its primary subset. It is produced by the National Center for Biotechnology Information (NCBI), part of the NATIONAL LIBRARY OF MEDICINE. PubMed, which is searchable through NLM's Web site, also includes access to additional citations to selected life sciences journals not in MEDLINE, and links to other resources such as the full-text of articles at participating publishers' Web sites, NCBI's molecular biology databases, and PubMed Central.Medical Subject Headings: Controlled vocabulary thesaurus produced by the NATIONAL LIBRARY OF MEDICINE. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity.MEDLINE: The premier bibliographic database of the NATIONAL LIBRARY OF MEDICINE. MEDLINE® (MEDLARS Online) is the primary subset of PUBMED and can be searched on NLM's Web site in PubMed or the NLM Gateway. MEDLINE references are indexed with MEDICAL SUBJECT HEADINGS (MeSH).Information Storage and Retrieval: Organized activities related to the storage, location, search, and retrieval of information.Abstracting and Indexing as Topic: Activities performed to identify concepts and aspects of published information and research reports.Databases, Bibliographic: Extensive collections, reputedly complete, of references and citations to books, articles, publications, etc., generally on a single subject or specialized subject area. Databases can operate through automated files, libraries, or computer disks. The concept should be differentiated from DATABASES, FACTUAL which is used for collections of data and facts apart from bibliographic references to them.Publications: Copies of a work or document distributed to the public by sale, rental, lease, or lending. (From ALA Glossary of Library and Information Science, 1983, p181)Randomized Controlled Trials as Topic: Works about clinical trials that involve at least one test treatment and one control treatment, concurrent enrollment and follow-up of the test- and control-treated groups, and in which the treatments to be administered are selected by a random process, such as the use of a random-numbers table.Periodicals as Topic: A publication issued at stated, more or less regular, intervals.Bibliometrics: The use of statistical methods in the analysis of a body of literature to reveal the historical development of subject fields and patterns of authorship, publication, and use. Formerly called statistical bibliography. (from The ALA Glossary of Library and Information Science, 1983)Data Mining: Use of sophisticated analysis tools to sort through, organize, examine, and combine large sets of information.Publication Bias: The influence of study results on the chances of publication and the tendency of investigators, reviewers, and editors to submit or accept manuscripts for publication based on the direction or strength of the study findings. Publication bias has an impact on the interpretation of clinical trials and meta-analyses. Bias can be minimized by insistence by editors on high-quality research, thorough literature reviews, acknowledgement of conflicts of interest, modification of peer review practices, etc.Natural Language Processing: Computer processing of a language with rules that reflect and describe current usage rather than prescribed usage.Internet: A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange.Search Engine: Software used to locate data or information stored in machine-readable form locally or at a distance such as an INTERNET site.Reference Books: Books designed by the arrangement and treatment of their subject matter to be consulted for definite terms of information rather than to be read consecutively. Reference books include DICTIONARIES; ENCYCLOPEDIAS; ATLASES; etc. (From the ALA Glossary of Library and Information Science, 1983)National Library of Medicine (U.S.): An agency of the NATIONAL INSTITUTES OF HEALTH concerned with overall planning, promoting, and administering programs pertaining to advancement of medical and related sciences. Major activities of this institute include the collection, dissemination, and exchange of information important to the progress of medicine and health, research in medical informatics and support for medical library development.User-Computer Interface: The portion of an interactive computer program that issues messages to and receives commands from a user.Government Publications as Topic: Discussion of documents issued by local, regional, or national governments or by their agencies or subdivisions.Evidence-Based Medicine: An approach of practicing medicine with the goal to improve and evaluate patient care. It requires the judicious integration of best research evidence with the patient's values to make decisions about medical care. This method is to help physicians make proper diagnosis, devise best testing plan, choose best treatment and methods of disease prevention, as well as develop guidelines for large groups of patients with the same disease. (from JAMA 296 (9), 2006)Publishing: "The business or profession of the commercial production and issuance of literature" (Webster's 3d). It includes the publisher, publication processes, editing and editors. Production may be by conventional printing methods or by electronic publishing.Databases, Factual: Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references.Treatment Outcome: Evaluation undertaken to assess the results or consequences of management and procedures used in combating disease in order to determine the efficacy, effectiveness, safety, and practicability of these interventions in individual cases or series.Vocabulary, Controlled: A specified list of terms with a fixed and unalterable meaning, and from which a selection is made when CATALOGING; ABSTRACTING AND INDEXING; or searching BOOKS; JOURNALS AS TOPIC; and other documents. The control is intended to avoid the scattering of related subjects under different headings (SUBJECT HEADINGS). The list may be altered or extended only by the publisher or issuing agency. (From Harrod's Librarians' Glossary, 7th ed, p163)Risk Factors: An aspect of personal behavior or lifestyle, environmental exposure, or inborn or inherited characteristic, which, on the basis of epidemiologic evidence, is known to be associated with a health-related condition considered important to prevent.Journal Impact Factor: A quantitative measure of the frequency on average with which articles in a journal have been cited in a given period of time.Database Management Systems: Software designed to store, manipulate, manage, and control data for specific uses.Terminology as Topic: The terms, expressions, designations, or symbols used in a particular science, discipline, or specialized subject area.Software: Sequential operating programs and data which instruct the functioning of a digital computer.Biomedical Research: Research that involves the application of the natural sciences, especially biology and physiology, to medicine.Review Literature as Topic: Published materials which provide an examination of recent or current literature. Review articles can cover a wide range of subject matter at various levels of completeness and comprehensiveness based on analyses of literature that may include research findings. The review may reflect the state of the art. It also includes reviews as a literary form.MedlinePlus: NATIONAL LIBRARY OF MEDICINE service for health professionals and consumers. It links extensive information from the National Institutes of Health and other reviewed sources of information on specific diseases and conditions.Genetic Predisposition to Disease: A latent susceptibility to disease at the genetic level, which may be activated under certain conditions.Duplicate Publication as Topic: Simultaneous or successive publishing of identical or near- identical material in two or more different sources without acknowledgment. It differs from reprinted publication in that a reprint cites sources. It differs from PLAGIARISM in that duplicate publication is the product of the same authorship while plagiarism publishes a work or parts of a work of another as one's own.Clinical Trials as Topic: Works about pre-planned studies of the safety, efficacy, or optimum dosage schedule (if appropriate) of one or more diagnostic, therapeutic, or prophylactic drugs, devices, or techniques selected according to predetermined criteria of eligibility and observed for predefined evidence of favorable and unfavorable effects. This concept includes clinical trials conducted both in the U.S. and in other countries.Meta-Analysis as Topic: A quantitative method of combining the results of independent studies (usually drawn from the published literature) and synthesizing summaries and conclusions which may be used to evaluate therapeutic effectiveness, plan new studies, etc., with application chiefly in the areas of research and medicine.Medical Informatics: The field of information science concerned with the analysis and dissemination of medical data through the application of computers to various aspects of health care and medicine.Subject Headings: Terms or expressions which provide the major means of access by subject to the bibliographic unit.Algorithms: A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task.Odds Ratio: The ratio of two odds. The exposure-odds ratio for case control data is the ratio of the odds in favor of exposure among cases to the odds in favor of exposure among noncases. The disease-odds ratio for a cohort or cross section is the ratio of the odds in favor of disease among the exposed to the odds in favor of disease among the unexposed. The prevalence-odds ratio refers to an odds ratio derived cross-sectionally from studies of prevalent cases.Artificial Intelligence: Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language.

Annual MEDLINE/PubMed Year-End Processing (YEP): Background Information

... is loaded into PubMed, and the regular MEDLINE/PubMed update schedule resumes. At this point PubMed uses the new year's MeSH ... The article, Pharmacologic Action Headings: PubMed®, describes changes to PubMed when a Pharmacologic Action term is available. ... Annual MEDLINE®/PubMed® Year-End Processing (YEP):. Background Information. What is YEP? , Impact on Fall Searching , Hints , ...

Clinical Query builds upon the decade of work that BIDMC has done to create its Clinical Data Repository (CDR). About 90% of the CDR already uses controlled vocabularies; however, the information contained within the CDR is spread across dozens of databases. Clinical Query consolidates the coded data into a single set of database tables. Bringing the final 10% of data into Clinical Query is an ongoing challenge--there is data that cannot be easily mapped to the ontologies we are using, and there is a small but continual stream of uncoded data entering the CDR.

I used the wikipedia API to fetch my contributions. Creating an article is actually fast thanks to my firefox/WP extension and a xslt stylesheet pubmed2wiki. Biographies are found using this pubmed query

"""Example script showing how to interact with PubMed.""" # standard library import string # biopython from Bio import PubMed from Bio import Medline # do the search and get the ids search_term = 'orchid' orchid_ids = PubMed.search_for(search_term) print orchid_ids # access Medline through a dictionary interface that returns PubMed Records rec_parser = Medline.RecordParser() medline_dict = PubMed.Dictionary(parser = rec_parser) for id in orchid_ids[0:5]: cur_record = medline_dict[id] print 'title:', string.rstrip(cur_record.title) print 'authors:', cur_record.authors print 'source:', string.strip(cur_record.source) print

ST. LOUIS Sept. 17 /- Sigma-Aldrich (Nasdaq... Producing peer reviewed resources and making them available in locations... Less than two years old JoVE is a scientific journal that publishes video... Scientists depend on fast access to high quality information both from...

Edifix uses data from PubMed, a free resource that is developed and maintained by the National Center for Biotechnology Information.

(1/562) Information extraction in molecular biology.

Information extraction has become a very active field in bioinformatics recently and a number of interesting papers have been published. Most of the efforts have been concentrated on a few specific problems, such as the detection of protein-protein interactions and the analysis of DNA expression arrays, although it is obvious that there are many other interesting areas of potential application (document retrieval, protein functional description, and detection of disease-related genes to name a few). Paradoxically, these exciting developments have not yet crystallised into general agreement on a set of standard evaluation criteria, such as the ones developed in fields such as protein structure prediction, which makes it very difficult to compare performance across these different systems. In this review we introduce the general field of information extraction, we outline the status of the applications in molecular biology, and we then discuss some ideas about possible standards for evaluation that are needed for the future development of the field.  (+info)

(2/562) Predicting transcription factor synergism.

Transcriptional regulation is mediated by a battery of transcription factor (TF) proteins, that form complexes involving protein-protein and protein-DNA interactions. Individual TFs bind to their cognate cis-elements or transcription factor-binding sites (TFBS). TFBS are organized on the DNA proximal to the gene in groups confined to a few hundred base pair regions. These groups are referred to as modules. Various modules work together to provide the combinatorial regulation of gene transcription in response to various developmental and environmental conditions. The sets of modules constitute a promoter model. Determining the TFs that preferentially work in concert as part of a module is an essential component of understanding transcriptional regulation. The TFs that act synergistically in such a fashion are likely to have their cis-elements co-localized on the genome at specific distances apart. We exploit this notion to predict TF pairs that are likely to be part of a transcriptional module on the human genome sequence. The computational method is validated statistically, using known interacting pairs extracted from the literature. There are 251 TFBS pairs up to 50 bp apart and 70 TFBS pairs up to 200 bp apart that score higher than any of the known synergistic pairs. Further investigation of 50 pairs randomly selected from each of these two sets using PubMed queries provided additional supporting evidence from the existing biological literature suggesting TF synergism for these novel pairs.  (+info)

(3/562) An intelligent biological information management system.

MOTIVATION: As biomedical researchers are amassing a plethora of information in a variety of forms resulting from the advancements in biomedical research, there is a critical need for innovative information management and knowledge discovery tools to sift through these vast volumes of heterogeneous data and analysis tools. In this paper we present a general model for an information management system that is adaptable and scalable, followed by a detailed design and implementation of one component of the model. The prototype, called BioSifter, was applied to problems in the bioinformatics area. RESULTS: BioSifter was tested using 500 documents obtained from PubMed database on two biological problems related to genetic polymorphism and extracorporal shockwave lithotripsy. The results indicate that BioSifter is a powerful tool for biological researchers to automatically retrieve relevant text documents from biological literature based on their interest profile. The results also indicate that the first stage of information management process, i.e. data to information transformation, significantly reduces the size of the information space. The filtered data obtained through BioSifter is relevant as well as much smaller in dimension compared to all the retrieved data. This would in turn significantly reduce the complexity associated with the next level transformation, i.e. information to knowledge.  (+info)

(4/562) Identifying diagnostic studies in MEDLINE: reducing the number needed to read.

OBJECTIVES: The search filters in PubMed have become a cornerstone in information retrieval in evidence-based practice. However, the filter for diagnostic studies is not fully satisfactory, because sensitive searches have low precision. The objective of this study was to construct and validate better search strategies to identify diagnostic articles recorded on MEDLINE with special emphasis on precision. DESIGN: A comparative, retrospective analysis was conducted. Four medical journals were hand-searched for diagnostic studies published in 1989 and 1994. Four other journals were hand-searched for 1999. The three sets of studies identified were used as gold standards. A new search strategy was constructed and tested using the 1989-subset of studies and validated in both the 1994 and 1999 subsets. We identified candidate text words for search strategies using a word frequency analysis of the abstracts. According to the frequency of identified terms, searches were run for each term independently. The sensitivity, precision, and number needed to read (1/precision) of every candidate term were calculated. Terms with the highest sensitivity x precision product were used as free text terms in combination with the MeSH term "SENSITIVITY AND SPECIFICITY" using the Boolean operator OR. In the 1994 and 1999 subsets, we performed head-to-head comparisons of the currently available PubMed filter with the one we developed. MEASUREMENTS: The sensitivity, precision and the number needed to read (1/precision) were measured for different search filters. RESULTS: The most frequently occurring three truncated terms (diagnos*; predict* and accura*) in combination with the MeSH term "SENSITIVITY AND SPECIFICITY" produced a sensitivity of 98.1 percent (95% confidence interval: 89.9-99.9%) and a number needed to read of 8.3 (95% confidence interval: 6.7-11.3%). In direct comparisons of the new filter with the currently available one in PubMed using the 1994 and 1999 subsets, the new filter achieved better precision (12.0% versus 8.2% in 1994 and 5.0% versus 4.3% in 1999. The 95% confidence intervals for the differences range from 0.05% to 7.5% (p = 0.041) and -1.0% to 2.3% (p = 0.45), respectively). The new filter achieved slightly better sensitivities than the currently available one in both subsets, namely 98.1 and 96.1% (p = 0.32) versus 95.1 and 88.8% (p = 0.125). CONCLUSIONS: The quoted performance of the currently available filter for diagnostic studies in PubMed may be overstated. It appears that even single external validation may lead to over optimistic views of a filter's performance. Precision appears to be more unstable than sensitivity. In terms of sensitivity, our filter for diagnostic studies performed slightly better than the currently available one and it performed better with regards to precision in the 1994 subset. Additional research is required to determine whether these improvements are beneficial to searches in practice.  (+info)

(5/562) Use of the Internet and information technology for surgeons and surgical research.

The recent, and extensive, expansion in the use of computers and the Internet offers great potential for benefit in surgical research and, increasingly, surgical practice. However, in addition to the usefulness of information technology, much time can be spent achieving little and the potential missed because of the complexity and excess of information available. In this article, we examine some useful areas relevant to surgeons and surgical research, such as Internet service provision and E-mail, databases, medical Websites, and potential future directions.  (+info)

(6/562) Using LOINC to link an EMR to the pertinent paragraph in a structured reference knowledge base.

Intermountain Health Care has integrated the electronic medical record (EMR) with online information resources in order to create easy access to a knowledge base which practicing physicians can use at the point of care. When a user is reviewing problems/diagnosis, medications, or clinical laboratory test results, they can conveniently access a "pertinent paragraph" of reference literature that pertains to the clinical data in the EMR. Using terminology first coined by Cimino1, we call this application the "infobutton." We describe the architectural issues involved in linking our electronic medical record with a structured laboratory knowledge base. The application has been well received as noted by anecdotal comments made by physicians and usage of the application.  (+info)

(7/562) Finding UMLS Metathesaurus concepts in MEDLINE.

The entire collection of 11.5 million MEDLINE abstracts was processed to extract 549 million noun phrases using a shallow syntactic parser. English language strings in the 2002 and 2001 releases of the UMLS Metathesaurus were then matched against these phrases using flexible matching techniques. 34% of the Metathesaurus names (occurring in 30% of the concepts) were found in the titles and abstracts of articles in the literature. The matching concepts are fairly evenly chemical and non-chemical in nature and span a wide spectrum of semantic types. This paper details the approach taken and the results of the analysis.  (+info)

(8/562) A literature-based method for assessing the functional coherence of a gene group.

MOTIVATION: Many experimental and algorithmic approaches in biology generate groups of genes that need to be examined for related functional properties. For example, gene expression profiles are frequently organized into clusters of genes that may share functional properties. We evaluate a method, neighbor divergence per gene (NDPG), that uses scientific literature to assess whether a group of genes are functionally related. The method requires only a corpus of documents and an index connecting the documents to genes. RESULTS: We evaluate NDPG on 2796 functional groups generated by the Gene Ontology consortium in four organisms: mouse, fly, worm and yeast. NDPG finds functional coherence in 96, 92, 82 and 45% of the groups (at 99.9% specificity) in yeast, mouse, fly and worm respectively.  (+info)


  • These maintained citations, along with new citations, are then available to MEDLINE data distribution and PubMed users daily. (
  • During this year-end processing time, the schedule for adding citations newly indexed with MeSH to MEDLINE/PubMed is temporarily interrupted, and no newly indexed MEDLINE citations are added to PubMed. (
  • When the year-end processing activities are complete, the new version of MEDLINE, with the new year's MeSH Headings, is loaded into PubMed, and the regular MEDLINE/PubMed update schedule resumes. (


  • At this point PubMed uses the new year's MeSH vocabulary in the MeSH translation tables and the MeSH database. (