Abbreviations as Topic: Shortened forms of written words or phrases used for brevity.Dictionaries as Topic: Lists of words, usually in alphabetical order, giving information about form, pronunciation, etymology, grammar, and meaning.Abbreviations: Works consisting of lists of shortened forms of written words or phrases used for brevity. Acronyms are included here.Natural Language Processing: Computer processing of a language with rules that reflect and describe current usage rather than prescribed usage.Unified Medical Language System: A research and development program initiated by the NATIONAL LIBRARY OF MEDICINE to build knowledge sources for the purpose of aiding the development of systems that help health professionals retrieve and integrate biomedical information. The knowledge sources can be used to link disparate information systems to overcome retrieval problems caused by differences in terminology and the scattering of relevant information across many databases. The three knowledge sources are the Metathesaurus, the Semantic Network, and the Specialist Lexicon.MEDLINE: The premier bibliographic database of the NATIONAL LIBRARY OF MEDICINE. MEDLINE® (MEDLARS Online) is the primary subset of PUBMED and can be searched on NLM's Web site in PubMed or the NLM Gateway. MEDLINE references are indexed with MEDICAL SUBJECT HEADINGS (MeSH).Terminology as Topic: The terms, expressions, designations, or symbols used in a particular science, discipline, or specialized subject area.Abstracting and Indexing as Topic: Activities performed to identify concepts and aspects of published information and research reports.Databases, Bibliographic: Extensive collections, reputedly complete, of references and citations to books, articles, publications, etc., generally on a single subject or specialized subject area. Databases can operate through automated files, libraries, or computer disks. The concept should be differentiated from DATABASES, FACTUAL which is used for collections of data and facts apart from bibliographic references to them.Subject Headings: Terms or expressions which provide the major means of access by subject to the bibliographic unit.Pattern Recognition, Automated: In INFORMATION RETRIEVAL, machine-sensing or identification of visible patterns (shapes, forms, and configurations). (Harrod's Librarians' Glossary, 7th ed)Information Storage and Retrieval: Organized activities related to the storage, location, search, and retrieval of information.Databases as Topic: Organized collections of computer records, standardized in format and content, that are stored in any of a variety of computer-readable modes. They are the basic sets of data from which computer-readable files are created. (from ALA Glossary of Library and Information Science, 1983)Biological Ontologies: Structured vocabularies describing concepts from the fields of biology and relationships between concepts.
Acronym: An acronym is an abbreviation used as a word which is formed from the initial components in a phrase or a word. Usually these components are individual letters (as in NATO or laser) or parts of words or names (as in Benelux).MARTINI: Martini}}Dragomir R. Radev: Dragomir R. Radev is a University of Michigan computer science professor and Columbia University computer science adjunct professor working on natural language processing and information retrieval.Statutory auditor: Statutory auditor is a title used in various countries to refer to a person or entity with an auditing role, whose appointment is mandated by the terms of a statute.International Committee on Aeronautical Fatigue and Structural IntegrityConference and Labs of the Evaluation Forum: The Conference and Labs of the Evaluation Forum (formerly Cross-Language Evaluation Forum), or CLEF, is an organization promoting research in multilingual information access (currently focusing on European languages). Its specific functions are to maintain an underlying framework for testing information retrieval systems and to create repositories of data for researchers to use in developing comparable standards.Mouse Phenome Database: The Mouse Phenome Database (MPD) is a web-accessible database of strain characterization data for the laboratory mouse, to facilitate translational research for human health and disease. MPD characterizes phenotype as well as genotype, and provides tools for online analysis.
(1/68) Acronymophilia: an update.
The history, epidemiology, clinical features, and treatment of the epidemic infection, acronymophilia, a sinister scourge of modern medicine are described. (+info)
(2/68) Spirituality in history taking.
Andrew Taylor Still, MD, DO, included in his founding postulates of osteopathy the concept that a patient's health includes the health of a patient's spirit. In the recent past, medicine as a whole, and osteopathic medicine specifically, has neglected this postulate. Recent research has confirmed the validity of Still's postulate, and many medical training institutions have received grants and established programs to incorporate spirituality into their curriculum. As with any patient evaluation, the history and physical examination is the starting platform. This article describes several tools that can be easily incorporated into the history and physical examination, along with some of the obstacles in evaluating the health of the patient's spirit. (+info)
(3/68) A study of abbreviations in the UMLS.
Abbreviations are widely used in medicine. The understanding of abbreviations is important for medical language processing and information retrieval systems. The Unified Medical Language System (UMLS) contains a large number of abbreviations. We hypothesized that extracting and studying the UMLS abbreviations can be helpful for understanding the characteristics of abbreviations in medicine. In this paper, we describe a method for extracting abbreviations from the UMLS. We evaluated the method and studied the ambiguous nature of the abbreviations. In addition, the coverage of the UMLS abbreviations in medical reports was studied. Using our method, we extracted 163,666 unique (abbreviation, full form) pairs from the UMLS with a precision of 97.5%, and a recall of 96%. The UMLS abbreviations were highly ambiguous: 33.1% of abbreviations with six characters or less had multiple meanings; the average number of different full forms for all abbreviations with six characters or less was 2.28. The coverage of the UMLS abbreviations in medical reports was over 66%. (+info)
(4/68) Mapping abbreviations to full forms in biomedical articles.
OBJECTIVE: To develop methods that automatically map abbreviations to their full forms in biomedical articles. METHODS: The authors developed two methods of mapping defined and undefined abbreviations (defined abbreviations are paired with their full forms in the articles, whereas undefined ones are not). For defined abbreviations, they developed a set of pattern-matching rules to map an abbreviation to its full form and implemented the rules into a software program, AbbRE (for "abbreviation recognition and extraction"). Using the opinions of domain experts as a reference standard, they evaluated the recall and precision of AbbRE for defined abbreviations in ten biomedical articles randomly selected from the ten most frequently cited medical and biological journals. They also measured the percentage of undefined abbreviations in the same set of articles, and they investigated whether they could map undefined abbreviations to any of four public abbreviation databases (GenBank LocusLink, SWISSPROT, LRABR of the UMLS Specialist Lexicon, and BioABACUS). RESULTS: AbbRE had an average 0.70 recall and 0.95 precision for the defined abbreviations. The authors found that an average of 25 percent of abbreviations were defined in biomedical articles and that of a randomly selected subset of undefined abbreviations, 68 percent could be mapped to any of four abbreviation databases. They also found that many abbreviations are ambiguous (i.e., they map to more than one full form in abbreviation databases). CONCLUSION: AbbRE is efficient for mapping defined abbreviations. To couple AbbRE with abbreviation databases for the mapping of undefined abbreviations, not only exhaustive abbreviation databases but also a method to resolve the ambiguity of abbreviations in the databases are needed. (+info)
(5/68) Tagging gene and protein names in biomedical text.
MOTIVATION: The MEDLINE database of biomedical abstracts contains scientific knowledge about thousands of interacting genes and proteins. Automated text processing can aid in the comprehension and synthesis of this valuable information. The fundamental task of identifying gene and protein names is a necessary first step towards making full use of the information encoded in biomedical text. This remains a challenging task due to the irregularities and ambiguities in gene and protein nomenclature. We propose to approach the detection of gene and protein names in scientific abstracts as part-of-speech tagging, the most basic form of linguistic corpus annotation. RESULTS: We present a method for tagging gene and protein names in biomedical text using a combination of statistical and knowledge-based strategies. This method incorporates automatically generated rules from a transformation-based part-of-speech tagger, and manually generated rules from morphological clues, low frequency trigrams, indicator terms, suffixes and part-of-speech information. Results of an experiment on a test corpus of 56K MEDLINE documents demonstrate that our method to extract gene and protein names can be applied to large sets of MEDLINE abstracts, without the need for special conditions or human experts to predetermine relevant subsets. AVAILABILITY: The programs are available on request from the authors. (+info)
(6/68) Creating an online dictionary of abbreviations from MEDLINE.
OBJECTIVE: The growth of the biomedical literature presents special challenges for both human readers and automatic algorithms. One such challenge derives from the common and uncontrolled use of abbreviations in the literature. Each additional abbreviation increases the effective size of the vocabulary for a field. Therefore, to create an automatically generated and maintained lexicon of abbreviations, we have developed an algorithm to match abbreviations in text with their expansions. DESIGN: Our method uses a statistical learning algorithm, logistic regression, to score abbreviation expansions based on their resemblance to a training set of human-annotated abbreviations. We applied it to Medstract, a corpus of MEDLINE abstracts in which abbreviations and their expansions have been manually annotated. We then ran the algorithm on all abstracts in MEDLINE, creating a dictionary of biomedical abbreviations. To test the coverage of the database, we used an independently created list of abbreviations from the China Medical Tribune. MEASUREMENTS: We measured the recall and precision of the algorithm in identifying abbreviations from the Medstract corpus. We also measured the recall when searching for abbreviations from the China Medical Tribune against the database. RESULTS: On the Medstract corpus, our algorithm achieves up to 83% recall at 80% precision. Applying the algorithm to all of MEDLINE yielded a database of 781,632 high-scoring abbreviations. Of all the abbreviations in the list from the China Medical Tribune, 88% were in the database. CONCLUSION: We have developed an algorithm to identify abbreviations from text. We are making this available as a public abbreviation server at \url[http://abbreviation.stanford.edu/]. (+info)
(7/68) Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS.
Motivation. The UMLS has been used in natural language processing applications such as information retrieval and information extraction systems. The mapping of free-text to UMLS concepts is important for these applications. To improve the mapping, we need a method to disambiguate terms that possess multiple UMLS concepts. In the general English domain, machine-learning techniques have been applied to sense-tagged corpora, in which senses (or concepts) of ambiguous terms have been annotated (mostly manually). Sense disambiguation classifiers are then derived to determine senses (or concepts) of those ambiguous terms automatically. However, manual annotation of a corpus is an expensive task. We propose an automatic method that constructs sense-tagged corpora for ambiguous terms in the UMLS using MEDLINE abstracts. METHODS: For a term W that represents multiple UMLS concepts, a collection of MEDLINE abstracts that contain W is extracted. For each abstract in the collection, occurrences of concepts that have relations with W as defined in the UMLS are automatically identified. A sense-tagged corpus, in which senses of W are annotated, is then derived based on those identified concepts. The method was evaluated on a set of 35 frequently occurring ambiguous biomedical abbreviations using a gold standard set that was automatically derived. The quality of the derived sense-tagged corpus was measured using precision and recall. RESULTS: The derived sense-tagged corpus had an overall precision of 92.9% and an overall recall of 47.4%. After removing rare senses and ignoring abbreviations with closely related senses, the overall precision was 96.8% and the overall recall was 50.6%. CONCLUSIONS: UMLS conceptual relations and MEDLINE abstracts can be used to automatically acquire knowledge needed for resolving ambiguity when mapping free-text to UMLS concepts. (+info)
(8/68) A study of abbreviations in MEDLINE abstracts.
Abbreviations are widely used in writing, and the understanding of abbreviations is important for natural language processing applications. Abbreviations are not always defined in a document and they are highly ambiguous. A knowledge base that consists of abbreviations with their associated senses and a method to resolve the ambiguities are needed. In this paper, we studied the UMLS coverage, textual variants of senses, and the ambiguity of abbreviations in MEDLINE abstracts. We restricted our study to three-letter abbreviations which were defined using parenthetical expressions. When grouping similar expansions together and representing senses using groups, we found that after ignoring senses where the total number of occurrences within the corresponding group was less than 100, 82.8% of the senses matched the UMLS, covered over 93% of occurrences that were considered, and had an average of 7.74 expansions for each sense. Abbreviations are highly ambiguous: 81.2% of the abbreviations were ambiguous, and had an average of 16.6 senses. However, after ignoring senses with occurrences of less than 5, 64.6% of the abbreviations were ambiguous, and had an average of 4.91 senses. (+info)