A UMLS-based knowledge acquisition tool for rule-based clinical decision support system development. (41/525)

Decision support systems in the medical field have to be easily modified by medical experts themselves. The authors have designed a knowledge acquisition tool to facilitate the creation and maintenance of a knowledge base by the domain expert and its sharing and reuse by other institutions. The Unified Medical Language System (UMLS) contains the domain entities and constitutes the relations repository from which the expert builds, through a specific browser, the explicit domain ontology. The expert is then guided in creating the knowledge base according to the pre-established domain ontology and condition-action rule templates that are well adapted to several clinical decision-making processes. Corresponding medical logic modules are eventually generated. The application of this knowledge acquisition tool to the construction of a decision support system in blood transfusion demonstrates the value of such a pragmatic methodology for the design of rule-based clinical systems that rely on the highly progressive knowledge embedded in hospital information systems.  (+info)

Use of general-purpose negation detection to augment concept indexing of medical documents: a quantitative study using the UMLS. (42/525)

OBJECTIVES: To test the hypothesis that most instances of negated concepts in dictated medical documents can be detected by a strategy that relies on tools developed for the parsing of formal (computer) languages-specifically, a lexical scanner ("lexer") that uses regular expressions to generate a finite state machine, and a parser that relies on a restricted subset of context-free grammars, known as LALR(1) grammars. METHODS: A diverse training set of 40 medical documents from a variety of specialties was manually inspected and used to develop a program (Negfinder) that contained rules to recognize a large set of negated patterns occurring in the text. Negfinder's lexer and parser were developed using tools normally used to generate programming language compilers. The input to Negfinder consisted of medical narrative that was preprocessed to recognize UMLS concepts: the text of a recognized concept had been replaced with a coded representation that included its UMLS concept ID. The program generated an index with one entry per instance of a concept in the document, where the presence or absence of negation of that concept was recorded. This information was used to mark up the text of each document by color-coding it to make it easier to inspect. The parser was then evaluated in two ways: 1) a test set of 60 documents (30 discharge summaries, 30 surgical notes) marked-up by Negfinder was inspected visually to quantify false-positive and false-negative results; and 2) a different test set of 10 documents was independently examined for negatives by a human observer and by Negfinder, and the results were compared. RESULTS: In the first evaluation using marked-up documents, 8,358 instances of UMLS concepts were detected in the 60 documents, of which 544 were negations detected by the program and verified by human observation (true-positive results, or TPs). Thirteen instances were wrongly flagged as negated (false-positive results, or FPs), and the program missed 27 instances of negation (false-negative results, or FNs), yielding a sensitivity of 95.3 percent and a specificity of 97.7 percent. In the second evaluation using independent negation detection, 1,869 concepts were detected in 10 documents, with 135 TPs, 12 FPs, and 6 FNs, yielding a sensitivity of 95.7 percent and a specificity of 91.8 percent. One of the words "no," "denies/denied," "not," or "without" was present in 92.5 percent of all negations. CONCLUSIONS: Negation of most concepts in medical narrative can be reliably detected by a simple strategy. The reliability of detection depends on several factors, the most important being the accuracy of concept matching.  (+info)

Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. (43/525)

The UMLS Metathesaurus, the largest thesaurus in the biomedical domain, provides a representation of biomedical knowledge consisting of concepts classified by semantic type and both hierarchical and non-hierarchical relationships among the concepts. This knowledge has proved useful for many applications including decision support systems, management of patient records, information retrieval (IR) and data mining. Gaining effective access to the knowledge is critical to the success of these applications. This paper describes MetaMap, a program developed at the National Library of Medicine (NLM) to map biomedical text to the Metathesaurus or, equivalently, to discover Metathesaurus concepts referred to in text. MetaMap uses a knowledge intensive approach based on symbolic, natural language processing (NLP) and computational linguistic techniques. Besides being applied for both IR and data mining applications, MetaMap is one of the foundations of NLM's Indexing Initiative System which is being applied to both semi-automatic and fully automatic indexing of the biomedical literature at the library.  (+info)

Circular hierarchical relationships in the UMLS: etiology, diagnosis, treatment, complications and prevention. (44/525)

The Unified Medical Language System (UMLS) is a large repository of some 800,000 concepts for the biomedical domain, organized by several millions of inter-concept relationships, either inherited from the source vocabularies, or specifically generated. This paper focuses on hierarchical relationships in the UMLS Metathesaurus, and especially, on circular hierarchical relationships. Using the metaphor of a disease, we first analyze the causal mechanisms for circular hierarchical relationships. Then, we discuss methods to identify and remove these relationships. Finally, we briefly discuss the consequences of these relationships for applications based on the UMLS, and we propose some prevention measures.  (+info)

Evaluation of negation phrases in narrative clinical reports. (45/525)

OBJECTIVE: Automatically identifying findings or diseases described in clinical textual reports requires determining whether clinical observations are present or absent. We evaluate the use of negation phrases and the frequency of negation in free-text clinical reports. METHODS: A simple negation algorithm was applied to ten types of clinical reports (n=42,160) dictated during July 2000. We counted how often each of 66 negation phrases was used to mark a clinical observation as absent. Physicians read a random sample of 400 sentences, and precision was calculated for the negation phrases. We measured what proportion of clinical observations were marked as absent. RESULTS: The negation algorithm was triggered by sixty negation phrases with just seven of the phrases accounting for 90% of the negations. The negation phrases received an overall precision of 97%, with "not" earning the lowest precision of 63%. Between 39% and 83% of all clinical observations were identified as absent by the negation algorithm, depending on the type of report analyzed. The most frequently used clinical observations were negated the majority of the time. CONCLUSION: Because clinical observations in textual patient records are frequently negated, identifying accurate negation phrases is important to any system processing these reports.  (+info)

Battling Scylla and Charybdis: the search for redundancy and ambiguity in the 2001 UMLS metathesaurus. (46/525)

I previously developed methods for identifying cases of multiple synonymous concepts (redundancy) and concepts with multiple meanings (ambiguity) and applied them to the 1995 UMLS Metathesaurus. These methods use semantic approaches (including knowledge about word synonymy and the semantic types assigned to concepts) to complement the standard lexical approaches. In this paper, I describe the results of their application to the 2001 Metathesaurus and examine their implications for the evolution of the UMLS.  (+info)

Evaluating the UMLS as a source of lexical knowledge for medical language processing. (47/525)

Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.  (+info)

A metaschema of the UMLS based on a partition of its semantic network. (48/525)

The Unified Medical Language System's (UMLS's) Semantic Network (SN) provides an important conceptual abstraction that helps orient users to the vast knowledge content of its Metathesaurus. However, the SN is itself large and complex, and can also benefit from an additional abstract view of its own. In this paper, we present a metaschema that serves such a purpose. This metaschema is derived from a previously developed partitioning methodology for the SN. The metaschema is formally defined, and used to provide partial compact views of the SN.  (+info)