A model for enhancing Internet medical document retrieval with "medical core metadata".
OBJECTIVE: Finding documents on the World Wide Web relevant to a specific medical information need can be difficult. The goal of this work is to define a set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents. DESIGN: The authors based their approach on a proposed metadata standard, the Dublin Core Metadata Element Set, which has recently been submitted to the Internet Engineering Task Force. Their model also incorporates the National Library of Medicine's Medical Subject Headings (MeSH) vocabulary and MEDLINE-type content descriptions. RESULTS: The model defines a medical core metadata set that can be used to describe the metadata for a wide variety of Internet documents. CONCLUSIONS: The authors propose that their medical core metadata set be used to assign metadata to medical documents to facilitate document retrieval by Internet search engines. (+info)
Searching for information on outcomes: do you need to be comprehensive?
The concepts of evidence-based practice and clinical effectiveness are reliant on up to date, accurate, high quality, and relevant information. Although this information can be obtained from a range of sources, computerised databases such as MEDLINE offer a fast, effective means of bringing up to date information to clinicians, as well as health service and information professionals. Common problems when searching for information from databases include missing important relevant papers or retrieving too much information. Effective search strategies are therefore necessary to retrieve a manageable amount of relevant information. This paper presents a range of strategies which can be used to locate information on MEDLINE efficiently and effectively. (+info)
Automatic identification of pneumonia related concepts on chest x-ray reports.
A medical language processing system called SymText, two other automated methods, and a lay person were compared against an internal medicine resident for their ability to identify pneumonia related concepts on chest x-ray reports. Sensitivity (recall), specificity, and positive predictive value (precision) are reported with respect to an independent panel of physicians. Overall the performance of SymText was similar to the physician and superior to the other methods. The automatic encoding of pneumonia concepts will support clinical research, decision making, computerized clinical protocols, and quality assurance in a radiology department. (+info)
Language-independent automatic acquisition of morphological knowledge from synonym pairs.
Medical words exhibit a rich and productive morphology. Beyond simple inflection, derivation and composition are a common way to form new words. Morphological knowledge is therefore very important for any medical language processing application. Whereas rich morphological resources are available for the English medical language with the UMLS Specialist Lexicon, no such resources are publicly available for French or most other languages. We propose a simple and powerful method to help acquire automatically such knowledge. This method takes advantage of the synonym terms present in medical terminologies. In a bootstrapping step, it detects morphologically related words from which it learns "derivation rules". In an expansion step, it then applies these rules to the whole vocabulary available. Our goal is to acquire data for French and other languages for which they are not available. However, to evaluate the efficiency of the method, we tested it on English in a setting which is close to that prevailing for French, and we confronted its results to those obtained with the Specialist lexical variant generation tool. (+info)
Structures of clinical information in patient records.
In order to support the preparation of the European Prestandard on "Communication of Electronic Health Care Record--Part 2: Domain Termlist" we carried out an analytical study about names of clinical documents, titles of generic sections, names of data elements, according to our terminological methods. We defined three layers of structures for clinical information: i) documents and sections, ii) clinical statements, iii) systematic details within statements. We prepared in correspondence many lists suitable to develop a principled coarse-grained markup for transmission and homogeneous browsing of disparate patient records across many institutions, without any preventive agreement on existing coding systems, data elements, record organization. This achievement is the basis for federated records, in particular for the virtual life-long patient record. (+info)
MEDTAG: tag-like semantics for medical document indexing.
Medical documentation is central in health care, as it constitutes the main means of communication between care providers. However, there is a gap to bridge between storing information and extracting the relevant underlying knowledge. We believe natural language processing (NLP) is the best solution to handle such a large amount of textual information. In this paper we describe the construction of a semantic tagset for medical document indexing purposes. Rather than attempting to produce a home-made tagset, we decided to use, as far as possible, standard medicine resources. This step has led us to choose UMLS hierarchical classes as a basis for our tagset. We also show that semantic tagging is not only providing bases for disambiguisation between senses, but is also useful in the query expansion process of the retrieval system. We finally focus on assessing the results of the semantic tagger. (+info)
Analysis of biomedical text for chemical names: a comparison of three methods.
At the National Library of Medicine (NLM), a variety of biomedical vocabularies are found in data pertinent to its mission. In addition to standard medical terminology, there are specialized vocabularies including that of chemical nomenclature. Normal language tools including the lexically based ones used by the Unified Medical Language System (UMLS) to manipulate and normalize text do not work well on chemical nomenclature. In order to improve NLM's capabilities in chemical text processing, two approaches to the problem of recognizing chemical nomenclature were explored. The first approach was a lexical one and consisted of analyzing text for the presence of a fixed set of chemical segments. The approach was extended with general chemical patterns and also with terms from NLM's indexing vocabulary, MeSH, and the NLM SPECIALIST lexicon. The second approach applied Bayesian classification to n-grams of text via two different methods. The single lexical method and two statistical methods were tested against data from the 1999 UMLS Metathesaurus. One of the statistical methods had an overall classification accuracy of 97%. (+info)
A technique for semantic classification of unknown words using UMLS resources.
Natural Language Processing (NLP) is a tool for transforming natural text into codable form. Success of NLP systems is contingent on a well constructed semantic lexicon. However, creation and maintenance of these lexicons is difficult, costly and time consuming. The UMLS contains semantic and syntactic information of medical terms, which may be used to automate some of this task. Using UMLS resources we have observed that it is possible to define one semantic type by its syntactic combinations with other types in a corpus of discharge summaries. These patterns of combination can then be used to classify words which are not in the lexicon. The technique was applied to a corpus for a single semantic type and generated a list of 875 words which matched the classification criteria for that type. The words were ranked by number of patterns matched and the top 95 words were correctly typed with 80% accuracy. (+info)