Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. (57/831)

The UMLS Metathesaurus, the largest thesaurus in the biomedical domain, provides a representation of biomedical knowledge consisting of concepts classified by semantic type and both hierarchical and non-hierarchical relationships among the concepts. This knowledge has proved useful for many applications including decision support systems, management of patient records, information retrieval (IR) and data mining. Gaining effective access to the knowledge is critical to the success of these applications. This paper describes MetaMap, a program developed at the National Library of Medicine (NLM) to map biomedical text to the Metathesaurus or, equivalently, to discover Metathesaurus concepts referred to in text. MetaMap uses a knowledge intensive approach based on symbolic, natural language processing (NLP) and computational linguistic techniques. Besides being applied for both IR and data mining applications, MetaMap is one of the foundations of NLM's Indexing Initiative System which is being applied to both semi-automatic and fully automatic indexing of the biomedical literature at the library.  (+info)

Informatics application provides instant research to practice benefits. (58/831)

A web-based research information system was designed to enable our research team to efficiently measure health related quality of life among frail older adults in a variety of health care settings (home care, nursing homes, assisted living, PACE). The structure, process, and outcome data is collected using laptop computers and downloaded to a SQL database. Unique features of this project are the ability to transfer research to practice by instantly sharing individual and aggregate results with the clinicians caring for these elders and directly impacting the quality of their care. Clinicians can also dial in to the database to access standard queries or receive customized reports about the patients in their facilities. This paper will describe the development and implementation of the information system. The conference presentation will include a demonstration and examples of research to practice benefits.  (+info)

Automatic MeSH term assignment and quality assessment. (59/831)

For computational purposes documents or other objects are most often represented by a collection of individual attributes that may be strings or numbers. Such attributes are often called features and success in solving a given problem can depend critically on the nature of the features selected to represent documents. Feature selection has received considerable attention in the machine learning literature. In the area of document retrieval we refer to feature selection as indexing. Indexing has not traditionally been evaluated by the same methods used in machine learning feature selection. Here we show how indexing quality may be evaluated in a machine learning setting and apply this methodology to results of the Indexing Initiative at the National Library of Medicine.  (+info)

MeSHmap: a text mining tool for MEDLINE. (60/831)

Our research goal is to explore text mining from the metadata included in MEDLINE documents. We present MeSHmap our prototype text mining system that exploits the MeSH indexing accompanying MEDLINE records. MeSHmap supports searches via PubMed followed by user driven exploration of the MeSH terms and subheadings in the retrieved set. The potential of the system goes beyond text retrieval. It may also be used to compare entities of the same type such as pairs of drugs or pairs of procedures etc. In addition there is the potential to generate maps of entities (drugs or diseases etc.) such that the strength of the link between two entities in the map represents their similarity as expressed in the MeSH metadata of the MEDLINE documents. Higher level operators have been proposed to support these comparison and mapping functions. This paper motivates and describes MeSHmap. Future work will include user evaluations of the system.  (+info)

Comparing frequency of word occurrences in abstracts and texts using two stop word lists. (61/831)

Retrieval tests have assumed that the abstract is a true surrogate of the entire text. However, the frequency of terms in abstracts has never been compared to that of the articles they represent. Even though many sources are now available in full-text, many still rely on the abstract for retrieval. 1,138 articles with their abstracts were downloaded from Journal of the American Medical Association, New England Journal of Medicine, the British Medical Journal, and the Lancet. Based on two stop word lists, one long and one short, content bearing words were extracted from the articles and their abstracts and the frequency of each word was counted in both sources. Each article and its abstract were tested using a chi-squared test to determine if the words in the abstract occurred as frequently as would be expected. 96% to 98% of the abstracts tested were not significantly different than random samples of the articles they represented. In these four journals, the abstracts are lexical, as well as intellectual, surrogates for the articles they represent.  (+info)

Developing a test collection for biomedical word sense disambiguation. (62/831)

Ambiguity, the phenomenon that a word has more than one sense, poses difficulties for many current Natural Language Processing (NLP) systems. Algorithms that assist in the resolution of these ambiguities, i.e. which make unambiguous a word, or more generally, a text string, will boost performance of these systems. To test such techniques in the biomedical language domain, we have developed a Word Sense Disambiguation (WSD) test collection that comprises 5,000 unambiguous instances for 50 ambiguous UMLS Metathesaurus strings.  (+info)

Structured data management--the design and implementation of a web-based video archive prototype. (63/831)

In response to the lack of readily available multimedia rich medical knowledge sources to support medical education and patient care, we designed and implemented a web-based video publishing platform. In order to promote the development of high-quality, up-to-date educational content, we have devised a scalable structure that allows online submissions and continuous updating of video and accompanying textual descriptions. Our goal is to enable experts in varied medical domains to collaborate in the construction of a video library using an intuitive web-based interface. Neurologists at Stanford built a well-annotated neurology video collection that initially emphasized childhood and adult movement disorders. The collection may be accessed either as a stand-alone resource or as part of the Stanford Skolar MD, an integrated online medical knowledge provider. This manuscript discusses the design framework and implementation details of structured media content development. We present examples illustrating media data collection, content indexing using UMLS concepts, media storage, and web presentation.  (+info)

A computerized tool for evaluating the effectiveness of preventive interventions. (64/831)

In identifying appropriate strategies for effective use of preventive services for particular settings or populations, public health practitioners employ a systematic approach to evaluating the literature. Behavioral intervention studies that focus on prevention, however, pose special challenges for these traditional methods. Tools for synthesizing evidence on preventive interventions can improve public health practice. The authors developed a literature abstraction tool and a classification for preventive interventions. They incorporated the tool into a PC-based relational database and user-friendly evidence reporting system, then tested the system by reviewing behavioral interventions for hypertension management. They performed a structured literature search and reviewed 100 studies on behavioral interventions for hypertension management. They abstracted information using the abstraction tool and classified important elements of interventions for comparison across studies. The authors found that many studies in their pilot project did not report sufficient information to allow for complete evaluation, comparison across studies, or replication of the intervention. They propose that studies reporting on preventive interventions should (a) categorize interventions into discrete components; (b) report sufficient participant information; and (c) report characteristics such as intervention leaders, timing, and setting so that public health professionals can compare and select the most appropriate interventions.  (+info)