Knowledge requirements for automated inference of medical textbook markup. (17/831)

Indexing medical text in journals or textbooks requires a tremendous amount of resources. We tested two algorithms for automatically indexing nouns, noun-modifiers, and noun phrases, and inferring selected binary relations between UMLS concepts in a textbook of infectious disease. Sixty-six percent of nouns and noun-modifiers and 81% of noun phrases were correctly matched to UMLS concepts. Semantic relations were identified with 100% specificity and 94% sensitivity. For some medical sub-domains, these algorithms could permit expeditious generation of more complex indexing.  (+info)

Automation and integration of components for generalized semantic markup of electronic medical texts. (18/831)

Our group has built an information retrieval system based on a complex semantic markup of medical textbooks. We describe the construction of a set of web-based knowledge-acquisition tools that expedites the collection and maintenance of the concepts required for text markup and the search interface required for information retrieval from the marked text. In the text markup system, domain experts (DEs) identify sections of text that contain one or more elements from a finite set of concepts. End users can then query the text using a predefined set of questions, each of which identifies a subset of complementary concepts. The search process matches that subset of concepts to relevant points in the text. The current process requires that the DE invest significant time to generate the required concepts and questions. We propose a new system--called ACQUIRE (Acquisition of Concepts and Queries in an Integrated Retrieval Environment)--that assists a DE in two essential tasks in the text-markup process. First, it helps her to develop, edit, and maintain the concept model: the set of concepts with which she marks the text. Second, ACQUIRE helps her to develop a query model: the set of specific questions that end users can later use to search the marked text. The DE incorporates concepts from the concept model when she creates the questions in the query model. The major benefit of the ACQUIRE system is a reduction in the time and effort required for the text-markup process. We compared the process of concept- and query-model creation using ACQUIRE to the process used in previous work by rebuilding two existing models that we previously constructed manually. We observed a significant decrease in the time required to build and maintain the concept and query models.  (+info)

Maintaining a catalog of manually-indexed, clinically-oriented World Wide Web content. (19/831)

With no quality controls and a highly distributed means of posting information, finding high-quality, clinically-oriented content on the World Wide Web can be difficult. Maintaining a catalog of such information can be equally challenging. CliniWeb is a catalog of quality-filtered and clinically-oriented content on the Web designed to enhance access to such information. This paper describes a group of semi-automated tools have been developed to maintain the CliniWeb database. One allows easier identification of content by utilizing Web crawling techniques from high-level pages. Another allows easier selection of content for inclusion and its indexing. A final one checks links to help keep the database current. These are augmented by general plans to adopt more detailed metadata and linkages into the medical literature.  (+info)

Creating and indexing teaching files from free-text patient reports. (20/831)

Teaching files based on real patient data can enhance the education of students, staff and other colleagues. Although information retrieval system can index free-text documents using keywords, these systems do not work well where content bearing terms (e.g., anatomy descriptions) frequently appears. This paper describes a system that uses multi-word indexing terms to provide access to free-text patient reports. The utilization of multi-word indexing allows better modeling of the content of medical reports, thus improving retrieval performance. The method used to select indexing terms as well as early evaluation of retrieval performance is discussed.  (+info)

Quality criteria and access characteristics of Web sites: proposal for the design of a health Internet directory. (21/831)

The increasing volume of information available on the Internet today is a problem for health care professionals who want to access rapidly data of high quality. Usual search engines and directories are not sufficient to satisfy their needs. Moreover, the information published by Web sites is not always guaranteed. Some institutions around the word deal with the definition of a set of criteria for the evaluation of medical Web sites. We base our current work on the technologies we developed previously in order to integrate sources of information of various kinds using the "Unified Medical Language System" knowledge bases. This paper focuses on quality criteria and access characteristics Web sites should satisfy to be registered in a "Health Internet Directory". The design of such a system is proposed and discussed.  (+info)

A strategy for statistical Master Person Index linking. (22/831)

A linking program used by Connecticut Healthcare Information Management and Exchange to maintain the Master Person Index for its large, state-wide patient data repository is being stretched beyond its limits by the growing size and complexity of the database. This paper presents the early work into developing a second-generation linking program. Like the original program, the new linker will use a unique multi-step process to allow effective linking of data from a large number of dissimilar data sources. The new linker will use parallel multi-processing to allow improved performance and scalability. These changes will also make possible more sophisticated statistical methods of defining link confidence. The system is implemented using a scalable collection of inexpensive, PC based systems running the Linux operating system, a freely available database engine, and the Java programming language.  (+info)

Evaluation of the Information Sources Map. (23/831)

As part of preliminary studies for the development of a digital library, we have studied the possibility of using the UMLS Information Sources Map (ISM) database to provide the means to connect and map different terminologies, as well as to facilitate access to available information sources. The main issues discussed are the indexing of and connection to relevant online sources. We found the features of the ISM to be consistent with the need to support automated source selection and retrieval. However, attention should be paid to three aspects of the information: granularity, completeness, and accuracy. We found the ISM to be potentially useful; however, significant modifications will be required if the ISM is to be able to support automated source selection and retrieval.  (+info)

Task-specific journal extracts for using the medical literature. (24/831)

Clinicians and researchers use the medical literature in a variety of ways. The overwhelming volume of clinical journals necessitates tools to help healthcare professionals identify and employ relevant information. The structured abstract can facilitate browsing articles, but may not contain appropriate types of information or sufficient detail for all uses of the medical literature. We have created customized views of journal articles that provide information for specific research or clinical tasks, such as evaluating the scientific validity of a clinical trial. These summaries are called extracts because we literally extract information of a particular type from the full text of an article. We employ a context-based indexing scheme, previously designed for improving precision in literature searches, to automatically generate extracts from clinical research articles. In this paper, we present an evaluation of the content and utility of these task-specific extracts. Our results provide preliminary evidence that such extracts contain information that is relevant to clinical and research tasks and may facilitate understanding and use of the medical literature.  (+info)