Automated extraction of information on protein-protein interactions from the biological literature.
MOTIVATION: To understand biological process, we must clarify how proteins interact with each other. However, since information about protein-protein interactions still exists primarily in the scientific literature, it is not accessible in a computer-readable format. Efficient processing of large amounts of interactions therefore needs an intelligent information extraction method. Our aim is to develop an efficient method for extracting information on protein-protein interaction from scientific literature. RESULTS: We present a method for extracting information on protein-protein interactions from the scientific literature. This method, which employs only a protein name dictionary, surface clues on word patterns and simple part-of-speech rules, achieved high recall and precision rates for yeast (recall = 86.8% and precision = 94.3%) and Escherichia coli (recall = 82.5% and precision = 93.5%). The result of extraction suggests that our method should be applicable to any species for which a protein name dictionary is constructed. AVAILABILITY: The program is available on request from the authors. (+info)
Effects of the American College of Rheumatology systemic sclerosis trial guidelines on the nature of systemic sclerosis patients entering a clinical trial.
OBJECTIVES: To compare the systemic sclerosis (SSc) patients entered into the d-penicillamine trial with SSc patients entered into previous controlled SSc trials. It was hypothesized that the d-penicillamine trial patients, who conformed to the American College of Rheumatology (ACR) guidelines for clinical trials in SSc were different from patients entered into previous trials. METHODS: Patients entering a double-blind, randomized trial of low- vs high-dose d-penicillamine were described carefully and completely. Their characteristics were then compared with previously published data on SSc and its treatment. RESULTS: One hundred and thirty-four patients had early [mean duration 9.5 (s.d. 4.2) months], diffuse [skin score 21 (8)] disease. Organ involvement in the patients was as follows: pulmonary 54%, cardiac 20%, joints 38%, muscular 20%. Thirty-three per cent had mild proteinuria and 13% were hypertensive when first seen. Compared with patients in most previous studies, these SSc patients had earlier disease and uniformly had diffuse disease. They had less muscular involvement, less dyspnoea, less abnormal pulmonary function and less cardiac and less renal involvement than patients in earlier studies. CONCLUSIONS: The use of the new ACR guidelines for SSc trials may change the nature of patient populations entering future studies. (+info)
Evidence-based dentistry: Part V. Critical appraisal of the dental literature: papers about therapy.
Evidence-based dentistry involves defining a question focused on a patient-related problem and searching for reliable evidence to provide an answer. Once potential evidence has been found, it is necessary to determine whether the information is credible and whether it is useful in your practice by using the techniques of critical appraisal. In this paper, the fifth in a 6-part series on evidence-based dentistry, a framework is described which provides a series of questions to help the reader assess both the validity and applicability of an article related to questions of therapy or prevention. (+info)
Comparing like with like: some historical milestones in the evolution of methods to create unbiased comparison groups in therapeutic experiments.
Histories of clinical trials have recorded and analysed the development of quantification in therapeutic evaluation, the emergence of probabilistic thinking, the application of statistical methods and theory, and the sociology, ethics and politics of clinical trials; but it is surprising that they only rarely identify as a distinct theme the development of efforts to control biases. An exception is Kaptchuk's recent account of the history of blinding and placebos for reducing observer biases. In this complementary paper I introduce and discuss some milestones between 1662 and 1948 in the development of methods to control selection biases when assembling therapeutic comparison groups, to ensure, as far as possible, that 'like is compared with like'. In the paper I note (i) that treatment allocation based on strict alternation abolishes selection bias as effectively as treatment allocation based on strict random allocation; (ii) that use of schedules based on random numbers is more likely to prevent foreknowledge of allocation schedules, and thus the risk of introducing selection bias at the point of recruitment to trials; (iii) that a concern to conceal allocation schedules was the rationale for using schedules based on random numbers in the Medical Research Council trials of vaccination for whooping cough and streptomycin for pulmonary tuberculosis; and (iv) that the introduction of allocation concealment more than half a century ago remains the most recent substantive milestone in the history of efforts to control selection biases in therapeutic experiments. (+info)
Comparing syntactic complexity in medical and non-medical corpora.
With the growing use of Natural Language Processing (NLP) techniques as solutions in Medical Informatics, the need to quickly and efficiently create the knowledge structures used by these systems has grown concurrently. Automatic discovery of a lexicon for use by an NLP system through machine learning will require information about the syntax of medical language. Understanding the syntactic differences between medical and non-medical corpora may allow more efficient acquisition of a lexicon. Three experiments designed to quantify the syntactic differences in medical and non-medical corpora were conducted. The results show that the syntax of medical language shows less variation than non-medical language and is likely simpler. The differences were great enough to question the applicability of general language tools on medical language. These differences may reduce the difficulty of some free text machine learning problems by capitalizing on the simpler nature of narrative medical syntax. (+info)
Handheld computing in medicine.
Handheld computers have become a valuable and popular tool in various fields of medicine. A systematic review of articles was undertaken to summarize the current literature regarding the use of handheld devices in medicine. A variety of articles were identified, and relevant information for various medical fields was summarized. The literature search covered general information about handheld devices, the use of these devices to access medical literature, electronic pharmacopoeias, patient tracking, medical education, research, business management, e-prescribing, patient confidentiality, and costs as well as specialty-specific uses for personal digital assistants (PDAs). The authors concluded that only a small number of articles provide evidence-based information about the use of PDAs in medicine. The majority of articles provide descriptive information, which is nevertheless of value. This article aims to increase the awareness among physicians about the potential roles for handheld computers in medicine and to encourage the further evaluation of their use. (+info)
Can Mary Shelley's Frankenstein be read as an early research ethics text?
The current popular view of the novel Frankenstein is that it describes the horrors consequent upon scientific experimentation; the pursuit of science leading inevitably to tragedy. In reality the importance of the book is far from this. Although the evil and tragedy resulting from one medical experiment are its theme, a critical and fair reading finds a more balanced view that includes science's potential to improve the human condition and reasons why such an experiment went awry. The author argues that Frankenstein is an early and balanced text on the ethics of research upon human subjects and that it provides insights that are as valid today as when the novel was written. As a narrative it provides a gripping story that merits careful analysis by those involved in medical research and its ethical review, and it is more enjoyable than many current textbooks! To support this thesis, the author will place the book in historical, scientific context, analyse it for lessons relevant to those involved in research ethics today, and then draw conclusions. (+info)
Textpresso: an ontology-based information retrieval and extraction system for biological literature.
We have developed Textpresso, a new text-mining system for scientific literature whose capabilities go far beyond those of a simple keyword search engine. Textpresso's two major elements are a collection of the full text of scientific articles split into individual sentences, and the implementation of categories of terms for which a database of articles and individual sentences can be searched. The categories are classes of biological concepts (e.g., gene, allele, cell or cell group, phenotype, etc.) and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., biological process, etc.). Together they form a catalog of types of objects and concepts called an ontology. After this ontology is populated with terms, the whole corpus of articles and abstracts is marked up to identify terms of these categories. The current ontology comprises 33 categories of terms. A search engine enables the user to search for one or a combination of these tags and/or keywords within a sentence or document, and as the ontology allows word meaning to be queried, it is possible to formulate semantic queries. Full text access increases recall of biological data types from 45% to 95%. Extraction of particular biological facts, such as gene-gene interactions, can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a 3-fold increase of search efficiency. Textpresso currently focuses on Caenorhabditis elegans literature, with 3,800 full text articles and 16,000 abstracts. The lexicon of the ontology contains 14,500 entries, each of which includes all versions of a specific word or phrase, and it includes all categories of the Gene Ontology database. Textpresso is a useful curation tool, as well as search engine for researchers, and can readily be extended to other organism-specific corpora of text. Textpresso can be accessed at http://www.textpresso.org or via WormBase at http://www.wormbase.org. (+info)