HICLAS: a taxonomic database system for displaying and comparing biological classification and phylogenetic trees. (1/2276)

MOTIVATION: Numerous database management systems have been developed for processing various taxonomic data bases on biological classification or phylogenetic information. In this paper, we present an integrated system to deal with interacting classifications and phylogenies concerning particular taxonomic groups. RESULTS: An information-theoretic view (taxon view) has been applied to capture taxonomic concepts as taxonomic data entities. A data model which is suitable for supporting semantically interacting dynamic views of hierarchic classifications and a query method for interacting classifications have been developed. The concept of taxonomic view and the data model can also be expanded to carry phylogenetic information in phylogenetic trees. We have designed a prototype taxonomic database system called HICLAS (HIerarchical CLAssification System) based on the concept of taxon view, and the data models and query methods have been designed and implemented. This system can be effectively used in the taxonomic revisionary process, especially when databases are being constructed by specialists in particular groups, and the system can be used to compare classifications and phylogenetic trees. AVAILABILITY: Freely available at the WWW URL: http://aims.cps.msu.edu/hiclas/ CONTACT: [email protected]; [email protected]

Identifying diabetes mellitus or heart disease among health maintenance organization members: sensitivity, specificity, predictive value, and cost of survey and database methods. (2/2276)

We conducted a study of the sensitivity, specificity, positive predictive value, and cost of two methods of identifying diagnosed diabetes mellitus or heart disease among members of a health maintenance organization (HMO). Among 3186 adult HMO members who were attending one primary care clinic, 2326 were reached for a telephone survey (efficiency = 0.73). Among these members, 1991 answered standardized questions to ascertain whether they had diabetes or heart disease (corrected response rate = 0.85). Linkage was then made to computerized diagnostic databases. By means of both a database method and a survey method, the 1976 members with complete data for analysis were classified as having or not having diabetes or heart disease. When results with the two methods disagreed, charts were reviewed to confirm the presence or absence of diabetes or heart disease. Diabetes was identified among 4.7% of adult members, and heart disease was identified among 3.7%. Identification of diabetes differed between the database method and the survey method (sensitivity 0.91 vs 0.98, specificity 0.99 vs 0.99, positive predictive value 0.94 vs 0.83). Identification of heart attach history was similar for the database method and the survey method (sensitivity 0.89 vs 0.95, specificity 0.99 vs 0.99, positive predictive value 0.79 vs 0.81). The cost of obtaining data was $13.50 per member for the survey method and $0.30 per member for the database method. Database methods or survey methods of identifying selected chronic diseases among HMO members may be acceptable for various purposes, but database identification methods appear to be less expensive and provide information on a higher proportion of HMO members than do survey methods. Accurate identification of chronic diseases among patients supports clinic-level measures for clinical improvement, research, and accountability.

Prevalence and cost of hospitalization for gastrointestinal complications related to peptic ulcers with bleeding or perforation: comparison of two national databases. (3/2276)

The purpose of this study was to determine the prevalence and cost of hospitalization for upper gastrointestinal complications, including peptic ulcers with hemorrhage or perforation. Upper gastrointestinal complications and corresponding economic data were obtained from two sources. The first was a 20% sample of all community hospital discharges (about 6 million per year) from 11 states for 1991 and 1992 Hospital Cost Utilization Project; HCUP-3). The second source of data was a claims database for employees of large US corporations and their dependents for 1992, 1993, and 1994 (about 3.5 million covered lives per year; MarketScan). A group of ICD-9 codes for the diagnosis of peptic and gastroduodenal ulcers with bleeding or perforation were used to identify hospital admissions because of upper gastrointestinal complications. Similar patterns were observed across the MarketScan and HCUP-3 databases regarding hospitalization with diagnoses related to gastrointestinal complications identified according to the ICD-9 codes. The average age of patients with upper gastrointestinal complications was 66 years in the HCUP-3 database and 52 years in the MarketScan database. The average annual rates of upper gastrointestinal complications as a primary or secondary diagnosis were 6.4 and 6.7 per 1000 discharges for 1991 and 1992, respectively (HCUP-3), and 4.3, 4.2, and 4.9 per 1000 admissions for 1992, 1993, and 1994, respectively (MarketScan). The average length of stay for upper gastrointestinal complications as a primary diagnosis was 7.8 days in 1991 and 7.5 days in 1992 (HCUP-3) and 6.1, 5.1, and 5.1 days in 1992, 1993, and 1994, respectively (MarketScan). The national average total charge for hospitalization for gastrointestinal problems as a primary diagnosis was $12,970 in 1991 and $14,294 in 1992 (HCUP-3). The average total reimbursement for hospitalizations related to upper gastrointestinal problems was $15,309 in 1992, $12,987 in 1993, and $13,150 in 1994 (MarketScan). Hospital admissions for upper gastrointestinal complications are expensive. The rate and cost per admission are higher for the older population. The results on the elements covered by both databases are consistent. Therefore the databases complement each other on the type of information abstracted.

DNA microarray technology: the anticipated impact on the study of human disease. (4/2276)

One can imagine that, one day, there will be a general requirement that relevant array data be deposited, at the time of publication of manuscripts in which they are described, into a single site made available for the storage and analysis of array data (modeled after the GenBank submission requirements for DNA sequence information). With this system in place, one can anticipate a time when data from thousands of gene expression experiments will be available for meta-analysis, which has the potential to balance out artifacts from many individual studies, thus leading to more robust results and subtle conclusions. This will require that data adhere to some type of uniform structure and format that would ideally be independent of the particular expression technology used to generate it. The pros and cons of various publication modalities for these large electronic data sets have been discussed elsewhere [12], but, practical difficulties aside, general depositing must occur for this technology to reach the broadest range of investigators. Finally, as mentioned at the beginning of this review, it is unfortunate that this important research tool remains largely restricted to a few laboratories that have developed expertise in this area and to a growing number of commercial interests. Ultimately the real value of microarray technology will only be realized when this approach is generally available. It is hoped that issues including platforms, instrumentation, clone availability, and patents [20] will be resolved shortly, making this technology accessible to the broadest range of scientists at the earliest possible moment.

Motif-based searching in TOPS protein topology databases. (5/2276)

MOTIVATION: TOPS cartoons are a schematic ion of protein three-dimensional structures in two dimensions, and are used for understanding and manual comparison of protein folds. Recently, an algorithm that produces the cartoons automatically from protein structures has been devised and cartoons have been generated to represent all the structures in the structural databank. There is now a need to be able to define target topological patterns and to search the database for matching domains. RESULTS: We have devised a formal language for describing TOPS diagrams and patterns, and have designed an efficient algorithm to match a pattern to a set of diagrams. A pattern-matching system has been implemented, and tested on a database derived from all the current entries in the Protein Data Bank (15,000 domains). Users can search on patterns selected from a library of motifs or, alternatively, they can define their own search patterns. AVAILABILITY: The system is accessible over the Web at http://tops.ebi.ac.uk/tops

Wrapping SRS with CORBA: from textual data to distributed objects. (6/2276)

MOTIVATION: Biological data come in very different shapes. Databanks are maintained and used by distinct organizations. Text is the de facto Standard exchange format. The SRS system can integrate heterogeneous textual databanks but it was lacking a way to structure the extracted data. RESULTS: This paper presents a CORBA interface to the SRS system which manages databanks in a flat file format. SRS Object Servers are CORBA wrappers for SRS. They allow client applications (visualisation tools, data mining tools, etc.) to access and query SRS servers remotely through an Object Request Broker (ORB). They provide loader objects that contain the information extracted from the databanks by SRS. Loader objects are not hard-coded but generated in a flexible way by using loader specifications which allow SRS administrators to package data coming from distinct databanks. AVAILABILITY: The prototype may be available for beta-testing. Please contact the SRS group (http://srs.ebi.ac.uk).

The strategic and operational characteristics of a distributed phased archive for a multivendor incremental implementation of picture archiving and communications systems. (7/2276)

The long-term (10 years) multimodality distributed phased archive for the Medical Information, Communication and Archive System (MICAS) is being implemented in three phases. The selection process took approximately 10 months. Based on the mandatory archive attributes and desirable features, Cemax-Icon (Fremont, CA) was selected as the vendor. The archive provides for an open-solution allowing incorporation of leading edge, "best of breed" hardware and software and provides maximum flexibility and automation of workflow both within and outside of radiology. The solution selected is media-independent, provides expandable storage capacity, and will provide redundancy and fault tolerance in phase II at minimum cost. Other attributes of the archive include scalable archive strategy, virtual image database with global query, and an object-oriented database. The archive is seamlessly integrated with the radiology information system (RIS) and provides automated fetching and routing, automated study reconciliation using modality worklist manager, clinical reports available at any Digital Imaging and Communications in Medicine (DICOM) workstation, and studies available for interpretation whether validated or not. Within 24 hours after a new study is acquired, four copies will reside within different components of the archive including a copy that can be stored off-site. Phase II of the archive will be installed during 1999 and will include a second Cemax-Icon archive and database using archive manager (AM) Version 4.0 in a second computer room.

Establishing radiologic image transmission via a transmission control protocol/Internet protocol network between two teaching hospitals in Houston. (8/2276)

The technical and management considerations necessary for the establishment of a network link between computed tomography (CT) and magnetic resonance imaging (MRI) networks of two geographically separated teaching hospitals are presented. The University of Texas Medical School at Houston Department of Radiology provides radiology residency training at its primary teaching hospital and at a second county-run hospital located approximately 12 miles away. A direct network link between the two hospitals was desired to permit timely consultative services to residents and professional colleagues. The network link was established by integrating the county hospital free-standing imaging network into the network infrastructure of the Medical School and the main teaching hospital. Technical issues involved in the integration were reassignment of internet protocol (IP) addresses, determination of data transmission protocol compatibilities, proof of connectivity and image transmission, transmission speeds and network loading, and management of the new network. These issues were resolved in a planned stepwise fashion and despite the fact that the system has a rate-limiting T1 segment between the county hospital and the teaching hospital the transmission speed was deemed suitable. The project has proven successful and can provide a guide for planning similar projects elsewhere. It has in fact made possible several new services for the teaching and research activities of the department's faculty and residents, which were not envisaged before the implementation of this connection.