Selection of optimal DNA oligos for gene expression arrays. (33/2093)

MOTIVATION: High density DNA oligo microarrays are widely used in biomedical research. Selection of optimal DNA oligos that are deposited on the microarrays is critical. Based on sequence information and hybridization free energy, we developed a new algorithm to select optimal short (20-25 bases) or long (50 or 70 bases) oligos from genes or open reading frames (ORFs) and predict their hybridization behavior. Having optimized probes for each gene is valuable for two reasons. By minimizing background hybridization they provide more accurate determinations of true expression levels. Having optimum probes minimizes the number of probes needed per gene, thereby decreasing the cost of each microarray, raising the number of genes on each chip and increasing its usage. RESULTS: In this paper we describe algorithms to optimize the selection of specific probes for each gene in an entire genome. The criteria for truly optimum probes are easily stated but they are not computable at all levels currently. We have developed an heuristic approach that is efficiently computable at all levels and should provide a good approximation to the true optimum set. We have run the program on the complete genomes for several model organisms and deposited the results in a database that is available on-line (http://ural.wustl.edu/~lif/probe.pl). AVAILABILITY: The program is available upon request.  (+info)

Database-driven multi locus sequence typing (MLST) of bacterial pathogens. (34/2093)

MOTIVATION: Multi Locus Sequence Typing (MLST) is a newly developed typing method for bacteria based on the sequence determination of internal fragments of seven house-keeping genes. It has proved useful in characterizing and monitoring disease-causing and antibiotic resistant lineages of bacteria. The strength of this approach is that unlike data obtained using most other typing methods, sequence data are unambiguous, can be held on a central database and be queried through a web server. RESULTS: A database-driven software system (mlstdb) has been developed, which is used by public health laboratories and researchers globally to query their nucleotide sequence data against centrally held databases over the internet. The mlstdb system consists of a set of perl scripts for defining the database tables and generating the database management interface and dynamic web pages for querying the databases. AVAILABILITY: http://www.mlst.net.  (+info)

Comparison of the likelihood ratio and identity-by-state scoring methods for analyzing sib-pair test cases: a study using computer simulation. (35/2093)

To assess the power and significance of the likelihood ratio (LR) and the identity-by-state scoring (IBS) methods for a pair of siblings, we performed computer simulations by use of 10 DNA markers (HLA-DQalpha, D1S80, and 8 short tandem repeat loci) that were frequently analyzed in paternity tests in Japan. The combined power of discrimination of these 10 markers in the Japanese population is 0.999 999 999 98. Pedigrees each consisting of 10,000 pair of full-siblings, half-siblings and unrelated individuals were generated and typed on all markers as random samples. Both the summation of log10 LR and IBS of each group had approximate standard normal distribution with significant differences between the means. Statistical studies showed that the LR method has 91% power to detect unrelated individuals and 38% power to detect half-siblings as not full-siblings with a 5% false-positive rate, whereas the IBS method does 87% and 42% powers, respectively. In 62% of full-siblings, in contrast with only 0.2% of unrelated individuals, the values of LR exceeded 100 which was equivalent to 0.99 of probability of full-sibship at 50% prior probability. The advantage of the LR method over IBS method was convincing especially for the detection of unrelated individuals as not half-siblings, however, the latter would be also informative for sib-pair tests if sufficient number of polymorphic markers are available.  (+info)

A physical amplified fragment-length polymorphism map of Arabidopsis. (36/2093)

We have positioned amplified fragment-length polymorphism (AFLP) markers directly on the genome sequence of a complex organism, Arabidopsis, by combining gel-based AFLP analysis with in silico restriction fragment analysis using the published genome sequence. For placement of the markers, we used information on restriction fragment size, four selective nucleotides, and the rough genetic position of the markers as deduced from the analysis of a limited number of Columbia (Col)/Landsberg (Ler) recombinant inbred lines. This approach allows for exact physical positioning of markers as opposed to the statistical localization resulting from traditional genetic mapping procedures. In addition, it is fast because no extensive segregation analysis is needed. In principle, the method can be applied to all organisms for which a complete or nearly complete genome sequence is available. We have located 1,267 AFLP Col/Ler markers resulting from 256 SacI+2, MseI+2 primer combinations to a physical position on the Arabidopsis genome. The positioning was verified by sequence analysis of 70 markers and by segregation analysis of two leaf-form mutants. Approximately 50% of the mapped Col/Ler AFLP markers can be used for segregation analysis in Col/C24, Col/Wassilewskija, or Col/Cape Verde Islands crosses. We present data on one such cross: the localization of a viviparous-like mutant segregating in a Col/C24 cross.  (+info)

Differential expression of genes coding for ribosomal proteins in different human tissues. (37/2093)

MOTIVATION: To perform a computational and statistical study on a large set of gene expression data pertaining six adult human tissues (brain, liver, skeletal muscle, ovary, retina and uterus) for analyzing the expression of ribosomal protein genes. RESULTS: Unexpectedly, in each of the considered tissues large variations in the expression of ribosomal protein genes were observed. Moreover, when comparing the expression levels of 89 ribosomal protein genes in six different tissues, 13 genes appeared differentially expressed among tissues. AVAILABILITY: The expression data of the ribosomal protein genes together with supplementary material (complete transcriptional profiles of the considered human tissues) are freely available at the site GETProfiles (http://telethon.bio.unipd.it/GETProfiles/). CONTACT: [email protected]  (+info)

Taxonomy workbench. (38/2093)

At advanced stages of working with user-defined protein and gene sequence collections, it is frequently necessary to link these data to the taxonomic tree and to extract subsets in accordance with taxonomic considerations. Since no general automatic tools had been available, this was a tedious manual effort. Our taxonomy workbench allows processing of sequence sets, mapping of these sets onto the taxonomic tree, collection of taxonomic subsets from them and printing of the whole tree or some part of it. As a side effect, the system enables queries to and navigation within the taxonomy database. AVAILABILITY: An implementation of the taxonomy workbench is accessible for public use as a www-service at http://mendel.imp.univie.ac.at/taxonomy/. Software components for the command-line and for the www-version are available on request. CONTACT: [email protected]; [email protected] SUPPLEMENTARY INFORMATION: Documentation for the taxonomy workbench can be accessed at http://mendel.imp.univie.ac.at/taxonomy/help.html.  (+info)

Progenetix.net: an online repository for molecular cytogenetic aberration data. (39/2093)

Through sequencing projects and, more recently, array-based expression analysis experiments, a wealth of genetic data has become accessible via online resources. In contrast, few of the (molecular-) cytogenetic aberration data collected in the last decades are available in a format suitable for data mining procedures. www.progenetix.net is a new online repository for previously published chromosomal aberration data, allowing the addition of band-specific information about chromosomal imbalances to oncologic data analysis efforts. AVAILABILITY: http://www.progenetix.net CONTACT: [email protected]  (+info)

Sequence type analysis and recombinational tests (START). (40/2093)

The 32-bit Windows application START is implemented using Visual Basic and C(++) and performs analyses to aid in the investigation of bacterial population structure using multilocus sequence data. These analyses include data summary, lineage assignment, and tests for recombination and selection. AVAILABILITY: START is available at http://outbreak.ceid.ox.ac.uk/software.htm. CONTACT: [email protected]  (+info)