Loading...
  • query sequences
  • The general algorithmic process followed by BLAT is similar to BLAST's in that it first searches for short segments in the database and query sequences which have a certain number of matching elements. (wikipedia.org)
  • It does this by keeping an indexed list (hash table) of the target database in memory, which significantly reduces the time required for the comparison of the query sequences with the target database. (wikipedia.org)
  • Calculating a global alignment is a form of global optimization that "forces" the alignment to span the entire length of all query sequences. (wikipedia.org)
  • In many cases, the input set of query sequences are assumed to have an evolutionary relationship by which they share a linkage and are descended from a common ancestor. (wikipedia.org)
  • gene expression
  • MicroRNAs (miRNAs) are endogenous small RNAs that recognize target sequences by base complementarity and play a role in the regulation of target gene expression. (plantcell.org)
  • Scotty Scotty: a web tool for designing RNA-Seq experiments to measure differential gene expression. (wikipedia.org)
  • Statistical tests, specifically designed to handling count based data, can be used for differential gene expression and alternative splicing analysis. (wikipedia.org)
  • homologous
  • The resultant sRNA library data revealed that the surveyed tick populations produced reads that were homologous to St. Croix River Virus (SCRV) sequences. (frontiersin.org)
  • As part of this analysis, a phylogenetic tree was constructed to display the relationships among the homologous sequences that were identified. (frontiersin.org)
  • rather, it first attempts to rapidly detect short sequences which are more likely to be homologous, and then it aligns and further extends the homologous regions. (wikipedia.org)
  • There are three different strategies used in order to search for candidate homologous regions: The first method requires single perfect matches between the query and database sequences i.e. the two k-mer words are exactly the same. (wikipedia.org)
  • A common use for pairwise sequence alignment is to take a sequence of interest and compare it to all known sequences in a database to identify homologous sequences. (wikipedia.org)
  • rRNA
  • The project also financed deep sequencing of bacterial 16S rRNA sequences amplified by polymerase chain reaction from human subjects. (wikipedia.org)
  • 28S ribosomal RNA is the structural ribosomal RNA (rRNA) for the large component, or large subunit (LSU) of eukaryotic cytoplasmic ribosomes, and thus one of the basic components of all eukaryotic cells. (wikipedia.org)
  • Genomes
  • The INFERNAL package can also be used with Rfam to annotate sequences (including complete genomes) for homologues to known ncRNAs. (wikipedia.org)
  • reads
  • RNA reads may be obtained using a variety of RNA-seq methods. (wikipedia.org)
  • The Python script htseq-qa takes a file with sequencing reads (either raw or aligned reads) and produces a PDF file with useful plots to assess the technical quality of a run. (wikipedia.org)
  • QC-Chain QC-Chain is a package of quality control tools for next generation sequencing (NGS) data, consisting of both raw reads quality evaluation and de novo contamination screening, which could identify all possible contamination sequences. (wikipedia.org)
  • Quickly scans reads and gathers statistics on base and quality frequencies, read length, and frequent sequences. (wikipedia.org)
  • Strand NGS also allows users to perform quality control on the imported data and filter reads before the main analysis is performed. (wikipedia.org)
  • read sequences
  • Yksi yleisistä RNA-sekvensointidatan analyysitavoista koostuu kolmesta osasta: lukujaksojen (read sequences) linjaus referenssigenomiin, transkriptien kokoaminen, ja transkriptien ekspressiotasojen määrittäminen. (helsinki.fi)
  • It can import raw read sequences from sequencing platforms like Illumina, Ion Torrent, PacBio, ABI, and 454 Life Sciences and supports fragment, single-end, paired-end, mate-paired, directional single/ paired end library types. (wikipedia.org)
  • bacterial
  • Important components of the HMP were culture-independent methods of microbial community characterization, such as metagenomics (which provides a broad genetic perspective on a single microbial community), as well as extensive whole genome sequencing (which provides a "deep" genetic perspective on certain aspects of a given microbial community, i.e. of individual bacterial species). (wikipedia.org)
  • Transfer-messenger RNA (abbreviated tmRNA, also known as 10Sa RNA and by its genetic name SsrA) is a bacterial RNA molecule with dual tRNA-like and messenger RNA-like properties. (wikipedia.org)
  • In trans-translation, tmRNA and its associated proteins bind to bacterial ribosomes which have stalled in the middle of protein biosynthesis, for example when reaching the end of a messenger RNA which has lost its stop codon. (wikipedia.org)
  • In other bacterial species, a permuted ssrA gene produces a two-piece tmRNA in which two separate RNA chains are joined by base-pairing. (wikipedia.org)
  • common ancestor
  • If two sequences in an alignment share a common ancestor, mismatches can be interpreted as point mutations and gaps as indels (that is, insertion or deletion mutations) introduced in one or both lineages in the time since they diverged from one another. (wikipedia.org)
  • Antisense
  • According to the pairing region of a sense and antisense RNA pair, hNATs are divided into 6 classes, of which about 87% involve 5' or 3' UTR sequences, supporting the regulatory role of UTRs. (mendeley.com)
  • conservation
  • For example, it can be used to predict secondary structure, generate trees, and assess consensus and conservation across sequence families. (jalview.org)
  • Here, we designed a strategy to systematically analyze MIRNAs from different species generating a graphical representation of the conservation of the primary sequence and secondary structure. (plantcell.org)
  • Our results describe a link between the evolutionary conservation of plant MIRNAs and the mechanisms underlying the biogenesis of these small RNAs and show that the MIRNA pattern of conservation can be used to infer the mode of miRNA biogenesis. (plantcell.org)
  • Although DNA and RNA nucleotide bases are more similar to each other than are amino acids, the conservation of base pairs can indicate a similar functional or structural role. (wikipedia.org)
  • the consensus sequence is also often represented in graphical format with a sequence logo in which the size of each nucleotide or amino acid letter corresponds to its degree of conservation. (wikipedia.org)
  • Rather than using a single sequence, profile methods use a multiple sequence alignment to encode a profile which contains information about the conservation level of each residue. (wikipedia.org)
  • phylogenetic
  • In eukaryotes, making phylogenetic inferences using RNA is complicated by alternative splicing, which produces multiple transcripts from a single gene. (wikipedia.org)
  • As proper ortholog identification is pivotal to phylogenetic analyses, there are a variety of methods available to infer orthologs and paralogs. (wikipedia.org)
  • RevTrans will even use protein data to inform DNA alignments, which can be beneficial for resolving more distant phylogenetic relationships. (wikipedia.org)
  • Rfam
  • Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. (wikipedia.org)
  • Rfam researchers also contribute to Wikipedia's RNA WikiProject. (wikipedia.org)
  • The interface at the Rfam website allows users to search ncRNAs by keyword, family name, or genome as well as to search by ncRNA sequence or EMBL accession number. (wikipedia.org)
  • This seed alignment is used to create the SCFG, which is used with the Rfam software INFERNAL to identify additional family members and add them to the alignment. (wikipedia.org)
  • structural
  • Extracts structural motifs from a set of RNA sequences. (biosupplynet.com)
  • It is a hand-curated alignment that contains representative members of the ncRNA family and is annotated with structural information. (wikipedia.org)
  • The absence of substitutions, or the presence of only very conservative substitutions (that is, the substitution of amino acids whose side chains have similar biochemical properties) in a particular region of the sequence, suggest that this region has structural or functional importance. (wikipedia.org)
  • The complete E. coli tmRNA secondary structure was elucidated by comparative sequence analysis and structural probing. (wikipedia.org)
  • Sanger
  • Since the very first sequences of the insulin protein were characterized by Fred Sanger in 1951, biologists have been trying to use this knowledge to understand the function of molecules. (wikipedia.org)
  • The method used in this study, which is called "Sanger method" or Sanger sequencing, was a milestone in sequencing long strand molecule such as DNA. (wikipedia.org)
  • data
  • The first part of this thesis focuses on the analysis of short-read RNA-seq data. (helsinki.fi)
  • The second part, where the main contributions of this thesis lie, focuses on the analysis of long-read RNA-seq data. (helsinki.fi)
  • This one day computer based hands-on training course is designed for life sciences graduate students and researcher scientists who works with proteins, RNA and DNA sequence data. (jalview.org)
  • Software for Life Scientists DNASTAR is committed to providing innovative and easy-to-use data analysis software tools for today's life scientists. (dnastar.com)
  • DNA sequencing has become increasingly efficient over the years, resulting in an enormous increase in the amount of data gen- ated. (springer.com)
  • This sheer volume of available data makes advanced computer methods ess- tial to analysis, and a familiarity with computers and sequence ana- sis software a vital requirement for the researcher involved with DNA sequencing. (springer.com)
  • This two-part work on Analysis of Data is designed to be a practical aid to the researcher who uses computers for the acquisition, storage, or analysis of nucleic acid (and/or p- tein) sequences. (springer.com)
  • Data and images of protein-RNA interactions. (biosupplynet.com)
  • Analysis of DNA/RNA and protein sequence data. (umich.edu)
  • Analysis of expression array data. (umich.edu)
  • Among them sequence data is increasing at the exponential rate due to advent of next-generation sequencing technologies. (wikipedia.org)
  • The advent of next-generation sequencing technologies has resulted in generation of voluminous sequencing data. (wikipedia.org)
  • There are a number of public databases that contain freely available RNA-Seq data. (wikipedia.org)
  • RNA-Seq data may be directly assembled into transcripts using sequence assembly. (wikipedia.org)
  • Genome-guided assembly (sometimes mapping or reference-guided assembly) - is capable of using a pre-existing reference to guide the assembly of transcripts Both methods attempt to generate biologically representative isoform-level constructs from RNA-seq data and generally attempt to associate isoforms with a gene-level construct. (wikipedia.org)
  • When selecting or generating sequence data, it is also vital to consider the tissue type, developmental stage and environmental conditions of the organisms. (wikipedia.org)
  • It is not uncommon to translate RNA sequence into protein sequence when using transcriptomic data, especially when analyzing highly diverged taxa. (wikipedia.org)
  • Sequerome directly queries the input sequence against a variety of databases/tools ('popular public domains' and 'privately hosted services') including BLAST, Protein Data Bank (PDB), REBASE and others, and generates outputs that are intuitive and easily comprehensible. (wikipedia.org)
  • One of the key features of a profiling an input sequence data is to store, retrieve and effectively combine and re-use the older inputs. (wikipedia.org)
  • Often, is necessary to filter data, removing low quality sequences or bases (trimming), adapters, contaminations, overrepresented sequences or correcting errors to assure a coherent final result. (wikipedia.org)
  • mRIN mRIN - Assessing mRNA integrity directly from RNA-Seq data. (wikipedia.org)
  • NGSQC NGSQC: cross-platform quality analysis pipeline for deep sequencing data. (wikipedia.org)
  • NGS QC Toolkit NGS QC Toolkit A toolkit for the quality control (QC) of next generation sequencing (NGS) data. (wikipedia.org)
  • The toolkit comprises user-friendly stand alone tools for quality control of the sequence data generated using Illumina and Roche 454 platforms with detailed results in the form of tables and graphs, and filtering of high-quality sequence data. (wikipedia.org)
  • It also includes few other tools, which are helpful in NGS data quality control and analysis. (wikipedia.org)
  • PRINSEQ PRINSEQ is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data. (wikipedia.org)
  • It is particular designed for 454/Roche data, but can also be used for other types of sequence. (wikipedia.org)
  • QC3 QC3 a quality control tool designed for DNA sequencing data for raw data, alignment, and variant calling. (wikipedia.org)
  • Strand NGS is a software platform for next-generation sequencing data analysis. (wikipedia.org)
  • Transcripts
  • A thorough in silico analysis of human transcripts will help expand our knowledge of NATs. (mendeley.com)
  • Combined with endogenous micro RNAs, hNATs could be regarded as a special group of transcripts contributing to the complex regulation networks. (mendeley.com)
  • Databases that contain and/or detect orthologous relationships include: DIOPT Ensembl Compara GreenPhylDB HaMStR HomoloGene InParanoid MultiParanoid OMA OrthoDB OrthologID OrthoMCL OrtholugeDB PhylomeDB TreeFam eggNOG metaPhOrs As eukaryotic transcription is a complex process by which multiple transcripts may be generated from a single gene through alternative splicing with variable expression, the utilization of RNA is more complicated than DNA. (wikipedia.org)
  • transcript
  • For transcript assembly we propose a novel (at the time of the publication) approach of using minimum-cost flows to solve the problem of covering a graph created from the read alignments with a set of paths with the minimum cost, under some cost model. (helsinki.fi)
  • GENCODE Release 1 contained 416 known loci, 26 novel (coding DNA sequence) CDS loci, 82 novel transcript loci, 78 putative loci, 104 processed pseudogenes and 66 unprocessed pseudogenes. (wikipedia.org)
  • At the time of release, GENCODE Release 7 had the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. (wikipedia.org)
  • mismatches
  • Because the search is greedy, the first valid alignment encountered by Bowtie will not necessarily be the 'best' in terms of the number of mismatches or in terms of quality. (wikipedia.org)