Highly repeated sequences, 6K-8K base pairs in length, which contain RNA polymerase II promoters. They also have an open reading frame that is related to the reverse transcriptase of retroviruses but they do not contain LTRs (long terminal repeats). Copies of the LINE 1 (L1) family form about 15% of the human genome. The jockey elements of Drosophila are LINEs.
Highly repeated sequences, 100-300 bases long, which contain RNA polymerase III promoters. The primate Alu (ALU ELEMENTS) and the rodent B1 SINEs are derived from 7SL RNA, the RNA component of the signal recognition particle. Most other SINEs are derived from tRNAs including the MIRs (mammalian-wide interspersed repeats).
The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence.
Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories.
Addition of methyl groups to DNA. DNA methyltransferases (DNA methylases) perform this reaction using S-ADENOSYLMETHIONINE as the methyl group donor.
The monomeric units from which DNA or RNA polymers are constructed. They consist of a purine or pyrimidine base, a pentose sugar, and a phosphate group. (From King & Stansfield, A Dictionary of Genetics, 4th ed)
Discrete segments of DNA which can excise and reintegrate to another site in the genome. Most are inactive, i.e., have not been found to exist outside the integrated state. DNA transposable elements include bacterial IS (insertion sequence) elements, Tn elements, the maize controlling elements Ac and Ds, Drosophila P, gypsy, and pogo elements, the human Tigger elements and the Tc and mariner elements which are found throughout the animal kingdom.
The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION.
A single nucleotide variation in a genetic sequence that occurs at appreciable frequency in the population.
Nucleotide sequences, usually upstream, which are recognized by specific regulatory transcription factors, thereby causing gene response to various regulatory agents. These elements may be found in both promoter and enhancer regions.
Adenine nucleotides are molecules that consist of an adenine base attached to a ribose sugar and one, two, or three phosphate groups, including adenosine monophosphate (AMP), adenosine diphosphate (ADP), and adenosine triphosphate (ATP), which play crucial roles in energy transfer and signaling processes within cells.
The insertion of recombinant DNA molecules from prokaryotic and/or eukaryotic sources into a replicating vehicle, such as a plasmid or virus vector, and the introduction of the resultant hybrid molecules into recipient cells without altering the viability of those cells.
DNA sequences which are recognized (directly or indirectly) and bound by a DNA-dependent RNA polymerase during the initiation of transcription. Highly conserved sequences within the promoter include the Pribnow box in bacteria and the TATA BOX in eukaryotes.
Cis-acting DNA sequences which can increase transcription of genes. Enhancers can usually function in either orientation and at various distances from a promoter.
The biosynthesis of RNA carried out on a template of DNA. The biosynthesis of DNA from an RNA template is called REVERSE TRANSCRIPTION.
Guanine nucleotides are cyclic or linear molecules that consist of a guanine base, a pentose sugar (ribose in the cyclic form, deoxyribose in the linear form), and one or more phosphate groups, playing crucial roles in signal transduction, protein synthesis, and regulation of enzymatic activities.
The sequential correspondence of nucleotides in one nucleic acid molecule with those of another nucleic acid molecule. Sequence homology is an indication of the genetic relatedness of different organisms and gene function.
A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine).
Purines attached to a RIBOSE and a phosphate that can polymerize to form DNA and RNA.
A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis.
The parts of a macromolecule that directly participate in its specific combination with another molecule.
The spatial arrangement of the atoms of a nucleic acid or polynucleotide that results in its characteristic 3-dimensional shape.
Any detectable and heritable change in the genetic material that causes a change in the GENOTYPE and which is transmitted to daughter cells and to succeeding generations.
Nucleic acid sequences involved in regulating the expression of genes.
Extrachromosomal, usually CIRCULAR DNA molecules that are self-replicating and transferable from one organism to another. They are found in a variety of bacterial, archaeal, fungal, algal, and plant species. They are used in GENETIC ENGINEERING as CLONING VECTORS.
RNA sequences that serve as templates for protein synthesis. Bacterial mRNAs are generally primary transcripts in that they do not require post-transcriptional processing. Eukaryotic mRNA is synthesized in the nucleus and must be exported to the cytoplasm for translation. Most eukaryotic mRNAs have a sequence of polyadenylic acid at the 3' end, referred to as the poly(A) tail. The function of this tail is not known for certain, but it may play a role in the export of mature mRNA from the nucleus as well as in helping stabilize some mRNA molecules by retarding their degradation in the cytoplasm.
Sequences of DNA or RNA that occur in multiple copies. There are several types: INTERSPERSED REPETITIVE SEQUENCES are copies of transposable elements (DNA TRANSPOSABLE ELEMENTS or RETROELEMENTS) dispersed throughout the genome. TERMINAL REPEAT SEQUENCES flank both ends of another sequence, for example, the long terminal repeats (LTRs) on RETROVIRUSES. Variations may be direct repeats, those occurring in the same direction, or inverted repeats, those opposite to each other in direction. TANDEM REPEAT SEQUENCES are copies which lie adjacent to each other, direct or inverted (INVERTED REPEAT SEQUENCES).
A species of gram-negative, facultatively anaerobic, rod-shaped bacteria (GRAM-NEGATIVE FACULTATIVELY ANAEROBIC RODS) commonly found in the lower part of the intestine of warm-blooded animals. It is usually nonpathogenic, but some strains are known to produce DIARRHEA and pyogenic infections. Pathogenic strains (virotypes) are classified by their specific pathogenic mechanisms such as toxins (ENTEROTOXIGENIC ESCHERICHIA COLI), etc.
The relationships of groups of organisms as reflected by their genetic makeup.
Proteins which bind to DNA. The family includes proteins which bind to both double- and single-stranded DNA and also includes specific DNA binding proteins in serum which can be used as markers for malignant diseases.
Use of restriction endonucleases to analyze and generate a physical map of genomes, genes, or other segments of DNA.
Endogenous substances, usually proteins, which are effective in the initiation, stimulation, or termination of the genetic transcription process.
Cyclic nucleotides are closed-chain molecules formed from nucleotides (ATP or GTP) through the action of enzymes called cyclases, functioning as second messengers in various cellular signaling pathways, with cAMP and cGMP being the most prominent members.
Protein factors that promote the exchange of GTP for GDP bound to GTP-BINDING PROTEINS.
A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms.
The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms.
A group of chemical elements that are needed in minute quantities for the proper growth, development, and physiology of an organism. (From McGraw-Hill Dictionary of Scientific and Technical Terms, 4th ed)
Any of the processes by which nuclear, cytoplasmic, or intercellular factors influence the differential control (induction or repression) of gene action at the level of transcription or translation.
A computer based method of simulating or analyzing the behavior of structures or components.
Established cell cultures that have the potential to propagate indefinitely.
Substances that comprise all matter. Each element is made up of atoms that are identical in number of electrons and protons and in nuclear charge, but may differ in mass or number of neutrons.
Pyrimidines with a RIBOSE and phosphate attached that can polymerize to form DNA and RNA.
The process in which substances, either endogenous or exogenous, bind to proteins, peptides, enzymes, protein precursors, or allied compounds. Specific protein-binding measures are often used as assays in diagnostic assessments.