Next Previous Contents

2. Biological meaning

Phylogenetic footprinting is an in vogue technique to discover Transcription Factor Binding Sites, TFBSs, in eukaryotes.


Orthologous sequences of different species are compared pairwise looking for peaks of similarity in a certain region (window). In this example we have the gene NM_022435, Sp5, of mouse compared with orthologous sequences in human, dog and fugu.

Comparative studies have been very effective at identifying conserved non-coding elements (CNEs) that might have regulatory functions; however CNEs may regulate a broad variety of biological functions, not necessarily confined to transcriptional regulation. For example, CNEs may be involved in the process of DNA replication or mRNA splicing. So one of the main reasons of high false positive rate of present TFBS discovery software is that they are mainly exploiting only the Kimura rule: "functionally less important molecules or parts of molecules evolve (in term of mutant substitutions) faster than more important ones". They take into account only the alignment of sequences and the percentage of conservation among them while there is extra information which could be used as the modular structure of TFBSs.


Modular binding of proteins to a regulatory region: they typically bind in a restricted region and then act cooperatively to induce transcription activity. Two possible scenarios are avilable: a single transcription factor which binds repetitively to the same kind of binding sites (simple module) or different transcription factors which bind to different binding sites (complex module).

The current view is that once a regulatory region is accessible it is bound by a combination of transcription factors. Binding of proteins is generally cooperative: while one protein binds weakly, multiple transcription factors involved in protein-protein interactions increase their affinities to the regulatory region.


Cooperative activity of transcription factors: a few proteins bind to a Cis-Regulatory Module (CRM) and then intereact with the co-activator complex to enhance transcription. From: "Applied bioinformatics for the identification of regulatory elements", W. Wasserman and A. Sandelin, Nature Reviews Genetics.

Although a well defined structure for regulatory regions has not yet been described in detail we can summarize it as follow:

Moreover most of current software is based on global alignment while the modular structure of binding sites suggests local alignment to be used. Even in tools which carry out local alignment to discover regulatory regions there is no direct use of the modular structure of TFBSs.


Next Previous Contents