FOOTER (version 1.0) |
This program analyses a pair of homologous mammalian DNA sequences (i.e. human and mouse/rat) for high probability binding sites of known transcription factors. A set of Position-Specific Scoring Matrices (PSSM) has been carefully constructed from mammalian transcription factor binding sites deposited in TRANSFAC database. These matrices are species-specific (i.e. human and mouse/rat) and class-specific (human/mouse/rat). Matrices were created using program WConsensus (Hertz and Stormo, Bioinformatics 1999).
FOOTER is a statistical algorithm that
employs phylogenetic footprinting principles to identify the most
likely human-mouse pairs of binding sites. The algorithm evaluates
each pair of sites using two tail probabilities:
1) Relative distance of the two sites from the Transcription
Start Site (TSS) or the closest boundary of conserved
region.
2) The relative PSSM score.
Tail probabilities are calculated based on the hypothesis that the compared pair of sites occur by chance and they are not true binding sites. The weighted sum of these tail probabilities constitutes the FOOTER score and a Weighted Average p-value (WAP) is calculated based on this score.
The program presents the results in table format. The results are consisted of the transcription factor name, the identified human site and its position/orientation, the identified mouse site and its position/orientation, the FOOTER combined score and the weighted average p-value. The output also provides links to the human and mouse analyzed promoter sequences and alignment output.
Search length:
The user can specify the length of upstream and downstream sequences
that wishes to be analyzed. Lengths are valid only if the
user submits a protein sequence.
a) PFd Weight:
The user can define the weight given to the probability for the relative
distance of the putative sites from the TSS or the closest conserved region
boundary. For more details on the probability score, see (Corcoran
et al.).
b) PFs Weight:
The user can define the weight given to the probability for the relative PSSM
scores. For more details on the probability score, see (Corcoran
et al.).
c) WAP Cut-off:
The user specifies the weighted average p-value cutoff.
d) Seed:
The user can define the number of seed putative binding sites per 'n' base
pairs.
FOOTER predictions are presented in a table format,
preceeded by a PNJ file showing the degree of conservation (color coded)
between the two regions analysed. The columns of the table contain
information about the transcription factor(s) predicted to bind, the
sites' sequence and position and information about the
FOOTER combined score and WAP value.
a) Factor:
The name of the transcription factor predicted to bind in this
region.
b) Human site:
The transcription factor binding site sequence in the human promoter.
c) PosH (orient.):
The position where the human site found and its orientation. Note that
position 1 is the beginning of the specified sequence, not the
Transcription Start Site (TSS). Orientation is
determined with respect to the specified sequences.
d) Mouse site:
The transcription factor binding site sequence in the mouse promoter.
e) PosM (orient.):
The position where the mouse site found and its orientation. Note that
position 1 is the beginning of the specified sequence, not the
Transcription Start Site (TSS). Orientation is
determined with respect to the specified sequences.
f) Score:
The FOOTER
combined score, which is the
negative sum of the log-probabilities of the scores for the individual
criteria (Corcoran
et al.). Practically, this score is equal to -ln(WAP).
Score values around 5 are giving the best results in most cases.
f) WAP value:
The Weighted Average p-value as it is calculated by
FOOTER.
Please reference this work as follows:
Corcoran et al. "FOOTER: a quantitative phylogenetic footprinting method for efficient recognition of cis-regulatory elements" Genome Res (2005) in press.
The user can select which transcription factors should be considered in the search. Currently, FOOTER offers 127 different factors. We are currently working on adding the ability for the user to input their own count matrix. Note: The 'check all' and factor group check do not currently work on all browsers.
>Mouse_PEPCK mppqlhngld fsakviqgsl dslpqavrkf vegnaqlcqp eyihicdgse eeygqllahm qeegvirklk kydncwlalt dprdvaries ktviitqeqr dtvpipktgl sqlgrwmsee dfekafnarf pgcmkgrtmy vipfsmgplg splakigiel tdspyvvasm rimtrmgisv lealgdgefi kclhsvgcpl plkkplvnnw acnpeltlia hlpdrreiis fgsgyggnsl lgkkcfalri asrlakeegw laehmlilgi tnpegkkkyl aaafpsacgk tnlammnpsl pgwkvecvgd diawmkfdaq gnlrainpen gffgvapgts vktnpnaikt iqkntiftnv aetsdggvyw egideplapg vtitswknke wrpqdaepca hpnsrfctpa sqcpiidpaw espegvpieg iifggrrpeg vplvyealsw qhgvfvgaam rseataaaeh kgkiimhdpf amrpffgynf gkylahwlsm ahrpaaklpk ifhvnwfrkd kdgkflwpgf gensrvlewm fgriegedsa kltpigyipk enalnlkglg gvnveelfgi skefwekeve eidryledqv ntdlpyeier elralkqris qm
Factors to Try with the Sample Sequence:
Under Leucine Zipper Factors:
CREB and C/EBP-beta