These exercises will continue to examine the ENm008 region from the last workshop, but now well look at conservation of transcription factor binding sites and regulatory potential. Of course, you can look at any locus you like, but you should stick with the human genome, May 2004 (hg17) assembly.
Ivan Ovcharenko designed zPicture and Mulan to export their alignments to the rVista and MultiTF servers to obtain information on TFBSs and whether or not they are conserved.
If your RID from the last workshop is still active use it to return to your multiple alignments in Mulan. If not, try m11040822693626, or repeat the alignments.
From the summary page (which includes the dynamic visualization and the dot-plots), send the alignments to multiTF. Stick with the default settins on the page Defining transcription factor binding sites.
Choose the transcription factors GATA1, NFE2, and TAL1BETAE4. There are multiple entries related to GATA1 and TAL1; these are other weight matrices for the binding site. Feel free to choose others - follow your own interests. Click on submit. (My RID for this exercise is mlr11042005210219578, you may want to use it if you run into problems.)
View the results as dynamic visualization. Where do you see groups of cTFBSs? Compare these to the pattern of all sites, i.e. including the sites found in only a single species. Which is more selective?
You may wish to explore other features such as the ability to search for any user-defined consensus sequence (on the page Defining transcription factor binding sites), and a tool for finding clusters of cTFBS. In fact, dcode.org has a server for that genome-wide for human and mouse. The pairwise alignments can be sent from zPicture to an equivalent cTFBS finder, called rVista.
At the UCSC Genome Browser, bring up the 5x Regulatory Potential track (full mode, max set to 0.2) and TFBS conserved (pack mode) (both are under Expression and Regulation), along with the RefSeq genes and Conservation. Zoom in on chr16:100,001-170,000.
Do you see any noncoding regions with high RP and some cTFBSs that look like good candidates for cis-regulatory modules?
Note that this version of cTFBSs has similarities and differences to those obtained with MultiTF. Can you think of factors that would contribute to the differences?