CSHL Computational Genomics - November 6, 2005 -
Large Scale Genome Alignment



The human GSTM (glutatione transferase class-mu) gene cluster is comprised of 5 genes, oriented GSTM4->>GSTM2->GSTM1->>GSTM5-><-GSTM3.

The exact number of GSTM genes in the mouse is not known, and, more importantly, the orthology between the mouse and human genes is unclear.

In this exercise, we will do large scale comparisons of a syntenic region from human, mouse, rat, and fugu to try to identify the the orthologous relationships based on similarity and gene order, and determine whether fugu has both head-to-tail and tail-to-tail GSTM genes.

  1. Examining a genome map
    1. Go to the human map notice the gene orientation in the top of the map in blue. Pay attention to the genes that bound the syntenic region to orient the non-human clusters.
    2. Go to the mouse map, what is the orientation of these genes in mouse? What genes are at the boundaries of these genes?
    3. Go to the rat map, what is the orientation of these genes in rat? What genes are at the boundaries of these genes?

  2. Does Ensembl annotate the same mouse genes?
  3. The human, mouse, rat and syntenic fugu genome sequences have been downloaded to cerebus.cshl.edu:/ecg/data/genome/ucsc_human.nt, ucsc_mouse.nt, ucsc_rat.nt, and ucsc_fugu.nt. Copy these genomic DNA sequences to your computer:

    On the Mac:

    1. Open a terminal window
    2. copy the genome data:
      cp /ecg/data/genome/ucsc_* ~/Desktop

    On the PC's, use the ws_ftp program to transfer the files.

  4. Use PiPMaker to compare following genomes:
    1. Human to Mouse - use ucsc_human.repeats as First Sequence Mask and ucsc_human.genes as First Sequence exons.
    2. Mouse to Human use ucsc_mouse.repeats as First Sequence Mask and ucsc_mouse.genes as First Sequence exons.
    3. Mouse to Rat use ucsc_mouse.repeats and ucsc_mouse.genes
    4. Mouse to Fugu use ucsc_mouse.repeats and ucsc_mouse.genes

  5. Use the VISTA Genome Browser to look at this region from the human and mouse perspective. (On the Mac, you may need to use the Safari browser to use Java.) The Human location is: chr1:109850000-110050000. The Mouse location is: chr3:108250000-108500000. The rat location is: chr2:203400000-203650000. Which view gives the best perspective on potential mouse genes?

  6. You can also run the AVID sequence aligner at the LBL site. Compare the maps produced by AVID with those produced by LAGAN in the VISTA Genome Browser.
  7. Now try compare genomes using Ensembl using this guide


Course Home Page