A Seacoast in Bohemia


Catching up the Chromosomes
October 6, 2009, 8:38 pm
Filed under: Uncategorized

Yesterday, a new build of the human genome was published to the NCBI genome viewer.  I checked it when a student had a question about the number of genes on a particular chromosome, and discovered that the chromosomes in the Gallery of Human Chromosomes on Genome Island needed updating!

Max updates the chromosomes

New genes everywhere!  Even the little Y chromosome has picked up a fair few.  It raises the interesting question of just what a gene is.  The genes with known functions are relatively easy to spot, since there are chunks of sequence that can be compared to the sequences of the same gene somewhere else. 

Nature doesn’t reinvent the wheel any more often than necessary.  Once it arrives at a workable beta globin or a methyl transferase, that gene becomes part of the archive and just turns up everywhere.  I remember how impressed I was when I first learned about the leghemoglobin in plants.  Who knew? 

Genes that don’t have a known relative can sometimes be recognized by their possession of known motifs — a sequence that binds Ca++ or DNA or ATP in some other protein.   If a DNA sequence encodes a bit of a protein that does something, then it’s probably part of a gene, even if you don’t know exactly why it might be binding Ca++.

But what about REALLY unknown genes –genes with no recognizable homologs or motifs, especially in eukaryotes, whose sequences are usually complicated by introns.  What you need is an open reading frame.  A bit of sequence that has a start codon and a reasonable string of amino acid codons following in the same translational frame.  But codons can break across an intron, that is, part of the codon may be in one exon and the rest of it thousands of nucleotides away in the next exon.   Fortunately introns have markers:  they tend to begin with GT and end with AG, which helps to guess where they might be interrupting the coding sequence. 

Nevertheless, none of this gene hunting is as simple as it might sound, and I’m in awe of the annotators who have added the new genes to the NCBI database.  And Max has now added them to the Chromosome Gallery!