Decoding the human genome

  1. Kelly A. Frazer1
  1. Moores UCSD Cancer Center, Department of Pediatrics and Rady Children's Hospital, University of California at San Diego, La Jolla, California 92093, USA

    This extract was created in the absence of an abstract.

    Interpreting the human genome sequence is one of the major scientific endeavors of our time. In February 2001, when the human genome reference sequence was initially released (Lander et al. 2001), our understanding of the encoded contents was surprisingly limited. It was perplexing to many in the scientific community when we realized that the human genome contains only ∼21,000 distinct protein-coding genes (Claverie 2001; Hollon 2001; Pennisi 2003; Clamp et al. 2007), as other less complex species like the nematode Caenorhabditis elegans were known to have a similar number of protein-coding genes (Hillier et al. 2005). It quickly became apparent that the developmental and physiological complexity of humans would not be explained solely by the number of protein-coding genes, and the quest to understand the contents of the human genome began full force.

    The Encyclopedia of DNA Elements (ENCODE) Project was launched in September of 2003 with the daunting task of identifying all the functional elements encoded in the human genome sequence. To accomplish this task, the National Human Genome Research Institute (NHGRI) organized The ENCODE Project Consortium, which consists of an international group of scientists with diverse expertise in experimental and computational methods for generating and analyzing high-throughput genomic data (The ENCODE Project Consortium 2004). During the initial four years, the consortium conducted a pilot project which focused on annotating functional elements in a defined 1% of the human genome consisting of ∼30 Mb divided among 44 genomic regions. On June 14, 2007, a report summarizing the findings of the pilot project revealed pervasive transcription of the human genome, with the majority of nucleotides represented in transcripts in at least a limited number of cell types at some time (The ENCODE Project Consortium 2007). Many of these transcripts comprised novel noncoding RNA …

    Related Articles

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server