David Haussler

David Haussler (born 1953) is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.[11][12] He is a Howard Hughes Medical Institute Investigator, professor of biomolecular engineering and founding scientific director of the UC Santa Cruz Genomics Institute at the University of California, Santa Cruz, director of the California Institute for Quantitative Biosciences (QB3) on the UC Santa Cruz campus, and a consulting professor at the Stanford University School of Medicine and the UC San Francisco Biopharmaceutical Sciences Department.[10][13]

David Haussler
David Haussler 1.jpg
BornOctober 1953 (1953-10) (age 66)[1]
NationalityUnited States
Alma mater
Known for
Scientific career
InstitutionsUniversity of California, Santa Cruz
ThesisInsertion and iterated insertion as operations on formal language (1982)
Doctoral advisorAndrzej Ehrenfeucht[8]
Doctoral students
Other notable studentsAnders Krogh[10]


Haussler studied art briefly at the Academy of Art in San Francisco in 1971 and then psychotherapy at Immaculate Heart College in Hollywood until 1973, when he transferred to Connecticut College, finishing in 1975 with a major in mathematics and minor in physics. He earned an MS in applied mathematics from California Polytechnic University in San Luis Obispo in 1979. Haussler received his PhD in computer science from the University of Colorado at Boulder in 1982.


During summers while he was in college, Haussler worked for his brother, Mark Haussler, a biochemist at the University of Arizona studying vitamin D metabolism. They were the first to measure the levels of 1alpha,25-dihydroxyvitamin D3, the hormonal form of vitamin D, in the human bloodstream.[14] Between 1975 and 1979 he traveled and worked a variety of jobs, including a job at a petroleum refinery in Burghausen, Germany, tomato farming on Crete, and farming kiwifruit, almonds, and walnuts in Templeton, CA. While in Templeton he worked on his Master's degree at nearby California Polytechnic University.

Haussler was an assistant professor in Mathematics and Computer Science at the University of Denver in Colorado from 1982-1986. From 1986 to the present, he has been at UC Santa Cruz, initially in the Computer Science Department, and in 2004 as an inaugural member of the Biomolecular Engineering Department.

While pursuing his doctorate in theoretical computer science at the University of Colorado, Haussler became interested in the mathematical analysis of DNA along with fellow students Gene Myers, Gary Stormo, and Manfred Warmuth. Haussler's current research stems from his early work in machine learning. In 1988 he organized the first Workshop on Computational learning Theory with Leonard Pitt. With Blumer, Ehrenfeucht, and Warmuth he introduced the Vapnik-Chervonenkis framework to computational learning theory, solving some problems posed by Leslie Valiant. In the 1990s he obtained various results in information theory, empirical processes, artificial intelligence, neural networks, statistical decision theory, and pattern recognition.


Haussler’s research combines mathematics, computer science, and molecular biology.[7] He develops new statistical and algorithmic methods to explore the molecular function and evolution of the human genome, integrating cross-species comparative and high-throughput genomics data to study gene structure, function, and regulation.[15][16][17][18] He is credited with pioneering the use of hidden Markov models (HMMs), stochastic context-free grammars, and the discriminative kernel method for analyzing DNA, RNA, and protein sequences. He was the first to apply the latter methods to the genome-wide search for gene expression biomarkers in cancer, now a major effort of his laboratory.

As a collaborator on the international Human Genome Project, his team, featuring programming work by graduate student Jim Kent, posted the first publicly available computational assembly of the human genome sequence on the Internet on July 7, 2000.[19] Following this, his team developed the UCSC Genome Browser,[20] a web-based tool that is used extensively in biomedical research and serves as the platform for several large-scale genomics projects. These include NHGRI’s ENCODE project to use omics methods to explore the function of every base in the human genome (for which UCSC serves as the Data Coordination Center), NIH’s Mammalian Gene Collection, NHGRI’s 1000 genomes project to explore human genetic variation, and NCI’s Cancer Genome Atlas project to explore the genomic changes in cancer.

His group’s informatics work on cancer genomics, including the UCSC Cancer Genomics Browser,[21] provides a complete analysis pipeline from raw DNA reads through the detection and interpretation of mutations and altered gene expression in tumor samples. His group collaborates with researchers at medical centers nationally, including members of the Stand Up To Cancer “Dream Teams” and the Cancer Genome Atlas, to discover molecular causes of cancer and develop a new personalized, genomics-based approach to cancer treatment.[22]

Haussler is one of eight organizing committee members of the Global Alliance for Genomic and Clinical Data Sharing, along with David Altshuler from the Broad Institute of Harvard and MIT; Peter Goodhand and Thomas Hudson from the Ontario Institute for Cancer Research; Brad Margus from the A-T Children's Project; Elizabeth Nabel from Brigham and Women's Hospital; Charles Sawyers from Memorial Sloan-Kettering; and Michael Stratton from Wellcome Trust Sanger Institute.[citation needed]

He co-founded the Genome 10K Project to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species—to capture genetic diversity as a resource for the life sciences and for worldwide conservation efforts.[23][24]

Through wet-lab experiments, Haussler explores and validates predictions generated from computational genomic research about the evolution and function of human genes. For instance, in his lab he uses embryonic and induced pluripotent stem cells to investigate neurodevelopment from a functional and evolutionary perspective. Research project areas include genome evolution, comparative genomics, alternative splicing, and functional genomics.[25][26][27][28][29][30][31][32]

Awards and recognitionEdit

Haussler is a member of the National Academy of Sciences,[33] the National Academy of Engineering,[34] and the American Academy of Arts and Sciences[citation needed][35] and a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI)[when?]. His awards include the 2011 Weldon Memorial Prize from University of Oxford, the 2009 American Society of Human Genetics (ASHG) Curt Stern Award in Human Genetics, the 2008 ISCB Senior Scientist Award from the International Society for Computational Biology (who also elected him an ISCB Fellow in 2009),[4] the 2005 Dickson Prize for Science from Carnegie Mellon University, and the 2003 Association for Computing Machinery/Association for the Advancement of Artificial Intelligence Allen Newell Award in Artificial Intelligence.

Alongside Cyrus Chothia and Michael Waterman, Haussler was awarded the 2015 Dan David Prize for his contributions to the field of bioinformatics.[36]


  1. ^ Jones, Pevzner An introduction to bioinformatics algorithms, 2004, p. 403.
  2. ^ Anon (2010). "2009 ASHG Awards and Addresses". The American Journal of Human Genetics. 86 (3): 309–310. doi:10.1016/j.ajhg.2010.02.013. PMC 3591852.
  3. ^ Sansom, C.; Morrison Mckay, B. J. (2008). Bourne, Philip E. (ed.). "ISCB Honors David Haussler and Aviv Regev". PLOS Computational Biology. 4 (7): e1000101. Bibcode:2008PLSCB...4E0101S. doi:10.1371/journal.pcbi.1000101. PMC 2536508. PMID 18795145.
  4. ^ a b Anon (2017). "ISCB Fellows". iscb.org. International Society for Computational Biology. Archived from the original on 2017-03-20.
  5. ^ n88141915
  6. ^ "Wikipedia co-founder Jimmy Wales among 2015 Dan David Prize winners". Retrieved 13 February 2015.
  7. ^ a b David Haussler publications indexed by Google Scholar
  8. ^ a b c David Haussler at the Mathematics Genealogy Project
  9. ^ Freund, Yoav (1993). Data filtering and distribution modeling algorithms for machine learning (PhD thesis). University of California, Santa Cruz. OCLC 679396091.
  10. ^ a b Gitschier, J. (2013). "Life, the Universe, and Everything: An Interview with David Haussler". PLOS Genetics. 9 (1): e1003282. doi:10.1371/journal.pgen.1003282. PMC 3561096. PMID 23382705.
  11. ^ Haussler, D. (2011). "David Haussler". Nature Biotechnology. 29 (3): 243. doi:10.1038/nbt.1808. PMID 21390032.
  12. ^ Downey, P. (2008). "Profile of David Haussler". Proceedings of the National Academy of Sciences. 105 (38): 14251–14253. Bibcode:2008PNAS..10514251D. doi:10.1073/pnas.0808284105. PMC 2567157. PMID 18799747.
  13. ^ Don't throw it out: 'Junk DNA' essential in evolution, radio interview by Joe Palca, NPR, Aug 19, 2011.
  14. ^ Brumbaugh, P. F.; Haussler, D. H.; Bressler, R.; Haussler, M. R. (1974). "Radioreceptor assay for 1 alpha,25-dihydroxyvitamin D3". Science. 183 (4129): 1089–1091. Bibcode:1974Sci...183.1089B. doi:10.1126/science.183.4129.1089. PMID 4812038.
  15. ^ Pearson, Helen (2004). "'Junk' DNA reveals vital role". Nature. doi:10.1038/news040503-9.
  16. ^ Biello, David. "Scientists Identify Gene Difference Between Humans and Chimps". Scientific American. 17 August 2006. Retrieved 2012-02-29.
  17. ^ "Vertebrate Evolution Occurred in Genetically Distinct Epochs". HHMI News. 19 August 2011. Retrieved 2012-02-29.
  18. ^ Zimmer, Carl. "When Bats and Humans Were One and the Same". The New York Times. 7 December 2004. Retrieved 2012-02-29.
  19. ^ Maher, Brendan. "Postcard from the party". The Scientist. 17 April 2003. Retrieved 2012-02-29.
  20. ^ Wade, Nicholas. "Reading the book of life; Grad student becomes gene effort's unlikely hero". The New York Times. 13 February 2001. Retrieved 2012-02-29.
  21. ^ Zhu, J.; Sanborn, J. Z.; Benz, S.; Szeto, C.; Hsu, F.; Kuhn, R. M.; Karolchik, D.; Archie, J.; Lenburg, M. E.; Esserman, L. J.; Kent, W. J.; Haussler, D.; Wang, T. (2009). "The UCSC Cancer Genomics Browser". Nature Methods. 6 (4): 239–240. doi:10.1038/nmeth0409-239. PMC 5027375. PMID 19333237.
  22. ^ Patterson, David. "Computer Scientists May Have What It Takes to Help Cure Cancer". The New York Times. 5 December 2011. Retrieved 2012-02-29.
  23. ^ Pennisi, Elizabeth. No Genome Left Behind. Science News. November 2009. Retrieved 2012-02-29.
  24. ^ "Building the Genome Zoo: The Genome 10K Project". The 7th Avenue Project. 22 November 2009. Retrieved 2012-02-29.
  25. ^ Park, J; Karplus, K; Barrett, C; Hughey, R; Haussler, D; Hubbard, T; Chothia, C (1998). "Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods". Journal of Molecular Biology. 284 (4): 1201–10. doi:10.1006/jmbi.1998.2221. PMID 9837738.
  26. ^ Farrell, C. M.; O'Leary, N. A.; Harte, R. A.; Loveland, J. E.; Wilming, L. G.; Wallin, C.; Diekhans, M.; Barrell, D.; Searle, S. M. J.; Aken, B.; Hiatt, S. M.; Frankish, A.; Suner, M. -M.; Rajput, B.; Steward, C. A.; Brown, G. R.; Bennett, R.; Murphy, M.; Wu, W.; Kay, M. P.; Hart, J.; Rajan, J.; Weber, J.; Snow, C.; Riddick, L. D.; Hunt, T.; Webb, D.; Thomas, M.; Tamez, P.; Rangwala, S. H. (2013). "Current status and new features of the Consensus Coding Sequence database". Nucleic Acids Research. 42 (Database issue): D865–D872. doi:10.1093/nar/gkt1059. PMC 3965069. PMID 24217909.
  27. ^ Harrow, J; Frankish, A; Gonzalez, J. M.; Tapanari, E; Diekhans, M; Kokocinski, F; Aken, B. L.; Barrell, D; Zadissa, A; Searle, S; Barnes, I; Bignell, A; Boychenko, V; Hunt, T; Kay, M; Mukherjee, G; Rajan, J; Despacio-Reyes, G; Saunders, G; Steward, C; Harte, R; Lin, M; Howald, C; Tanzer, A; Derrien, T; Chrast, J; Walters, N; Balasubramanian, S; Pei, B; et al. (2012). "GENCODE: The reference human genome annotation for the ENCODE Project". Genome Research. 22 (9): 1760–74. doi:10.1101/gr.135350.111. PMC 3431492. PMID 22955987.
  28. ^ International Cancer Genome Consortium; Hudson, T. J.; Anderson, W; Artez, A; Barker, A. D.; Bell, C; Bernabé, R. R.; Bhan, M. K.; Calvo, F; Eerola, I; Gerhard, D. S.; Guttmacher, A; Guyer, M; Hemsley, F. M.; Jennings, J. L.; Kerr, D; Klatt, P; Kolar, P; Kusada, J; Lane, D. P.; Laplace, F; Youyong, L; Nettekoven, G; Ozenberger, B; Peterson, J; Rao, T. S.; Remacle, J; Schafer, A. J.; Shibata, T; et al. (2010). "International network of cancer genome projects". Nature. 464 (7291): 993–8. Bibcode:2010Natur.464..993T. doi:10.1038/nature08987. PMC 2902243. PMID 20393554.
  29. ^ Pruitt, K. D.; Harrow, J; Harte, R. A.; Wallin, C; Diekhans, M; Maglott, D. R.; Searle, S; Farrell, C. M.; Loveland, J. E.; Ruef, B. J.; Hart, E; Suner, M. M.; Landrum, M. J.; Aken, B; Ayling, S; Baertsch, R; Fernandez-Banet, J; Cherry, J. L.; Curwen, V; Dicuccio, M; Kellis, M; Lee, J; Lin, M. F.; Schuster, M; Shkeda, A; Amid, C; Brown, G; Dukhanina, O; Frankish, A; et al. (2009). "The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes". Genome Research. 19 (7): 1316–23. doi:10.1101/gr.080531.108. PMC 2704439. PMID 19498102.
  30. ^ ENCODE Project Consortium, Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET; et al. (2007). "Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project". Nature. 447 (7146): 799–816. Bibcode:2007Natur.447..799B. doi:10.1038/nature05874. PMC 2212820. PMID 17571346.CS1 maint: multiple names: authors list (link)
  31. ^ Chinwalla, A. T.; Waterston, L. L.; Lindblad-Toh, K. D.; Birney, G. A.; Rogers, L. A.; Abril, R. S.; Agarwal, T. A.; Agarwala, L. W.; Ainscough, E. R.; Alexandersson, J. D.; An, T. L.; Antonarakis, W. E.; Attwood, J. O.; Baertsch, M. N.; Bailey, K. H.; Barlow, C. S.; Beck, T. C.; Berry, B.; Birren, J.; Bloom, E.; Bork, R. H.; Botcherby, M. C.; Bray, R. K.; Brent, S. P.; Brown, P.; Brown, E.; Bult, B.; Burton, T.; Butler, D. G.; et al. (2002). "Initial sequencing and comparative analysis of the mouse genome". Nature. 420 (6915): 520–562. Bibcode:2002Natur.420..520W. doi:10.1038/nature01262. PMID 12466850.
  32. ^ Lander, E. S.; Linton, M.; Birren, B.; Nusbaum, C.; Zody, C.; Baldwin, J.; Devon, K.; Dewar, K.; Doyle, M.; Fitzhugh, W.; Funke, R.; Gage, D.; Harris, K.; Heaford, A.; Howland, J.; Kann, L.; Lehoczky, J.; Levine, R.; McEwan, P.; McKernan, K.; Meldrim, J.; Mesirov, J. P.; Miranda, C.; Morris, W.; Naylor, J.; Raymond, C.; Rosetti, M.; Santos, R.; Sheridan, A.; et al. (Feb 2001). "Initial sequencing and analysis of the human genome" (PDF). Nature. 409 (6822): 860–921. Bibcode:2001Natur.409..860L. doi:10.1038/35057062. ISSN 0028-0836. PMID 11237011.
  33. ^ "National Academy of Sciences".
  34. ^ "National Academy of Engineering".
  35. ^ "American Academy of Arts & Sciences".
  36. ^ "Dan David Prize".