Guide to Bioinformatics at Stanford University
Russ B. Altman, MD, PhD

(updated 11/25/06)

The following offers a summary of information about bioinformatics at Stanford. It is meant as advisory information for people interested in bioinformatics, written from my own perspective.

What is Bioinformatics?

Who does bioinformatics at Stanford?

What courses are offered at Stanford in bioinformatics?

Can I get a PhD or Master's degree in bioinformatics?

What about an MD/PhD?

What about if I am an undergraduate?

What about post-docs in bioinformatics?

Other programs I should consider?

What are the main conferences?

Who funds bioinformatics research?

Where to learn more?

Professor Don Knuth on Bioinformatics.


What is Bioinformatics and Biomedical Informatics?

The definition of bioinformatics is not univerally agreed upon. Generally speaking, we define it as the creation and development of advanced information and computational technologies for problems in biology, most commonly molecular biology (but increasingly in other areas of biology). As such, it deals with methods for storing, retrieving and analyzing biological data, such as nucleic acid (DNA/RNA) and protein sequences, structures, functions, pathways and genetic interactions.  Some people construe bioinformatics more narrowly, and include only those issues dealing with the management of genome project sequencing data. Others construe bioinformatics more broadly and include all areas of computational biology, including population modeling and numerical simulations.   Biomedical informatics is a slightly broader umbrella that includes not only bioinformatics, but other areas of informatics in biology, medicine and health-care.  They are closely related.

The Stanford Biomedical Informatics training program has adopted the following mission statement:

“The mission of the Stanford BMI program is to train the next generation of researchers in biomedical informatics. Our students gain a knowledge of the scholarly informatics literature and the application requirements of specific areas within biology and/or medicine.  They learn to design and implement novel methods that are generalizable to a defined class of problems--focusing on the acquisition, representation, retrieval, and analysis of  biomedical data and  knowledge.”

At the Stanford Biomedical Informatics (BMI) training program, bioinformatics is included along with clinical informatics as part of a general informatics training program. Clinical Informatics is the creation and development of advanced information and computational technologies to problems in the delivery of medical care. As such, it deals with patients, hospitals, laboratory tests, physicians, health-care professionals instead of bench scientists. However, the issues in clinical informatics and bioinformatics are sufficiently complementary, that we believe they can profit from being under the same roof.  I have discussed the issue of "what is bioinformatics?" in a  paper, and have recently written an editorial about a proposed curriculum for bioinformatics.   I have also addressed the issue of clinical and bioinformatics co-existing.

US News & World Report recently ranked Stanford as #1 for graduate training in biological sciences. Stanford's Computer Science graduate department was ranked #1. In addition to excellent ratings in individual departments, Stanford was ranked #2 (tied with MIT) in "Genetics/Genomics/Bioinformatics. The BMI training program covers more than  bioinformatics (all of biomedical informatics, but US News does not separate out this discipline), but we are happy that our efforts in  this subfield are recognized by our peers.

There is significant industrial interest in bioinformatics currently because of the information being produced by the genome sequencing projects, and the need to harnass this for medical diagnostic and therapeutic uses, as well as the need to uses this information for other industrial applications.  There is an International Society for Computational Biology (ISCB) that is organizing a job list.  It is affiliated with a number of scholarly journals.

There are many companies that have bioinformatics research and development units.  The best way to learn about them is to peruse the last few issues of the BIOINFORM newsletter, and look at job listings at the various computational biology conferences.  These companies are often co-sponsors of meetings.  The companies include big pharmaceutical companies, small pharmaceutical companies, bioinformatics companies and others.

Who does bioinformatics or computational biology at Stanford?

The list of participating faculty on the BMI web pages is a very good starting list.  It may not be comprehensive, but it includes many faculty members who do bioinformatics or computational biology.

In addition, there is an annual student-oganized symposium on Biocomputation at Stanford (BCatS)

What courses are offered at Stanford in bioinformatics?

Of course, any biology or computer science or related course can be considered a bioinformatics course, but there are some courses that can be considered core courses in bioinformatics.  The number of courses is growing, and so you should check the latest Stanford course catalog.

Representations and Algorithms for Computational Molecular Biology (Biomedical Informatics 214/Computer Science 274). A programming course, introduces nuts and bolts of basic algorithms.

Translational bioinformatics (BMI 217) is a new course covering the use of bioinformatics to assist translational medicine. 

Computer Applications in Molecular Biology (Biochemistry 218). A course for biologists, introduces key ideas in bioinformatics.

Genomics (Genetics 211).  A course for biologists, introduces PERL programming and covers analysis of genomic data. 

Protein Architecture, Dynamics and Structure Prediction (Structural Biology 228). Introduces the basic concepts of molecular structure and how to compute with molecular structure.

Computational Genomics (CS 262) covers important algorithms for genomics research, including comparative genomics.

Algorithms for structure and function in biology (CS273) covers algorithms for modeling and motion in molecular biology.

Algorithms in Biology (CS 374) covers detailed study of exciting current algorithms in bioinformatics.

Computational methods for analysis and reconstruction of biological networks (CS 279) covers the algorithms and data structures for analyzing and reconstructing biological networks.

Computational Systems Biology (CS 278) is an introduction to systems biology computing.

Biomedical Informatics 210 and Biomedical Informatics 211 offer basic introduction to informatics, with both clinical and biological applications.

Biomedical Informatics 212 is a project course that allows students with an interest in the field to work on teams of 3 or 4 students to create a novel software system in some area of biomedical informatics.

In addition, the Stanford Center for Professional Development is offering a Certificate in Bioinformatics for industry.  They can also make arrangements for academic licenses. Contact the Center for more information.

Can I get a PhD or Master's degree in bioinformatics?

Stanford offers  MS and PhD degrees in Biomedical Informatics (BMI), which includes bioinformatics. The deadline for application is mid-December each year. The core curriculum includes basic courses in biology/physiology, computer science, probability/statistics/decision theory, social/ethical issues, and core biomedical informatics. All students are exposed to both clinical and bioinformatics, but may choose courses to allow some degree of specialization based on their interests. Currently the program has about 70% bioinformatics and 30% clinical informatics students.  The BMI program has an NIH training grant with pre-doctoral slots for US citizens and permanent residents. Other students must find research assistantships or other sources of funds for their training.  Students who apply to the Biomedical Informatics program will automatically be considered for all sources of funding.

In addition to the BMI  program, applicants can apply to any department affiliated with one of the faculty listed above and pursue bioinformatics within bioengineering, computer science, genetics, structural biology, mathematics, etc…

Coterm MS in BMI?

The BMI program also offers a coterminal MS degree to Stanford undergraduates, and this is a way for undergraduates to spend approximately one extra year to add an MS to their undergraduate program.  There is no funding available for MS students, except if they can secure research assistantships.

What about an MD/PhD?

Stanford has two ways to do an MD/PhD. US citizens and permanent residents can apply to the Medical Scientist Training Program and be accepted with funding for both MD and PhD with stipend/tuition. They can then select an informatics laboratory for their PhD.

Students can also apply to the MD program and, either simultaneously or during first year of MD training, apply to the PhD program of their choice. If accepted for the PhD, they may be eligible for training grant funding of their PhD (the training grant associated with their PhD program (such as the MIS training grant for MIS students), or the Genome Training Grant for students in any program relevant to bioinformatics). They will still have to pay for their MD, but these costs can be defrayed if they can secure a 50% research assistantship with their research mentor, which allows them to take medical school courses with greatly reduced tuition, and work in a lab for a stipend.

Students with an interest in MD/PhD in informatics should, in general, have some exposure to both biology and computer science/applied mathematics. Although this is not mandatory, the MSTP program is sufficiently competitive that the most successful applicants will have a strong undergraduate experience in their chosen field, and evidence of long term committment to the field. I would be happy to speak with MD/PhD candidates interested particularly in bioinformatics.

What about if I am an undergraduate?

The School of Engineering has a pre-approved independently designed major (IDM) entitled “Biomedical Computation” which may be of interest to students who want to combine training in biology and computation. 

Undergraduates can contact interested faculty investigators. Normally (although not exclusively) it is critical to have some basic knowledge of biology as well as computer programming skills in order to have a good experience in one of these labs. It is usually important to spend one or two quarters of research with a faculty member before a summer position can be secured, in order to ensure that the PI and the student are compatible. Undergraduates from both biology and computer science have participated in bioinformatics research at Stanford profitably with significant contributions and publishable results. It is appropriate to consider bioinformatics for a biology honors thesis, as well as for a computer science programming project.

The types of skills that are appropriate for preparing for bioinformatics graduate school include the following:

  1. Computer programming (including text processing--PERL, and regular computation--C, C++, Java)
  2. Databases
  3. Probability and statistics
  4. Molecular biology and/or physiology

 

Undergraduates with programming skills and some knowledge of biology should also consider taking the Stanford courses in bioinformatics that are listed above.

What about post-docs in bioinformatics?

The issue of post-docs is tricky in informatics. Some people have done PhD's in computer science or biology or related field, but have not worked within the intersection of the two fields, and thus are not really qualified to do independent work in bioinformatics. Others have done PhD's in these fields with an emphasis on bioinformatics, and are thus capable of entering a lab and becoming an independent contributor. Thus, I believe that post-docs come in two varieties: those ready to be bioinformatics post-docs, and those requiring additional training before they can take off on independent research. It is important to figure out which group you belong to, since it implies a different set of requirements.

Post-docs with a signficant gap in biology or computer science need basic coursework and training in these areas, and are often not able to get up and going in a research project right away. Post-docs with no such gaps are ready to do relatively independent research under the direction of a PI with little extra formal instruction required.

Stanford Biomedical Informatics training program has some training grant funding for post-docs (all for US citizens and permanent residents) in both situations.

For people with an advanced degree, but not a complete set of skills in informatics, they are eligible for BMI training grant support ONLY if they enter the MS or PhD program in Biomedical Informatics.

There are two other options:

3. Established bioinformatics PhDs (US citizen/permanent residents) with relevant previous experience may also apply for post-doctoral support from external sources. 

4. Finally, of course, post-docs can apply to specific labs and work on specific research grants with individual PIs by their own arrangement.

Other programs I should consider?

I am not in the business of recommending or not recommending other programs, but it is clear that a look at recent important bioinformatics conferences will quickly indicate where there is exciting work going on, and who one might want to contact to find out about training opportunities. One rule-of-thumb is that many of the genome sequencing centers are affiliated with bioinformatics programs due to the heavy informatics requirements within genome science.   In addition, many people working on large scale biological and genomic data collection techniques (for example, gene chips) have active informatics efforts.

What are the main conferences?

Who funds bioinformatics research?

The main funding sources have been National Institute of Health (via National Library of Medicine and National Center for Human Genome Research and the National Institute of General Medical Sciences), National Science Foundation (especially Biological Information Resources and Computational Database Activity programs), Department of Energy (through Genome projects), as well as some foundations (Burroughs-Welcome, Keck Foundation, Howard Hughes Medical Institute,  Culpeper Foundation). Some pharmaceutical and biotech companies have grant programs as well, such as PhRMA.

Where to learn more?

  1. Talk to your local bioinformatics professional.
  2. Join the International Society for Computational Biology, and look at their web page with training programs both online and for degrees.
  3. Sources of funding for computational biology funding

Professor Donald Knuth on Bioinformatics

Professor Donald Knuth, of the Stanford Computer Science Department was once interviewed (entire interview available online) and had some comments on issues related to Bioinformatics:

CLB: If you were a soon-to-graduate college senior or Ph.D. and you didn't have any "baggage", what kind of research would you want to do? Or would you even choose research again?

Knuth: I think the most exciting computer research now is partly in robotics, and partly in applications to biochemistry. Robotics, for example, that's terrific. Making devices that actually move around and communicate with each other. Stanford has a big robotics lab now, and our plan is for a new building that will have a hundred robots walking the corridors, to stimulate the students. It'll be two or three years until we move in to the building. Just seeing robots there, you'll think of neat projects. These projects also suggest a lot of good mathematical and theoretical questions. And high level graphical tools, there's a tremendous amount of great stuff in that area too. Yeah, I'd love to do that... only one life, you know, but...

CLB: Why do you mention biochemistry?

Knuth: There's millions and millions of unsolved problems. Biology is so digital, and incredibly complicated, but incredibly useful. The trouble with biology is that, if you have to work as a biologist, it's boring. Your experiments take you three years and then, one night, the electricity goes off and all the things die! You start over. In computers we can create our own worlds. Biologists deserve a lot of credit for being able to slug it through.

It is hard for me to say confidently that, after fifty more years of explosive growth of computer science, there will still be a lot of fascinating unsolved problems at peoples' fingertips, that it won't be pretty much working on refinements of well-explored things. Maybe all of the simple stuff and the really great stuff has been discovered. It may not be true, but I can't predict an unending growth. I can't be as confident about computer science as I can about biology. Biology easily has 500 years of exciting problems to work on, it's at that level.