Guide to Bioinformatics at Stanford University
Russ B. Altman, MD, PhD
(updated
11/25/06)
The following offers a summary of information about bioinformatics at
Stanford. It is meant as advisory information for people interested in bioinformatics,
written from my own perspective.
Who does
bioinformatics at Stanford?
What
courses are offered at Stanford in bioinformatics?
Can I
get a PhD or Master's degree in bioinformatics?
What
about if I am an undergraduate?
What
about post-docs in bioinformatics?
Other
programs I should consider?
What
are the main conferences?
Who funds
bioinformatics research?
Professor
Don Knuth on Bioinformatics.
What is Bioinformatics and Biomedical Informatics?
The definition of bioinformatics is not univerally agreed upon. Generally
speaking, we define it as the creation and development of advanced information
and computational technologies for problems in biology, most commonly molecular
biology (but increasingly in other areas of biology). As such, it deals with
methods for storing, retrieving and analyzing biological data, such as nucleic
acid (DNA/RNA) and protein sequences, structures, functions, pathways and
genetic interactions. Some people
construe bioinformatics more narrowly, and include only those issues dealing
with the management of genome project sequencing data. Others construe bioinformatics
more broadly and include all areas of computational biology, including
population modeling and numerical simulations. Biomedical informatics is a slightly broader umbrella
that includes not only bioinformatics, but other areas of informatics in
biology, medicine and health-care.
They are closely related.
The Stanford Biomedical Informatics training program has adopted the
following mission statement:
“The mission of the Stanford
BMI program is to train the next
generation of researchers in biomedical informatics. Our students gain a
knowledge of the scholarly informatics literature and the application
requirements of specific areas within biology and/or medicine. They learn to design and implement
novel methods that are generalizable to a defined class of problems--focusing
on the acquisition, representation, retrieval, and analysis of biomedical data and knowledge.”
At the Stanford Biomedical Informatics
(BMI) training program, bioinformatics is included along with clinical
informatics as part of a general informatics training program. Clinical
Informatics is the creation and development
of advanced information and computational technologies to problems in the delivery
of medical care. As such, it deals with patients, hospitals, laboratory tests,
physicians, health-care professionals instead of bench scientists. However, the
issues in clinical informatics and bioinformatics are sufficiently
complementary, that we believe they can profit from being under the same
roof. I have discussed the issue of "what is bioinformatics?"
in a paper, and have recently written an
editorial about a proposed
curriculum for bioinformatics. I have also addressed the issue
of clinical
and bioinformatics co-existing.
US News & World Report recently ranked
Stanford as #1 for graduate training in biological sciences. Stanford's
Computer Science graduate department was ranked
#1. In addition to excellent ratings in individual departments, Stanford
was ranked
#2 (tied with MIT) in "Genetics/Genomics/Bioinformatics. The BMI
training program covers more than
bioinformatics (all of biomedical informatics, but US News does not
separate out this discipline), but we are happy that our efforts in this subfield are recognized by our
peers.
There is significant industrial interest in bioinformatics currently because
of the information being produced by the genome sequencing projects, and the
need to harnass this for medical diagnostic and therapeutic uses, as well as
the need to uses this information for other industrial applications.
There is an International Society for
Computational Biology (ISCB) that is organizing a job list. It is
affiliated with a number of scholarly journals.
There are many companies that have bioinformatics research and development
units. The best way to learn about them is to peruse the last few issues
of the BIOINFORM newsletter, and look
at job listings at the various computational biology conferences. These
companies are often co-sponsors of meetings. The companies include big
pharmaceutical companies, small pharmaceutical companies, bioinformatics
companies and others.
Who does bioinformatics or computational biology at
Stanford?
The list of participating faculty on the BMI web pages is a very good starting list. It may not be comprehensive, but it includes many faculty members who do bioinformatics or computational biology.
In addition, there is an annual student-oganized symposium on Biocomputation at Stanford (BCatS).
What courses are offered at Stanford in bioinformatics?
Of course, any biology or computer science or related course can be
considered a bioinformatics course, but there are some courses that can be
considered core courses in bioinformatics. The number of courses is growing, and so you should check
the latest Stanford course catalog.
Representations and
Algorithms for Computational Molecular Biology (Biomedical Informatics
214/Computer Science 274). A programming course, introduces nuts and bolts of
basic algorithms.
Translational bioinformatics (BMI 217) is a new course covering the use of
bioinformatics to assist translational medicine.
Computer
Applications in Molecular Biology (Biochemistry 218). A course for
biologists, introduces key ideas in bioinformatics.
Genomics
(Genetics 211). A course for
biologists, introduces PERL programming and covers analysis of genomic
data.
Protein Architecture, Dynamics
and Structure Prediction (Structural Biology 228). Introduces the basic
concepts of molecular structure and how to compute with molecular structure.
Computational
Genomics (CS 262) covers important algorithms for genomics research,
including comparative genomics.
Algorithms for structure and
function in biology (CS273) covers algorithms for modeling and motion in
molecular biology.
Algorithms in Biology
(CS 374) covers detailed study of exciting current algorithms in
bioinformatics.
Computational methods for
analysis and reconstruction of biological networks (CS 279) covers the
algorithms and data structures for analyzing and reconstructing biological
networks.
Computational Systems Biology (CS 278) is an introduction to systems biology
computing.
Biomedical Informatics 210
and Biomedical
Informatics 211 offer basic introduction to informatics, with both clinical
and biological applications.
Biomedical
Informatics 212 is a project course that allows students with an interest
in the field to work on teams of 3 or 4 students to create a novel software
system in some area of biomedical informatics.
In addition, the Stanford
Center for Professional Development is offering a Certificate
in Bioinformatics for industry. They can also make arrangements for
academic licenses. Contact the Center for more information.
Can I get a PhD or Master's degree in bioinformatics?
Stanford offers MS and PhD degrees in Biomedical Informatics
(BMI), which includes bioinformatics. The deadline for application is
mid-December each year. The core curriculum includes basic courses in
biology/physiology, computer science, probability/statistics/decision theory,
social/ethical issues, and core biomedical informatics. All students are
exposed to both clinical and bioinformatics, but may choose courses to allow
some degree of specialization based on their interests. Currently the program
has about 70% bioinformatics and 30% clinical informatics students. The
BMI program has an NIH training grant with pre-doctoral slots for US citizens
and permanent residents. Other students must find research assistantships or
other sources of funds for their training. Students who apply to the Biomedical Informatics program
will automatically be considered for all sources of funding.
In addition to the BMI program, applicants can apply to any department
affiliated with one of the faculty listed above and pursue bioinformatics
within bioengineering, computer science, genetics, structural biology,
mathematics, etc…
Coterm MS in BMI?
The BMI program also offers a coterminal MS degree to Stanford
undergraduates, and this is a way for undergraduates to spend approximately one
extra year to add an MS to their undergraduate program. There is no funding available for MS
students, except if they can secure research assistantships.
What about an MD/PhD?
Stanford has two ways to do an MD/PhD. US citizens and permanent residents
can apply to the Medical
Scientist Training Program and be accepted with funding for both MD and PhD
with stipend/tuition. They can then select an informatics laboratory for their
PhD.
Students can also apply to the MD program and, either simultaneously or
during first year of MD training, apply to the PhD program of their choice. If
accepted for the PhD, they may be eligible for training grant funding of their
PhD (the training grant associated with their PhD program (such as the MIS
training grant for MIS students), or the Genome Training Grant for students in
any program relevant to bioinformatics). They will still have to pay for their
MD, but these costs can be defrayed if they can secure a 50% research
assistantship with their research mentor, which allows them to take medical
school courses with greatly reduced tuition, and work in a lab for a stipend.
Students with an interest in MD/PhD in informatics should, in general, have
some exposure to both biology and computer science/applied mathematics.
Although this is not mandatory, the MSTP program is sufficiently competitive
that the most successful applicants will have a strong undergraduate experience
in their chosen field, and evidence of long term committment to the field. I
would be happy to speak with MD/PhD candidates interested particularly in
bioinformatics.
What about if I am an undergraduate?
The School of Engineering has a pre-approved independently designed major
(IDM) entitled “Biomedical Computation”
which may be of interest to students who want to combine training in biology
and computation.
Undergraduates can contact interested faculty investigators. Normally
(although not exclusively) it is critical to have some basic knowledge of
biology as well as computer programming skills in order to have a good
experience in one of these labs. It is usually important to spend one or two
quarters of research with a faculty member before a summer position can be
secured, in order to ensure that the PI and the student are compatible.
Undergraduates from both biology and computer science have participated in
bioinformatics research at Stanford profitably with significant contributions
and publishable results. It is appropriate to consider bioinformatics for a
biology honors thesis, as well as for a computer science programming project.
The types of skills that are appropriate for preparing for bioinformatics
graduate school include the following:
Undergraduates with programming skills and some knowledge of biology should also consider taking the Stanford courses in bioinformatics that are listed above.
What about post-docs in bioinformatics?
The issue of post-docs is tricky in informatics. Some people have done PhD's
in computer science or biology or related field, but have not worked within the
intersection of the two fields, and thus are not really qualified to do
independent work in bioinformatics. Others have done PhD's in these fields with
an emphasis on bioinformatics, and are thus capable of entering a lab and
becoming an independent contributor. Thus, I believe that post-docs come in two
varieties: those ready to be bioinformatics post-docs, and those requiring
additional training before they can take off on independent research. It is
important to figure out which group you belong to, since it implies a different
set of requirements.
Post-docs with a signficant gap in biology or computer science need basic
coursework and training in these areas, and are often not able to get up and
going in a research project right away. Post-docs with no such gaps are ready
to do relatively independent research under the direction of a PI with little
extra formal instruction required.
Stanford Biomedical Informatics training program has some training grant
funding for post-docs (all for US citizens and permanent residents) in both
situations.
For people with an advanced degree, but not a complete set of skills in
informatics, they are eligible for BMI training grant support ONLY if they
enter the MS or PhD program in Biomedical Informatics.
There are two other options:
3. Established bioinformatics PhDs (US citizen/permanent residents) with
relevant previous experience may also apply for post-doctoral support from
external sources.
4. Finally, of course, post-docs can apply to specific labs and work on
specific research grants with individual PIs by their own arrangement.
Other programs I should consider?
I am not in the business of recommending or not recommending other programs,
but it is clear that a look at recent important bioinformatics conferences will
quickly indicate where there is exciting work going on, and who one might want
to contact to find out about training opportunities. One rule-of-thumb is that
many of the genome sequencing centers are affiliated with bioinformatics
programs due to the heavy informatics requirements within genome
science. In addition, many people working on large scale biological
and genomic data collection techniques (for example, gene chips) have active
informatics efforts.
What are the main conferences?
Who funds bioinformatics research?
The main funding sources have been National Institute of Health (via
National Library of Medicine and National Center for Human Genome Research and
the National Institute of General Medical Sciences), National Science
Foundation (especially Biological Information Resources and Computational
Database Activity programs), Department of Energy (through Genome projects), as
well as some foundations (Burroughs-Welcome, Keck Foundation, Howard Hughes
Medical Institute, Culpeper
Foundation). Some pharmaceutical and biotech companies have grant programs as
well, such as PhRMA.
Professor Donald Knuth on Bioinformatics
Professor Donald Knuth, of the Stanford Computer Science Department was once
interviewed (entire interview
available online) and had some comments on issues related to
Bioinformatics:
CLB: If you were a soon-to-graduate college senior or Ph.D. and you
didn't have any "baggage", what kind of research would you want to
do? Or would you even choose research again?
Knuth: I think the most exciting computer research now is partly in
robotics, and partly in applications to biochemistry. Robotics, for example,
that's terrific. Making devices that actually move around and communicate with
each other. Stanford has a big robotics lab now, and our plan is for a new
building that will have a hundred robots walking the corridors, to stimulate
the students. It'll be two or three years until we move in to the building.
Just seeing robots there, you'll think of neat projects. These projects also
suggest a lot of good mathematical and theoretical questions. And high level
graphical tools, there's a tremendous amount of great stuff in that area too.
Yeah, I'd love to do that... only one life, you know, but...
CLB: Why do you mention biochemistry?
Knuth: There's millions and millions of unsolved problems. Biology is so
digital, and incredibly complicated, but incredibly useful. The trouble with biology
is that, if you have to work as a biologist, it's boring. Your experiments take
you three years and then, one night, the electricity goes off and all the
things die! You start over. In computers we can create our own worlds.
Biologists deserve a lot of credit for being able to slug it through.
It is hard for me to say confidently that, after fifty more years of
explosive growth of computer science, there will still be a lot of fascinating
unsolved problems at peoples' fingertips, that it won't be pretty much working
on refinements of well-explored things. Maybe all of the simple stuff and the
really great stuff has been discovered. It may not be true, but I can't predict
an unending growth. I can't be as confident about computer science as I can about
biology. Biology easily has 500 years of exciting problems to work on, it's at
that level.