Download CISC 636: Intro to Bioinformatics - Lecture 1: Course Overview and Books - Prof. Li Liao and more Study notes Computer Science in PDF only on Docsity! CISC 636 Intro to Bioinformatics (Spring 2008) Lecture 1 Course Overview Li Liao Computer and Information Sciences University of Delaware CISC 636, S08, Lec1, Liao Administrative stuff Syllabus and tentative schedule (check frequently for update) Office hours: 3:00PM-5:00PM Wednesdays. Appointments Collect info (name, email, dept, language) Introduce textbook and other resources URLs, PDF/PS files, or hardcopy handout A reading list Workload 4 homework assignments (hands-on to learn the nuts and bolts) • Language issue: Perl is strongly recommended (A tutorial is provided) Mid-term and final exams Late policy: 15% off per class up to two class mtgs. CISC 636, S08, Lec1, Liao Bioinformatics - use and develop computing methods to solve biological problems The field is characterized by an explosion of data difficulty in interpreting the data large number of open problems until recently, relative lack of sophistication of computational techniques (compared with, say, signal processing, graphics, etc.) CISC 636, S08, Lec1, Liao Why is this course good for you? According to a report in recent ACM Technews, CS enrollment has dropped, for good or bad. A factor for this drop is "the growing prominence of biotechnology and other fields." Bioinformatics is a computational wing of biotechnology. Sumer ica Mra tc aur eke Le aces Cen eed let
File Edit GoTo Favorites Help ay
Q ex -© [x] | A ~P Search she Fevortes A rrecia & -- & 4 Ld
Address |] http: sim ec.gatech, edulnews/dldownioad. pd
Edco tins * &~
[FA save a copy (= Bs. ee - A Sign + | ri
| ,
2
3 a
s a
é
“7 BC eR eae Emer eee Sea ae aoe ely
Downloaded in March 2005
1 173 “A Survey of Peer-to-Peer Content Distribution Technologies.” Stephanos Androutsellis-Theotokis, Diomidis
Spinellis ACM Computing Surveys. Dec. 2004
2 % “Bicinformatics—An Introduction for Computer Scientists.” Jacques Cohen. ACM Computing Surveys. june 2004.
3 2 “Data Clustering.” AK. Jain, M.N. Murty, PJ. Flynn. ACM Computing Surveys. Sepc. 1999.
4 412 “The University's Next Challenges.” Peter J. Denning. Communications of the ACM. May 1996.
5 3 “Face Recognition.” W. Zhao, R. Chellappa, PJ. Phillips, A. Rosenfeld. ACM Computing Surveys. Dec. 2003.
6 162 “Why We Blog.” Bonnie A. Nardi, Diane J. Schiano, Michelle Gumbreche, Luke Swartz. Communications of
the ACM. Dec. 2004.
a 7 4 “Aspect-Oriented Programming." Tzilla Elrad, Robert E. Filman, Atef Bader. Communications of the ACM. Oct. 2001.
=e
g i “Emerging Business Models for Mobile Brokerage Services.” Clyton A. Looney, Leonard M. Jessup, joseph S.
3 Valacich. Communications of the ACM. June 2004.
I
a
9 7 “The State of the Art in Automating Usability Evaluation of User Interfaces.” Melody Y. lvory, Marti A. Hearst.
ACM Computing Surveys. Dec. 2001. |
e
E
2 10 17 “U.S. Technology Policy in the Information Age.” Hal Berghel. Communications of the ACM June 1996.
é ‘This list does not include downleads for the February and March issues as thase statistics tend to reflect e-subserbers downloading current issues.
| Coming Next Month in ¥
id meen 3 Oa) ot
rt
biotechnology
MAR 2001
FIGURES | TABLES
=
Naturejobs Biotechnology
March 2001 Volume 19 Number 3 pp 285 - 286
University bioinformatics
programs on the rise
Randy J. Zauhar
Randy J. Zauhar is associate professor of biochemistry and
director of the graduate program in bioinformatics, Department
of Chemistry and Biochemistry, University of the Sciences in
Philadelphia, 600 5. 43rd Street, Philadelphia, PA
19104-4495 (e-mail: pzauhareusip.edu).
Fueled by strong demand from students and industry's
need for trained bioinformaticists, universities are
increasing their offerings in this fast-growing field.
Information is found in the arrangement of things with respect to
each other. Whether we observe the arrangement of ink on the
page to form words, or the order of nucleotide bases to form a
gene sequence, we recognize that information is the key to
understanding the world around us. Bioinformatics studies haw
information is stored, reproduced, and used by living systems. It
iS not an overstatement to say that bioinformatics is what
biology is evolving to become inthe 21st century.
The subject matter of bioinformatics has existed as long as
there has been molecular biology, but ithas emerged as a
distinct discipline only in the last two decades. This was
Rramntoad hee a vortaklea avalacian af intarmatan terasarad Rhu
careers and recruitment
Nmupe 389,420 - 422 (1997) © Maonillm Publishers Led.
Running to catch up in Europe
HELEN GAVAGHAN
Helen Gavaghan isa science and technology writer based in Hebden Bridge, Yorkshire, LIK.
Across Europe, the story is the same. Demand for those skilled m biomformatics exceeds supply. Like biochemustry and biophysics befo
the barriers between traditional academic fields, and demanding flexibility and a new way of thinking from its adherents
Computational biology has meant different things to different people. Not too long ago, says Hans Prydz of the University of Oslo's Biot:
handing NMR. data or analysing Doppler echograms. Now renamed biomformatics, it means looking for patterns mn DNA and RMA, pr
modelling proteins and mining massive databases that continue to grow. When the DNA database run by the European Bioinformatics In
contained 700,000 nucleotides: now there are more than a billion.
Driven by the scientific and commercial importance of bioinformatics in genomics and drug discovery and development, governments, un
responding with varymg degrees of vigour and success to the skulls shortage and are seeking ways to cross the boundaries between disci
physics, mathematics, computer science, statistics, protem chemistry, genetics and molecular biology.
At European level, the EBL, based near Cambridge (United Kingdom), is funded to the sum of about DM 9 million ($5 million) by memt
Israel via ther contributions to the European Molecular Biology Laboratory (EMBL) in Heidelberg, Germany. Contributions from the ph
industries roughly double the institute's income. The EBI, an offshoot of EMBL, develops tools for bioinformatics, seeks innovative ways
traning courses for academucs and industrialists. Initiatives with industry include the Industry Affiliates Initiatrve, which helps small and me
and apply new techniques, the BioTitan Project, running nodes to enable faster access to databases; and the Biostandards project, funde
European Union for promoting and developing standards
Nahonal intiatiwes also exist, particularly nm the Untted Kingdom and Germany. Says Andrew Lyall, responsible for biomformatics at Gla
in pretty good shape." There are two government-financed imttiatives in the United Kingdom, both of which recetwed a second lease of lif
One of these schemes, supported by the Biotechnology and Biological Sciences Research Council (BBSRC), coordmates the UE biomt
the scheme has concentrated on developing software that would enable biclogists without information technology (IT) skills to use some
their trade that are found on the World Wide Web. At a meeting earlier this month, the steering committee of the scheme decided to cha
Brass, who runs a masters' degree course in bioinformatics at the University of Manchester and is a member of the committee, says, "Wi
careers and recruitment
Maure 404, 686 - 697 (2000) © Maowilln Publishers Ltd.
Training: United States gives priority to skills shortage
POTTER WICKWARE
Bioinformatics marties together a wide range of scientific disciplines, but with a global shortage of skilled researchers, train
[WASHINGTON] Industry is draining bioinformatics talent from universities faster than it can be replenished. This is good news for the peo,
news for the institutions that are scrambling to provide it, says Francis Quellette, at the University of British Columbia's Center for Molec
Quelltte and Christoph Sensen at Canadian Biomformatics Resource, in Halifax, Nova Scotta, run a four-part survey series (one week!
genomics, proteomics and tools development), which introduces people to the field. Ouellette worrtes that the series is only a temporary
Sensen stresses the difficulties academuc groups have m finding and retammg talent. "In two years of looking I haven't found a person will
environment. PhDs either go to a company or to a nice warm place in the United States where they also get more money. But there is an
academia because that's where much of the real science is done."
Chris Lee, of the Bioinformatics Institute at the University of Califorma, Los Angeles, concurs. Industry has the data, he says. But it lack:
full-service university, as well as the freedom to "sit around talking about problems with people from different backgrounds"
The gap between supply and demand m biomformatics ts receiving official recognition in the Umted States. The 13 National Institutes of
bioinformatics mainly through two institutes, the National Human Genome Research Institute and the Mational Library of Medicine. How
centres outside the NIH must also arise. The NIH approves the concept of developing such "centres of excellence", but has been slow t
infrastructure.
The National Institute of General Medical Sciences has also commiutted itself to funding traimmg slots, and a fourth branch of the NIH, the
Resources (which is net an mstitute), has put itself behind shared bio-computational resources at more than a dozen centres nahonwide. '
Argonne and Oak Ridge laboratories are also huge funders of bioinformatics work, as is, to a somewhat smaller extent, the Department
On the private side, the Howard Hughes Medical Institute (HHMI) has declared that st will appoint investigators in computational biclog
that until now has avoided funding research in what tt viewed as engmeerng disciplines. Now, however, it is becommng clear that biocom
HAM1's biomedical mission, but is one of its most critical elements.
Other support ts alse issumg from the Alfred P. Sloan Foundation, which has recently called for proposals to fund academic umts that or
in biology. Traditionally, these degrees have not carried the same weight in biology as in engineering or business, where they are terminal
CISC 636, S08, Lec1, Liao Computing and IT skills Algorithm design and model building Working with unix system/Web server Programming (in PERL, Java, etc.) RDBMS: SQL, Oracle PL/SQL CISC 636, S08, Lec1, Liao People International Society for Computational Biology (www.iscb.org) ~ 1000 members Severe shortage for qualified bioinformatians CISC 636, S08, Lec1, Liao Conferences ISMB (Intelligent Systems for Molecular Biology) started in 1992 RECOMB (International Conference on Computational Molecular Biology) started in 1997 PSB (Pacific Symposium on Biocomputing) started 1996 TIGR Computational genomic, started in 1997 ... CISC 636, S08, Lec1, Liao How much should I know about biology? - Apparently, the more the better - The least, Pavzner's 3-page "All you need to know about Molecular biology". > I will tell you. - We adopt an "object-oriented" scheme, namely, we will transform biological problems into abstract computing problems and hide unnecessary details. So another big goal of this course is learn how to do abstraction. CISC 636, S08, Lec1, Liao Organisms: three kindoms -- eukaryotes, eubacteria, and archea Cell: the basic unit of life Chromosome (DNA) > circular, also called plasmid when small (for bacteria) > linear (for eukaryotes) Genes: segments on DNA that contain the instructions for organism's structure and function Proteins: the workhorse for the cell. > establishment and maintenance of structure > transport. e.g., hemoglobin, and integral transmembrane proteins > protection and defense. e.g., immunoglobin G > Control and regulation. e.g., receptors, and DNA binding proteins > Catalysis. e.g., enzymes promoter
introns
regulatory
elements splicing
terminator
exons
activator
repressor
nucleotides
uonpoid ulazoud
Jo Jou
translation
structure of
DNA & RNA
double
helix base pairing
from DNA
to protein
transcription mRNA genetic code protein
nucleic acid
world
ribosomes
RNA structure
uopnjona
YI] JO Bal}
suoleinus
sajofueyna
Buapeq
CISC 636, S08, Lect, Liao
Genetic Code: codons
T c A G
TIT Phe [F] | TCT Ser[S] | TAT Tyr [¥] | TGT Cys [C] | 1
| TEC Phe (Fl | TCC Ser [8] | TAC Tw [¥] | TGC Cys [C] | c
TTA Leu[L] | TCA Ser [S] | TAA Fer [end] | TGA fer [end] | A
TTG Leu[L] | TCG Ser [8] | TAG Ter [end] | TGG Trp [W] |G
CTT Leu[L] | CCT Pro[P] | CAT His[H] | CGT @re[R] | T
c| CTC Lew[L] | CCC Pro[P] | CAC His [H] | CGC Arg(R] | c
CTA Leu[L] | CCA Pro[P] |CAAGh[O] | CGA Are[R] | A
CTG Leu[L] | CCG Pro[P] | CAG Gin[Q] | coc are[R] | c¢
ATT Ile [I] | ACT Thr[T] | AAT Asn[N] | AGT Ser [Ss] | T
[ATC le [1] | ACC Thr(T] | AAC Asn[N] | AGC Ser [S]_ | C
ATA Ile [I] | ACA Thr[T] | AMA Lys[K] | AGA Are([R] | A
ATG Met [M] | ACG Thr [T] | AAG Lys [K] | AGG Are[R] | ¢
GTT Val[¥] | GCT AlalA] | GAT Asp[D] | GGT Gly[c] | 1
g| GTC VAY] | GCC Ala [A] | GAC Asp[D] | GGC Giy[G] |
GTA Va[V] | GCA Ala[A] | GAA Gh [E] | GGAGIy(G] | a
GTG Val[V] | GCG Ala[A] | Gac Glu [E] | caccly[c] |c¢
CISC 636, S08, Lec1, Liao Challenges in Life Sciences Understanding correlation between genotype and phenotype Predicting genotype <=> phenotype Phenotypes: drug/therapy response drug-drug interactions for expression drug mechanism interacting pathways of metabolism
The New York Gimes
January 24,2008
Scientists Take New Step Toward Man-Made Life
By ANDREW POLLACK
Taking a significant step toward the creation of man-made forms of life, researchers reported Thursday that they had manufactured the
entire genome of a bacterium by painstakingly stitching together its chemical components.
While scientists had previously synthesized the complete DNA of viruses, this is the first time it has been done for bacteria, which are much
more complex. The genome is more than 10 times as long as the longest piece of DNA ever previously synthesized.
The feat is a watershed for the emerging field called synthetic biology, which involves the design of organisms to perform particular tasks,
such as making biofuels. Synthetic biologists envision being able one day to design an organism on a computer, press the “print” button to
have the necessary DNA made, and then put that DNA into a cell to produce a custom-made creature.
“What we are doing with the synthetic chromosome is going to be the design process of the future,” said Dr. J. Craig Venter, the boundary-
pushing gene scientist. He assembled the team that made the bacterial genome as part of his well publicized quest to create the first
synthetic organism. The work was published online Thursday by the journal Science.
But there are concerns that synthetic biology could be used to make pathogens, or that errors by well-intended scientists could produce
ponent a te ee et 8 em te a ent a te a ta
CISC 636, S08, Lect, Liao